Neural Networks Training Overview

This NVIDIA Model Parallelism training course teaches attendees how to train, optimize, and deploy large-scale models that push the boundaries of AI. Participants master cutting-edge techniques like model parallelism, inference optimization, and production deployment to tackle the real-world challenges of working with extensive deep neural networks (DNNs). By the end of this course, students confidently train large neural networks and deploy them to production.

Location and Pricing

This course is taught as a private, live online class for teams of 3 or more. All our courses are hands-on, instructor-led, and tailored to fit your group’s goals and needs. Most Accelebrate classes can be flexibly scheduled for your group, including delivery in half-day segments across a week or set of weeks. To receive a customized proposal and price quote for online corporate training, please contact us.

In addition, some courses are available as live, instructor-led training from one of our partners.

Objectives

Understand the motivations and intricate nuances of training colossal neural networks

Master fundamental techniques and frameworks for distributed training across multiple servers

Implement advanced model parallelism strategies to overcome memory limitations and scale your models further

Fine-tune model performance through profiling, auto-tuning, and mixture-of-experts architecture

Implement real-world deployment tactics, including model reduction, NVIDIA libraries, and production-ready servers

Outline

Introduction to Training of Large Models

Learn about the motivation behind and key challenges of training large models
Get an overview of the basic techniques and tools needed for large-scale training
Get an introduction to distributed training and the Slurm job scheduler
Train a Megatron-LM-based GPT model using data parallelism
Profile the training process and understand execution performance

Model Parallelism: Advanced Topics

Increase the model size using a range of memory-saving techniques
Get an introduction to tensor and pipeline parallelism
Go beyond natural language processing and get an introduction to DeepSpeed
Auto-tune model performance
Learn about mixture-of-experts models

Inference of Large Models

Understand the challenges of deployment associated with large models
Explore techniques for model reduction
Learn how to use NVIDIA® TensorRT™ and Faster Transformer libraries
Learn how to use Triton Inference Server
Understand the process of deploying GPT checkpoint to production
See an example of prompt engineering

Conclusion

Learn faster

Our live, instructor-led lectures are far more effective than pre-recorded classes

Satisfaction guarantee

If your team is not 100% satisfied with your training, we do what's necessary to make it right

Learn online from anywhere

Whether you are at home or in the office, we make learning interactive and engaging

Multiple Payment Options

We accept check, ACH/EFT, major credit cards, and most purchase orders

Subscribe to our newsletter

Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Cedar Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan

© 2013-2024 Accelebrate, LLC - All rights reserved. All trademarks are owned by their respective owners.
This site is protected by reCAPTCHA. The collection of data and its use is described in our Privacy Policy and Terms of Service.

Neural Networks Training Overview

Location and Pricing

Objectives

Prerequisites

Outline

Training Materials

Software Requirements

Learn faster

Satisfaction guarantee

Learn online from anywhere

Multiple Payment Options

Agile

Business Analysis

DEI

ITIL

IT Leadership

Six Sigma

Introduction to Cloud Computing for Managers

Cloudflare

Google Cloud

Beginning OpenStack

Terraform

VMware

Amazon Web Services (AWS)

Azure

Remote Conferencing Tools

Writing and Communication

Adobe, Articulate, and e-Learning

AWS Data Science

Machine Learning

Data Engineering

Generative AI

NVIDIA

Data Literacy

Data Science for Healthcare Overview

Data Science Programming

Data Science Management and DataOps

Robotic Process Automation (RPA)

Data Analytics Tools

Data Visualization

Reporting

Amazon RedShift

MongoDB

NoSQL

PostgreSQL

Introduction to SQL Using MySQL

Big Data

SQL Server

Oracle

Ansible

Apache Maven

DevOps

DevOps CI/CD Pipeline

Docker and Kubernetes

Git

Jenkins

Jira & Confluence

Linux

Microservices

Terraform

OpenShift Administration

SaltStack and Salt Open Source Administration

Microsoft Official Curriculum (MOC)

.NET Development

SharePoint

Microsoft Server Platforms

Microsoft 365

Microsoft 365 Administration and Security

Salesforce End User

Salesforce Administration

Salesforce Developer

Salesforce Cloud

Salesforce Einstein and Salesforce Platform

MuleSoft

Fundamentals of DevSecOps

Secure Coding

Microsoft Security

Web Application Security

AWS Security

Introduction to ArgoCD

Introduction to Bazel

Programming in C++

Introduction to Lua Programming

API Management Fundamentals for Architects

RESTful API Design and Development

RESTful API Design, Development, and Testing using Insomnia

Scala Programming for Java Developers

Introduction to the Zig Programming Language

Erlang

Go Programming

Java