Apache Airflow for Machine Learning Operations


Course Number: PYTH-226
Duration: 3 days (19.5 hours)
Format: Live, hands-on

Airflow for ML Training Overview

This Apache Airflow for Machine Learning Operations training course teaches machine learning (ML) engineers how to build and validate training models, upload models to a model registry, and deploy models in a reproducible manner.

Attendees learn machine learning operations and the complexities of creating a reproducible CI/CD pipeline for ML models. Next, students explore options to reduce this gap with Apache Airflow for batch training scenarios (which are the majority). In addition, attendees learn the foundations of Airflow and how it creates reproducible and trustworthy pipelines via DAGs (Directed Acyclic Graphs).

This course focuses on real-world applications of ML using both traditional machine learning algorithms and deep learning algorithms, such as sentiment prediction in a stream of tweets.

Throughout the course, students tackle diverse machine learning problems by creating reproducible pipelines with Airflow.

Location and Pricing

Accelebrate offers instructor-led enterprise training for groups of 3 or more online or at your site. Most Accelebrate classes can be flexibly scheduled for your group, including delivery in half-day segments across a week or set of weeks. To receive a customized proposal and price quote for private corporate training on-site or online, please contact us.

In addition, some courses are available as live, instructor-led training from one of our partners.

Objectives

  • Migrate machine learning training workflows to scalable pipelines in Apache Airflow
  • Start with a raw dataset and a model architecture and take the project from beginning to end, culminating in deploying it in the cloud
  • Enforce reusability and modularization of pipelines for easy collaboration

Prerequisites

Students must have basic Python knowledge or object-oriented programming experience. Knowledge of machine learning is helpful but not required.

Outline

Expand All | Collapse All

Introduction
The Scalable Problem of Machine Learning Pipelines
  • What problems arise when trying to create a machine learning model?
  • The components of a machine learning platform
  • Introducing Apache Airflow
  • Airflow architecture
  • How do we represent a machine learning pipeline?
  • Our first DAG
  • Tasks, TaskFlows, and Operators
  • First Pipeline
  • Cresting the datasets for training
Creating our Machine Learning Pipeline
  • Using custom operators
  • Creating a Train Operator
  • Creating TaskGroups vs subDAGs
  • Sharing data with xCOMs
  • Branching and Triggers
  • Sensors and SmartSensors
  • Adding a sensor to validate enough new data
  • Adding training, validation, and delivery steps to our pipeline
Mastering Scheduling
  • execution_date, start_date, and schedule_interval
  • Handling non-default schedule_intervals
  • Playing with time
  • Using Sensors with a correct schedule_interval
Enabling Concurrency and Scalability
  • Abandoning SQLite to PostgreSQL
  • Executors: Debug, Local, Celery
  • Concurrency and parallelism
  • Concurrency with Celery
Hackathon: Sentiment Prediction from Twitter
Conclusion

Training Materials

All Apache Airflow for Machine Learning training attendees receive comprehensive courseware.

Software Requirements

This  course is taught using:

  • Python 3.5 or later
  • Apache Airflow 2.1 or later
  • scikit-learn 1.1 or later
  • PyTorch 1.8 or later

On request, we can provide either a remote VM environment for the class or directions for configuring this environment on your local PCs.



Learn faster

Our live, instructor-led lectures are far more effective than pre-recorded classes

Satisfaction guarantee

If your team is not 100% satisfied with your training, we do what's necessary to make it right

Learn online from anywhere

Whether you are at home or in the office, we make learning interactive and engaging

Multiple Payment Options

We accept check, ACH/EFT, major credit cards, and most purchase orders



Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Cedar Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan