Course Number: AWS-154
Duration: 1 day (6.5 hours)
Format: Live, hands-on

Data Analytics on AWS Training Overview

This Building Batch Data Analytics Solutions on AWS training teaches attendees how to construct batch data analytics solutions using Amazon EMR, a cluster framework that simplifies running big data frameworks like Apache Spark and Hadoop. Participants also learn how Amazon EMR integrates with open-source projects such as Hive, Hue, and HBase, as well as with other AWS services such as AWS Glue and AWS Lake Formation.

Accelebrate is an AWS Training Partner (ATP) and this hands-on official AWS Classroom Training course is taught by an accredited Amazon Authorized Instructor (AAI).

Location and Pricing

Accelebrate offers instructor-led enterprise training for groups of 3 or more online or at your site. Most Accelebrate classes can be flexibly scheduled for your group, including delivery in half-day segments across a week or set of weeks. To receive a customized proposal and price quote for private corporate training on-site or online, please contact us.

In addition, some courses are available as live, instructor-led training from one of our partners.

Objectives

  • Compare the features and benefits of data warehouses, data lakes, and modern data architectures
  • Design and implement a batch data analytics solution
  • Identify and apply appropriate techniques, including compression, to optimize data storage
  • Select and deploy appropriate options to ingest, transform, and store data
  • Choose the appropriate instance and node types, clusters, auto-scaling, and network topology for a particular business use case
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights
  • Secure data at rest and in transit
  • Monitor analytics workloads to identify and remediate problems
  • Apply cost management best practices

Prerequisites

Students should have a minimum of one year of experience managing open-source data frameworks such as Apache Spark or Apache Hadoop. It is recommended that attendees complete AWS Technical Essentials or Architecting on AWS. It is also recommended that students complete Building Data Lakes on AWS.

Outline

Expand All | Collapse All

Overview of Data Analytics and the Data Pipeline
  • Data analytics use cases
  • Using the data pipeline for analytics
Introduction to Amazon EMR
  • Using Amazon EMR in analytics solutions
  • Amazon EMR cluster architecture
  • Launching an Amazon EMR cluster
  • Cost management strategies
Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage
  • Storage optimization with Amazon EMR
  • Data ingestion techniques
High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR
  • Apache Spark on Amazon EMR use cases
  • Why Apache Spark on Amazon EMR
  • Spark concepts
  • Connect to an EMR cluster and perform Scala commands using the
  • Spark shell
  • Transformation, processing, and analytics
  • Using notebooks with Amazon EMR
  • Low-latency data analytics using Apache Spark on Amazon EMR
Processing and Analyzing Batch Data with Amazon EMR and Apache Hive
  • Using Amazon EMR with Hive to process batch data
  • Transformation, processing, and analytics
  • Batch data processing using Amazon EMR with Hive
  • Introduction to Apache HBase on Amazon EMR
Serverless Data Processing
  • Serverless data processing, transformation, and analytics
  • Using AWS Glue with Amazon EMR workloads
  • Orchestrate data processing in Spark using AWS Step Functions
Security and Monitoring of Amazon EMR Clusters
  • Securing EMR clusters
  • Client-side encryption with EMRFS
  • Monitoring and troubleshooting Amazon EMR clusters
  • Reviewing Apache Spark cluster history
Designing Batch Data Analytics Solutions
  • Batch data analytics use cases
  • Designing a batch data analytics workflow
Developing Modern Data Architectures on AWS
  • Modern data architectures

Training Materials

All AWS training students receive comprehensive courseware.

Software Requirements

A modern web browser and an Internet connection that allows connections by SSH or Remote Desktop (RDP) into AWS virtual machines.



Learn faster

Our live, instructor-led lectures are far more effective than pre-recorded classes

Satisfaction guarantee

If your team is not 100% satisfied with your training, we do what's necessary to make it right

Learn online from anywhere

Whether you are at home or in the office, we make learning interactive and engaging

Multiple Payment Options

We accept check, ACH/EFT, major credit cards, and most purchase orders



Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Cedar Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan