Introduction to Apache Hadoop Development

HDP-100 (4 Days)

Request Pricing for Introduction to Apache Hadoop Development

Hadoop Developer Training Overview

Accelebrate's Apache Hadoop Training: Introduction to Apache Hadoop Development training class teaches attendees how to build distributed, data-intensive applications using the Hadoop framework. Students learn the principles of parallel programming and how use Big Data tools such as Pig, Hive, and HBase.

Location and Pricing

Most Accelebrate courses are taught as private, customized training for 3 or more attendees at our clients' sites worldwide. In addition, we offer live, private online classes for teams who may be in multiple locations or wish to save on travel costs. Please visit our client list for organizations for whom we have delivered onsite training. To receive a customized proposal and price quote for private on-site or online training, please contact us.

Hadoop Developer Training Objectives

  • Understand the principles of parallel computing
  • Understand Hadoop architecture (HDFS and MapReduce)
  • Use additional Big Data tools (Pig, Hive, HBase, etc.)
  • Learn Big Data patterns and best practices
  • Define Big Data project architecture
  • Understand and use NoSQL, Mahout, and Oozie

Hadoop Developer Training Outline

Expand All | Collapse All | Printer-Friendly

Introduction
  • Hadoop history and concepts
  • Ecosystem
  • Distributions
  • High level architecture
  • Hadoop myths
  • Hadoop challenges (hardware / software)
HDFS
  • Concepts (horizontal scaling, replication, data locality, rack awareness)
  • Architecture
  • Namenode (function, storage, file system meta-data, and block reports)
  • Secondary namenode
  • HA Standby namenode
  • Data node
  • Communications / heart-beats
  • Block manager / balancer
  • Health check / safemode
  • read / write path
  • Navigating HDFS UI
  • Command-line interaction with HDFS
  • File systems abstractions
  • WebHDFS
  • Reading / writing files using Java API
  • Getting Data into / out of HDFS (Flume, Sqoop)
  • Getting HDFS stats
  • Latest in HDFS
  • Namenode HA and Federation
  • HDFS roadmap
MapReduce
  • Parallel computing before MapReduce
  • MapReduce concepts
  • Daemons: jobtracker / tasktracker
  • Phases: driver, mapper, shuffle/sort, and reducer
  • First MapReduce job
  • MapReduce UI walk through
  • Counters
  • Distributed cache
  • Combiners
  • Partitioners
  • MapReduce configuration
  • Job config
  • MR types and formats
  • Sorting
  • Job schedulers
  • MapReduce best practices
  • MRUnit
  • Optimizing MapReduce
  • Fool proofing MR
  • Thinking in MapReduce
  • YARN: architecture and use
Pig
  • Intro: principles and uses cases
  • Pig versus MapReduce
Hive
  • Intro: principles and uses cases
  • Environment and configuration
  • Hive tables and metadata
  • Hive keywords
HBase
  • History and concepts
  • Architecture
  • HBase versus RDBMS
  • HBase shell
  • HBase Java API
  • Splits and compaction
  • Read path / write path
  • Schema design
Real world Big Data skills and a hackathon
  • NoSQL design patterns: going from SQL to NoSQL
  • Smart Meter data collection with Flume
  • Sinks into HDFS and HBase
  • Analyzing smart meter data with Pig and Hive
  • Smart meter analytics with Mahout
  • Scheduling complete workflow with Oozie
Conclusion
Request Pricing for Introduction to Apache Hadoop Development

Lecture percentage

50%

Lecture/Demo

Lab percentage

50%

Lab

Course Number:

HDP-100

Duration:

4 Days

Prerequisites:

All attendees must be comfortable with the Java programming language (since all programming exercises are in Java), familiar with Linux commands, and proficient in an IDE like Eclipse or a Linux editor (VI / nano) for modifying the code.

Training Materials:

All attendees receive courseware and a related textbook.

Software Requirements:

  • A web browser - any recent version of Chrome, Firefox, or Internet Explorer, with a recent version of Flash Player
  • An SSH client
  • We will provide Hadoop clusters in a remote environment.

Contact Us:

Accelebrate’s training classes are available for private groups of 3 or more people at your site or online anywhere worldwide.

Don't settle for a "one size fits all" public class! Have Accelebrate deliver exactly the training you want, privately at your site or online, for less than the cost of a public class.

For pricing and to learn more, please contact us.

Contact Us Train For Us

Toll-free in US/Canada:
877 849 1850
International:
+1 678 648 3113

Toll-free in US/Canada:
866 566 1228
International:
+1 404 420 2491

925B Peachtree Street, NE
PMB 378
Atlanta, GA 30309-3918
USA

Subscribe to our Newsletter:

Never miss the latest news and information from Accelebrate:

Microsoft Gold Partner

Please see our complete list of
Microsoft Official Courses

Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Ceder Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan

© 2013-2019 Accelebrate, Inc. All Rights Reserved. All trademarks are owned by their respective owners.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.