Big Data Infrastructure

DATA-101 (5 Days)

Request Pricing

Big Data Training Overview

Accelebrate's Big Data Infrastructure training course teaches developers, data scientists, and DevOps professionals how to navigate and understand Big Data services. Attendees learn how to leverage Containers (Docker), Hadoop Distributed Filesystem (HDFS), Apache Spark, Natural Language Processing (NLP) applications, Cassandra, Kubernetes, and more.

Location and Pricing

Most Accelebrate courses are delivered as private, customized, on-site training at our clients' locations worldwide for groups of 3 or more attendees and are custom tailored to their specific needs. Please visit our client list to see organizations for whom we have delivered private in-house training. These courses can also be delivered as live, private online classes for groups that are geographically dispersed or wish to save on the instructor's or students' travel expenses. To receive a customized proposal and price quote for private training at your site or online, please contact us.

Big Data Training Objectives

This course will:

  • Explain “Big Data” and the challenges which can arise when trying to gain insights from it: Volume, velocity, variety, variability, and complexity.
  • Introduce the open source systems commonly used to work with data at scale: Cassandra, Elasticsearch, Hadoop/HDFS, Spark, and Kafka.
  • Provide overviews of each member of the big data ecosystem and examples of problems they are intended to address.
  • Take a deep dive into Kafka and see how it can be used to build pipelines for managing streams of data.
  • Take a deep dive into Spark to show how it can be used to analyze different types of data stored in relational databases, Cassandra, Elasticsearch, and Hadoop/HDFS.

Big Data Training Outline

Expand All | Collapse All | Printer-Friendly

Getting Started
  • Introduce tools to be used in the course
  • Docker
  • Python and Anaconda
  • Jupyter
Data versus Big Data
  • Challenges of Data: 5VC
  • What is Big Data?
  • Technology to the Rescue: Making it easier to work with large sets of data
  • Virtualization
  • Everything as a Service (EaaS): Strategies for managing complex computational tools
  • Distributed Storage and Computation
  • Properties of a Big Data Solution
Mesos and DC/OS: Operating Platform the Modern Data Center
  • Architecture
  • Configuration and management
  • Application and service deployment
Networking, load balancing, and application isolation
  • Kafka
  • Rationale and role: what problem does Kafka solve?
  • Architecture and key components
  • Service installation within DC/OS
  • API: Consumers and Producers
  • Python and Java client libraries
  • Kafka Connect: Tools for moving and working with structured data
Hadoop Distributed File Storage (HDFS)
  • What is HDFS and how does it fit with the Hadoop world?
  • HDFS API and tools for ingesting data for later analysis
  • Python and Java client libraries
Apache Spark: General Engine for Large Scale Data Processing
  • What is Spark?
  • How is it used in practice?
  • Architecture and components
  • Ecosystem: Core, SQL, Machine Learning (MLlib), Graph
  • Service installation within DC/OS
  • API and environment
  • Analysis of structured data
  • How Spark can be used to analyze streaming datasets
ElasticSearch: Storage Versus Search
  • What is the role of search within Big Data? What benefit does a search engine provide?
  • Architecture and data storage
  • What enables Elasticsearch to be used as an analytics platform: aggregations and tokenization
  • Service installation within DC/OS
  • Key APIs
  • Role as a Natural Language Processing platform
  • Use of Kibana as a visualization platform to explore ES data
Cassandra: Unstructured Data Storage at Scale
  • Purpose of Cassandra: What problems does it solve?
  • Architecture and components
  • API and integration
  • Service installation in DC/OS
  • Python and Java client
Kubernetes
  • What is DevOps?
  • How do DevOps practices relate to Big Data?
  • What role does Kubernetes play within DC/OS and how can it make the management of custom software easier?
  • Architecture and components
  • Service installation in DC/OS
  • Demonstrate workflows for deploying data services to Kubernetes
  • Package and deploy Kafka connect applications as a pod
  • Package and deploy Apache Spark Streaming applications as a pod
Conclusion
Request Pricing
Lecture percentage

40%

Lecture/Demo

Lab percentage

60%

Lab

Course Number:

DATA-101

Duration:

5 Days

Prerequisites:

All students should have:

  • Familiarity with Unix operating systems and the BASH command line interface (CLI) is assumed.
  • Students should be comfortable with executing commands from the terminal, capturing and redirecting output, and analyzing program logs and output.
  • Students will use the gi5 version control system and should be comfortable with the add, commit, push, pull, remote, and submodule commands.
  • Course examples are written in Java and Python. Students should be able to understand the basic syntax and program structure of both programming languages. The source code for all course examples will be provided.

Training Materials:

All students receive comprehensive courseware.

Software Requirements:

Each student will be provided access to a set of eight virtual machines with all of the materials required for the class. To access the remote lab system, students will need a laptop or desktop machine with a recent version of Mozilla Firefox, Google Chrome, or Apple Safari installed.

Contact Us:

Accelebrate’s training classes are available for private groups of 3 or more people at your site or online anywhere worldwide.

Don't settle for a "one size fits all" public class! Have Accelebrate deliver exactly the training you want, privately at your site or online, for less than the cost of a public class.

For pricing and to learn more, please contact us.

Contact Us Train For Us

Toll-free in US/Canada:
877 849 1850
International:
+1 678 648 3113

Toll-free in US/Canada:
866 566 1228
International:
+1 404 420 2491

925B Peachtree Street, NE
PMB 378
Atlanta, GA 30309-3918
USA

Subscribe to our Newsletter:

Never miss the latest news and information from Accelebrate:

Microsoft Gold Partner

Please see our complete list of
Microsoft Official Courses

Recent Training Locations

Alabama

Huntsville

Montgomery

Birmingham

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

San Francisco

Oakland

San Jose

Orange County

Los Angeles

Sacramento

San Diego

Colorado

Denver

Boulder

Colorado Springs

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Miami

Jacksonville

Orlando

Saint Petersburg

Tampa

Georgia

Atlanta

Augusta

Savannah

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Ceder Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

Banton Rouge

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Hagerstown

Frederick

Massachusetts

Springfield

Boston

Cambridge

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Saint Paul

Minneapolis

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Reno

Las Vegas

New Jersey

Princeton

New Mexico

Albuquerque

New York

Buffalo

Albany

White Plains

New York City

North Carolina

Charlotte

Durham

Raleigh

Ohio

Canton

Akron

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Tulsa

Oklahoma City

Oregon

Portland

Pennsylvania

Pittsburgh

Philadelphia

Rhode Island

Providence

South Carolina

Columbia

Charleston

Spartanburg

Greenville

Tennessee

Memphis

Nashville

Knoxville

Texas

Dallas

El Paso

Houston

San Antonio

Austin

Utah

Salt Lake City

Virginia

Richmond

Alexandria

Arlington

Washington

Tacoma

Seattle

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Edmonton

Calgary

British Columbia

Vancouver

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan

© 2013-2019 Accelebrate, Inc. All Rights Reserved. All trademarks are owned by their respective owners.