Course Number: NSQL-110WA
Duration: 2 days (13 hours)
Format: Live, hands-on

NoSQL Training Overview

The variety of NoSQL (Not Only SQL) technologies can be overwhelming. Which NoSQL platform should you choose? This NoSQL Architecture Comparison training class cuts through the hype to explain the architectures of NoSQL systems such as Pig, Hive, HBase, Cassandra, and MongoDB. Attendees learn how to make informed big data decisions and identify suitable NoSQL database use cases. By the end of this course, students are equipped to confidently select NoSQL persistence systems for their organization's needs.

Location and Pricing

Accelebrate offers instructor-led enterprise training for groups of 3 or more online or at your site. Most Accelebrate classes can be flexibly scheduled for your group, including delivery in half-day segments across a week or set of weeks. To receive a customized proposal and price quote for private corporate training on-site or online, please contact us.

In addition, some courses are available as live, instructor-led training from one of our partners.

Objectives

Understand the core concepts of big data
Explore the most common NoSQL stores
Choose the correct NoSQL database for specific use cases
Understand the architecture of Hadoop and MongoDB

Prerequisites

All attendees must have a background in enterprise information systems design.

Outline

Expand All | Collapse All

Introduction to NoSQL Systems

Gartner's Definition of Big Data
The V3
Properties
Limitations of Relational Databases
What are NoSQL Databases?
The Past and Present of the NoSQL World
NoSQL Database Properties
NoSQL Benefits
Use Cases for NoSQL Database Systems
NoSQL Database Storage Types
The CAP Theorem
Mechanisms to Guarantee a Single CAP Property
NoSQL Systems CAP Triangle
Limitations of NoSQL Databases
Mix-and-Match Approach
Big Data Sharding
Sharding Example
Google BigTable
BigTable-based Applications
BigTable Design
Barriers to Adoption
Dismantling Barriers to Adoption
Industry trends
NoSQL Technology Adoption Action Plan

Apache HBase

What is HBase?
HBase Design
HBase Master (HMaster)
Sparse Data Sets
Regions and Region Servers
HBase Features
HBase High Availability
The Write-Ahead Log (WAL) and MemStore
HBase vs RDBS
Interfacing with HBase
HBase Thrift and REST Gateway
HBase Table Design
Column Families
A Cell's Value Versioning
Timestamps
Accessing Cells
HBase Table Design Digest
The Conceptual View of an HBase Table
HBase Compaction
Loading Data in HBase
Column Families Notes
Cardinality of Column Families
Hotspotting
Rowkey Design Notes
Security
HBase Shell
HBase Shell Command Groups
Creating and Populating a Table Using HBase Shell
Getting a Cell's Value
Counting Rows in an HBase Table
HBase Java Client
HBase Scanners
The Scan Class
The KeyValue Class
The Result Class
Getting Versions of Cell Values Example
The Cell Interface
HBase Java Client Example
Scanning the Table Rows
Dropping a Table
The Bytes Utility Class
Table Schema Main Rules to Follow
Good Use Cases for HBase
Not Good Use Cases for HBase
Business Continuity Caveats

Introduction to MongoDB

MongoDB
Main Features
MongoDB's Logo
Positioning of MongoDB
The CAP Placement
MongoDB Clients
MongoDB Nexus Architecture
Blending the Best of Both Worlds
What Makes MongoDB Fast?
Pluggable Storage Engines
The BSON Data Format
BSON Caveats
MongoDB Terminology
MongoDB Data Model
MongoDB Data Model (Cont'd)
The _id Primary Key Filed Considerations
Indexes
(Traditional) Data Modeling in RDBMS
Data Modeling in MongoDB
An Example of a Data Model in MongoDB
MongoDB Data Modeling
A Sample JSON Document Matching the Schema
To Normalize or Denormalize? Is that a Question?
MongoDB Query Language (QL)
The
find()
Method
The limit()
Method
A MongoDB QL Example
Query Syntax is Driver-Specific!
More Client Code Examples
MongoDB Query to SQL Select Comparison
Data Inserts
Data Lifecycle Management
Data Lifecycle Management: TTL
Data Lifecycle Management: Capped Collections
Data Sharding
Data Replication
GridFS
MongoDB Security
Authentication
Data and Network Encryption
MongoDB Limitations
MongoDB Use Cases

Apache Cassandra

What is Apache Cassandra?
Main Features
Peer-to-Peer (No Master)
Wide Column Store NoSQL Databases
Cassandra Model vs Relational Model
Column Families
Columns
Simplified Data Model
Data Model
The Cap Placement
CQL
CQL Simple Examples
The Update Statement
Update Caveats
Update Statement with TTL and TIMESTAMP Examples
Collections
Example of Using a Set Collection
Using the List Collection
Data Replication
Visualizing Data Replication
The Write Path
Sequential Data Storage Engine
Java Client Code Example
Data Distribution
Native Aggregate Functions
Creating UDFs
HBase vs. Apache Cassandra
Cassandra vs. MongoDB
Security
WAN-Wide High Availability

Introduction to Hadoop

The Client – Server Processing Pattern
Apache Hadoop
Apache Hadoop Logo
Typical Hadoop Applications
Hadoop Clusters
Hadoop Distributions
Hadoop's Main Components
Hadoop Distributed File System (HDFS)
HDFS Considerations
Data Blocks
HDFS NameNode Directory Diagram
HDFS Balancing
Accessing HDFS
Examples of HDFS Commands
Other Supported File Systems
YARN
Hadoop-based Systems for Data Analysis
MapReduce
Similarity with SQL Aggregation Operations
MapReduce Word Count Example
Distributed Computing Economics
Discussion: Divide and Conquer
Apache Pig
Pig Latin
Running Pig
Pig Latin Script Example
What is Hive?
Hive's Value Proposition
Who uses Hive?
What Hive Does Not Have
HiveQL
Working with Hive Tables

Introduction to Functional Programming

What is Functional Programming (FP)?
Terminology: Higher-Order Functions
Terminology: Lambda vs Closure
A Short List of Languages that Support FP
FP with Java
FP With JavaScript
Imperative Programming in JavaScript
The JavaScript map (FP) Example
The JavaScript reduce (FP) Example
Using reduce to Flatten an Array of Arrays (FP) Example
The JavaScript filter (FP) Example
Common High-Order Functions in Python
Common High-Order Functions in Scala
Elements of FP in R

Introduction to Apache Spark

What is Apache Spark
A Short History of Spark
Where to Get Spark?
The Spark Platform
Spark Logo
Common Spark Use Cases
Languages Supported by Spark
Running Spark on a Cluster
The Driver Process
Spark Applications
Spark Shell
The spark-submit Tool
The spark-submit Tool Configuration
The Executor and Worker Processes
The Spark Application Architecture
Interfaces with Data Storage Systems
Limitations of Hadoop's MapReduce
Spark vs. MapReduce
Spark as an Alternative to Apache Tez
The Resilient Distributed Dataset (RDD)
Spark Streaming (Micro-batching)
Spark SQL
Example of Spark SQL
Spark Machine Learning Library
GraphX
Spark vs. R

The Spark Shell

The Spark Shell
The Spark Shell UI
Spark Shell Options
Getting Help
The Spark Context (sc) and SQL Context (sqlContext)
The Shell Spark Context
Loading Files
Saving Files
Basic Spark ETL Operations

Spark RDDs

The Resilient Distributed Dataset (RDD)
Ways to Create an RDD
Custom RDDs
Supported Data Types
RDD Operations
RDDs are Immutable
Spark Actions
RDD Transformations
Other RDD Operations
Chaining RDD Operations
RDD Lineage
The Big Picture
What May Go Wrong
Checkpointing RDDs
Local Checkpointing
Parallelized Collections
More on parallelize() Method
The Pair RDD
Where do I use Pair RDDs?
Example of Creating a Pair RDD with Map
Example of Creating a Pair RDD with keyBy
Miscellaneous Pair RDD Operations
RDD Caching
RDD Persistence
The Tachyon Storage

Conclusion

Training Materials

All NoSQL Architecture training students will receive comprehensive courseware.

Software Requirements

Computer with Internet connectivity
Ability to install software on the computer
Recent 64-bit OS, such as Windows 10, macOS, or Linux

Download

REQUEST PRICING

Related Topics

Introduction To NoSQL Databases

Learn faster

Our live, instructor-led lectures are far more effective than pre-recorded classes

Satisfaction guarantee

If your team is not 100% satisfied with your training, we do what's necessary to make it right

Learn online from anywhere

Whether you are at home or in the office, we make learning interactive and engaging

Multiple Payment Options

We accept check, ACH/EFT, major credit cards, and most purchase orders

Subscribe to our newsletter

Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Cedar Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan

© 2013-2024 Accelebrate, LLC - All rights reserved. All trademarks are owned by their respective owners.
This site is protected by reCAPTCHA. The collection of data and its use is described in our Privacy Policy and Terms of Service.

Agile

Business Analysis

DEI

ITIL

IT Leadership

Six Sigma

Introduction to Cloud Computing for Managers

Cloudflare

Google Cloud

Beginning OpenStack

Terraform

VMware

Amazon Web Services (AWS)

Azure

Remote Conferencing Tools

Writing and Communication

Adobe, Articulate, and e-Learning

AWS Data Science

Machine Learning

Data Engineering

Generative AI

NVIDIA

Data Literacy

Data Science for Healthcare Overview

Data Science Programming

Data Science Management and DataOps

Robotic Process Automation (RPA)

Data Analytics Tools

Data Visualization

Reporting

Amazon RedShift

MongoDB

NoSQL

PostgreSQL

Introduction to SQL Using MySQL

Big Data

SQL Server

Oracle

Ansible

Apache Maven

DevOps

DevOps CI/CD Pipeline

Docker and Kubernetes

Git

Jenkins

Jira & Confluence

Linux

Microservices

Terraform

OpenShift Administration

SaltStack and Salt Open Source Administration

Microsoft Official Curriculum (MOC)

.NET Development

SharePoint

Microsoft Server Platforms

Microsoft 365

Microsoft 365 Administration and Security

Salesforce End User

Salesforce Administration

Salesforce Developer

Salesforce Cloud

Salesforce Einstein and Salesforce Platform

MuleSoft

Fundamentals of DevSecOps

Secure Coding

Microsoft Security

Web Application Security

AWS Security

Introduction to ArgoCD

Introduction to Bazel

Programming in C++

Introduction to Lua Programming

API Management Fundamentals for Architects

RESTful API Design and Development

RESTful API Design, Development, and Testing using Insomnia

Scala Programming for Java Developers

Introduction to the Zig Programming Language

Erlang

Go Programming

Java