Comprehensive Data Science with Python

374 Ratings

Course Number: PYTH-124

Duration: 5 days (32.5 hours)

Format: Live, hands-on

Python for Data Science Training Overview

This Data Science with Python training course teaches engineers, data scientists, statisticians, and other quantitative professionals the Python programming skills they need to analyze and chart data.

Location and Pricing

Accelebrate offers instructor-led enterprise training for groups of 3 or more online or at your site. Most Accelebrate classes can be flexibly scheduled for your group, including delivery in half-day segments across a week or set of weeks. To receive a customized proposal and price quote for private corporate training on-site or online, please contact us.

In addition, some Programming courses are available as live, online classes for individuals.

Objectives

  • Understand the difference between Python basic data types
  • Know when to use different python collections
  • Implement python functions
  • Understand control flow constructs in Python
  • Handle errors via exception handling constructs
  • Be able to quantitatively define an answerable, actionable question
  • Import both structured and unstructured data into Python
  • Parse unstructured data into structured formats
  • Understand the differences between NumPy arrays and pandas dataframes
  • Simulate data through random number generation
  • Understand mechanisms for missing data and analytic implications
  • Explore and Clean Data
  • Create compelling graphics to reveal analytic results
  • Reshape and merge data to prepare for advanced analytics
  • Find test for group differences using inferential statistics
  • Implement linear regression from a frequentist perspective
  • Understand non-linear terms, confounding, and interaction in linear regression
  • Extend to logistic regression to model binary outcomes

Prerequisites

All attendees should have prior programming experience and an understanding of basic statistics.

Outline

Expand All | Collapse All

Base Python Introduction
  • History and current use
    • Installing the Software
    • Python Distributions
  • String Literals and numeric objects
  • Collections (lists, tuples, dicts)
  • Datetime classes in Python
  • Memory Management in Python
  • Control Flow
  • Functions
  • Exception Handling
Bringing Data In
  • Structured Data
    • Structured Text Files
    • Excel workbooks
    • SQL databases
  • Working with Unstructured Text Data
    • Reading Unstructured Text
    • Introduction to Natural Language Processing with Python
NumPy: Matrix Language
  • Introduction to the ndarray
  • NumPy operations
  • Broadcasting
  • Missing data in NumPy (masked array)
  • NumPy Structured arrays
  • Random number generation
Data Preparation with Pandas
  • Filtering
  • Creating and deleting variables
  • Discretization of Continuous Data
  • Scaling and standardizing data
  • Identifying Duplicates
  • Dummy Coding
  • Combining Datasets
  • Transposing Data
  • Long to wide and back
Exploratory Data Analysis with Pandas
  • Univariate Statistical Summaries and Detecting Outliers
  • Multivariate Statistical Summaries and Outlier Detection
  • Group-wise calculations using Pandas
  • Pivot Tables
Defining Actionable, Analytic Questions
  • Defining the quantitative construct to make inference on the question
  • Identifying the data needed to support the constructs
  • Identifying limitations to the data and analytic approach
  • Constructing Sensitivity analyses
Exploring Data Graphically
  • Histogram
  • Box-and-whiskers plot
  • Scatter plots
  • Forest Plots
  • Group-by plotting
Missing Data
  • Exploring and understanding patterns in missing data
  • Missing at Random
  • Missing Not at Random
  • Missing Completely at Random
  • Data imputation methods
Traditional Inferential Statistics
  • Comparing Groups 
    • P-Values, summary statistics, sufficient statistics, inferential targets
    • T-Tests (equal and unequal variances)
    • ANOVA
    • Chi-Square Tests
  • Correlation
Frequentist Approaches to Multivariate Statistics
  • Linear Regression
    • Multivariate linear regression
    • Capturing Non-linear Relationships
    • Comparing Model Fits
    • Scoring new data
    • Poisson Regression Extension
  • Logistic regression 
    • Logistic Regression Example
    • Classification Metrics
Conclusion

Training Materials:

All attendees receive comprehensive courseware.

Software Requirements:

  • Anaconda Python 3.6 or later
  • Spyder IDE and Jupyter notebook (Comes with Anaconda)


Learn faster

Our live, instructor-led lectures are far more effective than pre-recorded classes

Satisfaction guarantee

If your team is not 100% satisfied with your training, we do what's necessary to make it right

Learn online from anywhere

Whether you are at home or in the office, we make learning interactive and engaging

Multiple Payment Options

We accept check, ACH/EFT, major credit cards, and most purchase orders



Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Cedar Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan