Comprehensive Data Science with Python

PYTH-124 (5 Days)
4.4 out of 5 (267 reviews)  

Request Pricing

Data Science with Python Programming Training Overview

This Python programming data science training course teaches engineers, data scientists, statisticians, and other quantitative professionals the Python skills they need to use the Python programming language to analyze and chart data.

Location and Pricing

Most Accelebrate courses are delivered as private, customized, on-site training at our clients' locations worldwide for groups of 3 or more attendees and are custom tailored to their specific needs. Please visit our client list to see organizations for whom we have delivered private in-house training. These courses can also be delivered as live, private online classes for groups that are geographically dispersed or wish to save on the instructor's or students' travel expenses. To receive a customized proposal and price quote for private training at your site or online, please contact us.

Data Science with Python Programming Training Objectives

All students will:

  • Understand the history of Python and differences between 2.X and 3.X
  • Understand the difference between Python basic data types
  • Know when to use different python collections
  • Ability to implement python functions
  • Understand control flow constructs in Python
  • Handle errors via exception handling constructs
  • Be able to quantitatively define an answerable, actionable question
  • Import both structured and unstructured data into Python
  • Parse unstructured data into structured formats
  • Understand the differences between NumPy arrays and pandas dataframes
  • Overview of where Python fits in the Python/Hadoop/Spark ecosystem
  • Simulate data through random number generation
  • Understand mechanisms for missing data and analytic implications
  • Explore and Clean Data
  • Create compelling graphics to reveal analytic results
  • Reshape and merge data to prepare for advanced analytics
  • Find test for group differences using inferential statistics
  • Implement linear regression from a frequentist perspective
  • Understand non-linear terms, confounding, and interaction in linear regression
  • Extend to logistic regression to model binary outcomes
  • Understand the difference between machine learning and frequentist approaches to statistics
  • Implement classification and regression models using machine learning
  • Score new datasets, evaluate model fit, and quantify variable importance

Data Science with Python Programming Training Outline

Expand All | Collapse All | Printer-Friendly

Base Python Introduction
  • History and current use
    • Installing the Software
    • Python Distributions
  • String Literals and numeric objects
  • Collections (lists, tuples, dicts)
  • Datetime classes in Python
  • Memory Management in Python
  • Control Flow
  • Functions
  • Exception Handling
Defining actionable, analytic questions
  • Defining the quantitative construct to make inference on the question
  • Identifying the data needed to support the constructs
  • Identifying limitations to the data and analytic approach
  • Constructing Sensitivity analyses
Bringing Data In
  • Structured Data
    • Structured Text Files
    • Excel workbooks
    • SQL databases
  • Working with Unstructured Text Data
    • Reading Unstructured Text
    • Introduction to Natural Language Processing with Python
NumPy: Matrix Language
  • Introduction to the ndarray
  • NumPy operations
  • Broadcasting
  • Missing data in NumPy (masked array)
  • NumPy Structured arrays
  • Random number generation
Data Preparation with Pandas
  • Filtering
  • Creating and deleting variables
  • Discretization of Continuous Data
  • Scaling and standardizing data
  • Identifying Duplicates
  • Dummy Coding
  • Combining Datasets
  • Transposing Data
  • Long to wide and back
Exploratory Data Analysis with Pandas
  • Univariate Statistical Summaries and Detecting Outliers
  • Multivariate Statistical Summaries and Outlier Detection
  • Group-wise calculations using Pandas
  • Pivot Tables
Exploring Data graphically
  • Histogram
  • Box-and-whiskers plot
  • Scatter plots
  • Forest Plots
  • Group-by plotting
Advanced Graphing with Matplotlib, Pandas, and Seaborn
Python, Hadoop and Spark
  • Introduction to the difference in Python, Hadoop, and Spark
  • Importing data from Spark and Hadoop to Python
  • Parallel execution leveraging Spark or Hadoop
Missing Data
  • Exploring and understanding patterns in missing data      
  • Missing at Random
  • Missing Not at Random
  • Missing Completely at Random
  • Data imputation methods
Traditional Inferential Statistics
  • Comparing Groups
    • P-Values, summary statistics, sufficient statistics, inferential targets
    • T-Tests (equal and unequal variances)
    • ANOVA
    • Chi-Square Tests
  • Correlation
Frequentist Approaches to Multivariate Statistics
  • Linear Regression
    • Multivariate linear regression
    • Capturing Non-linear Relationships
    • Comparing Model Fits
    • Scoring new data
    • Poisson Regression Extension
  • Logistic regression
    • Logistic Regression Example
    • Classification Metrics
Machine learning approaches to multivariate statistics
  • Machine Learning Theory
  • Data pre-processing
    • Missing Data
    • Dummy Coding
    • Standardization
    • Training/Test data
  • Supervised Versus Unsupervised Learning
  • Unsupervised Learning: Clustering
    • Clustering Algorithms
    • Evaluating Cluster Performance
  • Dimensionality Reduction
    • A-priori
    • Principal Components Analysis
    • Penalized Regression
Supervised Learning: Regression
  • Linear Regression
  • Penalized Linear Regression
  • Stochastic Gradient Descent
  • Scoring New Data Sets
  • Cross Validation
  • Variance Bias-Tradeoff
  • Feature Importance
Supervised Learning: Classification
  • Logistic Regression
  • LASSO
  • Random Forest
  • Ensemble Methods
  • Feature Importance
  • Scoring New Data Sets
  • Cross Validation
Conclusion
Request Pricing

Lecture percentage

50%

Lecture/Demo

Lab percentage

50%

Lab

Course Number:

PYTH-124

Duration:

5 Days

Prerequisites:

All attendees should have prior programming experience and an understanding of basic statistics.

Training Materials:

All attendees receive comprehensive courseware.

Software Requirements:

  • Anaconda Python 3.5 or later
  • Spyder IDE (Comes with Anaconda)

Contact Us:

Accelebrate’s training classes are available for private groups of 3 or more people at your site or online anywhere worldwide.

Don't settle for a "one size fits all" public class! Have Accelebrate deliver exactly the training you want, privately at your site or online, for less than the cost of a public class.

For pricing and to learn more, please contact us.

Contact Us Train For Us

Toll-free in US/Canada:
877 849 1850
International:
+1 678 648 3113

Toll-free in US/Canada:
866 566 1228
International:
+1 404 420 2491

925B Peachtree Street, NE
PMB 378
Atlanta, GA 30309-3918
USA

Subscribe to our Newsletter:

Never miss the latest news and information from Accelebrate:

Microsoft Gold Partner

Please see our complete list of
Microsoft Official Courses

Recent Training Locations

Alabama

Huntsville

Montgomery

Birmingham

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

San Francisco

Oakland

San Jose

Orange County

Los Angeles

Sacramento

San Diego

Colorado

Denver

Boulder

Colorado Springs

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Miami

Jacksonville

Orlando

Saint Petersburg

Tampa

Georgia

Atlanta

Augusta

Savannah

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Ceder Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

Banton Rouge

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Hagerstown

Frederick

Massachusetts

Springfield

Boston

Cambridge

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Saint Paul

Minneapolis

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Reno

Las Vegas

New Jersey

Princeton

New Mexico

Albuquerque

New York

Buffalo

Albany

White Plains

New York City

North Carolina

Charlotte

Durham

Raleigh

Ohio

Canton

Akron

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Tulsa

Oklahoma City

Oregon

Portland

Pennsylvania

Pittsburgh

Philadelphia

Rhode Island

Providence

South Carolina

Columbia

Charleston

Spartanburg

Greenville

Tennessee

Memphis

Nashville

Knoxville

Texas

Dallas

El Paso

Houston

San Antonio

Austin

Utah

Salt Lake City

Virginia

Richmond

Alexandria

Arlington

Washington

Tacoma

Seattle

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Edmonton

Calgary

British Columbia

Vancouver

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan

© 2013-2019 Accelebrate, Inc. All Rights Reserved. All trademarks are owned by their respective owners.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.