Course Number: DATA-142
Duration: 5 days (32.5 hours)
Format: Live, hands-on

Modeling Data Training Overview

This Modeling Data for Inference course teaches attendees how to use Python to perform causal inference on observational data. Participants learn how to work with inferential models, missing data, and experimental design.

Location and Pricing

Accelebrate offers instructor-led enterprise training for groups of 3 or more online or at your site. Most Accelebrate classes can be flexibly scheduled for your group, including delivery in half-day segments across a week or set of weeks. To receive a customized proposal and price quote for private corporate training on-site or online, please contact us.

In addition, some courses are available as live, instructor-led training from one of our partners.

Objectives

  • Perform causal inference in observational data using Python
  • Perform and interpret null hypothesis testing in Python
  • Implement generalized linear models in statsmodels
  • Understand missing data
  • Impute missing data
  • Generate accurate power calculations
  • Implement non-parametric methods to test hypotheses.
  • Use causal inference frameworks to identify causal effects from observational data

Prerequisites

Attendees must have a solid foundation in Python programming for descriptive analytics.

Outline

Expand All | Collapse All

Introduction
GLMs with Python using Stats Models
  • Applying Statistical Models for Analysis in Python: The A/B test
    • Explanation of statsmodels library of functions
    • Inferential and descriptive statistics refresher
    • Implementing A/B tests
Modeling Continuous Data (Linear models)
  • Formulation of the simple linear model
  • Application of the intercept only, null model
    • Binary predictor
    • Interpreting results
    • Categorical predictor
    • Continuous predictor
    • Polynomial expansions
    • Multiple linear regression
    • Spline models
    • Interaction terms
    • Picking the “best” model
    • Discussion of confounding, interaction terms, and model building approaches
  • Modeling Binary Data (Logistic models)
    • Discussion of the generalized linear model
    • The Logit link function
    • Binomial distribution
    • Intercept only model
    • Back transformation of coefficients
    • Simple predictor
    • Multiple predictors
    • Odds ratio interpretations
    • Generating a scoring data set
    • Predicting from the model with new data
  • Modeling Count Outcomes
    • How are count outcomes different?
    • Poisson models
    • Over dispersed modeling options
    • Log link functions
    • Using offsets to model rates / uneven follow-up
Power Analyses/Study Design
  • Understanding and estimating statistical power
  • Type 1 and type 2 errors
  • Using existing power estimators
  • Simulating power through the data-generating process
Non-Parametric Analysis Methods
  • Using bootstrapping/permutation tests
    • Bootstrapping versus depending on asymptotic behavior to estimate confidence intervals
    • How different/stable are my results?
    • resampling a data set
    • bias-corrected bootstrap interval
    • Extending the bootstrap function to calculate more statistics
    • Permutation tests for p-values
Missing data
  • Quantifying
  • Visualizing missing data
  • MAR,MCAR,MNAR
  • Sensitivity analysis
  • Imputation
    • MICE/trees pre-processing
Time to Event (Survival) Analysis
  • Visualizing Hazards Across Time
  • Understanding the Log Rank Test
  • Cox Proportional Hazards Modeling
    • Understanding and interpreting the Hazard Ratio
    • Model diagnostics and assumptions
    • Implementing Time Varying Covariates
  • Parametric Survival Models
    • Weibull Model
    • Exponential Model
    • Predicting Failure Times
Causal Inference: The Potential Outcomes Framework
  • Defining treatment effects (ATT, ATE)
  • Identifying populations of interest
  • Defining your causal hypothesis
  • Understanding the counterfactual
  • Establishing the causal diagram for your problem
  • Different methods for conditioning on variables:
    • Propensity Scores
    • Direct regression adjustment
    • G-computation formulas
  • Instrumental variable analysis
Conclusion

Training Materials

All Data Modeling training students receive comprehensive courseware.

Software Requirements

  • Windows, Mac, or Linux
  • A current version of Anaconda for Python 3.x, or a comparable Python installation with the necessary libraries (Accelebrate can provide a list)


Learn faster

Our live, instructor-led lectures are far more effective than pre-recorded classes

Satisfaction guarantee

If your team is not 100% satisfied with your training, we do what's necessary to make it right

Learn online from anywhere

Whether you are at home or in the office, we make learning interactive and engaging

Multiple Payment Options

We accept check, ACH/EFT, major credit cards, and most purchase orders



Recent Training Locations

Alabama

Birmingham

Huntsville

Montgomery

Alaska

Anchorage

Arizona

Phoenix

Tucson

Arkansas

Fayetteville

Little Rock

California

Los Angeles

Oakland

Orange County

Sacramento

San Diego

San Francisco

San Jose

Colorado

Boulder

Colorado Springs

Denver

Connecticut

Hartford

DC

Washington

Florida

Fort Lauderdale

Jacksonville

Miami

Orlando

Tampa

Georgia

Atlanta

Augusta

Savannah

Hawaii

Honolulu

Idaho

Boise

Illinois

Chicago

Indiana

Indianapolis

Iowa

Cedar Rapids

Des Moines

Kansas

Wichita

Kentucky

Lexington

Louisville

Louisiana

New Orleans

Maine

Portland

Maryland

Annapolis

Baltimore

Frederick

Hagerstown

Massachusetts

Boston

Cambridge

Springfield

Michigan

Ann Arbor

Detroit

Grand Rapids

Minnesota

Minneapolis

Saint Paul

Mississippi

Jackson

Missouri

Kansas City

St. Louis

Nebraska

Lincoln

Omaha

Nevada

Las Vegas

Reno

New Jersey

Princeton

New Mexico

Albuquerque

New York

Albany

Buffalo

New York City

White Plains

North Carolina

Charlotte

Durham

Raleigh

Ohio

Akron

Canton

Cincinnati

Cleveland

Columbus

Dayton

Oklahoma

Oklahoma City

Tulsa

Oregon

Portland

Pennsylvania

Philadelphia

Pittsburgh

Rhode Island

Providence

South Carolina

Charleston

Columbia

Greenville

Tennessee

Knoxville

Memphis

Nashville

Texas

Austin

Dallas

El Paso

Houston

San Antonio

Utah

Salt Lake City

Virginia

Alexandria

Arlington

Norfolk

Richmond

Washington

Seattle

Tacoma

West Virginia

Charleston

Wisconsin

Madison

Milwaukee

Alberta

Calgary

Edmonton

British Columbia

Vancouver

Manitoba

Winnipeg

Nova Scotia

Halifax

Ontario

Ottawa

Toronto

Quebec

Montreal

Puerto Rico

San Juan