Generic filters
Exact matches only
Search in title
Search in content
Search in excerpt

Pandas for Everyone: Python Data Analysis

Pandas is an open-source Python library for data analysis. The Pandas for Everyone: Python Data Analysis course focuses on loading data into Python with the help of the Pandas library. This course contains interactive lessons with knowledge checks, quizzes, and hands-on labs to get a deeper understanding of the concepts such as Pandas DataFrame and Data Structure Basics, Plotting Basics, Tidy Data, Data Assembly, Data Normalization, linear regression, survival models, and so on.

Submit form to obtain discount

Test Prep
50+ Pre Assessment Questions | 50+ Post Assessment Questions |
Features
30+ LiveLab | 20+ Video tutorials | 43+ Minutes

Why choose TOPTALENT?

Outline

Lessons 1:
Preface

  • Breakdown of the Course
  • How to Read This Course
  • Setup

Lessons 2:
Pandas DataFrame Basics

  • Introduction
  • Load Your First Data Set
  • Look at Columns, Rows, and Cells
  • Grouped and Aggregated Calculations
  • Basic Plot
  • Conclusion

Lessons 3:
Pandas Data Structures Basics

  • Create Your Own Data
  • The Series
  • The DataFrame
  • Making Changes to Series and DataFrames
  • Exporting and Importing Data
  • Conclusion

Lessons 4:
Plotting Basics

  • Why Visualize Data?
  • Matplotlib Basics
  • Statistical Graphics Using matplotlib
  • Seaborn
  • Pandas Plotting Method
  • Conclusion

Lessons 5:
Tidy Data

  • Columns Contain Values, Not Variables
  • Columns Contain Multiple Variables
  • Variables in Both Rows and Columns
  • Conclusion

Lessons 6:
Apply Functions

  • Primer on Functions
  • Apply (Basics)
  • Vectorized Functions
  • Lambda Functions (Anonymous Functions)
  • Conclusion

Lessons 7:
Data Assembly

  • Combine Data Sets
  • Concatenation
  • Observational Units Across Multiple Tables
  • Merge Multiple Data Sets
  • Conclusion

Lessons 8:
Data Normalization

  • Multiple Observational Units in a Table (Normalization)
  • Conclusion

Lessons 9:
Groupby Operations: Split-Apply-Combine

  • Aggregate
  • Transform
  • Filter
  • The pandas.core.groupby. DataFrameGroupBy object
  • Working With a MultiIndex
  • Conclusion

Lessons 10:
Missing Data

  • What Is a NaN Value?
  • Where Do Missing Values Come From?
  • Working With Missing Data
  • Pandas Built-In NA Missing
  • Conclusion

Lessons 11:
Data Types

  • Data Types
  • Converting Types
  • Categorical Data
  • Conclusion

Lessons 12:
Strings and Text Data

  • Introduction
  • Strings
  • String Methods
  • More String Methods
  • String Formatting (F-Strings)
  • Regular Expressions (RegEx)
  • The regex Library
  • Conclusion

Lessons 13:
Dates and Times

  • Python’s datetime Object
  • Converting to datetime
  • Loading Data That Include Dates
  • Extracting Date Components
  • Date Calculations and Timedeltas
  • Datetime Methods
  • Getting Stock Data
  • Subsetting Data Based on Dates
  • Date Ranges
  • Shifting Values
  • Resampling
  • Time Zones
  • Arrow for Better Dates and Times
  • Conclusion

Lessons 14:
Linear Regression (Continuous Outcome Variable)

  • Simple Linear Regression
  • Multiple Regression
  • Models with Categorical Variables
  • One-Hot Encoding in scikit-learn with Transformer Pipelines
  • Conclusion

Lessons 15:
Generalized Linear Models

  • About This Lesson
  • Logistic Regression (Binary Outcome Variable)
  • Poisson Regression (Count Outcome Variable)
  • More Generalized Linear Models
  • Conclusion

Lessons 16:
Survival Analysis

  • Survival Data
  • Kaplan Meier Curves
  • Cox Proportional Hazard Model
  • Conclusion

Lessons 17:
Model Diagnostics

  • Residuals
  • Comparing Multiple Models
  • k-Fold Cross-Validation
  • Conclusion

Lessons 18:
Regularization

  • Why Regularize?
  • LASSO Regression
  • Ridge Regression
  • Elastic Net
  • Cross-Validation
  • Conclusion

Lessons 19:
Clustering

  • k-Means
  • Hierarchical Clustering
  • Conclusion

Lessons 20:
Life Outside of Pandas

  • The (Scientific) Computing Stack
  • Performance
  • Dask
  • Siuba
  • Ibis
  • Polars
  • PyJanitor
  • Pandera
  • Machine Learning
  • Publishing
  • Dashboards
  • Conclusion

Lessons 21:
It’s Dangerous To Go Alone!

  • Local Meetups
  • Conferences
  • The Carpentries
  • Podcasts
  • Other Resources
  • Conclusion

Appendix A: Concept Maps

Appendix B: Installation and Setup

  • B.1 Install Python
  • B.2 Install Python Packages
  • B.3 Download Book Data

Appendix C: Command Line

  • C.1 Installation
  • C.2 Basics

Appendix D: Project Templates

Appendix E: Using Python

  • E.1 Command Line and Text Editor
  • E.2 Python and IPython
  • E.3 Jupyter
  • E.4 Integrated Development Environments (IDEs)

Appendix F: Working Directories

Appendix G: Environments

  • G.1 Conda Environments
  • G.2 Pyenv + Pipenv

Appendix H: Install Packages

  • H.1 Updating Packages

Appendix I: Importing Libraries

Appendix J: Code Style

  • J.1 Line Breaks in Code

Appendix K: Containers: Lists, Tuples, and Dictionaries

  • K.1 Lists
  • K.2 Tuples
  • K.3 Dictionaries

Appendix L: Slice Values

Appendix M: Loops

Appendix N: Comprehensions

Appendix O: Functions

  • O.1 Default Parameters
  • O.2 Arbitrary Parameters

Appendix P: Ranges and Generators

Appendix Q: Multiple Assignment

Appendix R: Numpy ndarray

Appendix S: Classes

Appendix T: SettingWithCopyWarning

  • T.1 Modifying a Subset of Data
  • T.2 Replacing a Value
  • T.3 More Resources

Appendix U: Method Chaining

Appendix V: Timing Code

Appendix W: String Formatting

  • W.1 C-Style
  • W.2 String Formatting: .format() Method
  • W.3 Formatting Numbers

Appendix X: Conditionals (if-elif-else)

Appendix Y: New York ACS Logistic Regression Example

Appendix Z: Replicating Results in R

  • Z.1 Linear Regression
  • Z.2 Logistic Regression
© 2024 TOPTALENT LEARNING.