Pandas for Everyone: Python Data Analysis
Pandas is an open-source Python library for data analysis. The Pandas for Everyone: Python Data Analysis course focuses on loading data into Python with the help of the Pandas library. This course contains interactive lessons with knowledge checks, quizzes, and hands-on labs to get a deeper understanding of the concepts such as Pandas DataFrame and Data Structure Basics, Plotting Basics, Tidy Data, Data Assembly, Data Normalization, linear regression, survival models, and so on.
- Price: $279.99
- Delivery Method: eLearning
Name | Buy |
---|---|
Pandas for Everyone: Python Data Analysis |
Test Prep
50+ Pre Assessment Questions |
50+ Post Assessment Questions |
Features
30+ LiveLab |
20+ Video tutorials |
43+ Minutes
Why choose TOPTALENT?
- Get assistance every step of the way from our Texas-based team, ensuring your training experience is hassle-free and aligned with your goals.
- Access an expansive range of over 3,000 training courses with a strong focus on Information Technology, Business Applications, and Leadership Development.
- Have confidence in an exceptional 95% approval rating from our students, reflecting outstanding satisfaction with our course content, program support, and overall customer service.
- Benefit from being taught by Professionally Certified Instructors with expertise in their fields and a strong commitment to making sure you learn and succeed.
Outline
Lessons 1:
Preface
- Breakdown of the Course
- How to Read This Course
- Setup
Lessons 2:
Pandas DataFrame Basics
- Introduction
- Load Your First Data Set
- Look at Columns, Rows, and Cells
- Grouped and Aggregated Calculations
- Basic Plot
- Conclusion
Lessons 3:
Pandas Data Structures Basics
- Create Your Own Data
- The Series
- The DataFrame
- Making Changes to Series and DataFrames
- Exporting and Importing Data
- Conclusion
Lessons 4:
Plotting Basics
- Why Visualize Data?
- Matplotlib Basics
- Statistical Graphics Using matplotlib
- Seaborn
- Pandas Plotting Method
- Conclusion
Lessons 5:
Tidy Data
- Columns Contain Values, Not Variables
- Columns Contain Multiple Variables
- Variables in Both Rows and Columns
- Conclusion
Lessons 6:
Apply Functions
- Primer on Functions
- Apply (Basics)
- Vectorized Functions
- Lambda Functions (Anonymous Functions)
- Conclusion
Lessons 7:
Data Assembly
- Combine Data Sets
- Concatenation
- Observational Units Across Multiple Tables
- Merge Multiple Data Sets
- Conclusion
Lessons 8:
Data Normalization
- Multiple Observational Units in a Table (Normalization)
- Conclusion
Lessons 9:
Groupby Operations: Split-Apply-Combine
- Aggregate
- Transform
- Filter
- The pandas.core.groupby. DataFrameGroupBy object
- Working With a MultiIndex
- Conclusion
Lessons 10:
Missing Data
- What Is a NaN Value?
- Where Do Missing Values Come From?
- Working With Missing Data
- Pandas Built-In NA Missing
- Conclusion
Lessons 11:
Data Types
- Data Types
- Converting Types
- Categorical Data
- Conclusion
Lessons 12:
Strings and Text Data
- Introduction
- Strings
- String Methods
- More String Methods
- String Formatting (F-Strings)
- Regular Expressions (RegEx)
- The regex Library
- Conclusion
Lessons 13:
Dates and Times
- Python’s datetime Object
- Converting to datetime
- Loading Data That Include Dates
- Extracting Date Components
- Date Calculations and Timedeltas
- Datetime Methods
- Getting Stock Data
- Subsetting Data Based on Dates
- Date Ranges
- Shifting Values
- Resampling
- Time Zones
- Arrow for Better Dates and Times
- Conclusion
Lessons 14:
Linear Regression (Continuous Outcome Variable)
- Simple Linear Regression
- Multiple Regression
- Models with Categorical Variables
- One-Hot Encoding in scikit-learn with Transformer Pipelines
- Conclusion
Lessons 15:
Generalized Linear Models
- About This Lesson
- Logistic Regression (Binary Outcome Variable)
- Poisson Regression (Count Outcome Variable)
- More Generalized Linear Models
- Conclusion
Lessons 16:
Survival Analysis
- Survival Data
- Kaplan Meier Curves
- Cox Proportional Hazard Model
- Conclusion
Lessons 17:
Model Diagnostics
- Residuals
- Comparing Multiple Models
- k-Fold Cross-Validation
- Conclusion
Lessons 18:
Regularization
- Why Regularize?
- LASSO Regression
- Ridge Regression
- Elastic Net
- Cross-Validation
- Conclusion
Lessons 19:
Clustering
- k-Means
- Hierarchical Clustering
- Conclusion
Lessons 20:
Life Outside of Pandas
- The (Scientific) Computing Stack
- Performance
- Dask
- Siuba
- Ibis
- Polars
- PyJanitor
- Pandera
- Machine Learning
- Publishing
- Dashboards
- Conclusion
Lessons 21:
It’s Dangerous To Go Alone!
- Local Meetups
- Conferences
- The Carpentries
- Podcasts
- Other Resources
- Conclusion
Appendix A: Concept Maps
Appendix B: Installation and Setup
- B.1 Install Python
- B.2 Install Python Packages
- B.3 Download Book Data
Appendix C: Command Line
- C.1 Installation
- C.2 Basics
Appendix D: Project Templates
Appendix E: Using Python
- E.1 Command Line and Text Editor
- E.2 Python and IPython
- E.3 Jupyter
- E.4 Integrated Development Environments (IDEs)
Appendix F: Working Directories
Appendix G: Environments
- G.1 Conda Environments
- G.2 Pyenv + Pipenv
Appendix H: Install Packages
- H.1 Updating Packages
Appendix I: Importing Libraries
Appendix J: Code Style
- J.1 Line Breaks in Code
Appendix K: Containers: Lists, Tuples, and Dictionaries
- K.1 Lists
- K.2 Tuples
- K.3 Dictionaries
Appendix L: Slice Values
Appendix M: Loops
Appendix N: Comprehensions
Appendix O: Functions
- O.1 Default Parameters
- O.2 Arbitrary Parameters
Appendix P: Ranges and Generators
Appendix Q: Multiple Assignment
Appendix R: Numpy ndarray
Appendix S: Classes
Appendix T: SettingWithCopyWarning
- T.1 Modifying a Subset of Data
- T.2 Replacing a Value
- T.3 More Resources
Appendix U: Method Chaining
Appendix V: Timing Code
Appendix W: String Formatting
- W.1 C-Style
- W.2 String Formatting: .format() Method
- W.3 Formatting Numbers
Appendix X: Conditionals (if-elif-else)
Appendix Y: New York ACS Logistic Regression Example
Appendix Z: Replicating Results in R
- Z.1 Linear Regression
- Z.2 Logistic Regression