Pandas is a Python library for doing data analysis. It's really fast and lets you do exploratory work incredibly quickly.
The pandas library is a powerful tool for multiple phases of the data science workflow, including data cleaning, visualization, and exploratory data analysis. However, the size and complexity of the pandas library makes it challenging to discover the best way to accomplish any given task.
The goal of this cookbook is to give you some concrete examples for getting started with pandas. By the end of the tutorial, you'll be more fluent at using pandas to correctly and efficiently answer your own data science questions.
The tutorial code is available as a Jupyter notebook. You can run this notebook in the cloud (no installation required) by clicking the "launch binder" button:
- Make sure that
numpy numba pandas matplotlib seaborn sqlalchemy zipfile pandarallel swifter dask
are installed on your computer. - Download the
/datasets
from this repository.
pip install --upgrade numpy numba pandas matplotlib seaborn sqlalchemy zipfile pandarallel swifter dask
This software is licensed under the MIT License. See license.txt for details.