This repository introduces the application of Python in Economics and general data related industry. It covers the following topics in the Jupyter Notebook.
- I. Install and Import Packages
- II. Import Data Using Pandas
- III. Clean and Manipulate Data in Pandas DataFrame
- IV. Create Data Visualization Using Matplotlib
- V. Build Simple Linear Regression Model with StatsModels
For demonstration purpose, it is suggested to have Anaconda installed on the machine to follow the workshop demo. Here is a brief installation instruction for Anaconda:
Note: The Python version in this documentation is up to 3.7, but you can definitely go with the newest version offered on Anaconda official website.
Before starting to write code in Python, we need to first setup the development environment, which include an interpreter, standard library, and other environment settings. For people who want to use the official Python development environment, you can download it from Python official web page, www.python.org. However, the IDLE (Integrated Developmnet and Learning Environment) provided by Python is not the most intuitive one to use for developers or programmers. Here we adopted to a more complete standard platform called "Anaconda" and using "Jupyter Notebook" as our development environment. Jupyter Notebook is an open-source web-based development environment for coding. It supports over 40 programming languages, includes Python, R, Julia, and Scala.
We can download Anaconda from the official web page, www.anaconda.com, by click on "Download".
Choose the latest version Python 3.7 to complete downloading the installer. For Mac or Linus users, you can choose the cooresponding installer by clicking the "macOS" or "Linus" logo on top of the page. Follow the instructions to complete the installation. Note: Anaconda is a massive package to install, it requires at least 3GB of space from your machine to install the package. Please check the memory space in your machine before installing.
After installation, we can launch Anaconda Navigator from Window Start Menu. The Anaconda Navigator contains several program IDEs such as VS Code, Spyder, PyCharm, R-studio, etc. You can start a Jupyter Notebook by clicking the "Launch" button under Jupyter Notebook.
Jupyter Notebook is a web-based IDE, so it starts the Notebook Dashboard by opening the default web browser, which will show a list of the notebooks, files, and subdirectories int he directory where the notebook server was started. You should start a project by identifying the directory for the project. For instance, step 1, you can start a project directory on desktop, which will create a folder on your desktop. Step 2, click on "New" drop down menue. Step 3, create a new python notebook in that folder (or directory) by choosing "Python 3".
Once you created a new Python notebook file, step 1, you can give a name to the notebook file by clicking "Untitled" and replace with a new file name. Step 2, choose the coding type for the coding block. If you are writing Python code in a specific block, make sure "Code" is chosen in the drop down menu. You can also choose "Markdown" if you are trying to create a markdown in the document. Step 3, start coding in the coding block.
- Python Fundamentals Repo
- Official Python Documentation
- Pandas Documentation
- MatPlotLib Documentation
- StatsModels Documentation
- StackOverflow
- Python Data Science Handbook
For those who are interested in using Stata with Python integration, here is a link for the documentation from the official Stata website.
https://www.stata.com/new-in-stata/python-integration/
Copyright © 2021 Norman Lo