Giter VIP home page Giter VIP logo

course-materials's Introduction

Course Materials for Advanced Data Analytics in Economics

Nick Hagerty, Montana State University

Except where otherwise noted, this work is licensed under Creative Commons BY-NC-SA 4.0.


Skip to: Lecture slides | Supplemental labs | External resources


Lecture slides

Fall 2022

Lecture 1: R Basics (.pdf)

  • About R
  • Operators
  • Objects and functions
  • Data frames
  • Vectors
  • Indexing

Lecture 2: Programming in R

  • If/else statements
  • For-loops
  • Functions
  • Vectorization
  • Parallelization

Lecture 3: Productivity Tools

Lecture 4: Data Wrangling

  • Philosophy of tidy data
  • Wrangling data with dplyr
  • Joining data with dplyr
  • Tidying data with tidyr
  • Importing data with readr

Lecture 5: Data Cleaning

  • Join safety
  • Keys and relational data
  • String cleaning
  • Number storage
  • Data Cleaning Checklist (pdf version)

Lecture 6: Data Acquisition

  • Where data comes from
  • Webscraping
  • Using APIs

Lecture 7: Best Practices for Coding and Workflows

  • The perils of bad data cleaning
  • Reproducibility and transparency
  • Best practices (code organization, file organization, version control, abstraction, commenting, unit tests)

Lecture 8: Distinguishing Goals of Data Analysis

  • The Data Generating Process
  • Potential outcomes, counterfactuals, and causal inference
  • Descriptive, Predictive, or Causal Analysis?

Lecture 9: Exploratory Analysis

  • Part 1

    • Summaries, frequency tables and crosstabs in R
    • Characterizing distributions
    • Handling extreme values
    • Handling variable transformations
    • Handling missing data
  • Part 2

    • Characterizing relationships
    • Binscatter
    • The Conditional Expectation Function
    • Adjusting for other variables
    • Bin smoothing and local regression

Lecture 10: Spatial Analysis

  • Intro to Geospatial Data
  • Part 1
    • Spatial data and quick mapping
    • Reference systems and projections
  • Part 2
    • Spatial queries (measurement, relationships)
    • Spatial subsetting
    • Geometry operations
    • Spatial joins

Lecture 11: Data Visualization

  • Basics of ggplot2
  • Plotting examples
  • Colors and themes
  • Principles of data visualization
  • Case studies

Lecture 12: Regression Modeling

  • Basic regression in R
  • Review: Interpreting coefficients
  • Indicator and interaction terms
  • Econometrics packages in R
  • Modeling nonlinear relationships

Lecture 13: Machine Learning Fundamentals

  • Review: Prediction
  • Statistical learning
  • Model accuracy
  • Cross-validation

Lecture 14: Prediction Methods

Lecture 15: Classification Methods

  • Part 1: Methods
    • Classification
    • Logistic regression
    • k-nearest neighbors
    • Model assessment
    • Decision trees
  • Part 2: Examples
    • Logistic regression and KNN
    • Cross-validation
    • Decision trees
    • Teach your laptop to read

Lecture 16: Machine Learning in Economics

  • Predicting outcomes
  • Constructing new data
  • Selecting covariates
  • Predicting causal effects

Lecture 17: Databases and Big Data

  • Tools for big data
  • Databases in R
  • Writing SQL queries
  • Getting started with BigQuery

Supplemental labs

By Laura Sikoski


External resources

This is a list of further resources that you may find helpful throughout (and after!) this course. Start with the course materials above, but check these out for alternative explanations or if you want to take a deeper dive into a particular topic. If one isn't speaking to you, try another.

Basics of R

Programming in R

R Markdown

Git and GitHub

Data wrangling with the tidyverse

Data cleaning

Data acquisition and webscraping

Best practices for coding and workflows

Distinguishing goals of data analysis

Exploratory analysis

Spatial analysis

Data visualization

Regression modeling in R

Fundamentals of machine learning

Shrinkage methods

Classification methods

Machine learning with tidymodels

Unsupervised learning

Further methods in machine learning

  • ISLR (James, Witten, Hastie, Tibshirani).
    • Ch. 8: Tree-Based Methods
    • Ch. 9: Support Vector Machines
    • Ch. 10: Deep Learning
  • Prediction and Machine Learning Lectures (Ed Rubin).
    • Lecture 007: Decision Trees
    • Lecture 008: Ensemble Methods
    • Lecture 009: Support Vector Machines

Applications of machine learning in economics

Databases (SQL)

Distributed and cloud computing

course-materials's People

Contributors

hagertynw avatar lksiko avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.