Giter VIP home page Giter VIP logo

machine-learning's Introduction

Machine Learning Repository

A place for sharing ML knowledge among SFS students and recent grads.

Introduction

When I was a student at SFS I fell in love with machine learning. It's technically difficult, often misunderstood, and it raises so many ethical questions. I wanted to understand all of these things, so I created this repository for SFS students/recent grads to collaborate on Machine Learning stuff. Feel free to clone, explore, or add. If something does't work please tell me or make a pull request.

How to use this repository

If you don't know the math behind lasso and ridge, I reccommend you start by reading the pdf in the theory folder. After that, the demo folder will give you the scripts and output to walk through the basics of the lasso and ridge commands. The lasso+ridge_demo.md file will give you a nice online rendering of the document, but won't provide the output -- for that you'll want the lasso+ridge_demo_output.pdf. Alternatively the lasso+ridge_demo.do script allows you to follow along with the pure Stata do-file, which is conducive to editing etc. Finally, the bikeshare folder is where I'm (slowly) working on publicly-available Capital Bikeshare data to see if I can track the effects of COVID-19 on the system usage.

General ML resources/things I've come across

Basically start with anything by Susan Athey (https://athey.people.stanford.edu/research).

Stata's [help file](https://www.stata.com/manuals/lassolassointro.pdf#lassoLassointro) for newly-encorporated lasso command.

The Hastie, Tibshirani, and Friedman textbook (https://web.stanford.edu/~hastie/ElemStatLearn/). It seems to be publicly available but check for specfics on downloading.

Publicly-Available Datasets (make sure to check conditions before downloading)

1. UC Irvine Machine Learning Repo (http://archive.ics.uci.edu/ml/index.php).

2. Awesome Public Datasets via GitHub (https://github.com/awesomedata/awesome-public-datasets#climate-weather)

3. World Bank Open Data via GitHub (https://github.com/jpazvd/wbopendata) or the World Bank [website](https://data.worldbank.org/). Note that the former leads you to a World Bank-create Stata package that allows you to install publicly available data directly within Stata. In Stata, type "ssc install wbopendata".

4. The Capital Bikeshare [datasets](https://www.capitalbikeshare.com/system-data).

machine-learning's People

Contributors

buscandoaverroes avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.