Giter VIP home page Giter VIP logo

k9's Introduction

Self-Taught Data Science Playground

The repository is a collection of my self-taught notebooks for data science theories and practices. A huge effort is made to strike a balance between methodology derivation (with math) and hands-on coding. The target audience is data science practitioners (including myself) with hands-on experiences who are seeking for more in-depth understandings of machine learning algorithms and relevant statistics.

Here to visit the web site Hello, Data Science! hosting all the notebooks in nicely rendered HTML.

Notebooks Summary

notebooks/

A notebook is written in either Jupyter or R markdown. The major programming languages used for most of the notebooks are Python and/or R. You may find me sometimes inter-operate the two langauges in a single notebook. This is achieved thanks to reticulate.

Laboratory Scripts

labs/

These are quick-and-dirty scripts to explore a variety of open source machine learning tools. They may not be completed and can be messy to read.

[Optional] Setup Python Environment

To ensure reproducibility it is recommended to use pyenv along with pyenv-virtualenv to control both Python and package version.

pyenv support only Linux and macOS. For Windows user it is recommended to use conda instead.

Install Different Python Version

To use virtualenv with reticulate in Rmd, the involved Python must be installed with shared library:

PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.7.0

Create virtualenv

Each notebook has different package dependencies. Here is an example to create an environment specific for the notebook on model explainability:

cd notebooks/ml/model_explain
pyenv virtualenv 3.7.0 k9-model-explain
pyenv local k9-model-explain
pip install --upgrade pip
pip install -r requirements.txt

TODO

Topics

  • Machine Learning
    • Factorization Machines
    • Recurrent Neural Nets
    • Sequence-to-Sequence Models
    • GANs
    • Reinforcement Learning Basics
    • Approximated Nearest Neighbor
  • Statistics
    • Law of Large Numbers and Central Limit Theorem
    • On Linear Regression: Machine Learning vs Econometrics
    • Linear Mixed Effects Models
    • Naive Bayes
    • Bayesian Model Diagnostic
    • Bayesian Time Series Forecasting
  • Tools/Programming
    • PyTorch Hands-On
    • RASA Chatbot Framework Hands-On
  • Programming
    • R
      • Production Quality Shiny App Development
    • Python
      • Dash for Interactive Dashboarding
  • Projects
    • Model Deployment with gRRC

Site

  • Dockerize each notebook (for complete reproducibility and portability)?
  • Tidy up dependencies for each notebook

k9's People

Contributors

everdark avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.