Giter VIP home page Giter VIP logo

datascience_notebooks's Introduction

Descriptive Statistics with Python

The purpose of this repo is to introduce some simple tasks about data manipulation and visualizations, with Python and its libraries and packages Numpy, Pandas, MatPlotlib and Scikit-Learn. All these Jupyter Notebooks are taken as notes from Datacamp's Data Analyst track exercises. During my journey through Udacity Deep Learning and Artificial Intelligence Engineer Nanodegrees, I found that besides struggling with neural architectures, tuning, statistical models, calculus and algebra needed to understand in order to build and train any model, I needed to be fluent manipulating data since data pre-process is usually needed before feeding your model. These notebooks are not an attempt at teaching Data Analysis to anyone. These notebooks just cover some routinary tasks in Python. Each notebook is named after its Datacamp's analogous course.

DataCamp is a time flexible, online data science learning platform offering tutorials and courses in data science. Students can master data analysis from the comfort of their browser, at their own pace, and tailored to their needs and expertise. These Data Analyst courses are also available in R. The learning experience is really fast since you do not need to install anything and from the very beginning, you are encouraged to focus only in coding and learn to code, thus, your goal.

Since some people have asked me similar questions, I point some references here if you need some refresher courses in Statistics:

ProbStat by Stanford Lagunita -> Awesome hands-on course, no programming experience needed. Udacity offers ud827, which is ok as well.

Statistical thinking in Python I and Statistical thinking in Python II -> Awesome courses from Justin Bois offered by Datacamp. You go through EDA and CDA, concept by concept and exercise by exercise. Some experience manipulating data frames is needed.

A month ago on March 30th, 2018, Rachael Tatman from Kaggle, run a nice set of tutorials to be performed in 5 days -one day each, but now it is over, you can do it at your pace-, related to how to deal with ordinary operations like scaling and normalizing your data, dealing with time series, data inconsistencies, missing data, etc. If you do not have yet a Kaggle profile, this could be your chance by forking her notebook and start working without installing anything. Data Cleaning Challenge.

Contents

Requirements and environment

These are the packages you will need in your environment in order to run these notebooks. You are free to use https://conda.io/docs/ or whatever you feel comfortable with, of course.

name: data-analytics channels:

  • anaconda
  • defaults dependencies:
  • matplotlib=2.1.1
  • jupyter=1.0.0
  • nb_conda=2.2*
  • pandas=0.22.0
  • python=3.5.4
  • scikit-learn=0.19.1
  • scipy=1.0.0

datascience_notebooks's People

Contributors

nvmoyar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.