Giter VIP home page Giter VIP logo

sklearn-classification's Introduction

Census Income Dataset Classification

Data Science Notebook on a Classification Task

Objective

In the Jupyter Notebook included in this page, we will using the Census Income Dataset to predict whether an individual's income exceeds $50K/yr based on census data.

The Dataset can be found here:

The Notebook can be found here:

Companion Mindmap/Cheatsheet

This Jupyter Notepad has a companion Mindmap/Cheatsheet that lists most of the Data Science steps that can be found at the following link:

Steps

In this Notebook, we'll perform:

  • Feature Exploration (Uni and Bi-variate)
  • Feature Imputation
  • Feature Selection
  • Feature Encoding
  • Feature Ranking
  • Machine Learning with sklearn and Tensorflow
  • Random Search
  • Accuracy, Precision, Recall, and f1 calculations
  • ROC Curve

Setup

This Notebook has been designed to be run on top of the Jupyter Tensorflow Docker instance found in the link below:

If you haven't downloaded Docker at this point, please visit:

Then, open a shell or terminal session and copy/paste the following:

docker run -itd \
  --restart always \
  --name jupyter \
  --hostname jupyter \
  -p 8888:8888 \
  -p 6006:6006 \
  jupyter/tensorflow-notebook:latest \
  start-notebook.sh --NotebookApp.token=''

Upon running the command, docker will automatically pull the images it needs and get the containers going for us.

Give it a minute or so for Jupyter to start, and head to the following URL: http://localhost:8888

You should now have Jupyter running. If after a minute you can't reach the URL, check that the containers are running correctly and the network has been created by typing:

### Check the containers are running
docker ps -a

Loading the Notebook

Download it from this link:

Go back to:

Troubleshooting Docker

Here's a few useful commands in case something goes wrong with your docker instance:

# Restart Jupyter Docker Container
docker restart jupyter

# Stop Jupyter Docker Container
docker stop jupyter

# Remove Jupyter Docker Container
docker rm jupyter

Feature Exploration (Uni and Bi-variate) Feature Imputation Feature Selection Feature Encoding Feature Ranking Machine Learning Training Random Search Accuracy, Precision, Recall, and f1 calculations ROC Curve

Screenshots

Feature Distribution Analysis

alt text

Feature Cleaning

alt text

Missing Values is Features

alt text

Bivariate Exploration

alt text alt text

Feature Correlation

alt text

Feature Importance

alt text

Feature PCA

alt text

Results from Machine Learning Algorithms

alt text

ROC for each Algorithm

alt text

About Me

Twitter:

Linkedin:

Email:

sklearn-classification's People

Contributors

dformoso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sklearn-classification's Issues

Python 2 Deprecated in Colab

  1. I uploaded your book to my Colab
  2. import io, os, sys, types, time, datetime, math, random, requests, subprocess, StringIO, tempfile
  3. ModuleNotFoundError

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.