Giter VIP home page Giter VIP logo

cs7641_unsupervised_dimred's Introduction

Project 3: Unsupervised Learning & Dimensionality Reduction
###########################################################
GT CS7641 Machine Learning, Fall 2019
Eric W. Wallace, ewallace8-at-gatech-dot-edu, GTID 903105196

## Background ##
Classwork for Georgia Tech's CS7641 Machine Learning course. Project code should be published publicly for grading purposes, under the assumption that students should not plagiarize content and must do their own analysis.

These experiments compare 2 clustering algorithms and 4 dimensionality reduction algorithms, and also compare training of a neural network on the output of each of the above.

## Requirements ##

* Python 3.7 or higher
* essential Python machine learning libraries such as Numpy, Pandas, and Scipy
* Python visualization libraries include Matplotlib, Seaborn, and Yellowbrick

## Instructions ##

* Clone this source repository from Github onto your computer using the following command:
	`git clone [email protected]:ewall/CS7641_Unsupervised_DimRed.git`

* From the source directory, run the following command to install the necessary Python modules:
	`pip install -r requirements.txt`

* Directories in the project include:
	* `data` contains the pre-processed datasets
	* `plots` is where the scripts will save *.png files of the plots
	* `pickles` is where the scripts save serialized dataframes after processing
	* `experiments` contains miscellaenous scripts which have only a passing mention in the analysis paper

* Here is a guide to the filenames of the Python scripts in the directory:
	* `explore_datasets.py` shows plots and statistics of the original datasets
	* `1_clust_*.py` are the clustering experiments for the 2 cluster alogrithms
	* `2_dimred_*.py` are the dimension-reduction experiments for the 4 algorithms
	* `3_em_*.py` are EM clustering of the reduced feature sets from the 4 algorithms; similarly, `3_kmeans_*.py` are K-Means clustering across the same 4 reduced sets
	* `4_nn_base.py` is the baseline neural network run on the original data, for comparison
	* `4_nn_dimred.py` is the neural network run for all 4 feature selection algorithms
	* `5_nn_*.py` are the neural network runs on the 2 clustering algorithms

cs7641_unsupervised_dimred's People

Contributors

dependabot[bot] avatar ewall avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.