Giter VIP home page Giter VIP logo

sparklegit / mlimpl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vincen-github/mlimpl

0.0 1.0 0.0 1.45 MB

This Repository gathered some Implementation code which encapsulates commonly used methods in the field of machine learning based on Numpy and Pandas.u can implement commonly used machine learning algorithms by referring to this repository to deepen your understanding of it.

Python 58.36% R 0.88% MATLAB 0.98% Jupyter Notebook 39.78%

mlimpl's Introduction

mlimpl

Machine Learning Implementation

mlimpl

author: vincen (shields.io) email build: passing (shields.io) python: 3.6|3.7|3.8|3.9 (shields.io) NumPy version Pandas version

Introduce

This Repository gathered some Implementation code which encapsulates commonly used methods in the field of machine learning based on Numpy and Pandas.u can implement commonly used machine learning algorithms by referring to this repository to deepen your understanding of it.

trait

  • Detailed documentation and comment.
  • guidance for error-prone and difficult points.

Usage

I refer to the class structure of sklearn in my implementation. Most of class has three methods(i.e fit, predict, score).there is an example as follows:

from Multiple_linear_regression import LinearRegression  
from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)  
  
reg = LinearRegression()  
reg.fit(X, y)  
y_pred = reg.predict(X)  
score = reg.score(X, y)

as u saw. it is same as using sklearn.

Table of Contents

  • Gan:
    • Generate handwritten digital pictures through Gan achieved by tensorflow1.
  • Cnn:
    • Recognize digital verification code through convolutional neural network achieved by tensorflow1.
  • linear_model:
    • Linear Regression solved by analytical solution/gradient descent/AdamOptimizer.
    • Ridge solved by analytical soulution/gradient descent /AdamOptimizer.
    • Lasso solved by coordinate descent/iterate ridge.
  • DecisionTree:
    • ID3: the algorithm to solve classification problem based on tree form.
    • C4.5: Improvement of above method. Note that above two methods only support discrete features/labels.
    • Cart: CartRegressor to solve regression problem(i.e continuous label).this code can handle features whether continuous or discrete.
    • On the other hand.there exists some ipynb file implement decision tree which isn't encapsulated as class.
  • NaiveBayes:
    • MultinomialNB: Naive Bayes to solve discrete labels that obey multinomial distribution (priori of category).
    • u need to ensure the incoming features are categorical. GaussianNB: same as above except priori distribution is Gaussian.it is implies that this method can handle continuous features/label.
  • ann_by_matlab:
    • a simple artificial neural network achieved by matlab to distinguish the mnist digital dataset.the code in floder contains artificial neural network implement by myself.The other code file except file named ANN.m is to read mnist dataset to memory through matlab.it came from other blog.
  • SVM:
    • Support vector machine solved by sequential minimal optimization algorithm for classification task.
  • KMeans++:
    • Common unsupervised algorithm for cluster improved from kmeans.
  • rejection_sampling:
    • Rejection sampling method.
  • l1/2(LHalf):
    • l1/2 algorithm is a improved variant algorithm of lasso.it is a linear model as lasso but the optimization object of it is as follows

      min loss = 1/2 ‖Y - Xβ‖ + λ‖β‖_{1/2}

    • i use iterate ridge method to solve this non-convex regularization framework.
    • the file named energy_predict.py is the application of this method in Energy Consumption Field of CNC Machine Tools used pyspark.
  • xgboost:
    • eXtreme Gradient Boosting(xgboost) is a class that implement a scalable tree boosting system proposed by TianQi Chen.
    • i implement the exact greedy algorithm/approximate algorithm for split finding in this package.
  • RandomForest:
    • A random forest classifier.A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.
  • GMM:
    • Gaussian Mixture Model(single dimension) solved by EM.
  • MCMC:
    • Markov Chain Monte Carlo.It contains Metropolis–Hastings Algorithm and Gibbs Sampling.
  • High Confidence Predictions for Unrecognizable Images:
    • A simple demo about adversarial examples.
    • Reference: Anh Nguyen, Jason Yosinski and Jeff Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 427-436

mlimpl's People

Contributors

vincen-github avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.