Giter VIP home page Giter VIP logo

YanLiang's Projects

2014 icon 2014

Official content for the Fall 2014 Harvard CS109 Data Science course

2014_data icon 2014_data

Data directory for the CS109 Data Science course

angular-lynda icon angular-lynda

Building a Data-Driven App with AngularJS with Ray Villalobos

cs224n icon cs224n

CS224n: Natural Language Processing with Deep Learning Assignments Winter, 2017

easynlp icon easynlp

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

hmeae icon hmeae

Source code for EMNLP-IJCNLP 2019 paper "HMEAE: Hierarchical Modular Event Argument Extraction".

machine-learning-classify-handwritten-digit icon machine-learning-classify-handwritten-digit

Classify handwritten digits using machine learning techniques Yan Liang, Yunzhi Wang and Delong Zhao Project scope For our machine learning project, we propose to build several machine learning classifiers that recognize handwritten digits. Handwritten digit recognition is a classic problem in machine learning studies for many years. We plan to do several experiments using different machine learning algorithms and compare the pattern recognition performance. We hope to create a classifier that has same or better categorization accuracy than record performance from previous studies. Yan will focus on neural network, Delong will focus on the random forests methods, and Yunzhi will focus on SVMs and KNNs. We will also develop a final novel classifier that combines the best models from our different experiments. We hypothesize that the final classifier will archive a categorization accuracy of 0.99. This indicates that the classifier correctly classified all the handwritten digits but 1% of the images. The goal of handwritten digit recognition is to determine what digit is from an image of a single handwritten digit. It can be used to test pattern recognition theories and machine learning algorithms. Preprocessed standard handwritten digit image database has been developed to compare different digit recognizers. In our semester project, we will use modified National Institute of Standards and Technology (MNIST) handwritten digit images dataset from kaggle digit recognizer project. The Kaggle MNIST dataset is freely available and collected 28,000 training images and 42,000 test images. Each image is a preprocessed single black and white digit image with 28 x 28 pixels. Each pixel is an integer value range from 0 to 255 which represent the brightness of the pixel, the higher value meaning darker. Each image also has a label which is the correct digit for the handwritten image. For each input handwritten image, our model will output which digit we predict and evaluate with the correct label. We will use 28,000 training images to train our machine learning model and use 42,000 test images to test the performance. Then we will calculate the percentage of the test images that are correctly classified and compare the performance of different machine learning algorithms.

maven-dataset icon maven-dataset

Source code and dataset for EMNLP 2020 paper "MAVEN: A Massive General Domain Event Detection Dataset".

n3-collection icon n3-collection

N3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.