The classification from bosgithub

Classification

What is classification?

Classification is a process to predict qualitative responses. This involves assigning the observation to a specific category or class, it is done by comparing the probability of each of the categories of the specific observation and assigning the one with the highest probability.

Some examples would be the color of our eyes{green,brown, blue} or fraudulent emails{spam, ham}

Common classification techniques(classifiers):

Logistic Regression
Linear Discriminant Analysis
K-nearest Neighbors

there are some more computer-intensive(non-linear) models such as GAM, trees, random forests, and boosting, and support vector machines

Would linear regression work?

No, linear regression won't work in qualitative responses because for：

Multiclass Classification, class assigned with numerical value would imply ordering of responses. In the case of colors of rainbow, it makes no sense that red is greater than violet.
Binary Classification, linear regression would fail because we need to output probability of class A or class B, the output must be bounded between 0 and 1, linear regression may produce a value outside of this bound, which is difficult to understand as probability.

How does it work?

We start by building a classifier with some training observations (x1, y1), ... , （xn, yn), x are the features, and y is the response variable. After we obtain our model, we then test this classifier on unseen test data.

Logistic Regression

Logistic regression models the probability that the response variable belongs to a particular category. (example: Probability of race given the demographic region)

Logistic Model:

$p(X) = \frac{e^{\beta_{0} + \beta_{1}*X }} {1+ e^{\beta_{0} + \beta_{1}*X}}$

sometimes we prefer to work with odds and log of odds or logits,

Odds:

$\frac{p(X)}{1 - p(X)} = {e^{\beta_{0} + \beta_{1}*X }}$

Logit:

$log\left ( \frac{p(X)}{1 - p(X)}\right ) = {e^{\beta_{0} + \beta_{1}*X }}$

The idea of training the logistic model is to fit our model on the training data to best estimate the beta values. This is done via Maximum Likelihood Method.

bosgithub / classification Goto Github PK

classification's Introduction

Classification

What is classification?

Common classification techniques(classifiers):

How does it work?

Logistic Regression

The idea of training the logistic model is to fit our model on the training data to best estimate the beta values. This is done via Maximum Likelihood Method.

classification's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent