Giter VIP home page Giter VIP logo

linearregression's Introduction

Linear Regression From Scratch

An implementation of Linear Regression from scratch in python ๐Ÿ“ˆ. Includes the hypothesis function, partial differentiation of variables, parameter update rule and gradient descent.

Alt Text

Getting Started

  1. Download the project on your computer. git clone https://github.com/sarvasvkulpati/LinearRegression or download the ZIP file

  2. Go to the directory with the file: cd LinearRegression

  3. Download the required packages: pip install -r requirements.txt

  4. Run the project: python3 LinearRegression.py

How It Works

What Is Linear Regression?

Linear regression is a method for approximating a linear relationship between two variables. While that may sound complicated, all it really means is that it takes some input variable, like the age of a house, and finds out how it's related to another variable, for example, the price it sells at.

Linear data example

We use it when the data has a linear relationship, which means that when you plot the points on a graph, the data lies approximately in the shape of a straight line.

Linear Regression involves a couple of steps:

  1. Randomly initialising parameters for the hypothesis function
  2. Computing the mean squared error
  3. Calculating the partial derivatives
  4. Updating the parameters based on the derivatives and the learning rate
  5. Repeating from 2 until the error is minimised

The Hypothesis Function

The linear equation is the standard form that represents a straight line on a graph, where m represents the gradient, and b represents the y-intercept.

The Hypothesis Function is the exact same function in the notation of Linear Regression.

The two variables we can change โ€“ m and b โ€“ are represented as parameters theta1 and theta0

In the beginning, we randomly initialize our parameters, which means we give theta1 and theta0 random values to begin with. This will output a random line, maybe something like this: Randomly initialised parameters

The Error Function

Clearly the line drawn in the graph above is wrong. But how wrong is it? That's what the error function is for - it calculates the total error of your graph. We'll be using an error function called the Mean Squared Error function, or MSE, represented by the letter J.

Now while that may look complicated, what it's doing is actually quite simple. The function J takes our parameters theta0, and theta1 as an input. Then, from every point from 1 to m, where m is the number of points, it calculates the distance between the x value and the value predicted by our function and squares it. Squaring it ensures that the error is always positive, and that the error is greater the further away the predicted value is from the actual value.

Now that we have a value for how wrong our function is, we need to adjust the function to reduce this error.

Calculating Derivatives

If we graph our parameters against the error (i.e graphing the cost function), we'll find that it forms something similar to the graph below. Our goal is to find the lowest point of that graph, where the error is at its lowest. This is called minimising the cost function.

To do this, we need to consider what happens at the bottom of the graph - the gradient is zero. So to minimize the cost function, we need to get the gradient to zero.

The gradient is given by the derivative of the function, and the partial derivatives of the functions are

Updating The Parameters Based On The Learning Rate

Now we need to update our parameters to reduce the gradient. To do this, we use the gradient update rule

Alpha is what we call the Learning rate, which is a small number that allows the parameters to be updated by a small amount. As mentioned above, we are trying to update the gradient such that it's closer to zero (the bottom). The Learning rate helps guide the network to the lowest point on the curve by small amounts.

Minimising the Cost Function

Now we repeat these steps - checking the error, calculating the derivatives, and updating the weights, until the error is as low as possible. This is called minimising the cost funtion.

Once your error is minimised, your line should now be the best possible fit to approximate the data!

linearregression's People

Contributors

sarvasvkulpati avatar rish-16 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.