Giter VIP home page Giter VIP logo

warsaw_apartment_prices's Introduction

By Jan Czałbowski

You want to predict the price for renting an apartment in Warsaw?

Check out this apartment price prediction script that utilizes simple linear regression and this database to predict the price of an apartment. For explanation of all the mathematics scroll down.

How does it work?

To predict the price of an apartment, linear regression has to be used. Linear regression is easy to employ when 2 variable functions are used; therefore, in this script the area of an apartment is presented as $X$ and price as $Y$.

To calculate the optimal line of regression, minimizing the total sum of squares of $Y$ is required. $SSy$ shows how big is the sum of all the differences of $Yi$ and $\overline{Y}$, how precise is the linear function. The parameters of the linear function are $\hat{\beta_0}$ and $\hat{\beta_1}$. The optimal parameters can be calculated in 3 steps.

  1. Deriving the formula for partial derivatives of $\hat{\beta_0}$ and $\hat{\beta_1}$
    -> It is done to determine the slope of the function at a given moment
  2. Calculating zeroes of the two derivatives
    -> $\frac{\partial f}{\partial x} = 0$ is a local minimum of a function
  3. Solving the simultanious equations for $\hat{\beta_0}$ and $\hat{\beta_1}$
    -> To derive the complete forumla for the linear regression $Y = \hat{\beta_1}X + \hat{\beta_0}$

To simplify $\sum_{i = 0}^{n}$ => $\sum$

Formulas for calculating the mean:

$\overline{X} = \frac{\sum Xi}{n}$

$\overline{Y} = \frac{\sum Yi}{n}$

Step 1.

For $\hat{\beta_0}$

$\frac{\partial }{\partial \beta_0} \sum (Yi - \hat{Y})^2 = \sum \frac{\partial }{\partial \beta_0} (Yi - \hat{\beta_0} - \hat{\beta_1}Xi)^2 = -2\sum (Yi - \hat{\beta_0} - \hat{\beta_1}Xi)$

For $\hat{\beta_1}$

$\frac{\partial }{\partial \beta_1} \sum (Yi - \hat{Y})^2 = \sum \frac{\partial }{\partial \beta_1} (Yi - \hat{\beta_0} - \hat{\beta_1}Xi)^2 = -2\sum Xi(Yi - \hat{\beta_0} - \hat{\beta_1}Xi)$

Step 2.

$-2\sum (Yi - \hat{\beta_0} - \hat{\beta_1}Xi = 0$
$-2\sum Xi(Yi - \hat{\beta_0} - \hat{\beta_1}Xi) = 0$

$\sum(Yi - \hat{\beta_0} - \hat{\beta_1}Xi) = 0$
$\sum Xi(Yi - \hat{\beta_0} - \hat{\beta_1}Xi) = 0$

Step 3.

Now derive the formula for $\hat{\beta_0}$ from the first equation to then substitute in the second equation

$\sum Yi - \sum \hat{\beta_0} - \sum \hat{\beta_1}Xi = 0$

$\hat{\beta_0}$ and $\hat{\beta_1}$ are treated as constants

$\sum Yi - \hat{\beta_0}\sum 1 - \hat{\beta_1}\sum Xi = 0$

$\sum Yi - \hat{\beta_0}n - \hat{\beta_1}\sum Xi = 0$

$\sum Yi - \hat{\beta_1}\sum Xi = \hat{\beta_0}n$

$\hat{\beta_0} = \frac{\sum Yi - \hat{\beta_1}\sum Xi}{n}$


$\hat{\beta_0} = \hat{Y} - \hat{\beta_1}\overline{X}$

We take the second equation and substitue $\hat{\beta_0}$

$\sum Xi(Yi - \hat{Y} - \hat{\beta_1}\overline{X} - \hat{\beta_1}Xi) = 0$

$\sum Xi((Yi - \hat{Y}) - \hat{\beta_1}(\overline{X} - Xi)) = 0$

$\sum Xi(Yi - \hat{Y}) - \sum Xi \hat{\beta_1}(\overline{X} - Xi) = 0$

$\sum Xi(Yi - \hat{Y}) - \hat{\beta_1}\sum Xi (\overline{X} - Xi) = 0$

$\sum Xi(Yi - \hat{Y}) = \hat{\beta_1}\sum Xi (\overline{X} - Xi)$

$\hat{\beta_1} = \frac{\sum Xi(Yi - \hat{Y})}{\sum Xi (\overline{X} - Xi)} = \frac{\sum (Xi-\overline{X})(Yi - \overline{Y})}{\sum (\overline{X} - Xi)^2}$


$\hat{\beta_1} = \frac{\sum (Xi-\overline{X})(Yi - \overline{Y})}{\sum (\overline{X} - Xi)^2}$

$\hat{\beta_0} = \hat{Y} - \hat{\beta_1}\overline{X}$


$Y = X\hat{\beta_1} + \hat{\beta_0}$

This is the optimal linear regression formula.

warsaw_apartment_prices's People

Contributors

czalbia avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.