Giter VIP home page Giter VIP logo

real-estate-price-predictions's Introduction

My name is Greg Frasco and I am a full stack and app developer

real-estate-price-predictions's People

Contributors

bapower avatar goudete avatar gregfrasco avatar vyedin avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

real-estate-price-predictions's Issues

Clean up LOTSIZE data

If we choose to use LOTSIZE in our models, we need to clean the data.

  • Some lot sizes are listed in acres, and some in square feet, skewing the data significantly
  • Some lot size is 0 or extremely high, which should be validated

Encode categorical variables of interest and add them in columns to the data

For this first pass, we ended up dropping some of the non-numeric features of the data, like style, dates (used days on market, but not the list and sold dates), cooling/heating, or other info like basement, fireplace, etc.

Not all of these will be relevant, but some might be. If anything stands out to you, you can play around with one-hot encoding additional features: https://towardsdatascience.com/categorical-encoding-using-label-encoding-and-one-hot-encoder-911ef77fb5bd. They will be easy to add to the model later on.

Add column for number of photos to the data

According to a realtor friend, one tell-tale sign of a fixer upper is a small number of photos attached to the listing. To get this in, we don't need to analyze the photos themselves, just add a column to the data that says how many photos were included in the listing.

The imgs.zip file that we received has images with the MLS number as part of the filename. I'm not sure it has every listing, but we should be able to count what's there and add it to the data.

note: We'll need some way to treat listings with 0 photos in a special way, since that means we don't have the photos in the zip, not that there were no photos included. We don't want this to skew the data. But let's try and figure out what we have.

Map

Create Map Component

Clean up listings with negative age

FLIPPABLE MLSNUM SOLDPRICE DOM BEDS BATHS SQFT AGE GARAGE
True 71947648 608273.0 41 3 2.5 2086 -68 2
True 71947648 608273.0 41 3 2.5 2086 -68 2
False 72191043 810000.0 22 5 3.0 4000 -7981 0
False 72029110 99000.0 3 4 2.0 2190 -172 0
False 71920084 150000.0 52 3 2.0 1305 -7981 1
False 71980474 185000.0 134 3 2.0 1280 -7981 1
False 72003822 389000.0 84 3 2.5 2321 -6478 2
False 72110860 2900000.0 193 6 3.5 3469 -7981 0
False 72179044 564000.0 66 4 2.5 2430 -7981 1
False 72205606 559000.0 80 4 2.5 2520 -7981 2
False 72212450 665000.0 28 6 3.0 3000 -7981 2
False 72082350 140000.0 330 4 2.0 1472 -7981 0
False 72232439 264500.0 26 6 4.0 4718 -7981 3
False 72121524 539900.0 19 3 2.0 1830 -88 2
False 72089520 641657.0 81 3 2.5 2086 -68 2
False 72142398 1515000.0 23 6 6.0 3850 -7981 0
False 72080444 245000.0 16 3 1.0 1286 -7981 1
False 71916045 151000.0 358 4 2.0 1561 -7981 1
False 71928371 350000.0 188 4 1.5 2166 -7981 2
False 71988620 348000.0 74 3 2.0 1536 -7981 0
False 71892962 160000.0 292 2 1.0 677 -830 0
False 71939481 299293.0 145 3 2.0 1456 -88 2
False 72094644 659525.0 80 3 2.5 2086 -68 2
False 72144583 429000.0 1 3 1.5 1554 -7981 0
False 72242689 900000.0 47 4 3.5 2848 -899 2
False 72132722 675000.0 46 3 2.5 2086 -68 2
False 72229389 430000.0 23 3 2.5 2074 -56 0
False 72149033 655145.0 81 3 2.5 2086 -68 2
False 72222081 1066875.0 81 4 3.0 2240 -188 0
False 72252296 255000.0 235 4 2.5 2348 -263 2
False 72093762 740000.0 47 9 4.0 3539 -7981 1
False 72045435 631890.0 16 3 2.5 2086 -68 2
False 72044555 166000.0 135 4 2.0 1453 -7981 2
False 72083972 403000.0 60 3 2.5 1536 -88 0
False 72088506 1027500.0 149 6 5.0 4500 -7981 3
False 72061311 651700.0 44 3 2.5 2086 -68 2
False 72111244 646665.0 14 3 2.5 2086 -68 2
False 72113042 628590.0 63 3 2.5 2086 -68 2
False 71886289 1029000.0 99 4 3.0 2500 -7981 1
False 71942158 507000.0 161 4 2.5 2430 -7981 1

DATA CLEANING

The following listings need to be cleaned. Here is the process:

Paste MLS # into Redfin search
Find listing the the correct csv (see sales date)
Fix listing details
Save. Commit and push changes.

72250832 - SOLD Jan 2018
71902243 - SOLD Jan 2017
72214658 - SOLD Oct 2017
72099376 - SOLD April 2017
72032454 - this looks to be a listing in san diego, so the MLS is probably wrong too
72027853 - SOLD Nov 2016
72018311 - SOLD May 2017
71955378 - SOLD Oct 2016
72045937 - SOLD Dec 2016
72133139 - SOLD May 2017
72144618 - SOLD May 2017

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.