Home Price Prediction Project
This is the repository for Jacob Crabb and Taeho Jeon's Flatiron School end of module 1 project.
Our assignment is in 3 parts:
-
to use the kc_house_data.csv and it's description file, column_names.md, to make a model for predicting home sale prices.
-
to make a powerpoint/keynote/google slides presentation explaining our model
-
to write a blog concerning our project and post it for other aspiring data scientists to see.
A link to the blog on our project's visualisations: https://medium.com/@alludedwinter/visualizations-a-regressive-gene-235a4334276f
link to the slides:
Business understanding/the problem we're fixing:
we are to make a model that will predict home price based on costomer needs like number of bedrooms, number of bathrooms,location, and others.
Data understanding/what are we working with:
we are provided two years of housing data from king county in the kc_house_data.csv file to work with. and are allowed to pull from other sources as needed.
Data preperation/Data cleaning:
we check for duplicates, correct missing or incorrect values, and remove outliers.
Modeling:
we build functions to perfom train/test splits and cross validation. then we test out different predictors with sklearn feature selection.
Evaluation:
we test our model in general, on areas outside of Seattle, inside of Seattle and on each individual zip code. we then show the results for each area.
Deployment:
we build a usable predictor function based on our model. then add some final conclusions about our model.
Future:
it would be nice at some point to iron out the weaknesses of our model and make improvements to the code.