Giter VIP home page Giter VIP logo

phase-4-group-project's Introduction

Zillow Housing Time Series Market Analysis Phase-4-project

Alt text

Moringa Phase 4 Project Submission

GROUP 4:

  • Student name: Kenneth Karanja
  • Student name: Pete Njagi
  • Student name: James Koli (Group Leader)
  • Student name: Tom Mwabire
  • Student name: Paul Mwangi
  • Student name: Lee Ndung'u
  • Student name: Edwin Mwenda

Scheduled project review date/time: April 12th 2024

Forecasting Housing Market Trends for Real Estate Investment Strategy

Alt text

Important Project Files:

  1. Phase4.ipynb (Main Jupyter document)
  2. Presentation.pdf (Presentation)
  3. zillow_data.csv (Main Data)
  4. Data Science Report

Business Problem

The primary business problem is the lack of visibility and understanding of the US housing market among Kenyan investors. Additionally, there is a need to identify regions that offer promising investment opportunities within the low-cost housing segment. Another challenge is ensuring the accuracy and relevance of predictions amidst periods of market instability, such as the 2008 housing market crash. Therefore, the project aims to address these challenges by providing comprehensive data analysis and forecasting to guide investment decisions effectively.

Overview

Our client, a distinguished real estate investment firm based in Kenya, is dedicated to facilitating Kenyan investors' access to the thriving US housing market. Specializing in low-cost housing investments, they cater to the needs of local Kenyans and diaspora members looking to diversify their investment portfolios. Through rigorous data analysis and strategic insights, the firm empowers investors with the knowledge required to navigate the complexities of the US real estate market successfully. By democratizing access to wealth-building opportunities and fostering financial inclusion, they contribute to the economic growth and prosperity of Kenya while enabling individuals to secure their financial futures.

Business Stakeholders

The primary stakeholders for this project include:

  • Kenyans looking to invest in the US housing market
  • Kenyan diaspora planning to buy or rent properties in the USA
  • Real estate investment firms
  • Financial analysts focusing on real estate investments
  • Policy makers interested in foreign investments

This project seeks to empower our stakeholders with precise forecasts and strategic insights into the US real estate market, facilitating optimal investment decisions.

Business Objective

Our key objectives are:

  • Developing an accurate time series forecasting model using Zillow data.
  • Providing market trend analysis to identify promising investment opportunities.
  • Offering actionable investment strategies based on model insights.
  • Enhancing the decision-making process for buyers, renters, and investors.
  • Strengthening the investment portfolio of our clients through data-driven insights.

Through this initiative, we aim to bridge the gap between Kenyan investors and the US real estate market, ensuring lucrative and informed investment choices.

Data Understanding

The data source for our analysis is primarily the zillow_data.csv file, which encompasses historical housing price data across various US regions.

zillow_data.csv (Main data source)

  • Source: Zillow's publicly available dataset.
  • Contents: This dataset includes monthly housing prices across different US regions, spanning from April 1996 to April 2018. It covers details such as region name, city, state, metro area, and county, alongside the housing prices for each month.

Our analysis will focus on understanding the trends, patterns, and factors influencing these housing prices, thereby enabling us to forecast future market trends effectively.

Univariate Analysis

Housing Prices in USD = unit of analysis

Unique identifiers:

  • RegionID
  • RegionName (ZIP code)
  • City
  • State
  • Metro
  • CountyName
  • SizeRank

Data Visualizations

drawing

The graph shows property counts by county. Maricopa, Los Angeles, and Jefferson top the list. Middlesex, Jackson, and Harris follow closely. Cook, Montgomery, Washington, and Orange also have significant counts. Counts range from around 200 to over 250

drawing

The graph displays property counts by state. California (CA) has the highest count. New York (NY) and Texas (TX) follow closely. Pennsylvania (PA), Florida (FL), and Ohio (OH) also have substantial counts. Illinois (IL), New Jersey (NJ), Michigan (MI), and Indiana (IN) complete the list

drawing

The graph shows ROI of property in the top 10 states. DC, HI, and CA have the highest ROI. Other states, like SD, CO, and MA, also show significant ROI. NY, WA, VT, and FL complete the list with varying ROI levels

drawing

The scatter plot illustrates ROI against the top 10 states. CA, NY, and TX exhibit the highest ROIs. PA, FL, and OH also show notable ROIs. Additionally, IL, NJ, MI, and IN complete the list, each with varying ROI levels.

drawing

The scatter plot demonstrates ROI against the top 10 counties. Los Angeles, Jefferson, and Orange counties exhibit the highest ROIs. Washington, Montgomery, and Cook also show notable ROIs. Additionally, Harris, Jackson, Middlesex, and Maricopa complete the list, each with varying ROI levels.

ARIMA & Modelling

image

To meet the normality assumptions, the residuals must not be correlated and have a normal distribution. For this case:

The residuals are normally distributed because, as the qq-plot on the bottom left illustrates, they follow a linear trend line. There are minimal correlations with their lagged version, as indicated by the correlogram plot on the bottom left. This indicates that our series doesn't exhibit any clear seasonality. The residuals are positively distributed, as indicated by the bell curve on the histogram.

Testing the model's performance

image

The testing forecasting model performance graph above illustrates the accuracy and reliability of our forecasted returns. By comparing the observed returns (blue line) with the prediction series (red line), we can assess how well our model captures the underlying patterns and trends in the data. The close alignment between the two lines indicates that our forecasting model effectively captures the dynamics of the time series data, suggesting its robust performance in predicting future returns.

Model Evaluation

image

The forecasted values (depicted by the red line) in the graph above show a consistent trend in home price movements over time, aligning closely with the actual observed values (represented by the blue line). This indicates the model's effectiveness in capturing the underlying patterns in the data, providing valuable insights for stakeholders. Such accurate forecasts are crucial for making informed decisions regarding investments or understanding market trends within the 85035 area.

Conclusion & Recommendations

image

With the exception of the 85035 zipcode; every zipcode has an encouraging projected price because they are all in the green.

We can determine our top five recommendations and their anticipated return on investment after three years based on the graph above.

Zip code 94804 = Richmond California: This area's housing prices have been steadily rising, and a high return on investment is expected. Its stable market dynamics and favourable price trends make it a good fit for our client's investment portfolio.

Zip code 75217 = Dallas Texas With its promising combination of affordability and appreciation potential, this area represents an excellent opportunity for our client. Our analysis shows a positive trajectory in housing prices, indicating a high ROI potential.

Zip code 19143 = Kingsessing Philadelphia County : This zipcode demonstrates resilience in the face of market fluctuations, with consistent growth and promising investment opportunities. Its affordability and upward price trends make it an appealing option for our client looking for long-term returns.

Zip code 60628 = Roseland Chicago Illinois: This area has strong growth potential despite fluctuations in the overall market, especially in low-cost housing segments. Given the current favourable market conditions and anticipated growth, it is highly recommended for our client's investment plan.

Zip code 48227 - Wayne County Detroit, Michigan: This area offers our client an appealing investment opportunity because of its stability in the market and affordability. It has the potential to yield substantial returns over time due to its steady increase in housing prices and bright future prospects.

After that, the investor has the option to invest in any of the zip codes listed above, with the exception of 85035, which offers a negative return on investment.

In conclusion, our thorough examination of the US housing market, with an emphasis on affordable housing investment opportunities, offers our Kenyan real estate investment firm client insightful information. By using rigorous data preprocessing, time series modelling, and exploratory data analysis, we have predicted future results and found encouraging trends for a number of zip codes. Our research showed that although housing prices fluctuated in some areas, they consistently increased in other areas, providing favourable returns on investment.

For example in our case, the zipcode 94804 turned out to be the best suggestion with the highest estimated return on investment. In addition, we were able to formulate well-informed recommendations that were customised to our client's investment goals thanks to our analysis of the dynamics of the housing market, which included price trends, seasonality, and market stability. By using these insights, our client's investment portfolio can be diversified, resources can be wisely allocated to take advantage of profitable opportunities, and returns can be maximised.

In order to maximise long-term profitability and adjust to shifting market conditions, it will be essential to regularly review investment strategies and keep a close eye on market trends. Our client can confidently and precisely navigate the ever-changing US real estate market by incorporating data-driven decision-making processes into their investment strategy, setting themselves up for long-term success and sustainable growth.

Deployment

Future work in this domain could focus on several key areas:

1. Expanding the dataset: Incorporating additional data sources in new Geographical areas, expand economic indicators, demographic trends, and policy changes, can provide a more comprehensive understanding of the factors influencing housing prices and enhance the accuracy of the forecasts.

2. Refining the modeling approach: Exploring advanced time series modeling techniques, such as SARIMA (Seasonal ARIMA) or machine learning algorithms like Long Short-Term Memory (LSTM) networks, could potentially improve the predictive power of the models and capture more complex patterns in the data.

3. Developing a real-time monitoring system: Implementing a system that continuously updates the analysis with the latest housing market data and generates automated alerts for significant changes or investment opportunities could help the investment firm stay ahead of the curve and make timely decisions.

4. Conducting ongoing performance evaluation: Regularly assessing the performance of the recommended investments against the forecasted trends and ROI projections can help validate the effectiveness of the data-driven approach and identify areas for improvement

drawing

Making a real-time application using Flask and React.js hosted on the real estate company's website domain, in which their clients could directly check the live interactive map to see which Zipcodes should be avoided and which zip codes are recommended.

Open the file usa_zip_codes_roi_map.html in the project folder using a browser to see an example for low-level implementation of this application.

phase-4-group-project's People

Contributors

petezdj avatar pseudocmd avatar tommwambire avatar edwinmtegi avatar jameskoli avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.