The analysis was performed on a publicly available dataset from NYC Department of Finance. The data consisted of 5 datasets from the 5 Borough of New York City. A syntheic problem statement was created before analysis, to find the required insights.
The dataset from the NYC Department of Finance consisted of the data about the residential and commercial properties sold from June 2022 to May 2023. The dataset was uncleaned and had numerous missing values. The 5 datasets were first cleaned and then integrated to a single dataset for analysis using python.
The Problem Statement was generated using AI based on the dataset. The problem statement is purely synthetic and the analysis was done to get solutions from the data for the questions in the problem statement. Following are the questions answered by the dataset using the dashboard
- What is the average sales price?
- How many residential and commercial properties were sold?
- What is the average area of a property?
- What is the average price per square feet?
- Which borough has the highest number of properties?
- How does the sale price vary over the months?
- Is there a correlation between land square feet and sale price?
- What tax class is used the most for commercial properties?
- Where are the most properties sold by zipcode?
- In which years were the most properties built?
The dashboard was created using PowerBI. DAX was used to build measures for the data and Power Query was used to create conditional columns for a few visuals. The Dashboard creates a visual for each question in the problem statement. The static dashboard is displayed as a pdf file. Whereas to view the interactive dashboard, one must download the pbix file.