- Installation
- Project Motivation
- File Descriptions
- How to Interact with the project
- Results
- Licensing, Authors, Acknowledgements, etc.
To run this project you have to use an instance of Jupyter Notebook with python 3. The libraries used for this project are:
- numpy.
- pandas.
- matplotlib.
- sklearn.
- seaborn.
Note: For the jupyter notebook file "Airbnb Cost per Date on the year.ipynb" you need to use the CSV "calendar.csv", so for make the reference you must unzip the file "calendar.csv.zip"
My motivation for this project was take a look of the Airbnb public data on Seattle, and answer the next questions:
- What is the most expensive city in Seattle for rent an Airbnb?
- What is the better date to travel to Seattle and find a good price for an Airbnb?
- What city of Seattle has more options to look for a Airbnb that accomplish my requirements?
- What variables are important for predicting the price of an Airbnb in Seattle?
- Knowing the datasets.ipynb The objective of this Jupyter Notebook is get an idea of how is the data, and what are the important variables to answer our questions. In this file i have used basics functions to know the data, like the describe function and head.
- Mean of price per location and Distribution.ipynb The objective of this Jupyter Notebook is answer the question about the mean of price per city and know the distributions of Airbnb around Seattle. In this file i use the functions to merge datasets, and answer the questions using barplot and boxplot.
- Mean price during the year.ipynb The objective of this Jupyter Notebook is answer the question about the mean of price during the year, with the goal to know what time is the best to travel to Seattle. In this file i use functions to format the date and a TimeSeriesPlot to see the insights.
- Prediction Price Model.ipynb The objective of this Jupyter Notebook is know the variables that have correlation with the price. In this file i use two datasets, litings.csv, scraped.csv and a heatmap to see the correlation. In this file you can see functions to fill NA's and dummy variables with the goal to get a better prediction.
The project is splitted in the questions that we want to answer, the best way to interact with is starting with the file Business Understanding and Data Understanding.ipynb then you can proceed to check depends of what question do you want to see how to resolve. You can use the next dictionary with the questions and files.
- What is the most expensive city in Seattle for rent an Airbnb? This question is answered in the file Mean of price per location and Distribution.ipynb.
- What is the better date to travel to Seattle and find a good price for an Airbnb? This question is answered in the file Mean price during the year.ipynb.
- What city of Seattle has more options to look for a Airbnb that accomplish my requirements? This question is answered in the file Mean of price per location and Distribution.ipynb.
- What variables are important for predicting the price of an Airbnb in Seattle? This question is answered in the file Prediction Price Model.ipynb.
The main findings of the code can be found at the post available here.
Must give credit to Airbnb for the data. You can find the Licensing for the data and other descriptive information at the airbnb link available here. Otherwise, feel free to use the code here as you would like!