The ds-project1-airbnb-seattle from tienlx93

Please allow me 1 more day to complete this project. Thanks very much !

Installation
Project Motivation
File Descriptions
Results
Licensing, Authors, and Acknowledgements

Installation

There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*.

Project Motivation

This is my first project into the Data Science field to apply the Crips-DM (link) methodology.

For this project, I was interestested in using Airbnb listing data of Seattle to better understand:

What is the interesting parts about the data, included pricing, size, how to create a good description for a listing...
What aspect impact the pricing and popularity of a listing

File Descriptions

Seattle Airbnb analysis.ipynb: Main notebook file included all the works. This included 4 parts:

Part 1: Review & cleanup data
Part 2: Pricing overview
Part 3: How to get good impression by title, description & user reviews
Part 4: What aspect affect pricing & user review

data/listings.csv: detail of the listing, include detail of rental, place, pricing, user review
data/reviews.csv: detail of all review of places
data/calendar.csv: detail of pricing & avaiability of listings by day
output: folder containing output tables, images

Results

Throughout the project file Seattle Airbnb analysis.ipynb it will help to demostrate the CRIPS-DM method.

Business Understanding
- Question 1: When is the busiest time to visit Seatle? Where is the common place to be visited?
- Question 2: Based on the description & review, what we can learn to create a more attractive listing?
- Question 3: What aspect impacted the most on nighly pricing? What aspect impacted the popularity of a listing, based on number of reviews per month?
Data Understanding: The provided data included:
- listings.csv: detail of the listing, include detail of rental, place, pricing, user review
- reviews.csv: detail of all review of places
- calendar.csv: detail of pricing & avaiability of listings by day
Prepare Data: Through the first part of 'Review & cleanup data', we run through some clean up & prepare like:
- Cleanup the numeric & pricing value for a clean numberic output
- Remove the unnessary data
- Create the statistic tables, correlation matrix, data distribution for better understanding of data
- For the question 1, we merge the listings.csv and calendar.csv to see the pricing distribution. It is covered on the Part 2 of the project
- For the question 2, we merge the listings.csv and reviews.csv for the review & descritions of a listing. We applied the words deviding, then count the appearance of each word, to compare the difference between the most attractive listing (based on number of reviews per month) and all listing. The findings are available on Part 3 of the project.
Modeling
- Fit model: After clearing & enrich data, create columns for categorical values, we fit the model into the standard linear regression model. This is done in the 4th part on the notebook.
- Validate the model: After changing the parameter, we removed some columns that create overfitting data, only focus on most interesting data
- After apply the modeling, we can answer Question 3, what aspect affect the most on pricing and popularity. The result is available on the Part 4 of the notebook.
Evaluation: The main findings of the project can be found at the post available here.

Licensing, Authors, Acknowledgements

The data inside data folder is credited to Airbnb for the. You can find the Licensing for the data and other descriptive information at the Kaggle link available here.

The notebook is created for educational purpose. Please freely to contribute, remix and make it your own work / with or without any credit to original works.

tienlx93 / ds-project1-airbnb-seattle Goto Github PK

ds-project1-airbnb-seattle's Introduction

Please allow me 1 more day to complete this project. Thanks very much !

Table of Contents

Installation

Project Motivation

File Descriptions

Results

Licensing, Authors, Acknowledgements

ds-project1-airbnb-seattle's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent