Giter VIP home page Giter VIP logo

airbnb_datawarehouse's Introduction

Airbnb Data Warehouse Project

Project Overview

This project focuses on constructing a comprehensive data warehouse for Airbnb listings in Barcelona, providing a robust framework for data storage, organization, and analysis. The objective was to centralize data related to Airbnb properties, including listings, bookings, reviews, and host information, to facilitate in-depth analytical insights and decision-making processes.

The project involved several key stages:

  1. Data Scraping: Utilizing Python scripts to scrape Airbnb data.
  2. Data Loading: Employing pandas and SQLAlchemy to load the scraped data into a PostgreSQL database, setting the foundation for further processing and analysis.
  3. Data Profiling: Conducting an initial assessment of the data using the Airbnb Data Dictionary as a reference to identify data quality issues and understand the dataset's structure and content.
  4. Data Warehouse Design: Developing a data warehouse schema tailored to the needs of Airbnb data analysis, featuring two fact tables for reviews and bookings, alongside dimension tables for dates, listings, reviewers, locations, and hosts.
  5. Data Transformation: Applying SQL statements to transform the raw data, ensuring it is properly formatted and aligned with the data warehouse schema.
  6. Data Loading into Data Warehouse: Executing SQL scripts with insert statements to populate the data warehouse, making the data ready for analysis.

Data Warehouse Schema

Data Sources

  • Airbnb Dataset: The primary dataset for this project was obtained from Inside Airbnb (Get the Data), which provides detailed data on Airbnb listings and activities.
  • Airbnb Data Dictionary: The Airbnb Data Dictionary was instrumental in understanding the dataset's attributes and guiding the data profiling stage.

Technologies Used

  • Python: For data scraping and initial data processing.
  • Pandas & SQLAlchemy: For data manipulation and loading the data into PostgreSQL.
  • PostgreSQL: As the relational database management system to store the initial datasets.
  • SQL: For data transformation, schema creation, and data loading processes in the data warehouse.

Data Warehouse Structure

The data warehouse is designed with analytical queries in mind, structured around two fact tables:

  • Reviews Fact Table: Captures details about reviews made by guests.
  • Bookings Fact Table: Contains information on bookings.

The supporting dimension tables are:

  • Date Dimension: Holds information on dates to enable time-based analyses.
  • Listings Dimension: Contains detailed information about the listings.
  • Reviewers Dimension: Stores information about the reviewers.
  • Location Dimension: Captures geographical information about the listings.
  • Hosts Dimension: Contains details about the hosts of the listings.

airbnb_datawarehouse's People

Contributors

hoperighthere avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.