Giter VIP home page Giter VIP logo

airbnb-clustering-analysis's Introduction

AirBnB-Clustering-Analysis

we use a several different clustering algorithms (both HC and NHC) to assess AirBnB listings in the Cape Town region. The idea was to use a variety of features to cluster properties without knowing their spatial location.

Introduction

This project aims to uncover trends and hidden relationships from the property listings (data points) in the Airbnb data set through the use of clustering algorithms. The main methods used in this project include hierarchical and non-hierarchical clustering analysis and self-organizing maps.

Data

The data used for this project is the Airbnb property listings data in Cape Town. The data includes information about the properties such as location, price, and amenities. The data was scarped from AirBnB.

Methodology

The following methods were used for this project:

  • Hierarchical Clustering Analysis
    • A type of clustering algorithm that creates a tree-like structure, called dendrogram, to represent the relationships between the data points. It starts with each data point being its own cluster and merges them into bigger clusters based on their similarity.

  • Non-Hierarchical Clustering Analysis
    • A type of clustering algorithm that does not create a dendrogram to represent the relationships between the data points. Instead, it divides the data into a pre-defined number of clusters based on the similarity between the data points.

  • Self-Organizing Maps
    • A type of artificial neural network used for clustering and dimensionality reduction. It creates a 2-dimensional grid of nodes and trains them to map the high-dimensional data points to the grid, grouping similar data points together.

These methods were used to cluster the properties based on their similarities and find any trends or hidden relationships.

Results

The results of the clustering analysis showed some interesting trends and hidden relationships in the property listings data. The clustering algorithms were able to group similar properties together in the same wards and uncover patterns in the data. Hierarchical clustering resulted in a higher average silhouette width, 0.31 compared to Non-Hierarchical clustering achieving 0.29 with K-means algorithm. We notice that both results suggest weak fits, but Hierarchical clustering was slightly better than Non-Hierarchical clustering, with the clustering structure. Although the silhouette widths were relatively weak, we see there is clear room for improvement and even at this early stage are able to identify trends and patterns across the AirBnB listings in Cape Town region.

Conclusions

This project demonstrates the use of clustering algorithms to uncover trends and hidden relationships in the Airbnb property listings data. The results showed some interesting patterns in the data that can be useful for further analysis and decision making.

Getting Started

This project is implemented in R programming language. To run this project, the following software and packages are required:

  • R
  • R Studio
  • tidyverse
  • cluster
  • factoextra

To run this project, clone or download the repository and open the R project file in R Studio. The code and data are included in the repository.

Contributions

If you want to contribute to this project, feel free to submit a pull request or open an issue in the repository.

Contact

For any questions or queries, please email: [email protected]

airbnb-clustering-analysis's People

Contributors

pavsingh7 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.