Giter VIP home page Giter VIP logo

antoniskl / amsterdam-metro-crowdedness-prediction Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 0.0 1022 KB

The aim of this full-stack project is to predict with RandomForest and visualize crowdedness for metro stations of Amsterdam by using external factors.

Batchfile 0.09% Python 68.44% Jupyter Notebook 20.97% CSS 10.34% JavaScript 0.16%
regression random-forest python prediction-model crowded-areas metro dash scikit-learn ticketmaster-api amsterdam

amsterdam-metro-crowdedness-prediction's Introduction

Amsterdam Metro Crowdedness Prediction

The aim of this full-stack project is to predict and visualize crowdedness for 1 week ahead in 3 metro stations of Amsterdam: Centraal Station, Station Zuid and Station Bijlmer ArenA. Except for the number of check-ins & check-outs for each station, external factors are considered such as weather, events, holidays, vacations and COVID-19 pandemic.

Description

The project consists of the following components:

  • instagram-event-scraper → scraper for events from Instagram using instagram's public URLs
  • ticketmaster-event-fetcher → fetcher for events from Ticketmaster API
  • model → back-end and front-end for making predictions
    • data_utils.py → helper functions for data manipulation and logging
    • model_utils.py → functions for model pipeline
    • predictions.ipynb → notebook for running model pipeline
    • predictions_server.py → Flask server for running model pipeline
    • UI → front-end for running model pipeline

Model Pipeline

  1. Read and preprocess data
  2. Merge data of external factors (e.g. weather) with check-ins & check-outs per hour
  3. Interpolate missing check-ins & check-outs by using Random Forest algorithm
  4. Split dataset into training, validation and test set
  5. Create a separate Random Forest model for each of the 3 metro stations
  6. Train each model with historical data (X)
  7. Predict the check-ins & check-outs for each hour for 1 week ahead (Y)

Getting Started

Dependencies

  • Python 3.7+
  • All the libraries included in requirements.txt

Installing

  • Run pip install -r requirements.txt
  • Datasets for check-ins & check-outs (model/data/gvb/ & model/data/gvb-herkomst/), ** weather** (model/data/knmi/) and events (model/data/events/) are expected to be in model/ as per this directory structure:
model
└───data
    └───gvb
    │   └───<year>
    │   │   └───<month_number>
    │   │   │   └───<day_number>
    │   │   │       │   <csv_or_json.gz>
    │   │   │       │   ...
    │   │   │
    │   │   └───...
    │   └───...
    └───gvb-herkomst
    │   └───<year>
    │   │   └───<month_number>
    │   │   │   └───<day_number>
    │   │   │       │   <csv_or_json.gz>
    │   │   │       │   ...
    │   │   │
    │   │   └───...
    │   └───...
    └───knmi
    │   └───knmi
    │   │   └───<year>
    │   │   │   └───<month_number>
    │   │   │   │   └───<day_number>
    │   │   │   │       │   <json>
    │   │   │   │       │   ...
    │   │   │   │
    │   │   │   └───...
    │   │   └───...
    │   └───knmi-observations
    │       └───<year>
    │       │   └───<month_number>
    │       │   │   └───<day_number>
    │       │   │       │   <json>
    │       │   │       │   ...
    │       │   │
    │       │   └───...
    │       └───...
    └───events
        │   events_zuidoost.xlsx
        │
        └───instagram
        │   │   <csv>
        │   │   ...
        │
        └───ticketmaster
            │   <csv>
            │   ...
  • WARNING: For the model to produce valid predictions, check-ins & check-outs (model/data/gvb/ & model/data/gvb-herkomst/) and weather data (model/data/knmi/) should be manually up-to-date

Executing programs

instagram-event-scraper

  • Modify usernames array in scraper.py to include the usernames of the accounts which you want to be scraped
  • Go to instagram-event-scraper/ and run python scraper.py
  • After execution, instagram-event-scraper/events.csv will be updated with the scraped events

ticketmaster-event-fetcher

  • Create ticketmaster-event-fetcher/config.py containing api_key=EXAMPLE where EXAMPLE is a placeholder for your Ticketmaster API key
  • Modify year_to_fetch variable in fetcher.py to fetch events for the year of your choice
  • Go to ticketmaster-event-fetcher/ and run python fetcher.py
  • After execution, a file with format ticketmaster-event-fetcher/events_amsterdam_center_DATE_TIME_UTC.csv will be created with the fetched events

model

  • Using model/predictions.ipynb:
    • Modify config.ini for the model to use the feature configuration of your choice
    • Run model/predictions.ipynb
    • See below bullet point "After execution"
  • Using front-end and back-end server:
    • Go to model/, run python predictions_server.py and wait for the server output to show "Preprocessing finished" and be up
    • Go to model/UI/, run python test.py and wait for the front-end server to be up
    • Open the URL of the front-end server on a browser
    • Choose your desired parameters for the model and press "Submit"
    • After execution
      • If you press to any of the 3 available metro stations in the map, the graph should be updated with the current predictions
      • Each station's folder in model/output/ will be updated with a new file with format prediction_next_week_CURRENT-DATE.csv which will contain the current predictions
      • NOTE: Only if you ran the model using model/predictions.ipynb notebook, then model/output/models_log.csv will be updated with the model's parameters and metrics

Acknowledgments

amsterdam-metro-crowdedness-prediction's People

Contributors

antoniskl avatar ayannaalma avatar priyeshj12 avatar ramondop avatar sirjan16 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.