Uncertainty Quantification for Traffic Forecasting Using Deep-Ensemble-Based Spatiotemporal Graph Neural Networks

Deep ensemble with simultaneous quantile regression (DESQRUQ) is a spatiotemporal graph neural network (STGNN) with simultaneous quantile regression (SQR) loss for estimating both aleatoric (data) and epistemic (model) uncertainties. The SQR loss in STGNN is used to predict the quantiles of the forecasting traffic distribution. A scalable Bayesian optimization-based hyperparameter search DeepHyper is used to perform hyperparameter optimization on DESQRUQ, a selected set of high-performing configurations is used to fit a Gaussian copula model to capture the joint distributions of the hyperparameter configurations. Finally, a set of high-performing configurations is sampled from the distribution and used to train an ensemble of STGNN models.

Environment information

The experiments were executed on the Cooley GPU cluster. Cooley has a total of 126 compute nodes; each node has 12 CPU cores and one NVIDIA Tesla K80 dual-GPU card.

OS Login Node: Red Hat Enterprise Linux Server release 7.9 (Maipo)
OS Compute Node: Red Hat Enterprise Linux Server release 7.9 (GNU/Linux 3.10.0-1160.59.1.el7.x86_64)
Python: Miniconda Python 3.8

For more information about the environment refer to the env-sc22.txt which was generated with the provided SC Author-Kit.

Requirements

torch
scipy>=0.19.0
numpy>=1.12.1
pandas>=0.19.2
pyyaml
statsmodels
tensorflow>=1.3.0
tables
future
mpi4py

Installation

Install Miniconda: conda.io. Then create a Python environment:

conda create -n gcntf2 python=3.8
conda activate gcntf2

Dependency can be installed using the following command:

pip install -r requirements.txt

Download Data

The traffic data files for Los Angeles (METR-LA) are available at METR-LA. The train, test, and validation data are available at data/METR-LA/{train,val,test}.npz. The adjacency matrix and configuration file are available at METR-LA/sensor_graph/adj_mx.pkl and METR-LA/model/dcrnn_la.yaml.

Generate datasets

Run the following comands to create the dataset with 100 hyperparameter configurations

mkdir data100
mv METR-LA data100
cd data100
mv METR-LA data00
for i in {1..99}; do cp -r data00 "data0$i"; done
python change_yaml.py

Run the following comand to create 100 yaml files based on the synthetic hyperparameter configurations. change yaml.py takes as input synthetic hyperparameters.csv, which contains the high-performing hyperparameter configurations generated by Gaussian Capula.

python change_yaml.py

Run the experiments

We construct model ensemble using 100 synthetic hyperparameter configurations. We train 100 DCRNN-SQR models with synthetic hyperparameter configurations simultaneously on multiple GPUs.

To submit and run an experiment on the Cooley GPU cluster the following command is used:

qsub -n 50 -t 12:00:00 -A hpcbdsm qsub_100uq.sh

where

-n denotes the number of nodes requested.
-t denotes the allocation time (minutes) requested.
-A denotes the project's name at the ALCF.

Docker image

The docker image is available here:

https://anl.box.com/s/5zs8wu1se4r84npw6w45gdqo7a0wpytj

To run the image follow the instructions:

docker load < dcrnn.tar
docker run dcrnn.tar

tanwimallick / desqruq Goto Github PK

desqruq's Introduction

Uncertainty Quantification for Traffic Forecasting Using Deep-Ensemble-Based Spatiotemporal Graph Neural Networks

Environment information

Requirements

Installation

Download Data

Generate datasets

Run the experiments

Docker image

desqruq's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent