servicenow / tactis Goto Github PK

TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series, from ServiceNow Research

License: Apache License 2.0

Python 100.00%

deep-learning machine-learning neural-network time-series transformers copulas forecasting time-series-prediction timeseries

tactis's Introduction

TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series

Arjun Ashok, Étienne Marcotte, Valentina Zantedeschi, Nicolas Chapados, Alexandre Drouin. TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series. (Accepted at ICLR 2024)

We introduce a new model for multivariate probabilistic time series prediction, designed to flexibly address a range of tasks including forecasting, interpolation, and their combinations. Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS), wherein the number of distributional parameters now scales linearly with the number of variables instead of factorially. The new objective requires the introduction of a training curriculum, which goes hand-in-hand with necessary changes to the original architecture. We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks, while maintaining the flexibility of prior work, such as seamless handling of unaligned and unevenly-sampled time series.

[Paper]

Installation

You can install the TACTiS-2 model with pip:

pip install tactis

Alternatively, the research version installs gluonts and pytorchts as dependencies which are required to replicate experiments from the paper:

pip install tactis[research]

Note: tactis has been currently tested with Python 3.10.8.

Instructions

With the research version of the code, train.py can be used to train the TACTiS-2 model for a specific dataset. The arguments in train.py can be used to specify the dataset, the training task (forecasting or interpolation), the hyperparameters of the model and a whole range of other training options.

There are notebooks in the that are useful in guiding training and evaluation pipeline setups: random_walk.ipynb demonstrates TACTiS-2 on a simple low-dimensional random walk dataset, and gluon_fred_md_forecasting.ipynb demonstrates how to train and evaluate TACTiS-2 on the FRED-MD dataset used in the paper. Note that the gluon_fred_md_forecasting.ipynb notebook requires GluonTS and PyTorchTS to be installed.

Note

For an implementation of the original version of TACTiS, please see here.

Citing this work

Please use the following Bibtex entry to cite TACTiS-2.

@misc{ashok2023tactis2,
      title={TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series}, 
      author={Arjun Ashok and Étienne Marcotte and Valentina Zantedeschi and Nicolas Chapados and Alexandre Drouin},
      year={2023},
      eprint={2310.01327},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

tactis's People

Contributors

Stargazers

Watchers

tactis's Issues

Using Custom Dataset with Different Input Lengths for Model

My input dimensions are [1500, 50, 6] and my label dimensions are [1500, 1]. Can I use this model? If so, could you please explain how to configure it correctly?

Imputation task

Thank you for your valuable contributions! I have some confusion regarding the imputation task. While the code provided showcases the prediction task, it appears that the loss calculation in decoder.py involves generating hist_encoded, pred_encoded, hist_true_x, and pred_true_x using a mask. This seems to imply that the lengths of missing values in a batch are assumed to be constant. However, if the number of missing values in the historical data varies, could you kindly provide suggestions on how to adjust the code to accommodate this scenario? Thank you for your patiance!

Specify Numpy version in requirements.txt

Hi! I found the following error raised in generate_backtesting_datasets() when I ran any demo having this function.

setting an array element with a sequence requested array has an inhomogeneous shape after 1 dimensions The detected shape was (1918,)+inhomogeneous part

I follow the solution here to downgrade my Numpy version from 1.24.1 to 1.21.6 then it is solved. My coworker is using numpy=1.23.4 with no problem running the demo. Seems this error only occurs in the newest numpy>=1.24.

ModuleNotFoundError: No module named 'pts'

How can I solve it?

Multi input example?

Hi,

Do you have an example by which the model predicts the future value of one input from a dataset containing this input and many other features?

I saw in the paper you tested on kdd-cup, but I only see single input series in your demo/ folder

If you don't have such an example, could you please recommend some changes I can make to try?

It seems that hist_value and pred_value must be of the same dimension, and that's where I'm having trouble

Excellent paper, thank you.

Irregular Sampling

Hello in the previous @aldro61 talk given at MS Montreal office, we have discussed the irregular sampling code portion, is it possible for you to added to the repo, I found it very interesting.
Thank you!

Proof of Property (2) in Theorem 1

Hi, I found there is an error in proof of Theorem 1, regarding Property (2)

The inequality between arithmetic mean and geometric mean is mistakenly used. “δ ∈ R+ is exactly zero i.i.f. the density estimated by the model is permutation invariant”. The loss attains its minima doesn’t mean the equality holds. The claim is right only if you assume the arithmetic mean sums to a constant regardless of optimisation. I was wondering if you could help clarify this, in case I am missing something?

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/ivan/Projects/tactis/demo/gluon_solar.ipynb Cell 5 in <cell line: 4>()
      [1](vscode-notebook-cell:/home/ivan/Projects/tactis/demo/gluon_solar.ipynb#W4sZmlsZQ%3D%3D?line=0) history_factor = 2
      [2](vscode-notebook-cell:/home/ivan/Projects/tactis/demo/gluon_solar.ipynb#W4sZmlsZQ%3D%3D?line=1) backtest_id = 4
----> [4](vscode-notebook-cell:/home/ivan/Projects/tactis/demo/gluon_solar.ipynb#W4sZmlsZQ%3D%3D?line=3) metadata, train_data, test_data = generate_backtesting_datasets("solar_10min", backtest_id, history_factor)

File ~/Projects/tactis/tactis/gluon/dataset.py:305, in generate_backtesting_datasets(name, backtest_id, history_length_multiple, use_cached)
    303 train_data = []
    304 for i, series in enumerate(raw_dataset):
--> 305     train_end_index = _count_timesteps(series["start"], backtest_timestamp, timestep_delta)
    307     s_train = series.copy()
    308     s_train["target"] = series["target"][:train_end_index]

File ~/Projects/tactis/tactis/gluon/dataset.py:117, in _count_timesteps(left, right, delta)
    112 def _count_timesteps(left: pd.Timestamp, right: pd.Timestamp, delta: pd.DateOffset) -> int:
    113     """
    114     Count how many timesteps there are between left and right, according to the given timesteps delta.
    115     If the number if not integer, round down.
    116     """
--> 117     assert right >= left, f"Case where left ({left}) is after right ({right}) is not implemented in _count_timesteps()."
    118     try:
    119         return (right - left) // delta

TypeError: '>=' not supported between instances of 'Timestamp' and 'Period'