Giter VIP home page Giter VIP logo

itransformer's Introduction

iTransformer

The repo is the official implementation for the paper: iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. It currently includes code implementations for the following tasks:

Multivariate Forecasting: We provide all scripts as well as datasets for the reproduction of forecasting results in this repo.

Boosting Forecasting of Transformers: We are continuously incorporating Transformer variants. If you are interested in how well other Transformer variants work for forecasting tasks, feel free to contact us.

Generalization on Unseen Variates: iTransformer is demonstrated to generalize well on unseen time series, making it a nice alternative as the fundamental backbone of the large time series model.

Better Utilization of Lookback Windows: While Transformer does not necessarily benefit from the larger lookback window, iTransformer exhibits better utilization of the enlarged lookback window.

Adopt Efficient Attention and Training Strategy: Efficient attention mechanisms as well as the feasibility of extrapolating variates can be leveraged to reduce the complexity when the number of variates is tremendous.

Updates

๐Ÿšฉ News (2023.10) All the scripts for the above tasks in our paper are available in this repo.

๐Ÿšฉ News (2023.10) iTransformer has been included in [Time-Series-Library] and achieve the consistent state-of-the-art in long-term time series forecasting.

Introduction

๐ŸŒŸ Considering the characteristics of time series, iTransformer breaks the conventional model structure without the burden of modifying any Transformer modules. Inverting Transformer is all you need in MTSF.

๐Ÿ† iTransformer takes an overall lead in complex time series forecasting tasks and solves several pain points of Transformer modeling extensive time series data.

๐Ÿ˜Š iTransformer is repurposed on the vanilla Transformer. We think the "passionate modification" of Transformer has got too much attention in the research area of time series. Hopefully, the mainstream work in the following can focus more on the dataset infrastructure and consider the scale-up ability of Transformer.

Overall Architecture

iTransformer regards independent time series as variate tokens to capture multivariate correlations by attention and utilize layernorm and feed-forward networks to learn better series representations.

And the pseudo-code of iTransformer is as simple as the following:

Usage

  1. Install Pytorch and other necessary dependencies.
pip install -r requirements.txt
  1. The datasets can be obtained from Google Drive or Tsinghua Cloud.

  2. Train and evaluate the model. We provide all the above tasks under the folder ./scripts/. You can reproduce the results as the following examples:

# Task: Multivariate forecasting with iTransformer
bash ./scripts/multivariate_forecast/Traffic/iTransformer.sh

# Task: Compare the performance of Transformer and iTransformer
bash ./scripts/boost_performance/Weather/iTransformer.sh

# Task: Train the model with partial variates, and generalize on the unseen variates
bash ./scripts/variate_generalization/Electricity/iTransformer.sh

# Task: Test the performance on the enlarged lookback window
bash ./scripts/increasing_lookback/Traffic/iTransformer.sh

# Task: Utilize FlashAttention for acceleration (hardware-friendly and almost computationally equivalent to Transformer)
bash ./scripts/efficient_attentions/iFlashTransformer.sh

Main Result of Multivariate Forecasting

We evaluate the iTransformer on six challenging multivariate forecasting benchmarks as well as the server load prediction of Alipay online transactions (generally hundreds of variates, denoted as Dim). Consistent least prediction errors (MSE/MAE) are achieved by iTransformer.

Challenging Multivariate Time Series Forecasting Benchmarks (Avg Results)

Online Transaction Load Prediction of Alipay Trading Platform (Avg Results)

General Performance Boosting on Transformers

By introducing the proposed framework, Transformer and its variants achieve significant performance improvement, demonstrating the generality of the iTransformer approach and the feasibility of benefiting from efficient attention mechanisms.

Generalization on Unseen Variates

Technically, iTransformer can forecast with arbitrary number of variables during inference. We also dive into the capability, which further exhibit that iTransformer achieves smaller generalization errors compared with Channel-independence when only partial variates are used for training.

Better Utilization of Lookback Windows

While previous Transformers do not necessarily benefit from the increase of historical observation. iTransformers show a surprising improvement in forecasting performance with the increasing length of the lookback window.

Model Analysis

Benefiting from inverted Transformer modules:

  • (Left) Inverted Transformers learn better time series representations (more similar CKA) favored by time series forecasting.
  • (Right) The inverted self-attention module learns interpretable multivariate correlations.

Model Abalations

iTransformer that utilizes attention on variate dimensions and feed-forward on temporal dimension generally achieves the best performance. However, the performance of vanilla Transformer (the third row) performs the worst among these designs, indicating the disaccord of responsibility when the conventional architecture is adopted.

Model Efficiency

iTransformer achieves efficiency improvement over previous Channel-independence mechanism. We further propose a training strategy for multivariate series by taking advantage of its variate generation ability. While the performance (Left) remains stable on partially trained variates of each batch with the sampled ratios, the memory footprint (Right) of the training process can be cut off significantly.

Citation

If you find this repo helpful, please cite our paper.

@article{liu2023itransformer,
  title={iTransformer: Inverted Transformers Are Effective for Time Series Forecasting},
  author={Liu, Yong and Hu, Tengge and Zhang, Haoran and Wu, Haixu and Wang, Shiyu and Ma, Lintao and Long, Mingsheng},
  journal={arXiv preprint arXiv:2310.06625},
  year={2023}
}

Future Work

  • iTransformer for other time series tasks.
  • Integrating more Transformer variants.
  • iTransformer Scalability.

Acknowledgement

We appreciate the following GitHub repos a lot for their valuable code and efforts.

Contact

If you have any questions or want to use the code, feel free to contact:

itransformer's People

Contributors

wenweithu avatar zdandsomsp avatar eltociear avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.