Giter VIP home page Giter VIP logo

bi-mamba4ts's Introduction

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

Python 3.10 PyTorch 2.1.1 numpy 1.24.1 pandas 2.0.3 optuna 3.6.1 einops 0.7.0

This is the official implementation of [Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting].

Key Designs of the proposed Bi-Mamba+🔑

🤠 Exploring the validity of Mamba in long-term time series forecasting (LTSF).

🤠 Proposing a unified archetecture for channel-independent and channel-mixing tokenization strategies based on a novel designed series-relation-aware (SRA) decider.

🤠 Proposing Mamba+, an improved Mamba block specifically designed for LTSF to preserve historical information in a longer range.

🤠 Introducing a Bidirectional Mamba+ in a patching manner. The model can capture intra-series dependencies or inter-series dependencies in a finer granularity.

Architecture of Bi-Mamba+

Architecture of Bi-Mamba+ encoder

Datasets

We test Bi-Mamba+ on 8 real-world Datasets: (a) Weather, (b) Traffic, (c) Electricity, (d) ETTh1, (e) ETTh2, (f) ETTm1, (g) ETTm2 and (h) Solar.

All datasets are widely used and are publicly available at https://github.com/zhouhaoyi/Informer2020 and https://github.com/thuml/Autoformer.

Results✅

Main Results

Compared to iTransformer, the current SOTA Transformer-based model, the MSE results of Bi-Mamba+ are reduced by 4.85% and the MAE results are reduced by 2.70% on average. The improvement comes to 3.85% and 2.75% compared to S-Mamba.

main results

Ablation Study

We calculate the average MSE and MAE results of (i) without SRA decider (w/o SRA-I & w/o SRA-M); (ii) without bidirectional design (w/o Bi); (iii) replacing Mamba+ with Mamba (Bi-Mamba), (iv) without residual connection (w/o Residual); (v) S-Mamba and (vi) PatchTST. The SRA decider, added forget gate, bidirectional and residual design are all valid.

ablation

Model Efficiency

We conduct the following experiments to comprehensively evaluate the model efficiency from (a) predicting accuracy, (b) memory usage and (c) training speed. We set $L=96,H=96$ as the forecasting task and use $Batch=32$ for ETTh1 and Traffic. Bi-Mamba+ strikes a good balance among predicting performance, training speed and memory usage.

ETTh1 Traffic

Getting Start🛫

  1. Install the Requirements Packages(Linux only)

Run pip install -r requirements.txt to install the necessary Python Packages.

Tips for installing mamba-ssm and our proposed mamba_plus: run the following commands in conda (ENTERING STRICTLY IN ORDER!):

conda create -n your_env_name python=3.10.13
conda activate your_env_name
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
conda install packaging
cd causal-conv1d;CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .;cd ..
cd mamba_plus;MAMBA_FORCE_BUILD=TRUE pip install .;cd ..

These python package installing tips is work up to (04.24 2024).

I strongly recommand doing all these on Linux, or, WSL2 on Windows! The default cuda version should be at least 11.8 (or 11.6? seems that new versions allow for lower cuda versions).

The tips listed here will force local compilation of causal-conv1d and mamba_plus. The mamba_plus here is the modified hardware-aware parallel computing algorithm of our proposed Mamba+. If you want to run S-Mamba or else Mamba-based models, just go with cd mamba;pip install . or pip install mamba-ssm in a new python environment to download the original mamba_ssm of Mamba. Please use different python environments for mamba_plus and mamba_ssm, because the selective_scan program may be covered by one of them.

Take cuda 11.8 as an example, there should be a directory named 'cuda-11.8' in /usr/local. You should make sure that cuda exists in the path. Take bash as an example. Run vi ~/.bashrc and make sure the following paths exist:

export CPATH=/usr/local/cuda-11.8/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.8/bin:$PATH

After saving the new profile, run bash again and your SHELL will identify the new env path.

Of course, if you do not want to force local compilation, these paths are not necessary.

  1. Run the script: Find the model you want to run in /scripts and choose the dataset you want to use.

Run sh ./scripts/{model}/{dataset}.sh 1 to start training.

Run sh ./scripts/{model}/{dataset}.sh 0 to start testing.

Run sh ./scripts/{model}/{dataset}.sh -1 to start predicting.

We provide the trained models in checkpoints, currently the Bi-Mamba+ for Weather is offered.

Acknowledgements🙏

We are grateful for the following awesome works when implementing Bi-Mamba+:

Mamba

iTransformer

bi-mamba4ts's People

Contributors

aoboliang00208 avatar leopold2333 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.