Giter VIP home page Giter VIP logo

spmamba's Introduction

SPMamba: State-space model is all you need in speech separation

arXiv GitHub Stars License Python

๐Ÿ”ฅ News

[2024-05-09] Update SPMamba WHAM! Result: SI-SNRi=17.4 dB, SDRi=17.6 dB

[2024-04-23] Update SPMamba MACs: 238.21 G/s using code

[2024-04-18] Update SPMamba WSJ0-2Mix Result: SI-SNRi=22.5 dB, SDRi=22.7 dB

Introduction

SPMamba revolutionizes the field of speech separation tasks by leveraging the power of Mamba in conjunction with the robust TF-GridNet infrastructure. By replacing the conventional bidirectional LSTM with a more efficient and effective bidirectional Mamba model, SPMamba sets a new standard for accuracy and performance in speech separation.

Built upon the open-source ESPnet framework, SPMamba offers a seamless experience for users looking to train their models with cutting-edge technology. Whether you're a researcher, developer, or enthusiast in the field of speech processing, SPMamba provides the tools and flexibility needed to achieve unparalleled results.

Technology Stack

This repository is implemented using the ESPnet framework, a comprehensive platform for speech processing. SPMamba enhances ESPnet by integrating Mamba, a state-of-the-art state-space model, into the TF-GridNet architecture. This combination allows for significant improvements in speech separation tasks.

Installation

clone the repository

git clone https://github.com/JusperLee/SPMamba.git && cd SPMamba
conda env create -f look2hear.yml
conda activate look2hear

Usage

To train the SPMamba model, run the following command:

python audio_train.py --conf_dir=configs/spmamba.yml

Performance

Here, you can include a brief overview of the performance metrics or results that SPMamba achieves using own private datasets.

Model SDR SDRi SI-SNR SI-SNRi Params(M) Macs (G/s)
Conv-TasNet 7.58 7.69 6.71 6.89 5.62 10.23
DualPathRNN 5.76 5.87 4.88 5.06 2.72 85.32
SudoRM-RF 7.59 7.70 6.66 6.84 2.72 4.60
A-FRCNN 9.53 9.64 8.58 8.76 6.13 81.20
TDANet 9.93 10.14 8.95 9.21 2.33 9.13
BSRNN 12.64 12.75 12.04 12.23 25.97 98.69
TF-GridNet 13.59 13.70 12.62 12.81 14.43 445.56
SPMamba 16.01 16.14 15.20 15.33 6.14 238.21

SPMamba in Self-built, WSJ0 and WHAM!

Self-built

image

WSJ0

image

License

SPMamba is licensed under the Apache License 2.0. For more details, see the LICENSE file in the repository.

Acknowledgements

SPMamba is developed by the Look2Hear team at Tsinghua University. We would like to thank the ESPnet team for their contributions to the open-source community and for providing a solid foundation for our work.

Citation

If you use SPMamba in your research or project, please cite the following paper:

@article{li2024spmamba,
  title={SPMamba: State-space model is all you need in speech separation},
  author={Li, Kai and Chen Guo},
  journal={arXiv preprint arXiv:2404.02063},
  year={2024}
}

Contact

For any questions or feedback regarding SPMamba, feel free to reach out to us via email: [email protected]

spmamba's People

Contributors

jusperlee avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.