Giter VIP home page Giter VIP logo

mjhydri / 1d-statespace Goto Github PK

View Code? Open in Web Editor NEW
61.0 4.0 12.0 4.65 MB

This repository contains the implementation of an efficient joint beat, downbeat, tempo, and meter tracking system using a compact 1D probabilistic state space and a jump-back reward technique. ICASSP 2022.

License: MIT License

Python 100.00%
beats beat-detection beat-tracking beat-time downbeat tempo tempo-tracking tempo-estimation 1d-statespace bar-detection

1d-statespace's Introduction

A Novel 1D State Space for Efficient Music Rhythmic Analysis

An implementation of the probablistic jump reward semi_Markov inference model for music rhythmic analysis leveraging the proposed 1D state space.

PyPI CC BY 4.0 Downloads arXiv

PWC

This repository contains the source code and demo videos of a joint music rhythmic analyzer system using the 1D state space and jump reward technique proposed in ICASSP-2022. This implementation includes music beat, downbeat, tempo, and meter tracking jointly and in a causal fashion.

The model first takes the waveform to the spectral domain and then feeds them into one of the pre-trained BeatNet models to obtain beat/downbeat activations. Finally, the activations are used in a jump-reward inference model to infer beats, downbeats, tempo, and meter.

System Input:

Raw audio waveform

System Output:

A vector including beats, downbeats, local tempo, and local meter columns, respectively and with the following shape: numpy_array(num_beats, 4).

Installation Command:

Approach #1: Installing binaries from the pypi website:

pip install jump-reward-inference

Approach #2: Installing directly from the Git repository:

pip install git+https://github.com/mjhydri/1D-StateSpace
  • Note that by using either of the approaches all dependencies and required packages get installed automatically except Pyaudio that connot be installed that way. Pyaudio is a python binding for Portaudio to handle audio streaming.

If Pyaudio is not installed in your machine, download an appropriate version for your machine from here. Then, navigate to the file location through commandline and use the following command to install the wheel file locally:

pip install <Pyaudio_file_name.whl>   

Usage Example:

from jump_reward_inference.joint_tracker import joint_inference


estimator = joint_inference(1, plot=True) 

output = estimator.process("music file directory")

Video Tutorial:

1: In this tutorial, we explain the proposed 1D state space and the mechanism of the jump=back reward technique.

Tutorial

Video Demos:

This section demonstrates the system performance for several music genres. Each demo comprises four plots that are described as follows:

  • The first plot: 1D state space for music beat and tempo tracking. Each bar represents the posterior probability of the corresponding state at each time frame.
  • The second plot: The jump-back reward vector for the corresponding beat states.
  • The third plot:1D state space for music downbeat and meter tracking.
  • The fourth plot: The jump-back reward vector for the corresponding downbeat states.

1: Music Genre: Pop

Easy song

2: Music Genre: Country

Easy song

3: Music Genre: Reggae

Easy song

4: Music Genre: Blues

Easy song

5: Music Genre: Classical

Easy song

Demos Discussion:

1- As demo videos suggest, the system infers multiple music rhythmic parameters, including music beat, downbeat, tempo and meter jointly and in an online fashion using very compact 1D state spaces and jump back reward technique. The system works suitably for different music genres. However, the process is relatively more straightforward for some genres such as pop and country due to the rich percussive content, solid attacks, and simpler rhythmic structures. In contrast, it is more challenging for genres with poor percussive profile, longer attack times, and more complex rhythmic structures such as classical music.

2- Since both neural networks and inference models are designed for online/real-time applications, the causalilty constrains are applied and future data is not accessible. It makes the jumpback weigths weaker initially and become stronger over time.

3- Given longer listening time is required to infer higher hierarchies, i.e., downbeat and meter, within the very early few seconds, downbeat results are less confident than lower hierarchies, i.e., beat and tempo, however, they get accurate after observing a bar period.

Acknowledgement

Many thanks to the Pandora/SiriusXM Inc. research team for making it legal to publish the project's source code. To load the raw audio and input features extraction Librosa and Madmom libraries are ustilzed respectively. Many thanks for their great jobs. This work has been partially supported by the National Science Foundation grant 1846184.

arXiv 2111.00704

References:

M. Heydari, M. McCallum, A. Ehmann and Z. Duan, "A Novel 1D State Space for Efficient Music Rhythmic Analysis", In Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2022.

M. Heydari, F. Cwitkowitz, and Z. Duan, “BeatNet:CRNN and particle filtering for online joint beat down-beat and meter tracking,” in Proc. of the 22th Intl. Conf.on Music Information Retrieval (ISMIR), 2021.

M. Heydari and Z. Duan, “Don’t Look Back: An online beat tracking method using RNN and enhanced particle filtering,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2021.

1d-statespace's People

Contributors

mjhydri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

1d-statespace's Issues

Results are off, what am I missing?

Hi, I installed the package and run it for GTZAN evaluation reproduction and got fairly bad results. I ran it on the debug sample 808kick120bpm.mp3 in the BeatNet repository and got those outputs.
I get:

# Ex: RUN 1
array([[  4.02,   1.  , 130.  ,   4.  ],
       [  4.52,   2.  , 115.  ,   4.  ],
       [  5.02,   2.  , 125.  ,   4.  ],
       [  5.54,   2.  , 111.  ,   4.  ],
       [  6.04,   1.  , 115.  ,   4.  ],
       [  7.14,   2.  , 115.  ,   4.  ],
       [  8.24,   2.  , 120.  ,   4.  ],
       [  9.34,   2.  , 115.  ,   4.  ]])
       
# Ex: RUN 2
array([[  4.02,   1.  , 107.  ,   4.  ],
       [  4.52,   2.  , 115.  ,   4.  ],
       [  5.02,   2.  , 125.  ,   4.  ],
       [  6.12,   2.  , 125.  ,   4.  ],
       [  7.22,   1.  , 107.  ,   4.  ],
       [  8.32,   1.  , 125.  ,   4.  ],
       [  9.42,   1.  , 107.  ,   4.  ]])       

Beat times are clearly off/missing, plus results are different at each run actually. I don't know what is wrong, I wonder if someone took a dive in the code and found a quick fix? Thanks in advance!

incorrect process result

I process the sample 808kick120bpm.mp3 in beatnet like this
estimator = joint_inference(1, plot=False)
output = estimator.process('808kick120bpm.mp3')
print(output)

and got this result
[[ 4.02 1. 200. 4. ]
[ 4.52 2. 125. 4. ]
[ 5.62 2. 125. 4. ]
[ 6.72 2. 60. 4. ]
[ 7.82 1. 120. 4. ]
[ 8.92 2. 60. 4. ]
[ 9.52 1. 97. 4. ]]

I think this is incorrect,how to take the correct result? thank you。

Tempo off by 5 consistently

Hi Mojtaba,

I was trying out your package but find that the reported tempo is off consistently by 5. The easiest test of this is to use 808kick120bpm.mp3 from the beatnet package, though I found the same thing with another music sample. Beatnet reports the. correct tempo.

Any idea what might cause this?

Best,
Alex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.