liamappelbe / fftea Goto Github PK

A simple and efficient FFT implementation for Dart

License: Apache License 2.0

Dart 93.26% Python 6.74%

fftea's Introduction

fftea

A simple and efficient Fast Fourier Transform (FFT) library.

FFT converts a time domain signal to the frequency domain, and back again. This is useful for all sorts of applications:

Filtering or synthesizing audio
Compression algorithms such as JPEG and MP3
Computing a spectrogram (most AI applications that analyze audio actually do visual analysis of the spectrogram)
Convolutions, such as reverb filters for audio, or blurring filters for images

This library supports FFT of real or complex arrays of any size. It also includes some related utilities, such as windowing functions, STFT, inverse FFT, and circular and linear convolution.

Usage

Running a basic real-valued FFT:

import 'package:fftea/fftea.dart';

List<double> myData = ...;

final fft = FFT(myData.length);
final freq = fft.realFft(myData);

freq is a Float64x2List representing a list of complex numbers. See ComplexArray for helpful extension methods on Float64x2List.

Running an STFT to calculate a spectrogram:

import 'package:fftea/fftea.dart';

List<double> audio = ...;

final chunkSize = 1234;
final stft = STFT(chunkSize, Window.hanning(chunkSize));

final spectrogram = <Float64List>[];
stft.run(audio, (Float64x2List freq) {
  spectrogram.add(freq.discardConjugates().magnitudes());
});

The result of a real valued FFT is about half redundant data, so discardConjugates removes that data from the result (a common practice for spectrograms):

[sum term, ...terms..., nyquist term, ...conjugate terms...]
 ^----- These terms are kept ------^     ^- Discarded -^

Then magnitudes discards the phase data of the complex numbers and just keeps the amplitudes, which is usually what you want for a spectrogram.

If you want to know the frequency of one of the elements of the spectrogram, use stft.frequency(index, samplingFrequency), where samplingFrequency is the frequency that audio was recorded at, eg 44100. The maximum frequency (aka nyquist frequency) of the spectrogram will be stft.frequency(chunkSize / 2, samplingFrequency).

See the example for more detailed usage.

Performance Tips

If you need to squeeze every last CPU cycle out of your FFT, try these tips:

Construct your FFT object once, and reuse it, rather than constructing it each time you want to do an FFT.
Avoid unnecessary array allocations. If you need to do many FFTs, try reusing the same input array, and just overwrite the data. All the FFT methods have documentation stating whether they allocate any new arrays.
If you're doing something like: input data source -> manually convert to a List<double> -> pass to FFT.realFft, this is a bit inefficient. Instead do: input data source -> manually convert to a Float64x2List -> pass to FFT.inPlaceFft. This eliminates an array alocation and conversion.
Try to use power of two sized arrays. For example, if you're computing a spectrogram, you can usually just round up the chunk size to the next power of two. Or if you're filtering some audio, try padding your input with silence until it's a power of two, and then trimming the silence afterwards.
Try to use smaller FFTs if possible. For example, if you're doing STFT, smaller chunk sizes are faster. It's faster to do two FFTs of size N than one FFT of size 2N.

Technical Details

Fast Fourier Transform isn't really a single algorithm. It's a family of algorithms that all try to perform the Discreet Fourier Transform (DFT) in O(n*log(n)) time (a naive implementation of DFT is O(n^2)), where n is the size of the input array. These algorithms differ mainly by what kind of array sizes they can handle. For example, some FFT algorithms can only handle power of two sized arrays, while others can only handle prime number sized arrays.

Most FFT libraries only support power of two arrays. The reason for this is that it's relatively easy to implement an efficient FFT algorithm for power of two sizes arrays, using the Cooley-Tukey algorithm. To handle all array sizes, you need a patchwork of different implementations for different kinds of sizes.

In general, Cooley-Tukey algorithm actually works for any non-prime array size, by breaking it down into its prime factors. Powers of two just happen to be particularly fast and easy to implement.

This library handles arbitrary sizes by using Cooley-Tukey to break the size into its prime factors, and then using Rader's algorithm to handle large prime factors, or a naive O(n^2) implementation which is faster for small factors.

There's also a special Radix2FFT implementation for power of two FFTs that is much faster than the other implementations.

Rader's algorithm handles a prime numbered size, n, by transforming it into an FFT of size n - 1, which is non-prime, so can be handled by Cooley-Tukey. Alternatively, the n - 1 FFT can be zero padded up to a power two, which is usually faster because the special Radix2FFT can be used. See the primePaddingHeuristic function.

There are also a few special implementations for handling fixed sized FFTs of very small sizes. There's never really a practical use case for an FFT of size 2 or 3, but these form the base cases of Cooley-Tukey, and make larger FFTs much faster.

Check out the various implementations of FFT in lib/impl.dart for more details.

Benchmarks

This package was built because package:fft is not actively maintained anymore, isn't very efficient, and only supports power of two FFTs. This package aims to support FFTs of any size efficiently, and to make power of two FFTs particularly fast. There are a few improvements that make this Radix2FFT implementation efficient:

The main FFT class is constructed with a given size, so that the twiddle factors only have to be calculated once. This is particularly handy for STFT.
Doesn't use a wrapper class for complex numbers, just uses a Float64x2List. Every little wrapper class is a seperate allocation and dereference. For inner loop code, like FFT's complex number math, this makes a big difference. Float64x2 can also take advantage of SIMD optimisations.
The FFT algorithm is in-place, so no additional arrays are allocated.
Loop based FFT, rather than recursive.
Using trigonometric tricks to calculate fewer twiddle factors.
Bithacks
FFT implementations are cached by size.

I found some other promising Dart FFT implementations, so I decided to benchmark them too: scidart, and smart_signal_processing. package:fft and smart_signal_processing only support power of two arrays. Scidart supports arrays of any size, but for other sizes they use naive DFT, which is much slower. So for this first set of benchmarks I'm just testing power of two sizes.

To run the benchmarks:

Go to the bench directory and pub get, cd bench && dart pub get
Run dart run bench.dart

fftea gains some of its speed by caching the twiddle factors between runs, and by doing the FFT in-place. So the benchmarks are broken out into cached and in-place variants. Caching and running in-place is also applied to some of the other libraries, where appropriate.

In the table below, "cached" means the construction of the FFT object is not included in the benchmark time. And "in-place" means using the in-place FFT, and the conversion and copying of the input data into whatever format the FFT wants is not included in the benchmark. Run in Ubuntu on a Dell XPS 13.

Size	package:fft	smart	smart, in-place	scidart	scidart, in-place*	fftea	fftea, cached	fftea, in-place, cached
16	424.4 us	33.5 us	23.3 us	359.8 us	133.6 us	71.0 us	48.2 us	38.7 us
64	657.7 us	7.6 us	2.9 us	310.2 us	271.8 us	136.8 us	134.6 us	103.5 us
256	614.1 us	15.5 us	11.1 us	984.9 us	1.02 ms	100.3 us	51.9 us	38.3 us
2^10	1.74 ms	50.0 us	46.2 us	9.14 ms	9.17 ms	39.5 us	25.7 us	23.5 us
2^12	8.01 ms	219.7 us	203.1 us	133.10 ms	138.19 ms	119.8 us	109.9 us	104.8 us
2^14	39.69 ms	903.5 us	860.3 us	2.15 s	2.03 s	677.9 us	536.0 us	436.2 us
2^16	225.97 ms	5.36 ms	4.76 ms	42.53 s	42.71 s	3.43 ms	3.14 ms	2.21 ms
2^18	1.21 s	27.89 ms	25.84 ms	Skipped	Skipped	12.95 ms	12.53 ms	10.99 ms
2^20	7.25 s	164.35 ms	149.33 ms	Skipped	Skipped	89.84 ms	85.69 ms	74.99 ms

In practice, you usually know how big your FFT is ahead of time, so it's pretty easy to construct your FFT object once, to take advantage of the caching. It's sometimes possible to take advantage of the in-place speed up too, for example if you have to copy your data from another source anyway you may as well construct the flat complex array yourself. Since this isn't always possible, the "fftea, cached" times are probably the most representative. In that case, fftea is about 60-80x faster than package:fft, and about 70% faster than smart_signal_processing. Not sure what's going on with scidart, but it seems to be O(n^2) even for power of two sizes.

* Scidart's FFT doesn't have an in-place mode, but they do use a custom format, so in-place means that the time to convert to that format is not included in the benchmark.

I also benchmarked fftea's various implementations of FFT at different sizes, using bench/impl_bench.dart.

This graph shows how the different implementations perform at different sizes.

This graph shows the performance of the implementation selected by the FFT.FFT constructor, which attempts to automatically pick the right implementation for the given size. Although the overall O(n*log(n)) worst case slope is maintained, there is a lot of variation between specific sizes.

Looking at the first graph, you can see that Radix2FFT is consistently the fastest, the PrimeFFT variants are the slowest, and CompositeFFT is in between. Generally, the more composite the size is, the faster the FFT will be. So if you have some flexibility in the FFT size you can choose, try to choose a highly composite size (ie, one where the prime factors are small), or ideally choose a power of two.

fftea's People

Contributors

Stargazers

Watchers

Forkers

codakkk

fftea's Issues

How to compare two wav file?

Hello.
I have a reference wav file and wanna compare other wav files with it to check if same pattern is played or not.
I wrote the below code (Also I used wav package to read wav file data and fl_chart to draw line charts).

import 'package:wav/wav.dart';
import 'package:fftea/fftea.dart';
...

    final Wav wav1 = await Wav.readFile("path-to-reference-sound.wav");
    List<double> wavData1 = wav1.channels[0];
    final fftxWav1 = FFT(wavData1.length);
    final freqListWav1 = fftxWav1.realFft(wavData1).discardConjugates().magnitudes();
    final List<double> decibelsWav1 = freqListWav1.map((amp) => 20 * math.log(amp) / math.ln10).toList();

    final Wav wav2 = await Wav.readFile("path-to-second-sound.wav");
    List<double> wavData2 = wav2.channels[0];
    final fftxWav2 = FFT(wavData2.length);
    final freqListWav2 = fftxWav2.realFft(wavData2).discardConjugates().magnitudes();
    final List<double> decibelsWav2 = freqListWav2.map((amp) => 20 * math.log(amp) / math.ln10).toList();

    for (int i = 0; i < decibelsWav1.length; ++i) {
      final freq = fftxWav1.frequency(i, 44100);
      chartPointsWave1.add(FlSpot(freq, freqListWav1[i]));
    }
    for (int i = 0; i < decibelsWav2.length; ++i) {
      final freq = fftxWav2.frequency(i, 44100);
      chartPointsWave2.add(FlSpot(freq, freqListWav2[i]));
    }

Now I can draw 2 line charts from chartPointsWave1 and chartPointsWave2.
Is this approach true?
Could you please suggest me a solution?

Add inverse of discardConjugates function

Add a convolution util

Not an issue, just a question

I don't know about audio data / data science, but can this library produce sound from text,

sorry this question is stupid, I'm really already dizzy looking on google

Opengl accelerated

I would like to have GPU accelerated FFT.
Is it possible?
This will have applications for real time simulations and games.

Switch to faster bit reversal algorithm

There's a faster 64-bit reversal algorithm than what we're doing. It's O(log(bits)) instead of O(bits), using clever bit hacks:

x = ((x >> 32) & 0x00000000ffffffff) | (x << 32);
x = ((x >> 16) & 0x0000ffff0000ffff) | ((x & 0x0000ffff0000ffff) << 16);
x = ((x >> 8) & 0x00ff00ff00ff00ff) | ((x & 0x00ff00ff00ff00ff) << 8);
x = ((x >> 4) & 0x0f0f0f0f0f0f0f0f) | ((x & 0x0f0f0f0f0f0f0f0f) << 4);
x = ((x >> 2) & 0x3333333333333333) | ((x & 0x3333333333333333) << 2);
x = ((x >> 1) & 0x5555555555555555) | ((x & 0x5555555555555555) << 1);

This is slightly different to our case because we want to reverse the bit string based on the size of the FFT. So for a 256 element FFT we want to turn 0x1 into 0x80, not 2^63. So we'll need to shift the result based on the log2 of the FFT size.

Also rerun the benchmarks with this improvement.

Found Key & BPM from Audio byte

Hi.
I'am trying to get information from audio like Key and BPM from Audio byte.
Can I use this package to detect audio key and Bpm?

My audio byte in pcm16buffer with sample rate 16000 but stored in Uint8List, so i convert to Int16List.

         List<double> pcmDoubleData = audioBytes.buffer
          .asInt16List()
          .map((value) => value / 32768.0)
          .toList();

But i don't know next part to using this fft.

How to split range in frequency domain and inverse FT?

Hi!
First of all, thank you for your fast library for FT.

I tried to split the range of frequencies and perform an inverse FT.
In detail, my EEG data is sampled at 200Hz per second.
Here are the steps I followed:

Convert from time domain to frequency domain using STFT
split the frequencies into three ranges: 1~~4Khz, 4~~16Khz, 16~64Khz
Perform an Inverse FT

As far as I know, a spectrogram is Frequency + Amplitude(or can be Magnitude) + Time.
When I ran the Spectrogram example, I couldn't understand the result:

        final chunkSize = 300;
        final stft = STFT(chunkSize, Window.hanning(chunkSize));
        final spectrogram = <Float64List>[];
        stft.run(d, (Float64x2List freq) {
          spectrogram.add(freq.discardConjugates().magnitudes());
        });

The result of the spectrogram is a list of 151 double array and etc

stft = {STFT} 
 _fft = {CompositeFFT} CompositeFFT(300)
 _win = {_Float64List} size = 300
 _chunk = {_Float64x2List} size = 300

Where can I find frequency information from this spectrogram? and how to split?

Investigate changing order of operations in `CompositeFFT`

CompositeFFT is built on a series of lower level FFTs that are evaluated in sequence. But the evaluation order doesn't matter. So what happens if we change the order? Is it faster to do the larger FFTs first, or smaller first?

Streaming API for STFT

Add benchmarks to CI

https://github.com/benchmark-action/github-action-benchmark

Audio resampling util?

Changing the sample rate of an audio signal is another super common use case of FFT that would make for a good util, like the convolution util.

Is it worth including Bluestein's algorithm?

The integer literal can't be represented exactly in JavaScript

Thank you a lot for this great plugin !
It works perfectly on Android, but I get this error on web :

The integer literal 0xAAAAAAAAAAAAAAAA can't be represented exactly
in JavaScript

I was planing on replacing all those 64bits integer litterals by something along the line of this.
for my app. Do you know if there would be a better way ?

Realtime frequency peak detection

Hello,

I would like some guidance on how to implement a real-time (FFT) on microphone data to detect the peak frequency. However, most of the examples in the repo are using WAV files, and I am unsure of how to get this working with the microphone data directly on the device.

Currently, I am using mic_stream lib to obtain real-time readings of the microphone as 32-bit float Pulse Code Modulation (PCM) values. My current approach is to buffer X number of samples from the mic_stream, run a STFT on the buffered input, and then use the magnitude method to perform a max search to get the index of the peak frequency. Finally, I use the stft.frequency method to obtain the actual frequency.

I'm having trouble selecting the buffer sample size X. My current approach is to collect values of the mic_stream lib for 250ms, then process it via STFT. Unfortunately, in 90% of cases, the STFT function returns a list of NaN values, and in cases where it does not, the detected peak is not even close to the frequency I am playing into the input (200hz wave).

Now I am unsure if the data I feed into the STFT is simply wrong (each sample: [-1.0; 1.0), and 32 bit precision) or if the samplingFrequency I am using for the index to frequency conversion is wrong? Currently, I am using the samplingfrequency of the mic that the mic_stream lab reports.

Some example code

I wrote a small bit of code to make sure that I understand how this package works and that I get correct results. Maybe this is useful for someone else.

import 'dart:math';
import 'package:fftea/fftea.dart';
import 'dart:typed_data';

const double time = 1; // Seconds
const int fs = 48000; // Samplerate
const double delta_t = 1/fs; // Time between each audio sample
const double delta_f = fs/(time*fs); // Frequency resolution of fft output
const double ampl1 = 0.8, freq1 = 256, angle1 = 45; // Sine wave parameters (for generating input waveform)
double log10(num x) => log(x) / ln10;

void myfft(List<double> tvec, [List<double>? window, double windowcorrectionfactor = 1]) {
  List<double> itvec = List.from(tvec);
  if (window != null) {
    // Apply window
    for (var i = 0; i < itvec.length; i++) {
      itvec[i] *= window[i];
    }
  }
  // Calculate fft
  final fft = FFT(itvec.length);
  final freqbins = fft.realFft(itvec);
  // Remove frequencies above the Nyquist frequency
  Float64x2List fftresultF64x2 = freqbins.discardConjugates();
  // Calculate the phase/angle of each frequency component.
  List<double> fftangle = fftresultF64x2.map((i) => atan(i.y/i.x).toDouble()).toList();
  // Get the magnitude of each frequency component.
  Float64List fftamplF64 = fftresultF64x2.magnitudes();
  // Normalise (multiply by two because we discarded half the energy, divide by number of input samples) and correct for Window function
  List<double> fftresult = fftamplF64.map((i) => windowcorrectionfactor*2*i.toDouble()/itvec.length ).toList();

  // Find max, this should be from our input sine wave
  double max = 0;
  int maxindex = 0;
  for (int i = 0; i < fftresult.length; i++) {
    if (fftresult[i] > max) {
      max = fftresult[i];
      maxindex = i;
    }
  }
  // Look at bins around the max bin.
  for (int i = maxindex-3; i <= maxindex+3; i++) {
    //print("bin: ${i}, value: ${fftresult[i]}, ${freqbins[i].x}, ${freqbins[i].y}, ${freqbins[48000-i].x}, ${freqbins[48000-i].y}");
  }
  
  print("Number of input samples: ${itvec.length}, output bins: ${freqbins.length}, after discard: ${fftamplF64.length}");
  print("Amplitude is ${max} and phase is ${360*fftangle[maxindex]/(2*pi)} at frequency ${maxindex*delta_f}");
  print("");

}

void fftstuff() {
  List<double> tvec = []; // A list for holding the input samples
  // Generate sine to feed the fft.
  for (double i = 0; i <= time; i += delta_t) {
    tvec.add(ampl1 * cos(2*pi*freq1*i+angle1*2*pi/360));
  }

  // Perform fft with no window (rectangular), then hanning and hamming.
  myfft(tvec);
  myfft(tvec, Window.hanning(tvec.length), 2);
  myfft(tvec, Window.hamming(tvec.length), 1.85);
}

void main() {
  fftstuff();
}

The output is:

Number of input samples: 48000, output bins: 48000, after discard: 24001
Amplitude is 0.800000000000011 and phase is 45.000000029805 at frequency 256.0

Number of input samples: 48000, output bins: 48000, after discard: 24001
Amplitude is 0.7999833333354626 and phase is 45.00000002217547 at frequency 256.0

Number of input samples: 48000, output bins: 48000, after discard: 24001
Amplitude is 0.7991858166684803 and phase is 45.000000023305795 at frequency 256.0

how to convert to decibel

Hi I have used FFT(); like this
...
final fftx = FFT(audio.length);
final freqlist = fftx.realFft(audio);
...
that result is real and imagine number
How i convert to Decibel ?

Pitch Shift using FFT?

Hello. I'm wondering how I can use the FFT to change the pitch of audio data in pcm16 format. I've already been using this package flutter ffmpeg, which works great with this argument:

   "-i test.mp3 -af  asetrate=$sampleRate*$shift=$sampleRate,atempo=1/$shift output.mp3";

However, since this ffmpeg requires writing and reading files, it takes time to change the pitch on realtime. So, how does this FFT achieve pitch changes more efficiently?

Investigate supporting non-power-of-two FFT sizes

How hard would this be? Is it worth the effort? Last time I implemented non-power-of-two sizes it was really complicated and had serious numerical precision issues. Maybe could do one of the factorization based algorithms with an O(n^2) fallback for prime sizes?

Implement Baillie–PSW in the isPrime function?

Propose to merge with SciDart

Hello!
Congratulations for this great project. I'm glad that you mentioned SciDart inside of your benchmarks.
I quit the Radix implementation because of complexity and the shortly time. I'm just wondering here: What you think to bring this implementation to be part of SciDart? For me make sense because SciDart was born exactly with the propose to merge all the Dart Scientific stuff.

Gaussian Window

Hello, I'm looking for Gaussian Window implementation. Is there any future plan?

`primePaddingHeuristic` is inefficient

primePaddingHeuristic is built on largestPrimeFactor, but that function is overkill. We don't care what the largest prime factor is, just whether it's larger than 5. Write a new function that returns a bool of whether the number's largest prime factor is larger than some threshold: largestPrimeFactorIsLargerThan(n, p). This function can bail early if it divides out all the prime factors below that limit, but the number still hasn't been reduced to 1.

Hello, I'm trying to implement a function to extract MFCCs and I need to understand its function.

I implemented this same algorithm in Javascript and it gave great results, but for that I used tensorflow.
I used it like this:

const stfts = tf.signal.stft(input, n_fft, 512);

How could I use your function to do something like this?

Please understand that I don't have much understanding on the matter.