I'm trying to run Fast FullSubNet in a real-time audio streaming context. <p dir="

More details on the model latency: On a Macbook Pro with M1 Pr

For potential future reference, here's the <a href="https://pytorch.org/tutorials/reci

Real-time streaming Fast FullSubNet (LSTMCell) about fullsubnet HOT 3 OPEN

fronx commented on June 12, 2024 1

Real-time streaming Fast FullSubNet (LSTMCell)

from fullsubnet.

Comments (3)

fronx commented on June 12, 2024

More details on the model latency:

On a Macbook Pro with M1 Pro, one pass of running the model's forward function typically takes between 30ms and 40ms.
That's too long to keep up with a stream of 256 frames at 16kHz, which only leaves a 16ms time window for processing.
So the RTF (realtime factor) is between 1.8 (30/16) and 2.5 (40/16). It needs to be <1.0 in order for the model to qualify as realtime.

How did you calculate your RTFs?

from fullsubnet.

fronx commented on June 12, 2024

Good news: I updated my operating system to Sonoma 14.3.1 and that fixed it, without any further code changes. Now the processing time is consistently between 13ms and 15ms.

from fullsubnet.

fronx commented on June 12, 2024

For potential future reference, here's the torch.profiler output of a single inference run:

-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
                         Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
              model_inference         5.41%     780.000us       100.00%      14.412ms      14.412ms             1
                   aten::lstm         6.87%     990.000us        86.02%      12.397ms       2.479ms             5
                 aten::linear         1.02%     147.000us        39.57%       5.703ms     126.733us            45
                  aten::addmm        29.68%       4.277ms        35.56%       5.125ms     122.024us            42
               aten::sigmoid_        15.89%       2.290ms        15.89%       2.290ms      21.204us           108
                  aten::tanh_         8.48%       1.222ms         8.48%       1.222ms      33.944us            36
                   aten::tanh         7.88%       1.136ms         7.88%       1.136ms      31.556us            36
                  aten::copy_         6.45%     930.000us         6.45%     930.000us      17.222us            54
                   aten::add_         5.68%     818.000us         5.68%     818.000us      10.907us            75
                 aten::matmul         0.15%      22.000us         2.46%     354.000us      88.500us             4
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
Self CPU time total: 14.412ms

And here's a pretty timeline view: