Giter VIP home page Giter VIP logo

Comments (12)

mreinstein avatar mreinstein commented on July 19, 2024 1

lamejs should be the terminating node in an audio pipeline. Shouldn't it?

The audio pipeline in my use case:

                          ┌---------------------┐
                      ┌-->|watson speech-to-text|
┌---┐    ┌------┐     |   └---------------------┘
|mic├--->|lamejs├-----┤
└---┘    └------┘     |   ┌---------------------------┐
                      └-->|indexeddb (browser storage)|
                          └---------------------------┘

Mp3 is a stream of bytes hence uint8, outputting it as float32 makes no sense

Yeah, this is what I'm struggling with. Unless I'm mistaken, as per https://webaudio.github.io/web-audio-api/#defining-a-valid-audioworkletprocessor AudioWorklet outputs are Float32Arrays. I'm trying to figure out how to package this in a sensible way as an AudioWorklet so that lamejs can be used a normal webaudio node.

from lamejs.

zhuker avatar zhuker commented on July 19, 2024

lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits.
integration into apis are welcome as pull requests

from lamejs.

mreinstein avatar mreinstein commented on July 19, 2024

lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits

@zhuker yeah I get that. Maybe I misunderstand but it seems that the lamejs essentially takes Uint16Array data as input and produces Uint8Array encoded data as output. Is that right? I think the AudioWorklet api takes Float32Array data as input and output.

integration into apis are welcome as pull requests

I started working on this in a branch, but ran into the aforementioned issue. Would be happy to send a PR if/when it works!

from lamejs.

zhuker avatar zhuker commented on July 19, 2024

from lamejs.

guest271314 avatar guest271314 commented on July 19, 2024

Did you solve this?

from lamejs.

mreinstein avatar mreinstein commented on July 19, 2024

That was 5 years ago, I'm not working on audio processing lately.

Audio Worklet support has gotten pretty decent now though. It should be pretty feasible in theory.

from lamejs.

guest271314 avatar guest271314 commented on July 19, 2024

I achieved the requirement using the details here e18447f.

Is the issue resolved?

from lamejs.

mreinstein avatar mreinstein commented on July 19, 2024

I guess I can take a look and see if i can make that work via audio worklet

from lamejs.

guest271314 avatar guest271314 commented on July 19, 2024

This is what I am doing with raw PCM input that I simultaneously stream with MediaStreamTrackGenerator and record with lamejs which I modified to be a Module export, in pertinent part

async importEncoder() {
    if (this.mimeType.includes('mp3')) {
      const lamejs = (await import('./lame.min.js')).default;
      this.mp3encoder = new lamejs.Mp3Encoder(2, 44100, 128);
      this.mp3Data = [];
    } else if (this.mimeType.includes('opus')) {
      const { Decoder, Encoder, tools, Reader, injectMetadata } = (await import('./ts-ebml.min.js'));
      Object.assign(this, { Decoder, Encoder, tools, Reader, injectMetadata });
    }
 }
const int8 = new Int8Array(441 * 4);
const { value, done } = await this.inputReader.read();
// value: raw PCM from parec -d @DEFAULT_MONITOR@
if (!done) int8.set(new Int8Array(value));
const int16 = new Int16Array(int8.buffer);
// https://stackoverflow.com/a/35248852
const channels = [new Float32Array(441), new Float32Array(441)];
for (let i = 0, j = 0, n = 1; i < int16.length; i++) {
  const int = int16[i];
  // If the high bit is on, then it is a negative number, and actually counts backwards.
  const float = int >= 0x8000 ? -(0x10000 - int) / 0x8000 : int / 0x7fff;
  // deinterleave
  channels[(n = ++n % 2)][!n ? j++ : j - 1] = float;
}
// var floatSamples = new Float32Array(44100); // Float sample from an external source
const left = channels.shift();
const right = channels.shift();
let leftChannel, rightChannel;
if (this.mimeType.includes('mp3')) {
  const sampleBlockSize = 441;
  leftChannel = new Int32Array(left.length);
  rightChannel = new Int32Array(right.length);
  for (let i = 0; i < left.length; i++) {
    leftChannel[i] = left[i] < 0 ? left[i] * 32768 : left[i] * 32767;
    rightChannel[i] = right[i] < 0 ? right[i] * 32768 : right[i] * 32767;
  }
}
const data = new Float32Array(882);
data.set(left, 0);
data.set(right, 441);
const frame = new AudioData({
  timestamp,
  data,
  sampleRate: 44100,
  format: 'f32-planar',
  numberOfChannels: 2,
  numberOfFrames: 441,
});
this.duration += frame.duration;
await this.audioWriter.write(frame);
if (this.mimeType.includes('mp3')) {
  const mp3buf = this.mp3encoder.encodeBuffer(leftChannel, rightChannel);
  if (mp3buf.length > 0) {
    this.mp3Data.push(mp3buf);
  }
}
if (this.mimeType.includes('mp3')) {
  const mp3buf = this.mp3encoder.flush(); //finish writing mp3
  if (mp3buf.length > 0) {
    this.mp3Data.push(new Int8Array(mp3buf));
  }
  this.resolve(new Blob(this.mp3Data, { type: 'audio/mp3' }));
}

In an AudioWorklet we can use top-level import and modify sampleBlockSize to 128.

from lamejs.

guest271314 avatar guest271314 commented on July 19, 2024

You should be able to incorporate the changes https://github.com/guest271314/AudioWorkletStream. FWIW for speech synthesis processing can also utilize https://github.com/guest271314/native-messaging-espeak-ng. I am currently updating https://github.com/guest271314/captureSystemAudio for MP3 support. Next I will substitute https://github.com/davedoesdev/webm-muxer.js for MediaRecorder.

from lamejs.

mreinstein avatar mreinstein commented on July 19, 2024

Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.

WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.

None of the existing webaudio graph nodes can accept lamejs encoded mp3. It only really makes sense as an intermediate node. My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.

from lamejs.

guest271314 avatar guest271314 commented on July 19, 2024

Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.

Yes, it does make sense.

WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.

Not necessarily, parse, convert data to the expected TypedArray.

None of the existing webaudio graph nodes can accept lamejs encoded mp3.

Technically it can via HTML <audio> with MediaElementSource or captureStream() connected to MediaStreamAudioDestinationNode or MediaStreamAudioSourceNode connected to AudioWorkletNode.

My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.

The benefit is flexibility, and fidelity, particularly for speech to text. Though Mozilla Voice does use MP3.

You can certainly pipe a MediaStreamTrack through AudioWorkletNode to encode the stream in "real-time" and send to other destinations and save simultaneously.

The requirement is possible.

from lamejs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.