Giter VIP home page Giter VIP logo

Comments (21)

jmvalin avatar jmvalin commented on September 21, 2024

Can you explain a bit more here?

from opus.

wumaster avatar wumaster commented on September 21, 2024

Can you explain a bit more here?

Hi, very happy to get your reply. I have a lot of interests on your work and want to use your plc and fec methods, and recently I am studying your code of opus-ng.
I have tested three plc methods: silk plc, lpcnet plc, fargan plc. It seems that fargan plc generates more artifacts than silk and lpcnet plc. And I also tested them using subjective tests and pesq score, the results showed that plc method using lpcnet has a better quality than others, and plc method using fargan sometimes gets a worse result than silk plc (more sound artifacts). I guess that the fargan has a worse audio quality than lpcnet as a vocoder.

from opus.

jmvalin avatar jmvalin commented on September 21, 2024

Actually, fargan as a vocoder gives better quality than LPCNet. Can you provide the two commits you're comparing, what command line you're using, along with the input and output files so that we can reproduce what you're getting?

from opus.

wumaster avatar wumaster commented on September 21, 2024

Actually, fargan as a vocoder gives better quality than LPCNet. Can you provide the two commits you're comparing, what command line you're using, along with the input and output files so that we can reproduce what you're getting?

Thank you a lot for the reply. Please wait a moment and let me prepare these materials.

from opus.

wumaster avatar wumaster commented on September 21, 2024

Actually, fargan as a vocoder gives better quality than LPCNet. Can you provide the two commits you're comparing, what command line you're using, along with the input and output files so that we can reproduce what you're getting?

The lpcnet test branch: https://github.com/xiph/opus/tree/neural_plc. I have modified opus_demo.c to make it support lost file input. Base commit is 4e46ccd. The command line is: ./opus_demo voip 16000 1 64000 -use_lost_file -complexity 5 arctic_a0023_16k.pcm out_plc.pcm arctic_a0023_16k_is_lost.txt
The fargan test branch: https://github.com/xiph/opus/tree/opus-ng. Build option is ./configure --enable-deep-plc. Base commit is 591c8ba. The command line is: ./opus_demo voip 16000 1 64000 -lossfile arctic_a0023_16k_is_lost.txt -dec_complexity 10 -complexity 5 arctic_a0023_16k.pcm out_plc.pcm. I use enc complexity 5 to make sure only the silk encoder/decoder works.
The results shows that the fargan plc generates more audio data than silk or lpcnet plc, but it could generate more artifacts than other plc methods.
test_and_res_pcm.zip
In 2.0s/ 3.7s/ 5.5s... , the fargan plc generates more signals with pitch, but the original signal is not pitch signal. This I think we can solve it using silk plc instead of fargan plc when dealing with lost signal of TYPE_UNVOICED and TYPE_NO_VOICE_ACTIVITY.
In 3.0s, the fargan plc generates some artifacts, others would generate artifacts too. But the artifacts is easier to hear than other plc methods. This makes the plc method sometimes get worse subjective test scores.
We tested many files, it seems that above problems would also occur in other files.

from opus.

wumaster avatar wumaster commented on September 21, 2024

I also tested the 2022 PLC challenge test database using clean signal and loss file. The results shows that lpcnet plc get a higher PLCmos score.

from opus.

jmvalin avatar jmvalin commented on September 21, 2024

There's hundreds of changes between the two points you're comparing (not just switching from LPCNet to FARGAN). Are you able to narrow it down further?

from opus.

wumaster avatar wumaster commented on September 21, 2024

There's hundreds of changes between the two points you're comparing (not just switching from LPCNet to FARGAN). Are you able to narrow it down further?

Sorry, I have been learning your code just for a short time, and for now I can't figure out the details between the two plc algorithms. I just tested your two plc algorithms, and the results just showed that the fargan plc sometimes get worse results both in PLCMOS and our subjective tests. Just a polite question, I would like to ask your research team's test results between the two plc algorithms.
Here is the clean speech, lostfile and plc results of plc challenge test data(54.wav), in the subjective tests, the fargan plc get worse results. The command line used is same as mentioned above.
plc-challenge-54.zip
Recently I'm trying to figure out what causes the differences.

from opus.

jmvalin avatar jmvalin commented on September 21, 2024

I was just saying that if you have some time it may be useful to look at intermediate versions between the two you tested. There have been many more changes between the two, including a different pitch estimator, a smaller feature predictor, etc. In terms of objective results, we don't use PLCMOS as we've seen it to be unreliable in the past. I'll still see if I can find anything.

from opus.

wumaster avatar wumaster commented on September 21, 2024

I was just saying that if you have some time it may be useful to look at intermediate versions between the two you tested. There have been many more changes between the two, including a different pitch estimator, a smaller feature predictor, etc. In terms of objective results, we don't use PLCMOS as we've seen it to be unreliable in the past. I'll still see if I can find anything.

OK, thanks a lot. I need to take more time to look into some details between the two. In my test, the fargan plc sometimes generate more artifacts (more harmonic noise) than silk or lpcnet plc. I think the decoder information such as signal type can help fargan to generate less artifacts.

from opus.

jmvalin avatar jmvalin commented on September 21, 2024

If you want to see just the effect of FARGAN, you could test commit d1c5b32, which is just before FARGAN got added.

from opus.

wumaster avatar wumaster commented on September 21, 2024

If you want to see just the effect of FARGAN, you could test commit d1c5b32, which is just before FARGAN got added.

Thanks a lot!

from opus.

mklingb avatar mklingb commented on September 21, 2024

I did some investigation and found some commits where I think there is regression. I just did subjective listening to the arctic_a0023_16k.pcm example. On the opus-ng branch, the original LPCNet PLC is at 4414db0.

First potential regression is seen at 2d98ced. I notice that some of the PLC includes a bit more pitched content mixed in. I think it actually sounds fine but it is a change. I didn't run PESQ or PLCMOS on this.

Next potential regression is f0ec990. Here there are some strange choices of pitch, and again the pitched (voiced) segments are louder.

All of these predate the changeover to FARGAN. There is an addition possible regression that happens somewhere between f0ec990 and 591c8ba, but I haven't tracked that down yet.

There were changes to the PLC predictor and pitch models prior to the switch to FARGAN, so we're going to be looking at these as well as other possible root causes.

from opus.

wumaster avatar wumaster commented on September 21, 2024

I did some investigation and found some commits where I think there is regression. I just did subjective listening to the arctic_a0023_16k.pcm example. On the opus-ng branch, the original LPCNet PLC is at 4414db0.

First potential regression is seen at 2d98ced. I notice that some of the PLC includes a bit more pitched content mixed in. I think it actually sounds fine but it is a change. I didn't run PESQ or PLCMOS on this.

Next potential regression is f0ec990. Here there are some strange choices of pitch, and again the pitched (voiced) segments are louder.

All of these predate the changeover to FARGAN. There is an addition possible regression that happens somewhere between f0ec990 and 591c8ba, but I haven't tracked that down yet.

There were changes to the PLC predictor and pitch models prior to the switch to FARGAN, so we're going to be looking at these as well as other possible root causes.

thanks!

from opus.

jmvalin avatar jmvalin commented on September 21, 2024

Still looking into this, but can you give the exp_plc_fix1 branch (commit c1b80a7) a try and let me know?

from opus.

wumaster avatar wumaster commented on September 21, 2024

Still looking into this, but can you give the exp_plc_fix1 branch (commit c1b80a7) a try and let me know?

OK, I'm a little busy these days, I'll test it soon

from opus.

jmvalin avatar jmvalin commented on September 21, 2024

Well, you can now compare to the latest commit on opus-ng, which has the changes from exp_plc_fix1 and more

from opus.

wumaster avatar wumaster commented on September 21, 2024

I just test the new commit, it seems that the pitch-liked content decreased, but still has the problem.
test_and_res_pcm-1_22.zip
image
It seems that the network judged a wrong signal type, the lpcnet and silk plc get the correct signal type.
image

from opus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.