Giter VIP home page Giter VIP logo

conferencingspeech2021's Issues

Can you use PR?

How about using a PR to commit the sources, even for the internal change?
This will make us easy to review the change in the source code.
(In other words, it is not easy to review the changes if you directly commit the master without PR).

exported by onnx and inference scripts?

Hi, thanks for the challenge and baseline. I saw the description as "baseline, this folder contains baseline system include inference model exported by onnx and inference scripts;" But, where is actually the onnx part? and the inference scripts.

Thanks a lot.

Missing data in Audioset

Hello,

I was trying to run the simulation with the given selected_list, but I found some of the IDs for Audioset is not accessible now.
Below I list part of them (I haven't check all of the sample IDs):

HKTIe6piDOI
M7GmqUqVQEA
Hm20kZ7QzO0
oz3LrVaXMb4
6-kHUulyCog
TGd5kPDdN_I
IjoePLT_cFw
dKK-JaIzwS4
Cmhpj4MJ_hQ
NbBM82N1Xos
2JoJ_1agmTk
8YIELHXpf3g
AdLiRtpI01s
AgVZ65Hr9rw
4fh52mLYBYw
KKoTQfro920
L6DFGW6jeV8
X61ftZ590Uc
pK1ucosjoRo
Lpzx6N2aCMY
lnWP_zWFpBg
mg2rhu_HHR0

For example, if you go to https://www.youtube.com/watch?v=6-kHUulyCog, it says the video is unavailable.
If you go to https://www.youtube.com/watch?v=Lpzx6N2aCMY, it says the video becomes private.

Could you release the unavailable samples in Audioset directly, or just change the selected list for Audioset?

Task2 : RIR file

According to the RIR files, It seems that there is no correlation between different microphone arrays.

The rooms are different and source positions are different.

Is it useful for task2?

The version mis-match of VCTK corpus.

Hi, all

Without the VCTK in my cluster, I used the given download link to download the VCTK corpus.
However, I found the given link maybe not correct one to download.

The VCTK corpus is updated with version 0.92 now, which is given by the link in https://github.com/ConferencingSpeech/ConferencingSpeech2021/blob/master/simulation/ReadMe.md and https://github.com/ConferencingSpeech/ConferencingSpeech2021/blob/master/ReadMe.md.
The new version of VCTK actually is quite different from the original widely used version 0.80 ( which I believe is also used by the official baseline, inferred from the selected_list content).
Now the new version gets two tracks of audios with _mic1.flac or _mic2.flac suffix.
However, the VCTK files in selected_list is in .wav suffix.

So, I think maybe the correct version for the VCTK is 0.80, which could be downloaded from the link: https://datashare.ed.ac.uk/handle/10283/2651.
Please check it, thx.

And the 0.80 version contains both raw recording and the waves without the silence part. And they are using the same file name as p376_295.wav.
I'd also like to know which one should be used.
Because the quick_select.py may encounter some "accident" now.

Thanks.

Cannot get the dataset

We have already registered the competition using the educational mailbox. However we have not received the sharing code and do not have the permission to download data. Could you help us to fix it?

.so files

Are they necessary?
We usually don't put these library files since these files depend on the environment.

Generating the synth examples. Step 3 not clear.

In simulation/README.md:

What does it means for step 3:

Attention to the data/[dev | train]_[linear|circle]_simu_mix.config . In the config file path should be replaced with the corresponding path.

do we have to produce a script for replacing path with our own paths ?
If so, can you include in the repo the script you have used to replace the paths so each participant has not to write its own ? (i am lazy :) ).

The version mismatch of VCTK.

Hi, all

Without the VCTK in my cluster, I used the given download link to download the VCTK corpus.
However, I found the given link maybe not correct one to download.

The VCTK corpus is updated with version 0.92 now, which is given by the link in https://github.com/ConferencingSpeech/ConferencingSpeech2021/blob/master/simulation/ReadMe.md and https://github.com/ConferencingSpeech/ConferencingSpeech2021/blob/master/ReadMe.md.
The new version of VCTK actually is quite different from the original widely used version 0.80 ( which I believe is also used by the official baseline, inferred from the selected_list content).
Now the new version gets two tracks of audios with _mic1.flac or _mic2.flac suffix.
However, the VCTK files in selected_list is in .wav suffix.

So, I think maybe the correct version for the VCTK is 0.80, which could be downloaded from the link: https://datashare.ed.ac.uk/handle/10283/2651.
Please check it.

And the 0.80 version contains both raw recording and the waves without the silence part. And they are using the same file name as p376_295.wav.
I'd also like to know which one should be used.
Because the quick_select.py may encounter some "accident" in this part.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.