Comments (6)
Hey @ChangLee0903 !
Thanks for the feedback! I am not sure what you mean by training cost. Personally, I did not use state-of-the-art GPUs (NVIDIA GeForce GTX 1080 Ti ones), so most experiments took me about 12-15 hours on a single GPU depending on the experiment. Yes, I started with a clean speech checkpoint to train the robust ASR (to save compute effort and to start with a reasonable ASR system). This link will point you to the training command used for the base clean speech model along with the download link for the checkpoint. Hope this clarifies your questions.
from robust-e2e-asr.
Thx 10 billions for your fast reply, all of my questions have been answered. I'll cite you if my paper publish.
from robust-e2e-asr.
Hi @archiki,
I test your pre-train model with the LM model, which you point.
Since I cannot find the LM model link, I download one from librispeech's official website.
Here is the link: https://www.openslr.org/resources/11/4-gram.arpa.gz
Then the WER(7.334) and CER(3.028), which I got, are different from ur results.
BTW, your trainEnhance.py's arguments are different from the pre-train checkpoint, and I guess all the arguments I need can follow the pre-train checkpoint's settings, right?
best,
Chi-Chang Lee
from robust-e2e-asr.
The difference between the WER and CER numbers must be due to the difference in beam-decoding parameters like --beam-width, --alpha, --beta
. Let me know which arguments you are talking about, if I remember correctly, only the noise-injection-related arguments must be different. In that case, please use the arguments in the trainEnhanced.py
. Remaining things like --hidden-size
and --hidden-layers
are inherited from the pretrained checkpoint.
from robust-e2e-asr.
Hi @archiki,
Thx for ur reply, the arguments I concern is about "--learning-anneal 1.01 --batch-size 64 --
no-sortaGrad --opt-level O1 --loss-scale 1". Are such arguments the same as the pretrained checkpoints? BTW, did u leave the log files of ur training process? I noticed that the loss in trainTLNNoisy would increase in the beginning. Is it normal?
best,
Chi-Chang Lee
from robust-e2e-asr.
Yes, @ChangLee0903, these arguments are taken from the checkpoint. Note: --learning-anneal
and --batch-size
depend on your dataset, GPU memory, and type of compute (especially the latter, it can be changed to 32 or 16 if need be). The former depends on the training profile and might be different for a different dataset.
from robust-e2e-asr.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from robust-e2e-asr.