Comments (26)
It would be really nice if you could provide a pretrained model that is compatible with DeepSpeech-0.6.0 that was released a few days ago. The authors state that models trained with older versions are not compatible.
I would do it on my own if I had enough computing power. 😔
from deepspeech-german.
A pretrained model compatible with DeepSpeech-0.6.0 would indeed be great.
from deepspeech-german.
@DanBmh : Thank you for putting up the numbers.
We had split the training data into 10 sub-sets and then trained the model on each sub-set, repeatedly over 10 cycles. Example:
cycle 1: subset 1
cycle 2: subset 1 + subset 2
cycle 3: subset 1 + subset 2 + subset 3
so on ....
You might like to refer section 4.1 "Influence of Training Size" from our paper for details. It explains the training strategy.
We trained on Tuda-De (5 subsets) followed by Voxforge (5 subsets).
Regarding the WER numbers. The numbers depend on the data splits. Reshuffle the dataset and do a Train, Dev and Test split, you would surely get different WER. The numbers would be different for different splits. We are currently writing a paper to discuss these challenges and training a more robust model.
from deepspeech-german.
Thanks for the code. I tested it but for me it does not filter any additional files.
Note that i filtered some files before using different metrics like the file length and chars per second rate. Without this prefiltering i get an Error after some time, i think this is because of some invalid/empty file. Tested on tuda train + dev dataset.
from deepspeech-german.
@SantosSi @nanosonde @LarsScha @photoszzt : We have released v0.6.0. You can find the link in the ReadMe.
from deepspeech-german.
Added in the todo list.
You call still use v0.5.0 as there is no major model architecture difference between these two versions, apart from performance and some added features.
from deepspeech-german.
Trying your v0.5.0 model with stock deepspeech 0.6.0 as provided through pip3 I get the following error message:
Not found: Op type not registered 'VariableV2' in binary running on MyMachine. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.)
tf.contrib.resampler
should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
Traceback (most recent call last):
File "/home/user/.local/bin/deepspeech", line 10, in
sys.exit(main())
File "/home/user/.local/lib/python3.7/site-packages/deepspeech/client.py", line 113, in main
ds = Model(args.model, args.beam_width)
File "/home/user/.local/lib/python3.7/site-packages/deepspeech/init.py", line 42, in init
RuntimeError: CreateModel failed with error code 12294
from deepspeech-german.
You should install v0.5.0 of deepspeech using pip3.
v0.5.0 model won't be compatible with v0.6.0
from deepspeech-german.
Then I misunderstood your answer:
You call still use v0.5.0 as there is no major model architecture difference between these two versions
to @nanosonde when asking for a:
pretrained model that is compatible with DeepSpeech-0.6.0
You seem to relate to the software release version, not the model version. That was not completely clear.
from deepspeech-german.
I made a fork using the new DeepSpeech version. See: https://github.com/DanBmh/deepspeech-german
But for now only the voxforge dataset is working, i have some problems with tuda in validation step and did not test the other datasets (common-voice, swc, mailabs) yet.
from deepspeech-german.
@DanBmh I didn't face this issue with v0.5.0, but I had read about this error before, which you have mentioned in your ReadMe, and most of the people were able to resolve it by setting ignore_longer_outputs_than_inputs=True
in DeepSpeech.py.
from deepspeech-german.
I did that already. Training and testing for tuda is working, but the validation after each epoch always breaks with a segmentation fault error after some steps.
from deepspeech-german.
@DanBmh thanks for the fork!
Would you also consider to upload a builded model to your fork?
Otherwise I will try to build it during the weekend on my own :-)
If I can choose I would love to test deppespeech with this german model: Tuda-De+Voxforge
from deepspeech-german.
Now i got tuda to work, but my results are much worse than in your paper.
With my best tuda only version i get WER: 0.41, with tuda+voxforge i only get WER: 0.65.
did you train tuda+voxforge with combined datasets or first tuda then voxforge?
@Mexxxo if i have good results, i will try to upload them. But i think i dont get them before the weekend. If you train it yourself please post the scores you got.
from deepspeech-german.
I now tried your stepped training with 10 steps for tuda (without voxforge) but this is not working for me somehow.
The network is not learning much and getting worse after cycle 5 (WER 0.684).
Also i found out there are many dataset errors in tuda, resulting in infinite loss if training from English checkpoint. Did you encounter this too? I solved it by excluding those files, but i think there are still some files with errors.
from deepspeech-german.
Any update on model compatible with DeepSpeech 0.6?
from deepspeech-german.
Also i found out there are many dataset errors in tuda, resulting in infinite loss if training from English checkpoint. Did you encounter this too? I solved it by excluding those files, but i think there are still some files with errors.
@DanBmh : I remember I did a check to find erroneous files for all the dataset:
- Used SOX to read all the files. This way I could remove the corrupted files. (I don't have the code handy now.)
- Checked for wav's length greater than the transcript. (I can't recollect if the code gave an erroneous file.)
Code: find_erroneous_files.py
from deepspeech-german.
@photoszzt : We are currently writing a paper to discuss the challenges and training a more robust model using DeepSpeech.
We should be able to release the model with new datasets and updated Mozilla release till June/July.
from deepspeech-german.
I now uploaded a checkpoint of one of my models. I did use the master branch of DeepSpeech, the version should be v0.7.0a2. It has a WER of 0.19, tested with Tuda + CommonVoice dataset. You can find the model files here: https://github.com/DanBmh/deepspeech-german#language-model-and-checkpoints
@AASHISHAG I did run a test with your uploaded checkpoints and my test dataset. I only reached 0.68 WER with both datasets and 0.79 with tuda only. From which training are your checkpoints, the Tuda-De+Voxforge+Mozilla run? I know my testset is larger and the data is not the same, but shouldn't the difference be smaller? You can find the full results and the instructions I used here.
from deepspeech-german.
@DanBmh This could because of 2 reasons:
- Dataset size. We used ~300h of data. When compared to ~600h you used, The model you trained learned better.
- The random data splits. We are aware of these issues and have submitted a paper on it. I will share it as soon it's accepted.
from deepspeech-german.
@AASHISHAG thanks for the new version! do you have any performance data of the new model available yet?
from deepspeech-german.
@sebastiantilman : Unfortunately not, but this model should be more robust than the previous release, as it is trained on ~4 times the previous data.
from deepspeech-german.
hi,
I tested out the model performance (for version v0.6.0) on the test set of common voice dataset which amounted to about 18 hours of audio after data preparation. The WER I got was 31.65% and the CER was 16.05%.
The quality of transcriptions in general is much better than the previous version of the model (haven't evaluated WER/CER on the old model version), but I still feel that the WER for this model is too high.
To prepare the dataset, I downloaded the dataset from the common voice website and followed the instructions in your Readme.
@AASHISHAG Do the reported numbers make sense to you?
from deepspeech-german.
@DanBmh that you have a 0.7 models is awesome! But I can't find the files on your fork. The links at the bottom do not seem to work. Can you link the .pbmm
and .scorer
files? :)
PS: Could you open your fork for issues?
from deepspeech-german.
@erksch Updated the links. Only the checkpoints are not yet uploaded again.
from deepspeech-german.
@ALL
We would release version v0.7.0 soon.
I am closing this issue since it relates to old v0.6.0 release.
from deepspeech-german.
Related Issues (20)
- Deutsche Groß-, und Kleinschreibung HOT 4
- Newer models? HOT 2
- DeepSpeech 0.9? HOT 4
- Scorer file for 0.9 release HOT 2
- loading output_graph.tflite deepspeech-tflite result in segmentation fault HOT 17
- Loading pretrained model HOT 4
- Best Hyperparameters for version 0.9.0 HOT 3
- Qualität der Erkennung HOT 3
- WER results for newer models HOT 3
- DeepSpeech German for .NETCore HOT 1
- Dependency conflict with numpy HOT 3
- Add WER from most recent model HOT 2
- Training on Windows
- ERROR: Cannot install -r python_requirements.txt HOT 5
- ModuleNotFoundError: No module named 'down' HOT 3
- Link to CommonVoice 1035h dataset HOT 1
- ImportError: No module named deepspeech_training.util
- create_language_model.sh : HTTP Error 404 (downloading ds native client) HOT 1
- coqui HOT 2
- Pbmm and scorer file HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepspeech-german.