nvidia-riva / tutorials Goto Github PK
View Code? Open in Web Editor NEWNVIDIA Riva runnable tutorials
NVIDIA Riva runnable tutorials
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/deploy-eks.html
This is after changing k8s version to 1.22 as mentioned here and running eksctl create cluster -f eks_launch_conf.yaml
Hello,
I am currently working with the German speech recognition data provided by this project and came across the following line in the README:
"In addition, we also filter out samples that are considered 'noisy', that is, samples having very high WER (word error rate) or CER (character error rate) w.r.t. a previously trained German model."
Unfortunately, I do not have access to a pre-trained German model to calculate the WER or CER for my dataset. This makes it challenging for me to filter out the noisy samples effectively.
Could you please provide a list of these 'noisy' samples or the criteria used to identify them?
Hello everyone!
After the riva model deployment, I'm generating transcriptions from my audio files. It's working fine.
I wanted head to head comparison of the raw .nemo
model with the riva model repositories. But I've noticed that, in few cases like shorter audios(not applicable for all the shorter audios) I'm not getting transcriptions, where I was getting from raw .nemo
model. I've tried with both Quick Start Scripts way and Docker way, but with no luck. Both of these give same results, as expected.
Here's the command I used to build rmir
model:
riva-build speech_recognition /servicemaker-dev/<output name of rmir model> /servicemaker-dev/<riva model name> --name=conformer-bn-BD-asr-streaming --featurizer.use_utterance_norm_params=False --featurizer.precalc_norm_time_steps=0 --featurizer.precalc_norm_params=False --ms_per_timestep=40 --endpointing.start_history=200 --nn.fp16_needs_obey_precision_pass --endpointing.residue_blanks_at_start=-2 --chunk_size=0.16 --left_padding_size=1.92 --right_padding_size=1.92 --decoder_type=flashlight --decoding_language_model_binary=<lm_binary> --decoding_vocab=<decoder_vocab_file> --flashlight_decoder.lm_weight=0.2 --flashlight_decoder.word_insertion_score=0.2 --flashlight_decoder.beam_threshold=20. --language_code=bn-BD
What could be the underlying cause for not receiving transcriptions after the model transformation?
CC: @vinhngx
Hello everyone!
I've created my own NeMo model and then all other steps of riva model, rmir model and riva repositories according to documentation.
I'm using Quick Start Scripts to Deploy model. After using this line: riva_streaming_asr_client --audio_file <wav file location>
, I'm not getting any transcription.
Here is the output I'm getting:
I1031 08:47:32.209681 109 riva_streaming_asr_client.cc:154] Using Insecure Server Credentials
Loading eval dataset...
filename: <wav file location>
Done loading 1 files
File: <wav file location>
Final transcripts:
Audio processed: 7.34695e-40 sec.
Not printing latency statistics because the client is run without the --simulate_realtime option and/or the number of requests sent is not equal to number of requests received. To get latency statistics, run with --simulate_realtime and set the --chunk_duration_ms to be the same as the server chunk duration
Run time: 4.68129 sec.
Total audio processed: 286.728 sec.
Throughput: 61.2499 RTFX
So, as you can see Final transcripts
response is empty.
From docker log, I'm getting this:
> Triton server is ready...
I1031 04:19:51.898931 423 riva_server.cc:120] Using Insecure Server Credentials
I1031 04:19:51.944115 423 model_registry.cc:110] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
W1031 04:19:51.961644 423 grpc_riva_asr.cc:157] citrinet-1024-en-US-asr-streaming has no configured wfst normalizer model
I1031 04:19:51.980005 423 riva_server.cc:160] Riva Conversational AI Server listening on 0.0.0.0:50051
W1031 04:19:51.980062 423 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
I1031 08:47:00.767860 428 grpc_riva_asr.cc:892] ASRService.StreamingRecognize called.
I1031 08:47:00.768599 428 grpc_riva_asr.cc:919] ASRService.StreamingRecognize performing streaming recognition with sequence id: 1779700260
I1031 08:47:00.800891 428 grpc_riva_asr.cc:976] Using model citrinet-1024-en-US-asr-streaming for inference
I1031 08:47:00.801008 428 grpc_riva_asr.cc:992] Model sample rate= 16000 for inference
I1031 08:47:00.848378 428 riva_asr_stream.cc:214] Detected format: encoding = 1 numchannels = 1 samplerate = 16000 bitspersample = 16
I1031 08:47:03.263229 428 grpc_riva_asr.cc:1093] ASRService.StreamingRecognize returning OK
I1031 08:47:32.256738 428 grpc_riva_asr.cc:892] ASRService.StreamingRecognize called.
I1031 08:47:32.257011 428 grpc_riva_asr.cc:919] ASRService.StreamingRecognize performing streaming recognition with sequence id: 2124845530
I1031 08:47:32.257077 428 grpc_riva_asr.cc:976] Using model citrinet-1024-en-US-asr-streaming for inference
I1031 08:47:32.257154 428 grpc_riva_asr.cc:992] Model sample rate= 16000 for inference
I1031 08:47:32.257484 428 riva_asr_stream.cc:214] Detected format: encoding = 1 numchannels = 1 samplerate = 16000 bitspersample = 16
I1031 08:47:36.936904 428 grpc_riva_asr.cc:1093] ASRService.StreamingRecognize returning OK
Can somebody direct me on how to comprehend what went wrong?
Training or Fine-Tuning an Acoustic Model:
Model fine-tuning is a set of techniques that makes fine adjustments to a pre-existing model using new data, so as to make it adapt to new situations while also retaining its original capabilities.
so I download this model:speechtotext_zh_cn_conformer.tlt and modidy the evaluate.yaml on test_ds.labels to the Mandarin then use this api:
!tao speech_to_text_citrinet evaluate
-e $SPECS_DIR/speech_to_text_citrinet/evaluate.yaml
-g 1
-k $KEY
-m $RESULTS_DIR/speechtotext_zh_cn_conformer.tlt
-r $RESULTS_DIR/citrinet/evaluate
test_ds.manifest_filepath=$DATA_DIR/train.json
but i got this error
raise ValueError("cfg must have tokenizer config to create a tokenizer !")
ValueError: cfg must have tokenizer config to create a tokenizer !
riva_target_gpu_family="tegra"
riva_arm64_legacy_platform="xavier"
service_enabled_asr=true
service_enabled_nlp=false
service_enabled_tts=false
just started the riva server via:
bash riva_init.sh
bash riva_start.sh
then tried the command:
riva_streaming_asr_client --audio_file=/opt/riva/wav/en-US_sample.wav
{
"name": "_InactiveRpcError",
"message": "<InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = "Error: Unavailable model requested. Lang: en-US, Type: offline"\n\tdebug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:50051 {grpc_message:"Error: Unavailable model requested. Lang: en-US, Type: offline", grpc_status:3, created_time:"2022-10-12T13:47:30.950278184+08:00"}"\n>",
"stack": "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31m_InactiveRpcError\u001b[0m Traceback (most recent call last)\nCell \u001b[0;32mIn [5], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m response \u001b[38;5;241m=\u001b[39m \u001b[43mriva_asr\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43moffline_recognize\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcontent\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mconfig\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2\u001b[0m asr_best_transcript \u001b[38;5;241m=\u001b[39m response\u001b[38;5;241m.\u001b[39mresults[\u001b[38;5;241m0\u001b[39m]\u001b[38;5;241m.\u001b[39malternatives[\u001b[38;5;241m0\u001b[39m]\u001b[38;5;241m.\u001b[39mtranscript\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m"\u001b[39m\u001b[38;5;124mASR Transcript:\u001b[39m\u001b[38;5;124m"\u001b[39m, asr_best_transcript)\n\nFile \u001b[0;32m~/study/python/nemo/.nemo/lib/python3.8/site-packages/riva/client/asr.py:352\u001b[0m, in \u001b[0;36mASRService.offline_recognize\u001b[0;34m(self, audio_bytes, config, future)\u001b[0m\n\u001b[1;32m 350\u001b[0m request \u001b[39m=\u001b[39m rasr\u001b[39m.\u001b[39mRecognizeRequest(config\u001b[39m=\u001b[39mconfig, audio\u001b[39m=\u001b[39maudio_bytes)\n\u001b[1;32m 351\u001b[0m func \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mstub\u001b[39m.\u001b[39mRecognize\u001b[39m.\u001b[39mfuture \u001b[39mif\u001b[39;00m future \u001b[39melse\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mstub\u001b[39m.\u001b[39mRecognize\n\u001b[0;32m--> 352\u001b[0m \u001b[39mreturn\u001b[39;00m func(request, metadata\u001b[39m=\u001b[39;49m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mauth\u001b[39m.\u001b[39;49mget_auth_metadata())\n\nFile \u001b[0;32m~/study/python/nemo/.nemo/lib/python3.8/site-packages/grpc/channel.py:946\u001b[0m, in \u001b[0;36m_UnaryUnaryMultiCallable.call\u001b[0;34m(self, request, timeout, metadata, credentials, wait_for_ready, compression)\u001b[0m\n\u001b[1;32m 937\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39m__call\u001b[39m(\u001b[39mself\u001b[39m,\n\u001b[1;32m 938\u001b[0m request,\n\u001b[1;32m 939\u001b[0m timeout\u001b[39m=\u001b[39m\u001b[39mNone\u001b[39;00m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 942\u001b[0m wait_for_ready\u001b[39m=\u001b[39m\u001b[39mNone\u001b[39;00m,\n\u001b[1;32m 943\u001b[0m compression\u001b[39m=\u001b[39m\u001b[39mNone\u001b[39;00m):\n\u001b[1;32m 944\u001b[0m state, call, \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_blocking(request, timeout, metadata, credentials,\n\u001b[1;32m 945\u001b[0m wait_for_ready, compression)\n\u001b[0;32m--> 946\u001b[0m \u001b[39mreturn\u001b[39;00m _end_unary_response_blocking(state, call, \u001b[39mFalse\u001b[39;49;00m, \u001b[39mNone\u001b[39;49;00m)\n\nFile \u001b[0;32m~/study/python/nemo/.nemo/lib/python3.8/site-packages/grpc/_channel.py:849\u001b[0m, in \u001b[0;36m_end_unary_response_blocking\u001b[0;34m(state, call, with_call, deadline)\u001b[0m\n\u001b[1;32m 847\u001b[0m \u001b[39mreturn\u001b[39;00m state\u001b[39m.\u001b[39mresponse\n\u001b[1;32m 848\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m--> 849\u001b[0m \u001b[39mraise\u001b[39;00m _InactiveRpcError(state)\n\n\u001b[0;31m_InactiveRpcError\u001b[0m: <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.INVALID_ARGUMENT\n\tdetails = "Error: Unavailable model requested. Lang: en-US, Type: offline"\n\tdebug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:50051 {grpc_message:"Error: Unavailable model requested. Lang: en-US, Type: offline", grpc_status:3, created_time:"2022-10-12T13:47:30.950278184+08:00"}"\n>"
}
ASR model export
!pip install nemo2riva is not being resolved
cell:
print(riva_version)
!pip install nvidia-pyindex
!ngc registry resource download-version "nvidia/riva/riva_quickstart:"$riva_version
!pip install nemo2riva
!pip install protobuf==3.20.0
============
2.10.0
++++++++++
getting following resolution error
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/, https://pypi.ngc.nvidia.com/
Collecting nemo2riva
Using cached nemo2riva-2.11.0-py3-none-any.whl (33 kB)
Collecting pyarmor<8 (from nemo2riva)
Using cached pyarmor-7.7.4-py2.py3-none-any.whl (2.3 MB)
Requirement already satisfied: nemo-toolkit>=1.13 in /usr/local/lib/python3.10/dist-packages (from nemo2riva) (1.19.0rc0)
INFO: pip is looking at multiple versions of nemo2riva to determine which version is compatible with other requirements. This could take a while.
Collecting nemo2riva
Using cached nemo2riva-2.10.0-py3-none-any.whl (33 kB)
Using cached nemo2riva-2.9.0-py3-none-any.whl (32 kB)
ERROR: Cannot install nemo2riva==2.10.0, nemo2riva==2.11.0 and nemo2riva==2.9.0 because these package versions have conflicting dependencies.
The conflict is caused by:
nemo2riva 2.11.0 depends on nvidia-eff<=0.6.2 and >=0.5.3
nemo2riva 2.10.0 depends on nvidia-eff<=0.6.2 and >=0.5.3
nemo2riva 2.9.0 depends on nvidia-eff<=0.6.2 and >=0.5.3
To fix this you could try to:
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
I deployed Nvidia Riva on the remote machine using instructions from this quick start guide using the version nvidia/riva/riva_quickstart:2.6.0 of the quickstart.
I was trying to run the asr-python-basics.ipynb notebook, the prediction worked, but the container crashed.
Code for reproduction:
import io
import IPython.display as ipd
import grpc
import riva.client
auth = riva.client.Auth(uri='localhost:50051')
riva_asr = riva.client.ASRService(auth)
path = "./audio_samples/en-US_sample.wav"
with io.open(path, 'rb') as fh:
content = fh.read()
ipd.Audio(path)
config = riva.client.RecognitionConfig()
config.language_code = "en-US"
config.max_alternatives = 1
config.enable_automatic_punctuation = True
config.audio_channel_count = 1
response = riva_asr.offline_recognize(content, config)
asr_best_transcript = response.results[0].alternatives[0].transcript
print("ASR Transcript:", asr_best_transcript)
print("\n\nFull Response Message:")
print(response)
Error from the Riva container logs:
I1024 14:07:25.004743 91 grpc_server.cc:4544] Started GRPCInferenceService at 0.0.0.0:8001
I1024 14:07:25.004979 91 http_server.cc:3242] Started HTTPService at 0.0.0.0:8000
I1024 14:07:25.045824 91 http_server.cc:180] Started Metrics Service at 0.0.0.0:8002
> Triton server is ready...
I1024 14:07:25.194975 403 riva_server.cc:120] Using Insecure Server Credentials
I1024 14:07:25.198563 403 model_registry.cc:110] Successfully registered: citrinet-1024-en-US-asr-offline for ASR
I1024 14:07:25.202152 403 model_registry.cc:110] Successfully registered: citrinet-1024-en-US-asr-streaming for ASR
I1024 14:07:25.205432 403 model_registry.cc:110] Successfully registered: conformer-en-US-asr-offline for ASR
I1024 14:07:25.208616 403 model_registry.cc:110] Successfully registered: conformer-en-US-asr-streaming for ASR
I1024 14:07:25.272111 403 model_registry.cc:110] Successfully registered: riva-punctuation-en-US for NLP
I1024 14:07:25.277859 403 model_registry.cc:110] Successfully registered: riva_intent_weather for NLP
I1024 14:07:25.278462 403 model_registry.cc:110] Successfully registered: riva_ner for NLP
I1024 14:07:25.279049 403 model_registry.cc:110] Successfully registered: riva_qa for NLP
I1024 14:07:25.279526 403 model_registry.cc:110] Successfully registered: riva_text_classification_domain for NLP
I1024 14:07:25.603746 403 model_registry.cc:110] Successfully registered: riva-punctuation-en-US for NLP
I1024 14:07:25.609462 403 model_registry.cc:110] Successfully registered: riva_intent_weather for NLP
I1024 14:07:25.610060 403 model_registry.cc:110] Successfully registered: riva_ner for NLP
I1024 14:07:25.610651 403 model_registry.cc:110] Successfully registered: riva_qa for NLP
I1024 14:07:25.611116 403 model_registry.cc:110] Successfully registered: riva_text_classification_domain for NLP
I1024 14:07:25.628804 403 model_registry.cc:110] Successfully registered: fastpitch_hifigan_ensemble-English-US for TTS
I1024 14:07:25.644143 403 riva_server.cc:160] Riva Conversational AI Server listening on 0.0.0.0:50051
W1024 14:07:25.644161 403 stats_reporter.cc:41] No API key provided. Stats reporting disabled.
W1024 14:07:26.005081 91 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1024 14:07:26.005117 91 metrics.cc:444] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W1024 14:07:26.005121 91 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W1024 14:07:27.005261 91 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1024 14:07:27.005292 91 metrics.cc:444] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W1024 14:07:27.005296 91 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W1024 14:07:28.006394 91 metrics.cc:426] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W1024 14:07:28.006411 91 metrics.cc:444] Unable to get power usage for GPU 0. Status:Success, value:0.000000
W1024 14:07:28.006415 91 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
I1024 14:19:04.273377 410 grpc_riva_asr.cc:484] ASRService.Recognize called.
I1024 14:19:04.273463 410 riva_asr_stream.cc:214] Detected format: encoding = 1 numchannels = 1 samplerate = 16000 bitspersample = 16
I1024 14:19:04.273468 410 grpc_riva_asr.cc:550] ASRService.Recognize performing streaming recognition with sequence id: 1093626779
I1024 14:19:04.273550 410 grpc_riva_asr.cc:580] Using model citrinet-1024-en-US-asr-offline for inference
I1024 14:19:04.273597 410 grpc_riva_asr.cc:595] Model sample rate= 16000 for inference
terminate called after throwing an instance of 'std::runtime_error'
what(): punct_logits: failed to perform CUDA copy: invalid argument
Signal (6) received.
0# 0x000056392435A7E9 in tritonserver
1# 0x00007F011980A0C0 in /usr/lib/x86_64-linux-gnu/libc.so.6
2# gsignal in /usr/lib/x86_64-linux-gnu/libc.so.6
3# abort in /usr/lib/x86_64-linux-gnu/libc.so.6
4# 0x00007F0119BC3911 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
5# 0x00007F0119BCF38C in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
6# 0x00007F0119BCF3F7 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
7# 0x00007F0119BCF37F in /usr/lib/x86_64-linux-gnu/libstdc++.so.6
8# 0x00007F0085B76B1E in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
9# 0x00007F0085B88A1C in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
10# 0x00007F0085B38CC2 in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
11# 0x00007F0085B38B34 in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
12# 0x00007F0085C34B1F in /opt/tritonserver/backends/riva_nlp_pipeline/libtriton_riva_nlp_pipeline.so
E1024 14:19:10.933988 1386 client_object.cc:116] error: failed to do inference: Socket closed
I1024 14:19:10.934067 1386 grpc_riva_asr.cc:243] Could not get punctuated transcript from punctuator model for transcript "what is natural language processing", adding basic punctuation
I1024 14:19:10.935387 410 grpc_riva_asr.cc:664] ASRService.Recognize returning OK
/opt/riva/bin/start-riva: line 55: 91 Aborted (core dumped) ${CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --strict-model-config=true $model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.
W1024 14:19:16.092974 403 riva_server.cc:184] Signal: 15
https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tutorials/deploy-eks.html
Repro (bash):
eksctl create cluster -f eks_launch_conf.yaml
Hi, I'm trying the deploy a NeMo conformer CTC model using Riva. It works well when evaluating with NeMo, but Riva is failing to infer when using this command during riva-build -
riva-build speech_recognition -f /servicemaker-dev/ASR-Model-Language-bn-val-wer-0.132.rmir /servicemaker-dev/ASR-Model-Language-bn-val-wer-0.132.riva \
--name=conformer-ctc-med-voicebook-it1-run3 \
--featurizer.use_utterance_norm_params=False \
--featurizer.precalc_norm_time_steps=0 \
--featurizer.precalc_norm_params=False \
--ms_per_timestep=40 \
--nn.fp16_needs_obey_precision_pass \
--chunk_size=0.16 \
--left_padding_size=1.92 \
--right_padding_size=1.92 \
--decoder_type=greedy \
--language_code=bn-BD --force
Using this command to transcribe an audio file -
/opt/riva/clients/riva_streaming_asr_client --audio_file /audios/a1.wav --language-code bn-BD --riva_uri localhost:50051
And getting this error in the riva speech container -
0105 06:11:48.388618 195 grpc_riva_asr.cc:1464] ASRService.StreamingRecognize called.
I0105 06:11:48.388924 195 grpc_riva_asr.cc:1491] ASRService.StreamingRecognize performing streaming recognition with sequence id: 761739119
I0105 06:11:48.389112 195 grpc_riva_asr.cc:1556] Using model conformer-ctc-med-voicebook-it1-run3 for inference
I0105 06:11:48.389196 195 grpc_riva_asr.cc:1573] Model sample rate= 16000 for inference
I0105 06:11:48.389746 195 riva_asr_stream.cc:214] Detected format: encoding = 1 numchannels = 1 samplerate = 8000 bitspersample = 16
I0105 06:11:48.390592 263 grpc_riva_asr.cc:1227] Creating resampler, audio file sample rate=8000 model sample_rate=16000
E0105 06:11:48.681532 92 ctc-decoder.cc:328] Inference failed in ASR decoder: basic_string::_M_construct null not valid
E0105 06:11:48.681618 92 backend_triton_api.cc:111] Model 'conformer-ctc-med-voicebook-it1-run3-ctc-decoder-cpu-streaming', instance: 'conformer-ctc-med-voicebook-it1-run3-ctc-decoder-cpu-streaming_0': failed executing 1 request(s) as one batch on device 0
W0105 06:13:28.664069 263 grpc_riva_asr.cc:1332] Response timeout. requests sent: 814 received: 52
E0105 06:13:28.664247 195 grpc_riva_asr.cc:1677] ASRService.StreamingRecognize returning failure
I've downloaded the models from https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_fastpitch_ipa and https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_en_us_hifigan_ipa. But I've no idea to use it. The tutorials just give how to generate speech with Riva TTS APIs. Can you give me a tutorial on how to generate speech using the downloaded models?
Tried ASR basic example using notebook hosted by nvidea in Getting started with RIVA course got this error when using audio files in .wav format other than samole audio.
<_InactiveRpcError of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "Error: config format doesn't match with header format" debug_error_string = "{"created":"@1682956061.861311444","description":"Error received from peer ipv4:172.18.0.4:50051","file":"src/core/lib/surface/call.cc","file_line":1069,"grpc_message":"Error: config format doesn't match with header format","grpc_status":3}".
So I noticed once a riva_server is started it doesn't appear to ever get terminated. I am using recognizestreaming and I end() it in nodejs, but it fails to kill the process.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.