Comments (27)
import yaml
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow_tts.processor.ljspeech import LJSpeechProcessor
from tensorflow_tts.processor.ljspeech import symbols, _symbol_to_id
from tensorflow_tts.configs import Tacotron2Config
from tensorflow_tts.models import TFTacotron2
with open('./tacotron2.v1.yaml') as f:
config = yaml.load(f, Loader=yaml.Loader)
config = Tacotron2Config(**config["tacotron2_params"])
tacotron2 = TFTacotron2(config=config, training=False, name="tacotron2")
input_text = "i think it work"
input_ids = LJSpeechProcessor(None, "english_cleaners").text_to_sequence(input_text.lower())
input_ids = np.concatenate([input_ids, [len(symbols) - 1]], -1)
# pass input to build model.
decoder_output, mel_outputs, stop_token_prediction, alignment_history = tacotron2.inference(
input_ids=np.expand_dims(input_ids, 0),
input_lengths=np.array([len(input_ids)]),
speaker_ids=np.array([0], dtype=np.int32),
maximum_iterations=4000,
use_window_mask=False,
win_front=6,
win_back=6,
)
# load weight and save to pb.
tacotron2.load_weights("./tacotron2.v1/checkpoints/model-120000.h5")
tf.saved_model.save(tacotron2, "./test_saved")
# load and inference again to check.
tacotron2 = tf.saved_model.load("./test_saved")
decoder_output, mel_outputs, stop_token_prediction, alignment_history = tacotron2.inference(
input_ids=np.expand_dims(input_ids, 0),
input_lengths=np.array([len(input_ids)]),
speaker_ids=np.array([0], dtype=np.int32),
maximum_iterations=4000,
use_window_mask=False,
win_front=6,
win_back=6,
)
fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111)
ax.set_title(f'Alignment steps')
im = ax.imshow(
alignment_history[0].numpy(),
aspect='auto',
origin='lower',
interpolation='none')
fig.colorbar(im, ax=ax)
xlabel = 'Decoder timestep'
plt.xlabel(xlabel)
plt.ylabel('Encoder timestep')
plt.tight_layout()
plt.show()
plt.close()
@anasvaf let try :)). some how tacotron._build() make it can not to be able save to pb. :))
from tensorflowtts.
@manmay-nakhashi i fixed it today :)) pls git pull :D
from tensorflowtts.
ok, i will try to fix those issues tonight. Maybe we should merge call and inference into call function only or call inference function inside a call function
from tensorflowtts.
@anasvaf tflite works on flat buffer and tensorflow pb file is protobuf , flat buffer is faster mostly on low end devices.
from tensorflowtts.
@dathudeptrai thank you so much for all your help!! :))
from tensorflowtts.
@anasvaf i won't suggest you save entire model h5, It does not guarantee success. After you load weight from h5 file, you can save it into pb file then do inference on server. Or you can try save_format="tf". On TF 2, we won't use h5 to save entire model anymore :)), see https://www.tensorflow.org/api_docs/python/tf/saved_model/save.
from tensorflowtts.
@dathudeptrai Thank you for the prompt response!
Unfortunately, it did not work either when I added the save_format="tf".
I have read the API docs regarding the saved model, however I am still confused how to go from the saved HDF5 weights to a frozen pb file. I would still need the names for the input and output tensors from the graph. Is this right?
from tensorflowtts.
@anasvaf i will do it for you tonight :))). You just want to know how to save to pb ?
from tensorflowtts.
@dathudeptrai Yes, saving the model as pb would be really helpful, so I can use post-training quantization and try to import it on a raspberry to check the latency on the inference of mel prediction :)
Thank you for all the help!
from tensorflowtts.
@dathudeptrai Thank you so much! :) works like a charm :). I can get the pb file. Do you know the name of the mel_outputs tensor? I mean in the variables.data file what should be the name, as a string? Something like: "post_net/tf_tacotron_conv_batch_norm_9/batch_norm_._4/moving_variance"
from tensorflowtts.
@anasvaf why you need the name, you can use tf.saved_model.load and do inference as above code ?. you can print(mel_outputs) to get a name.
from tensorflowtts.
tensorflow.python.saved_model.nested_structure_coder.NotEncodableError: No encoder for object [tf.Tensor(2000, shape=(), dtype=int32)] of type [<class 'tensorflow.python.framework.ops.EagerTensor'>].
i am getting this error , can it be version issue ??
from tensorflowtts.
ok thanks
from tensorflowtts.
@manmay-nakhashi @anasvaf i think you guys need "watch" my repo, to be sure you guys won't missing any update. I will update multiban melgan soon, it's 3x faster than melgan and quality is better.
from tensorflowtts.
sure @dathudeptrai : ))
from tensorflowtts.
@dathudeptrai I will try printing the tf.Tensor to check the its node name. The reason that I asked is that if you build TF from source and deploy it on Android, my guess is that you would need to specify the node input/output name for the .pb file (as in line 74 https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/android/tfmobile/src/org/tensorflow/demo/ClassifierActivity.java)
Also another question for the frozen file. When loading the model, it holds the property of the input_id length and cannot accept smaller or larger sentences. I tried to zero pad for smaller ones but I get a weird wav file. Any thoughts on that?
from tensorflowtts.
Also another question for the frozen file. When loading the model, it holds the property of the input_id length and cannot accept smaller or larger sentences. I tried to zero pad for smaller ones but I get a weird wav file. Any thoughts on that?
Send me a code that u are using.
from tensorflowtts.
@anasvaf @dathudeptrai i am trying to convert this model to tflite model , but saved_model dosen't have any signatures do you know why , and how can i add it ?
from tensorflowtts.
@dathudeptrai This is the code for inference.
import yaml
import numpy as np
import matplotlib.pyplot as plt
import soundfile as sf
import tensorflow as tf
from tensorflow_tts.processor.ljspeech import LJSpeechProcessor
from tensorflow_tts.processor.ljspeech import symbols, _symbol_to_id
from tensorflow_tts.configs import Tacotron2Config
from tensorflow_tts.models import TFTacotron2
from tensorflow_tts.configs import MelGANGeneratorConfig
from tensorflow_tts.models import TFMelGANGenerator
input_text = "Hello! World"
input_ids = LJSpeechProcessor(None, "english_cleaners").text_to_sequence(input_text.lower())
input_ids = np.concatenate([input_ids, [len(symbols) - 1]], -1)
# load and inference again to check.
tacotron2 = tf.saved_model.load("test_saved")
decoder_output, mel_outputs, stop_token_prediction, alignment_history = tacotron2.inference(
input_ids=np.expand_dims(input_ids, 0),
input_lengths=np.array([len(input_ids)]),
speaker_ids=np.array([0], dtype=np.int32),
maximum_iterations=4000,
use_window_mask=False,
win_front=6,
win_back=6,
)
print(mel_outputs)
And the output I am getting is, since I have saved the pb file with a larger sentence:
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (7 total):
* Tensor("input_ids:0", shape=(1, 13), dtype=int64)
* Tensor("input_lengths:0", shape=(1,), dtype=int64)
* Tensor("speaker_ids:0", shape=(1,), dtype=int32)
* False
* 6
* 6
* 4000
Keyword arguments: {}
Expected these arguments to match one of the following 1 option(s):
Option 1:
Positional arguments (7 total):
* TensorSpec(shape=(1, 58), dtype=tf.int64, name='input_ids')
* TensorSpec(shape=(1,), dtype=tf.int64, name='input_lengths')
* TensorSpec(shape=(1,), dtype=tf.int32, name='speaker_ids')
* False
* 6
* 6
* 4000
Keyword arguments: {}
from tensorflowtts.
@manmay-nakhashi I am not sure if you can get a tflite from the pb file, since there are multiple @tf.function definitions in the model. E.g., on the call and infer function located at models/tacotron2.
Maybe I am wrong. This is the error I am getting when I am trying to do the following:
# Convert the model.
converter = tf.lite.TFLiteConverter.from_saved_model("test_saved")
tflite_model = converter.convert()
# Save the TF Lite model.
with tf.gfile.GFile('model_tacotron2.tflite', 'wb') as f:
f.write(tflite_model)
Traceback (most recent call last):
File "convert_h5_to_pb.py", line 59, in
tflite_model = converter.convert()
File "/home/anasvaf/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow/lite/python/lite.py", line 452, in convert
raise ValueError("This converter can only convert a single "
ValueError: This converter can only convert a single ConcreteFunction. Converting multiple functions is under development.
from tensorflowtts.
@manmay-nakhashi I am not sure if you can get a tflite from the pb file, since there are multiple @tf.function definitions in the model. E.g., on the call and infer function located at models/tacotron2.
Maybe I am wrong. This is the error I am getting when I am trying to do the following:# Convert the model. converter = tf.lite.TFLiteConverter.from_saved_model("test_saved") tflite_model = converter.convert() # Save the TF Lite model. with tf.gfile.GFile('model_tacotron2.tflite', 'wb') as f: f.write(tflite_model)
Traceback (most recent call last):
File "convert_h5_to_pb.py", line 59, in
tflite_model = converter.convert()
File "/home/anasvaf/anaconda3/envs/tf2/lib/python3.6/site-packages/tensorflow/lite/python/lite.py", line 452, in convert
raise ValueError("This converter can only convert a single "
ValueError: This converter can only convert a single ConcreteFunction. Converting multiple functions is under development.
@anasvaf
yes i have tried this but it seens like you can convert any function to tflite if we can add signature and make it a concrete function
tensorflow/tensorflow#34350
i was looking into this issue
from tensorflowtts.
@dathudeptrai @anasvaf i think it would be best if we can convert this to tflite for faster inference to mobile devices and embedded.
from tensorflowtts.
@manmay-nakhashi you can still quantize the weights on the .pb file. At the moment it is only 2.6 MB (consisting of variableOPs). If you build TensorFlow for mobile from source you can still perform quite fast inference on mobile, utilizing only the CPU. Not sure how much you can speed up Tacotron-2 with the TFLite that can use the GPU. Notice that the most computationally intensive operations, based on the dynamic input, are the Entering and Exiting the while loop on the encoder-decoder.
But we can give it a try with a TFLite file :)
from tensorflowtts.
@anasvaf https://github.com/dathudeptrai/TensorflowTTS/pull/31 for you.
from tensorflowtts.
@anasvaf @manmay-nakhashi pls close if it solve ur problem. I don't think we can convert Tacotron to tflite, even we can do that, there is no way make it can be run real-time on mobile devices.
from tensorflowtts.
@dathudeptrai Yes, saving the model as pb would be really helpful, so I can use post-training quantization and try to import it on a raspberry to check the latency on the inference of mel prediction :)
Thank you for all the help!
@anasvaf Thanks for your discussion, and would you mind tell me how do the post-training quantization on the pb file. And I use the Tensorflow Lite, but I encountered some errors.https://github.com/dathudeptrai/TensorflowTTS/issues/47#issue-639624372
from tensorflowtts.
@gongchenghhu Unfortunately I was not able to do it. There are also some missing ops regarding Tacotron2 that need to be written in C++
I will try though with fastSpeech
from tensorflowtts.
Related Issues (20)
- Multi Speaker Training HOT 1
- Support Arabic Language HOT 2
- Tacotron2 Pre-training have difficulties
- Training Tacotron2 model became so slow after update HOT 1
- How do I get the RTF index HOT 1
- Japanese TTS model HOT 2
- Preprocessing error with ljspeech HOT 6
- tacotron2 parameter confusing, hop size configuration for databaker dataset is 256, not 300 HOT 1
- Installation on MacOS HOT 1
- Hifi-Gan config for Baker dataset HOT 1
- tensorflow-gpu==2.7.0 HOT 15
- Dose it support mutil speaker of chinese language ? HOT 1
- Android release as TTS engine HOT 7
- Train with another dataset HOT 2
- No module named 'tensorflow_tts' HOT 2
- Inference on MB MelGAN sounds great until testing on iOS HOT 3
- TensorFlowTTS support vietnamese HOT 2
- [MB_Melgan] Why is a model trained only generator is better than trained on both?
- support chinese HOT 2
- How to config CMakeLists.txt ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflowtts.