inference time is more then pytorch version fastspeech , is there any additional layers added? about tensorflowtts HOT 16 CLOSED

manmay-nakhashi commented on May 14, 2024

inference time is more then pytorch version fastspeech , is there any additional layers added?

from tensorflowtts.

Comments (16)

dathudeptrai commented on May 14, 2024

Can you provide a code to compare? In my experiment tf is faster. You should check number of parameter álso

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

experiment is done on single cpu
TensorTTS
i have put a time.time() before and after inference functions
encode time: 2.9387922286987305
decode time: 0.8694911003112793
decode time: 0.3629953861236572
Fastspeech Time Calculation:
6.878948211669922
Melgen Time Calculation:
0.5189647674560547
TTS Time Calculation:
8.024913549423218
Pytorch fastspeech
Fastspeech MEL Calculation:
0.10805535316467285
TTS with squeezewave ouput
4.426220178604126

from tensorflowtts.

dathudeptrai commented on May 14, 2024

Tf run 1 times or loop ? On the first time tf run so slow. You should run around 50
Times and average last 40 times, ignore first 10 times

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

i have done this experiment on single cpu

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

i have ignored data loading part in time calculation

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

i have ignored data loading part in time calculation

Tf run 1 times or loop ? On the first time tf run so slow. You should run around 50
Times and average last 40 times, ignore first 10 times

i have ran it for single time , but time difference is already huge. on cpu*

from tensorflowtts.

dathudeptrai commented on May 14, 2024

Do you run many times and average ?. It does not make any sense for me, tf melgan run 2.5x faster than pytorch melgan. Fs 2x faster too. You must send me a code you use to calculate time.

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

import numpy as np
import soundfile as sf
import yaml

import tensorflow as tf

from tensorflow_tts.processor import LJSpeechProcessor

from tensorflow_tts.configs import FastSpeechConfig
from tensorflow_tts.configs import MelGANGeneratorConfig

from tensorflow_tts.models import TFFastSpeech
from tensorflow_tts.models import TFMelGANGenerator
import time

initialize fastspeech model.

with open('./examples/fastspeech/conf/fastspeech.v1.yaml') as f:
fs_config = yaml.load(f, Loader=yaml.Loader)
fs_config = FastSpeechConfig(**fs_config["fastspeech_params"])
fastspeech = TFFastSpeech(config=fs_config, name="fastspeech")
fastspeech._build()
fastspeech.load_weights("./examples/fastspeech/pretrained/model-195000.h5")

initialize melgan model

with open('./examples/melgan/conf/melgan.v1.yaml') as f:
melgan_config = yaml.load(f, Loader=yaml.Loader)
melgan_config = MelGANGeneratorConfig(**melgan_config["generator_params"])
melgan = TFMelGANGenerator(config=melgan_config, name='melgan_generator')
melgan._build()
melgan.load_weights("./examples/melgan/pretrained/generator-280000.h5")

#start = time.time()

inference

processor = LJSpeechProcessor(None, cleaner_names="english_cleaners")

ids = processor.text_to_sequence("do you want me to ask this to alexaa")
ids = tf.expand_dims(ids, 0)

fastspeech inference

start = time.time()
masked_mel_before, masked_mel_after, duration_outputs = fastspeech.inference(
ids,
attention_mask=tf.math.not_equal(ids, 0),
speaker_ids=tf.zeros(shape=[tf.shape(ids)[0]]),
duration_gts=None,
speed_ratios=tf.constant([1.0], dtype=tf.float32)
)
fastspeech = time.time()
print("Fastspeech Time Calculation:")
print(fastspeech-start)

melgan inference

audio_before = melgan(masked_mel_before)[0, :, 0]
audio_after = melgan(masked_mel_after)[0, :, 0]
melgen = time.time()
print("Melgen Time Calculation:")
print(melgen-fastspeech)

save to file

#sf.write('./audio_before.wav', audio_before, 22050, "PCM_16")
sf.write('./audio_after.wav', audio_after, 22050, "PCM_16")
end = time.time()
print("TTS Time Calculation:")
print(end-start)

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

encoder decoder time i have put it inside model code

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

Command i ran for time calculation == >>> CUDA_VISIBLE_DEVICES=-1 taskset --cpu-list 0 python3 inference.py

from tensorflowtts.

dathudeptrai commented on May 14, 2024

You ran 1 times, it will slow. You should loop and average time, ignore around first 5 iterations.

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

sure let me do that

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

where did you loop it ? inside script or in bash ?

from tensorflowtts.

dathudeptrai commented on May 14, 2024

Loop inference function on script

from tensorflowtts.

manmay-nakhashi commented on May 14, 2024

it is comparable now thanks , still pytorch fastspeech time : 0.09 and Tensorflow fastspeech time : 0.19 sec

from tensorflowtts.

dathudeptrai commented on May 14, 2024

i think there is mismatch about parameter config, in my experiment, tf always faster than pytorch. I will let you check the speed :)

from tensorflowtts.

inference time is more then pytorch version fastspeech , is there any additional layers added? about tensorflowtts HOT 16 CLOSED

Comments (16)

initialize fastspeech model.

initialize melgan model

inference

fastspeech inference

melgan inference

save to file

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent