iamjanvijay / rnnt Goto Github PK
View Code? Open in Web Editor NEWAn implementation of RNN-Transducer loss in TF-2.0.
License: MIT License
An implementation of RNN-Transducer loss in TF-2.0.
License: MIT License
hello,I have a question here:why ‘labels - 1 ’ in compute_rnnt_loss_and_grad_helper?
b = tf.reshape(labels - 1, shape=(batch_size, 1, target_max_len - 1, 1))
2020-07-10 09:52:09.715599: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 09:52:13.875948: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-10 09:52:13.910317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: GeForce RTX 2080 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.23GiB/s
2020-07-10 09:52:13.910441: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 09:52:13.966109: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-10 09:52:14.009304: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-10 09:52:14.028718: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-10 09:52:14.068297: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-10 09:52:14.094965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-10 09:52:14.171005: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-10 09:52:14.171229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-10 09:52:14.173103: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-10 09:52:14.199127: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18498c47600 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-10 09:52:14.199354: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-07-10 09:52:14.200187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: GeForce RTX 2080 computeCapability: 7.5
coreClock: 1.71GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.23GiB/s
2020-07-10 09:52:14.200339: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-10 09:52:14.200510: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-10 09:52:14.200624: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-10 09:52:14.200728: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-10 09:52:14.200834: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-10 09:52:14.200937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-10 09:52:14.201032: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-10 09:52:14.201209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-10 09:52:15.580864: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-10 09:52:15.581199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-07-10 09:52:15.581289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-07-10 09:52:15.582336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6609 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:08:00.0, compute capability: 7.5)
2020-07-10 09:52:15.586161: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x184c2fd8b50 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-10 09:52:15.586307: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
Model: "EncoderModel"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(4, None, 20)] 0
_________________________________________________________________
EncoderBlock (EncoderBlock) (4, None, 100) 370000
=================================================================
Total params: 370,000
Trainable params: 370,000
Non-trainable params: 0
_________________________________________________________________
None
Model: "PredictorModel"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(4, None, 28)] 0
_________________________________________________________________
PredictionBlock (PredictionB (4, None, 100) 212400
=================================================================
Total params: 212,400
Trainable params: 212,400
Non-trainable params: 0
_________________________________________________________________
None
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(4, None, 20)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(4, None, 28)] 0
__________________________________________________________________________________________________
EncoderBlock (EncoderBlock) (4, None, 100) 370000 input_1[0][0]
__________________________________________________________________________________________________
PredictionBlock (PredictionBloc (4, None, 100) 212400 input_2[0][0]
__________________________________________________________________________________________________
tf_op_layer_ExpandDims (TensorF [(4, None, 1, 100)] 0 EncoderBlock[0][0]
__________________________________________________________________________________________________
tf_op_layer_ExpandDims_1 (Tenso [(4, 1, None, 100)] 0 PredictionBlock[0][0]
__________________________________________________________________________________________________
tf_op_layer_AddV2 (TensorFlowOp [(4, None, None, 100 0 tf_op_layer_ExpandDims[0][0]
tf_op_layer_ExpandDims_1[0][0]
__________________________________________________________________________________________________
time_distributed (TimeDistribut (None, None, None, 1 10100 tf_op_layer_AddV2[0][0]
__________________________________________________________________________________________________
time_distributed_1 (TimeDistrib (None, None, None, 2 2828 time_distributed[0][0]
==================================================================================================
Total params: 595,328
Trainable params: 595,328
Non-trainable params: 0
__________________________________________________________________________________________________
None
Epoch 1/10
(4, 391, 172, 28)
(4, 172)
(4, 1)
(4, 1)
Traceback (most recent call last):
File "run_model.py", line 74, in <module>
train(t_model)
File "run_model.py", line 67, in train
loss = train_step(t_model, t_data, optimizer)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\def_function.py", line 580, in __call__
result = self._call(*args, **kwds)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\def_function.py", line 627, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\def_function.py", line 506, in _initialize
*args, **kwds))
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\function.py", line 2446, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\function.py", line 2777, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\function.py", line 2667, in _create_graph_function
capture_by_value=self._capture_by_value),
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\func_graph.py", line 981, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\eager\def_function.py", line 441, in wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
File "C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\func_graph.py", line 968, in wrapper
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
run_model.py:54 train_step *
loss = loss_fn(logits, labels, label_lens, mfcc_lens)
run_model.py:42 loss_fn *
return rnnt_loss(logits, labels, label_length, logit_length)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\rnnt-0.0.5-py3.7.egg\rnnt\rnnt.py:195 compute_rnnt_loss_and_grad *
result = compute_rnnt_loss_and_grad_helper(**kwargs)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\rnnt-0.0.5-py3.7.egg\rnnt\rnnt.py:112 compute_rnnt_loss_and_grad_helper *
blank_probs, truth_probs = transition_probs(one_hot_labels, log_probs)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\rnnt-0.0.5-py3.7.egg\rnnt\rnnt.py:36 transition_probs *
truth_probs = tf.reduce_sum(tf.multiply(log_probs[:, :, :-1, :], one_hot_labels), axis=-1)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\util\dispatch.py:180 wrapper **
return target(*args, **kwargs)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\ops\math_ops.py:381 multiply
return gen_math_ops.mul(x, y, name)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\ops\gen_math_ops.py:6092 mul
"Mul", x=x, y=y, name=name)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\op_def_library.py:744 _apply_op_helper
attrs=attr_protos, op_def=op_def)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\func_graph.py:595 _create_op_internal
compute_device)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\ops.py:3327 _create_op_internal
op_def=op_def)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\ops.py:1817 __init__
control_input_ops, op_def)
C:\Users\jtdut\anaconda3\envs\rnnt\lib\site-packages\tensorflow\python\framework\ops.py:1657 _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 171 and 172 for '{{node rnnt_loss/Mul}} = Mul[T=DT_FLOAT](rnnt_loss/strided_slice_1, rnnt_loss/one_hot)' with input shapes: [4,391,171,28], [4,391,172,28].
I got this error.
This multiply operator would fail, when input_max_len != (target_max_len-1)
.
Basically labels is batch x (target_max_len-1)
. When converted to one_hot_labels it becomes batch x (target_max_len-1) x (target_max_len-1) x vocab_size
.
logits is batch x input_max_len x target_max_len x vocab_size
.
And when we do tf.multiply(log_probs[:, :, :-1, :], one_hot_labels)
.
if input_max_len != (target_max_len-1)
it should fail.
Our test cases are succeeding only cause input_max_len == (target_max_len-1)
in all test cases.
ie input_max_len = 5
and target_max_len = 6
.
Hi, I have studied the source code, but the matrix operations are used so weird that really confuse me. Maybe you can kindly give a good literature about the algorithm with such strange diagonal matrix for us to understand. Now come to the problem, in the example:
pred_loss, pred_grads = loss_grad_gradtape(logits, labels, label_lengths, logit_lengths)
Is the pred_loss for tensorflow model loss function? what the use of pred_grads?
And when I check the source code, find the loss
loss = -final_state_probs
and
final_state_probs = beta[:, 0, 0]
the loss is get only from backward_dp() without connection with forward_dp(). So I think the pred_loss can't be used in tensorflow model simply. What's the correct training method for tensorflow, following is correct?
logits = some_deep_network(...)
pred_loss, pred_grads = loss_grad_gradtape(logits, labels, label_lengths, logit_lengths)
rnnt_model = tf.keras.Model(inputs=[logits, labels, label_lengths, logit_lengths], outputs=pred_loss)
rnnt_model.compile(optimizer='adam', loss=lambda y_true, y_pred: y_pred)
rnnt_model.fit(...)
Are there benchmarks with other implementations?
logits shape: (10, 79, 18, 6485)
labels shape (10, 17)
labels_length: tf.Tensor([ 2 14 9 9 9 13 17 9 9 17], shape=(10,), dtype=int64)
logit_length : tf.Tensor([20 47 36 35 41 58 64 38 45 78], shape=(10,), dtype=int64)
I got this error:
2020-12-11 10:48:09.516460: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at scatter_nd_op.cc:133 : Invalid argument: indices[0,0,2] = [0, 0, 2, -1] does not index into shape [10,79,18,6484]
Traceback (most recent call last):
File "/home/dapeng/PycharmProjects/convTT/train.py", line 68, in
train(model, train_set, optimizer, train_loss, epoch)
File "/home/dapeng/PycharmProjects/convTT/train.py", line 33, in train
label_length=labels_length)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/rnnt/rnnt.py", line 204, in rnnt_loss
return compute_rnnt_loss_and_grad(*args)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow/python/ops/custom_gradient.py", line 264, in call
return self._d(self._f, a, k)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow/python/ops/custom_gradient.py", line 218, in decorated
return _eager_mode_decorator(wrapped, args, kwargs)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow/python/ops/custom_gradient.py", line 412, in _eager_mode_decorator
result, grad_fn = f(*args, **kwargs)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/rnnt/rnnt.py", line 195, in compute_rnnt_loss_and_grad
result = compute_rnnt_loss_and_grad_helper(**kwargs)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/rnnt/rnnt.py", line 168, in compute_rnnt_loss_and_grad_helper
[batch_size, input_max_len, target_max_len, vocab_size - 1])
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8842, in scatter_nd
indices, updates, shape, name=name, ctx=_ctx)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8885, in scatter_nd_eager_fallback
attrs=_attrs, ctx=ctx, name=name)
File "/home/dapeng/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,0,2] = [0, 0, 2, -1] does not index into shape [10,79,18,6484] [Op:ScatterNd]
Hello, I like your work. I wonder if you can support this package for the newer version of tensorflow and tf-nightly?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.