Giter VIP home page Giter VIP logo

lucko515 / speech-recognition-neural-network Goto Github PK

View Code? Open in Web Editor NEW
186.0 11.0 87.0 12.96 MB

This is the end-to-end Speech Recognition neural network, deployed in Keras. This was my final project for Artificial Intelligence Nanodegree @Udacity.

Python 2.25% Shell 0.02% HTML 56.21% Jupyter Notebook 41.53%
aind recurrent-neural-networks speech-recognition deep-learning gru lstm-neural-networks

speech-recognition-neural-network's Introduction

Project Overview

In this notebook, you will build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline!

ASR Pipeline

We begin by investigating the LibriSpeech dataset that will be used to train and evaluate your models. Your algorithm will first convert any raw audio to feature representations that are commonly used for ASR. You will then move on to building neural networks that can map these audio features to transcribed text. After learning about the basic types of layers that are often used for deep learning-based approaches to ASR, you will engage in your own investigations by creating and testing your own state-of-the-art models. Throughout the notebook, we provide recommended research papers for additional reading and links to GitHub repositories with interesting implementations.

Project Instructions

Getting Started

  1. Clone the repository, and navigate to the downloaded folder.
git clone https://github.com/udacity/AIND-VUI-Capstone.git
cd AIND-VUI-Capstone
  1. Create (and activate) a new environment with Python 3.6 and the numpy package.

    • Linux or Mac:
    conda create --name aind-vui python=3.5 numpy
    source activate aind-vui
    
    • Windows:
    conda create --name aind-vui python=3.5 numpy scipy
    activate aind-vui
    
  2. Install TensorFlow.

    • Option 1: To install TensorFlow with GPU support, follow the guide to install the necessary NVIDIA software on your system. If you are using the Udacity AMI, you can skip this step and only need to install the tensorflow-gpu package:
    pip install tensorflow-gpu==1.1.0
    
    • Option 2: To install TensorFlow with CPU support only,
    pip install tensorflow==1.1.0
    
  3. Install a few pip packages.

pip install -r requirements.txt
  1. Switch Keras backend to TensorFlow.

    • Linux or Mac:
    KERAS_BACKEND=tensorflow python -c "from keras import backend"
    
    • Windows:
    set KERAS_BACKEND=tensorflow
    python -c "from keras import backend"
    
  2. Obtain the libav package.

    • Linux: sudo apt-get install libav-tools
    • Mac: brew install libav
    • Windows: Browse to the Libav website
      • Scroll down to "Windows Nightly and Release Builds" and click on the appropriate link for your system (32-bit or 64-bit).
      • Click nightly-gpl.
      • Download most recent archive file.
      • Extract the file. Move the usr directory to your C: drive.
      • Go back to your terminal window from above.
    rename C:\usr avconv
    set PATH=C:\avconv\bin;%PATH%
    
  3. Obtain the appropriate subsets of the LibriSpeech dataset, and convert all flac files to wav format.

    • Linux or Mac:
    wget http://www.openslr.org/resources/12/dev-clean.tar.gz
    tar -xzvf dev-clean.tar.gz
    wget http://www.openslr.org/resources/12/test-clean.tar.gz
    tar -xzvf test-clean.tar.gz
    mv flac_to_wav.sh LibriSpeech
    cd LibriSpeech
    ./flac_to_wav.sh
    
    • Windows: Download two files (file 1 and file 2) via browser and save in the AIND-VUI-Capstone directory. Extract them with an application that is compatible with tar and gz such as 7-zip or WinZip. Convert the files from your terminal window.
    move flac_to_wav.sh LibriSpeech
    cd LibriSpeech
    powershell ./flac_to_wav.sh
    
  4. Create JSON files corresponding to the train and validation datasets.

cd ..
python create_desc_json.py LibriSpeech/dev-clean/ train_corpus.json
python create_desc_json.py LibriSpeech/test-clean/ valid_corpus.json
  1. Create an IPython kernel for the aind-vui environment. Open the notebook.
python -m ipykernel install --user --name aind-vui --display-name "aind-vui"
jupyter notebook vui_notebook.ipynb
  1. Before running code, change the kernel to match the aind-vui environment by using the drop-down menu. Then, follow the instructions in the notebook.

select aind-vui kernel

NOTE: While some code has already been implemented to get you started, you will need to implement additional functionality to successfully answer all of the questions included in the notebook. Unless requested, do not modify code that has already been included.

Amazon Web Services

If you do not have access to a local GPU, you could use Amazon Web Services to launch an EC2 GPU instance. Please refer to the Udacity instructions for setting up a GPU instance for this project.

Evaluation

Your project will be reviewed by a Udacity reviewer against the CNN project rubric. Review this rubric thoroughly, and self-evaluate your project before submission. All criteria found in the rubric must meet specifications for you to pass.

Project Submission

When you are ready to submit your project, collect the following files and compress them into a single archive for upload:

  • The vui_notebook.ipynb file with fully functional code, all code cells executed and displaying output, and all questions answered.
  • An HTML or PDF export of the project notebook with the name report.html or report.pdf.
  • The sample_models.py file with all model architectures that were trained in the project Jupyter notebook.
  • The results/ folder containing all HDF5 and pickle files corresponding to trained models.

Alternatively, your submission could consist of the GitHub link to your repository.

Project Rubric

Files Submitted

Criteria Meets Specifications
Submission Files The submission includes all required files.

STEP 2: Model 0: RNN

Criteria Meets Specifications
Trained Model 0 The submission trained the model for at least 20 epochs, and none of the loss values in model_0.pickle are undefined. The trained weights for the model specified in simple_rnn_model are stored in model_0.h5.

STEP 2: Model 1: RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed rnn_model Module The submission includes a sample_models.py file with a completed rnn_model module containing the correct architecture.
Trained Model 1 The submission trained the model for at least 20 epochs, and none of the loss values in model_1.pickle are undefined. The trained weights for the model specified in rnn_model are stored in model_1.h5.

STEP 2: Model 2: CNN + RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed cnn_rnn_model Module The submission includes a sample_models.py file with a completed cnn_rnn_model module containing the correct architecture.
Trained Model 2 The submission trained the model for at least 20 epochs, and none of the loss values in model_2.pickle are undefined. The trained weights for the model specified in cnn_rnn_model are stored in model_2.h5.

STEP 2: Model 3: Deeper RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed deep_rnn_model Module The submission includes a sample_models.py file with a completed deep_rnn_model module containing the correct architecture.
Trained Model 3 The submission trained the model for at least 20 epochs, and none of the loss values in model_3.pickle are undefined. The trained weights for the model specified in deep_rnn_model are stored in model_3.h5.

STEP 2: Model 4: Bidirectional RNN + TimeDistributed Dense

Criteria Meets Specifications
Completed bidirectional_rnn_model Module The submission includes a sample_models.py file with a completed bidirectional_rnn_model module containing the correct architecture.
Trained Model 4 The submission trained the model for at least 20 epochs, and none of the loss values in model_4.pickle are undefined. The trained weights for the model specified in bidirectional_rnn_model are stored in model_4.h5.

STEP 2: Compare the Models

Criteria Meets Specifications
Question 1 The submission includes a detailed analysis of why different models might perform better than others.

STEP 2: Final Model

Criteria Meets Specifications
Completed final_model Module The submission includes a sample_models.py file with a completed final_model module containing a final architecture that is not identical to any of the previous architectures.
Trained Final Model The submission trained the model for at least 20 epochs, and none of the loss values in model_end.pickle are undefined. The trained weights for the model specified in final_model are stored in model_end.h5.
Question 2 The submission includes a detailed description of how the final model architecture was designed.

Suggestions to Make your Project Stand Out!

(1) Add a Language Model to the Decoder

The performance of the decoding step can be greatly enhanced by incorporating a language model. Build your own language model from scratch, or leverage a repository or toolkit that you find online to improve your predictions.

(2) Train on Bigger Data

In the project, you used some of the smaller downloads from the LibriSpeech corpus. Try training your model on some larger datasets - instead of using dev-clean.tar.gz, download one of the larger training sets on the website.

(3) Try out Different Audio Features

In this project, you had the choice to use either spectrogram or MFCC features. Take the time to test the performance of both of these features. For a special challenge, train a network that uses raw audio waveforms!

Special Thanks

We have borrowed the create_desc_json.py and flac_to_wav.sh files from the ba-dls-deepspeech repository, along with some functions used to generate spectrograms.

speech-recognition-neural-network's People

Contributors

lucko515 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

speech-recognition-neural-network's Issues

Layer softmax was called with an input that isn't a symbolic tensor.

Windows 10
Python: 3.7.5
tensorflow: 2.0.0
Keras: 2.3.1

model_1 = rnn_model(input_dim=13, # change to 13 if you would like to use MFCC features
units=200,
activation='relu')

This error occurrs


ValueError Traceback (most recent call last)
~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in assert_input_compatibility(self, inputs)
309 try:
--> 310 K.is_keras_tensor(x)
311 except ValueError:

~\AppData\Roaming\Python\Python37\site-packages\keras\backend\tensorflow_backend.py in is_keras_tensor(x)
696 raise ValueError('Unexpectedly found an instance of type ' + --> 697 str(type(x)) + '. '
698 'Expected a symbolic tensor instance.')

ValueError: Unexpectedly found an instance of type <class 'ellipsis'>. Expected a symbolic tensor instance.

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
1 model_1 = rnn_model(input_dim=13, # change to 13 if you would like to use MFCC features
2 units=200,
----> 3 activation='relu')

~\STT\AIND-VUI-Capstone\sample_models.py in rnn_model(input_dim, units, activation, output_dim)
33 time_dense = ...
34 # Add softmax activation layer
---> 35 y_pred = Activation('softmax', name='softmax')(time_dense)
36 # Specify the model
37 model = Model(inputs=input_data, outputs=y_pred)

~\AppData\Roaming\Python\Python37\site-packages\keras\backend\tensorflow_backend.py in symbolic_fn_wrapper(*args, **kwargs)
73 if _SYMBOLIC_SCOPE.value:
74 with get_graph().as_default():
---> 75 return func(*args, **kwargs)
76 else:
77 return func(*args, **kwargs)

~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in call(self, inputs, **kwargs)
444 # Raise exceptions in case the input is not compatible
445 # with the input_spec specified in the layer constructor.
--> 446 self.assert_input_compatibility(inputs)
447
448 # Collect input shapes to build layer.

~\AppData\Roaming\Python\Python37\site-packages\keras\engine\base_layer.py in assert_input_compatibility(self, inputs)
314 'Received type: ' +
315 str(type(x)) + '. Full input: ' +
--> 316 str(inputs) + '. All inputs to the layer '
317 'should be tensors.')
318

ValueError: Layer softmax was called with an input that isn't a symbolic tensor. Received type: <class 'ellipsis'>. Full input: [Ellipsis]. All inputs to the layer should be tensors.

I have made necessary changes to accommodate newer versions of tensorflow, keras and such

Error

I am trying to run the code but there is a problem after model_1 = rnn_model(...) could you tell me how to solve this problem pls

UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes in position 54-55: truncated \uXXXX

This is really great work first of all.

I do see a bug when I get to train_model(...) step.

I'm using Windows 10, my Keras version is 2.0.5, and Tensorflow version is 1.2.1.

The stack trace is as follows:

Epoch 1/20
100/101 [============================>.] - ETA: 0s - loss: 876.5908
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-7-8b7039598497> in <module>()
      2             pickle_path='model_0.pickle',
      3             save_model_path='model_0.h5',
----> 4             spectrogram=True) # change to False if you would like to use MFCC features

~\Deep Learning\AIND-VUI-Capstone\train_utils.py in train_model(input_to_softmax, pickle_path, save_model_path, train_json, valid_json, minibatch_size, spectrogram, mfcc_dim, optimizer, epochs, verbose, sort_by_duration, max_duration)
     74     hist = model.fit_generator(generator=audio_gen.next_train(), steps_per_epoch=steps_per_epoch,
     75         epochs=epochs, validation_data=audio_gen.next_valid(), validation_steps=validation_steps,
---> 76         callbacks=[checkpointer], verbose=verbose)
     77 
     78     # save model loss

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
     86                 warnings.warn('Update your `' + object_name +
     87                               '` call to the Keras 2 API: ' + signature, stacklevel=2)
---> 88             return func(*args, **kwargs)
     89         wrapper._legacy_support_signature = inspect.getargspec(func)
     90         return wrapper

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_q_size, workers, pickle_safe, initial_epoch)
   1937                             epoch_logs['val_' + l] = o
   1938 
-> 1939                 callbacks.on_epoch_end(epoch, epoch_logs)
   1940                 epoch += 1
   1941                 if callback_model.stop_training:

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\callbacks.py in on_epoch_end(self, epoch, logs)
     75         logs = logs or {}
     76         for callback in self.callbacks:
---> 77             callback.on_epoch_end(epoch, logs)
     78 
     79     def on_batch_begin(self, batch, logs=None):

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\callbacks.py in on_epoch_end(self, epoch, logs)
    426                     self.model.save_weights(filepath, overwrite=True)
    427                 else:
--> 428                     self.model.save(filepath, overwrite=True)
    429 
    430 

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\engine\topology.py in save(self, filepath, overwrite, include_optimizer)
   2504         """
   2505         from ..models import save_model
-> 2506         save_model(self, filepath, overwrite, include_optimizer)
   2507 
   2508     def save_weights(self, filepath, overwrite=True):

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\models.py in save_model(model, filepath, overwrite, include_optimizer)
    104     f.attrs['model_config'] = json.dumps({
    105         'class_name': model.__class__.__name__,
--> 106         'config': model.get_config()
    107     }, default=get_json_type).encode('utf8')
    108 

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\engine\topology.py in get_config(self)
   2320         for layer in self.layers:  # From the earliest layers on.
   2321             layer_class_name = layer.__class__.__name__
-> 2322             layer_config = layer.get_config()
   2323             filtered_inbound_nodes = []
   2324             for original_node_index, node in enumerate(layer.inbound_nodes):

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\layers\core.py in get_config(self)
    659     def get_config(self):
    660         if isinstance(self.function, python_types.LambdaType):
--> 661             function = func_dump(self.function)
    662             function_type = 'lambda'
    663         else:

~\AppData\Local\conda\conda\envs\aind-vui\lib\site-packages\keras\utils\generic_utils.py in func_dump(func)
    174         A tuple `(code, defaults, closure)`.
    175     """
--> 176     code = marshal.dumps(func.__code__).decode('raw_unicode_escape')
    177     defaults = func.__defaults__
    178     if func.__closure__:

UnicodeDecodeError: 'rawunicodeescape' codec can't decode bytes in position 54-55: truncated \uXXXX

It's possible that this is a bug in Keras as per this issue:
https://stackoverflow.com/questions/41847376/keras-model-to-json-error-rawunicodeescape-codec-cant-decode-bytes-in-posi
keras-team/keras#8572

I'm not sure what to try next. Anyone else seeing this issue?

Kernel Error

jupyter notebook, getting error on kernel connection.

File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/site-packages/tornado/platform/asyncio.py", line 148, in start self.asyncio_loop.run_forever() File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/asyncio/base_events.py", line 408, in run_forever raise RuntimeError('This event loop is already running') RuntimeError: This event loop is already running [I 10:22:07.220 NotebookApp] KernelRestarter: restarting kernel (4/5), keep random ports Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/site-packages/ipykernel_launcher.py", line 16, in <module> app.launch_new_instance() File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 477, in start ioloop.IOLoop.instance().start() File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/site-packages/tornado/platform/asyncio.py", line 148, in start self.asyncio_loop.run_forever() File "/home/ubuntu/anaconda3/envs/aind-vui/lib/python3.5/asyncio/base_events.py", line 408, in run_forever raise RuntimeError('This event loop is already running') RuntimeError: This event loop is already running [W 10:22:10.245 NotebookApp] KernelRestarter: restart failed [W 10:22:10.246 NotebookApp] Kernel 7008793e-d30b-4056-b45f-854269c1b400 died, removing from map. [W 10:22:41.367 NotebookApp] Timeout waiting for kernel_info reply from 8c712274-6346-4a59-acdb-b48b57744973 **[E 10:22:41.372 NotebookApp] Error opening stream: HTTP 404: Not Found (Kernel does not exist: 8c712274-6346-4a59-acdb-b48b57744973) [W 10:22:41.391 NotebookApp] 404 GET /api/kernels/7008793e-d30b-4056-b45f-854269c1b400/channels?session_id=a0a7cb01c3c644a3a135bedafc1996ea (127.0.0.1): Kernel does not exist: 7008793e-d30b-4056-b45f-854269c1b400 [W 10:22:41.424 NotebookApp] 404 GET /api/kernels/7008793e-d30b-4056-b45f-854269c1b400/channels?session_id=a0a7cb01c3c644a3a135bedafc1996ea (127.0.0.1) 39.32ms referer=None [W 10:22:42.438 NotebookApp] Replacing stale connection: 7008793e-d30b-4056-b45f-854269c1b400:a0a7cb01c3c644a3a135bedafc1996ea [3126:3126:0513/102307.294600:ERROR:gcm_channel_status_request.cc(151)] GCM channel request failed. [3126:3126:0513/102335.242424:ERROR:gcm_channel_status_request.cc(151)] GCM channel request failed. [I 10:23:40.856 NotebookApp] Saving file at /vui_notebook.ipynb [3126:3126:0513/102428.160312:ERROR:gcm_channel_status_request.cc(151)] GCM channel request failed. **

Validation loss fluctuating

Iam trying to train the model_end with few hyperparameter tuning changes. Iam also training on my own dataset of 600 wavefiles, splited as 10% test dataset, 10% validation dataset and 80% training dataset.

def train_model(input_to_softmax,
pickle_path,
save_model_path,
train_json='train_corpus.json',
valid_json='valid_corpus.json',
test_json='test_corpus.json',
minibatch_size=8,
spectrogram=True,
mfcc_dim=13,
optimizer=Adam(lr=0.0001, decay=1e-6),
epochs=3000,
verbose=1,
sort_by_duration=False,
max_duration=18.0):

model summary:


Layer (type) Output Shape Param #

the_input (InputLayer) (None, None, 13) 0


layer_1_conv (Conv1D) (None, None, 100) 39100


conv_batch_norm (BatchNormal (None, None, 100) 400


rnn_1 (GRU) (None, None, 100) 60300


bt_rnn_1 (BatchNormalization (None, None, 100) 400


rnn_bi (GRU) (None, None, 100) 60300


bt_rnn_bi (BatchNormalizatio (None, None, 100) 400


time_distributed_6 (TimeDist (None, None, 29) 2929


softmax (Activation) (None, None, 29) 0

Total params: 163,829
Trainable params: 163,229
Non-trainable params: 600


None

First 20 epochs:
Epoch 1/3000
69/70 [============================>.] - ETA: 0s - loss: 828.7257 - acc: 0.0000e+00Epoch 00000: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 824.4764 - acc: 0.0000e+00 - val_loss: 1025.0829 - val_acc: 0.0000e+00
Epoch 2/3000
69/70 [============================>.] - ETA: 0s - loss: 576.0773 - acc: 0.0000e+00Epoch 00001: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 574.0380 - acc: 0.0000e+00 - val_loss: 965.9543 - val_acc: 0.0000e+00
Epoch 3/3000
69/70 [============================>.] - ETA: 0s - loss: 475.5182 - acc: 0.0000e+00Epoch 00002: saving model to results/model_end.h5
70/70 [==============================] - 71s - loss: 475.8133 - acc: 0.0000e+00 - val_loss: 840.2710 - val_acc: 0.0000e+00
Epoch 4/3000
69/70 [============================>.] - ETA: 0s - loss: 446.0698 - acc: 0.0000e+00Epoch 00003: saving model to results/model_end.h5
70/70 [==============================] - 70s - loss: 446.3768 - acc: 0.0000e+00 - val_loss: 649.6033 - val_acc: 0.0000e+00
Epoch 5/3000
69/70 [============================>.] - ETA: 0s - loss: 420.5421 - acc: 0.0000e+00Epoch 00004: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 420.1927 - acc: 0.0000e+00 - val_loss: 505.9156 - val_acc: 0.0000e+00
Epoch 6/3000
69/70 [============================>.] - ETA: 0s - loss: 412.7203 - acc: 0.0000e+00Epoch 00005: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 412.8178 - acc: 0.0000e+00 - val_loss: 450.0329 - val_acc: 0.0000e+00
Epoch 7/3000
69/70 [============================>.] - ETA: 0s - loss: 397.5730 - acc: 0.0000e+00Epoch 00006: saving model to results/model_end.h5
70/70 [==============================] - 68s - loss: 398.1866 - acc: 0.0000e+00 - val_loss: 395.4442 - val_acc: 0.0000e+00
Epoch 8/3000
69/70 [============================>.] - ETA: 0s - loss: 388.5409 - acc: 0.0000e+00Epoch 00007: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 386.8799 - acc: 0.0000e+00 - val_loss: 401.9153 - val_acc: 0.0000e+00
Epoch 9/3000
69/70 [============================>.] - ETA: 0s - loss: 389.5977 - acc: 0.0000e+00Epoch 00008: saving model to results/model_end.h5
70/70 [==============================] - 68s - loss: 389.4545 - acc: 0.0000e+00 - val_loss: 388.4293 - val_acc: 0.0000e+00
Epoch 10/3000
69/70 [============================>.] - ETA: 0s - loss: 378.1360 - acc: 0.0000e+00Epoch 00009: saving model to results/model_end.h5
70/70 [==============================] - 68s - loss: 376.7665 - acc: 0.0000e+00 - val_loss: 407.0841 - val_acc: 0.0000e+00
Epoch 11/3000
69/70 [============================>.] - ETA: 0s - loss: 374.1938 - acc: 0.0000e+00Epoch 00010: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 374.0701 - acc: 0.0000e+00 - val_loss: 361.8077 - val_acc: 0.0000e+00
Epoch 12/3000
69/70 [============================>.] - ETA: 0s - loss: 373.0912 - acc: 0.0000e+00Epoch 00011: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 373.9879 - acc: 0.0000e+00 - val_loss: 362.8776 - val_acc: 0.0000e+00
Epoch 13/3000
69/70 [============================>.] - ETA: 0s - loss: 370.4228 - acc: 0.0000e+00Epoch 00012: saving model to results/model_end.h5
70/70 [==============================] - 68s - loss: 370.9717 - acc: 0.0000e+00 - val_loss: 353.5565 - val_acc: 0.0000e+00
Epoch 14/3000
69/70 [============================>.] - ETA: 0s - loss: 363.2626 - acc: 0.0000e+00Epoch 00013: saving model to results/model_end.h5
70/70 [==============================] - 69s - loss: 364.6332 - acc: 0.0000e+00 - val_loss: 350.5256 - val_acc: 0.0000e+00
Epoch 15/3000
69/70 [============================>.] - ETA: 0s - loss: 361.7289 - acc: 0.0000e+00Epoch 00014: saving model to results/model_end.h5
70/70 [==============================] - 68s - loss: 362.7544 - acc: 0.0000e+00 - val_loss: 391.8794 - val_acc: 0.0000e+00
Epoch 16/3000
69/70 [============================>.] - ETA: 0s - loss: 360.1477 - acc: 0.0000e+00Epoch 00015: saving model to results/model_end.h5
70/70 [==============================] - 70s - loss: 358.5634 - acc: 0.0000e+00 - val_loss: 389.8897 - val_acc: 0.0000e+00
Epoch 17/3000
69/70 [============================>.] - ETA: 0s - loss: 363.4254 - acc: 0.0000e+00Epoch 00016: saving model to results/model_end.h5
70/70 [==============================] - 70s - loss: 362.3484 - acc: 0.0000e+00 - val_loss: 347.7054 - val_acc: 0.0000e+00
Epoch 18/3000
69/70 [============================>.] - ETA: 0s - loss: 358.4653 - acc: 0.0000e+00Epoch 00017: saving model to results/model_end.h5
70/70 [==============================] - 70s - loss: 357.7992 - acc: 0.0000e+00 - val_loss: 382.3785 - val_acc: 0.0000e+00
Epoch 19/3000
69/70 [============================>.] - ETA: 0s - loss: 355.4213 - acc: 0.0000e+00Epoch 00018: saving model to results/model_end.h5
70/70 [==============================] - 68s - loss: 355.0019 - acc: 0.0000e+00 - val_loss: 378.6394 - val_acc: 0.0000e+00
Epoch 20/3000
69/70 [============================>.] - ETA: 0s - loss: 356.1490 - acc: 0.0000e+00Epoch 00019: saving model to results/model_end.h5
70/70 [==============================] - 70s - loss: 355.6650 - acc: 0.0000e+00 - val_loss: 369.9678 - val_acc: 0.0000e+00

as you can see, the training and validation loss is very high an also vlidation loss is going up and down. Also, can anyone suggest how this loss can be reduced?

Create a TCN model

Hey, I am trying to create a TCN model but it always shows a big loss in each epoch and predict is also not working. Can anyone help me with a TCN model implementation?

flac_to_wav.sh

Hi, I am using a window 10 to run the flac_to_wav.sh, for some reason, it is not working and said that command is not recognized. Does anyone also have this issue?

how does performance change with the length of the audio?

Is this method more suited to shorter bursts of audio or can it be employed for transcription on audio of longer duration say 30-40 minutes?
i.e is there a considerable decay in performance when applied to longer audio files?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.