Giter VIP home page Giter VIP logo

birdsong-keras's Introduction

Trainig scripts for deep convolutional neural network based audio classification in Keras

The following scripts were created for the BirdCLEF 2016 competition by Bálint Czeba and Bálint Pál Tóth.

The LifeCLEF bird identification challenge provides a largescale testbed for the system-oriented evaluation of bird species identifi- cation based on audio recordings. One of its main strength is that the data used for the evaluation is collected through Xeno-Canto, the largest network of bird sound recordists in the world. This makes the task closer to the conditions of a real-world application than previous, similar initiatives. The main novelty of the 2016-th edition of the challenge was the inclusion of soundscape recordings in addition to the usual xeno-canto recordings that focus on a single foreground species. This paper reports the methodology of the conducted evaluation, the overview of the systems experimented by the 6 participating research groups and a synthetic analysis of the obtained results. (More details: http://www.imageclef.org/lifeclef/2016/bird)

With some tweeks (reading meta-data and modifing network structure / how the spectogram is preprocessed) it is possible to apply it to arbitrary audio classification problems.

Citation

Please cite the following paper if this code was useful for your research:

Tóth Bálint Pál, Czeba Bálint, "Convolutional Neural Networks for Large-Scale Bird Song Classification in Noisy Environment", In: Working Notes of Conference and Labs of the Evaluation Forum, Évora, Portugália, 2016, p. 8 Download from here (PDF): http://ceur-ws.org/Vol-1609/16090560.pdf

@article{tothczeba,
    author =       "B\'{a}lint P\'{a}l T\'{o}th, B\'{a}lint Czeba",
    title =        "{Convolutional Neural Networks for Large-Scale Bird Song Classification in Noisy Environment}",
    booktitle =    "{Working Notes of Conference and Labs of the Evaluation Forum},
    pages =        "8",
    year =         "2016",
}

Prerequisites

You will need SOX for wave file resampling and Keras deep learning frameworks and some necessary modules. At the time of writeing you can install them in the following way:

sudo apt-get install sox
sudo apt-get install python-tk
sudo pip install scipy
sudo pip install matplotlib
sudo pip install sklearn
sudo pip install tensorflow-gpu
sudo pip install keras

The code is tested under Python 2.7. with TensorFlow (GPU) 1.0.0a0 and Keras 1.1.1. backend, NVidia Titan X 12GB GPU.

If you use TensorFlow as a backend with Keras 1.x you should set

"image_dim_ordering": "th",

in ~/.keras/keras.json configuration file.

In Keras 2 "image_dim_ordering" is deprecated. If you use TensorFlow + Keras 2.x, you should change the "image_data_format" setting to "channels_first".

Directory structure and files

doall.sh                - run this script and it will do everything (you will need plenty of disk space > 100 GB)
preprocess/loadData.py  - responsible for preprocessing the data (wavs and XML meta-data)
preprocess/sample_wavs_to_16k.sh - simple script that resamples wave files to 16 kHz with SOX
preprocess/xmltodict.py - XML processing from https://github.com/martinblech/xmltodict
train/trainModel.py     - after preprocessing this script trains the neural networks
train/model-AlexNet.py  - AlexNet inspired model for audio classification
train/model-BirdClef.py - Another convolutional neural net model for audio classification
train/MAPCallback.py    - Script to calculate MAP scores during training the neural nets
train/generateImages.py - Generate images from the preprocessed spectogram for visualization purposes
train/io_utils_mod.py   - Functions for loading and saving data to HDF5
train/log.py            - Functions for logging purposes
predict/predict.py      - Predict after preprocessing and training is done

Training (and download data and preprocess)

For training you have to simply run

./doall.sh

Be aware that this will download all the data (>50 GB) from http://otmedia.lirmm.fr/LifeCLEF/BirdCLEF2016/ to

birdclef_data

directory, unpack it and resample to 16 kHz and preprocess it into HDF5 files. You need cca. 280 GB of free space for the whole process. If you would like to put the data to somewhere else, please modify the doall.sh, preprocess/loadData.py and train/trainModel.py scripts.

The download process, preprocessing and training takes 4-5 days on an i7 CPU + Titan X GPU.

Prediction

After the preprocessing and training is do simpy run the following script to make predictions on test data:

./predict.sh

The prediction results will be written in a .csv file in the predict/ directory.

birdsong-keras's People

Contributors

bapalto avatar bmsasilva avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

birdsong-keras's Issues

AlexNet ValueError: Negative dimension size caused by subtracting 16 from 1

Hi,

When running trainModel.py, I get the following error:

model-AlexNet.py:45: UserWarning: Update your `Conv2D` call to the Keras 2 API: `Conv2D(kernel_initializer="glorot_normal", activation="relu", input_shape=(1, 200, 3..., padding="valid", strides=(6, 6), filters=96, kernel_size=(16, 16))`
  subsample=(6, 6)
Traceback (most recent call last):
  File "model-AlexNet.py", line 45, in <module>
    subsample=(6, 6)
  File "/anaconda/lib/python2.7/site-packages/keras/models.py", line 430, in add
    layer(x)
  File "/anaconda/lib/python2.7/site-packages/keras/engine/topology.py", line 578, in __call__
    output = self.call(inputs, **kwargs)
  File "/anaconda/lib/python2.7/site-packages/keras/layers/convolutional.py", line 164, in call
    dilation_rate=self.dilation_rate)
  File "/anaconda/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2893, in conv2d
    data_format='NHWC')
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 639, in convolution
    op=op)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 308, in with_space_to_batch
    return op(input, num_spatial_dims, padding)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 631, in op
    name=name)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 129, in _non_atrous_convolution
    name=name)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 396, in conv2d
    data_format=data_format, name=name)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2329, in create_op
    set_shapes_for_outputs(ret)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1717, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1667, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
    debug_python_shape_fn, require_shape_fn)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Negative dimension size caused by subtracting 16 from 1 for 'conv2d_1/convolution' (op: 'Conv2D') with input shapes: [?,1,200,310], [16,16,310,96].

Perhaps it's a change from Keras 1 to 2, but it seems that the `input_shape' should be in form of (200,310,1), the latter being depth, rather (1,200,310).

The Version of tensorflow and keras

I had installed the tensorflow-cpu 1.0.1 and keras 2.0.2.

When I ran the script of trainModel.py, I got an error as below:

Traceback (most recent call last):
File "trainModel.py", line 101, in
execfile(modelPath)
File "./model-AlexNet.py", line 25, in
from keras.models import Sequential
File "/usr/local/lib/python2.7/dist-packages/keras/init.py", line 3, in
from . import activations
File "/usr/local/lib/python2.7/dist-packages/keras/activations.py", line 3, in
from . import backend as K
File "/usr/local/lib/python2.7/dist-packages/keras/backend/init.py", line 64, in
from .tensorflow_backend import *
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1, in
import tensorflow as tf
File "/usr/local/lib/python2.7/dist-packages/tensorflow/init.py", line 24, in
from tensorflow.python import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/init.py", line 75, in
from tensorflow.core.framework.graph_pb2 import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/core/framework/graph_pb2.py", line 16, in
from tensorflow.core.framework import node_def_pb2 as tensorflow_dot_core_dot_framework_dot_node__def__pb2
File "/usr/local/lib/python2.7/dist-packages/tensorflow/core/framework/node_def_pb2.py", line 16, in
from tensorflow.core.framework import attr_value_pb2 as tensorflow_dot_core_dot_framework_dot_attr__value__pb2
File "/usr/local/lib/python2.7/dist-packages/tensorflow/core/framework/attr_value_pb2.py", line 16, in
from tensorflow.core.framework import tensor_pb2 as tensorflow_dot_core_dot_framework_dot_tensor__pb2
File "/usr/local/lib/python2.7/dist-packages/tensorflow/core/framework/tensor_pb2.py", line 16, in
from tensorflow.core.framework import resource_handle_pb2 as tensorflow_dot_core_dot_framework_dot_resource__handle__pb2
File "/usr/local/lib/python2.7/dist-packages/tensorflow/core/framework/resource_handle_pb2.py", line 22, in
serialized_pb=_b('\n/tensorflow/core/framework/resource_handle.proto\x12\ntensorflow"m\n\x0eResourceHandle\x12\x0e\n\x06\x64\x65vice\x18\x01 \x01(\t\x12\x11\n\tcontainer\x18\x02 \x01(\t\x12\x0c\n\x04name\x18\x03 \x01(\t\x12\x11\n\thash_code\x18\x04 \x01(\x04\x12\x17\n\x0fmaybe_type_name\x18\x05 \x01(\tB4\n\x18org.tensorflow.frameworkB\x13ResourceHandleProtoP\x01\xf8\x01\x01\x62\x06proto3')
TypeError: init() got an unexpected keyword argument 'syntax'

So can you tell me the version of tensorflow and keras?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.