Giter VIP home page Giter VIP logo

music-genre-classification's Introduction

Music Genre Classification

Overview

Recognizing music genre is a challenging task in the area of music information retrieval. Two approaches are studied here:

  1. Spectrogram based end-to-end image classification using a CNN (VGG-16)
  2. Feature Engineering Approach using Logistic Regression, SVMs, Random Forest and eXtreme Gradient Boosting.

For a detailed description about the project, please refer to Music Genre Classification using Machine Learning Techniques, published on arXiv.

Datasets

The Audio Set data released by Google is used in this study. Specifically, only the wav files that correspond to the following class labels are extracted from YouTube based on the video link, start and end times.



Requirements

  • tensorflow-gpu==1.3.0
  • Keras==2.0.8
  • numpy==1.12.1
  • pandas==0.22.0
  • youtube-dl==2018.2.4
  • scipy==0.19.0
  • librosa==0.5.1
  • tqdm==4.19.1
  • scipy==0.19.0
  • Pillow==4.1.1

Note: If you encounter any problem in installing the modules you just need to go to python unofficial binnaries and according to your python version you can install them.

Instructions

  1. First, the audio wav files need to be downloaded using the tool youtube-dl. For this run audio_retrieval.py. Note that the each file is about 880 KB, totally upto 34 GB!
  2. Next, generate MEL spectrograms by running generate_spectrograms.py. If needed, you may modify the same file to change the Short Time Fourier Transform (STFT) parameters.
  3. The next step is to run the models. Please refer to the corresponding Jupyter notebooks. The deep learning based models are present in notebooks 3.1, 3.2 and 3.3. Notebooks 4 and 5 contains steps for feature extraction (run feature_extraction.py) and building the classifiers using sklearn.

Results

The models are evaluated on the basis on AUC, accuracy and Fscore.

The most important 20 features based on the XGB classifier are shown below. The metric on the x-axis refers to the number of times a given features appears as a decision node in all of the decision trees used to build the gradient boost predictor.

The confusion matrix of the ensemble XGB and CNN classifier:

music-genre-classification's People

Contributors

hareeshbahuleyan avatar dhruvvats-011 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.