Giter VIP home page Giter VIP logo

bsd10k's Introduction

BSD10k

The BSD10k dataset is the initial version of the Broad Sound Dataset (BSD), a collection of ~10k annotated sounds aligned with the second level of the classes defined in the BST taxonomy.

Dataset characteristics

The dataset consists of 10,309 audio clips from Freesound, totalling 32.5 hours of single-labeled audio (after cropping the sounds in maximum length of 30 seconds). Each sound has been manually labeled by humans. The dataset categorizes the sounds into 23 classes, which are the second-level categories of the BST Taxonomy (see the taxonomy section below). The accompanying metadata file contains information about the split of the sounds (train and test set), the licenses of the individual sounds and the tags and titles of each audio file. The sounds are unequally distributed in the classes.

Taxonomy

The Broad Sound Taxonomy (BST) organizes sounds into a two-level hierarchical structure with 5 top-level and 23 second-level classes. The taxonomy is designed to classify any kind of sounds and to be easy to use, broad and comprehensive. The taxonomy can be implemented for the organization and initial filtering on various platforms, such as Freesound, as well as in personal sound libraries. Information about the definition of the taxonomy classes is located in the taxonomy file.

A journal article (“A General-Purpose Broad Taxonomy for Sound Classification”) providing further details about the taxonomy including its design principles, detailed taxonomy creation methodology and evaluation, will be linked here in the near future.

Audio data

The original files downloaded from Freesound are converted to a standardized format of uncompressed 44.1 kHz 16-bit mono audio files, with any sounds longer than 30 seconds cropped to that duration. The audio files of the dataset can be downloaded as a single .zip file (~7.4G):

Download BSD10k dataset

License

The BSD10K as a whole is released under CC-BY. We note, though, that each audio file is released under its own Creative Commons (CC) license, as defined by the uploader in Freesound. Some sounds require attribution to their original authors, while others forbid commercial reuse. If the dataset is used in a commercial setting, the sounds with CC-BY-NC licenses should be excluded.

This is the distributin of sounds per license:

  • CC0: 3,187
  • CC-BY: 5,534
  • CC-BY-NC: 1,192
  • CC Sampling+: 396

Links to the license deeds for each sound can be further accessed through the metadata file.

bsd10k's People

Contributors

allholy avatar ffont avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.