Giter VIP home page Giter VIP logo

birdclef-2023-identify-bird-calls-in-soundscapes's Introduction

BirdCLEF-2023-Identify-bird-calls-in-soundscapes 4th place solution

My writeup for this solution can be found on kaggle.

Hardware

  • Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz, CPU Core=12, CPU Memory=64GB, GPU= 1 x RTX 3090

OS/platform

  • Linux Ubuntu 20.04 LTS
  • python==3.7.13

Training

Data preparation

  • Download BirdCLEF data for 2021, 2022, and 2023
  • Download additional datasets here
  • Copy the no-call directory of ff1010bird_nocall to the BirdCLEF 2023 train_audio directory.

Directory structure example

/input/
    ┣ aicrowd2020_noise_30sec/
    ┣ birdclef-2021/
        └ train_short_audio
    ┣ birdclef-2022/
        └ train_audio
    ┣ birdclef-2023/
        ├ train_audio <- add no-call
        └ train_meta_pseudo.pickle
    ┣ esc50/
    ┣ ff1010bird_nocall/
        └  ff1010bird_metadata_v1_pseudo.pickle
    ┣ train_soundscapes/
    ┣ xeno-canto/
    ┣ xeno-canto_nd/
    ┣ zenodo_nocall_30sec/
    ┣ pretrain_metadata_10fold_pseudo.pickle
    ┣ xeno-canto_audio_meta_pseudo.pickle
    ┗ xeno-canto_nd_audio_meta_pseudo.pickle
/src/
    ┗ ...

If you train on your own data

  • Get predicted values from Kaggle Models like this notebook.
  • Store the vector of predicted values as one column (teacher_preds) in the training data. like a ○○○.pickle.

Run

# -C flag is used to specify a config file
# replace NAME_OF_CONFIG with an appropiate config file name such as exp105

python pretrain_net.py -C NAME_OF_CONFIG  # for pretraining using BirdCLEF 2021, 2022

python train_net.py -C NAME_OF_CONFIG  # for training using BirdCLEF 2021, 2022

Weights of the trained model

Inference

Inference is published in a kaggle kernel here.

Ablation study

Name Public LB Private LB
BaseModel 0.80603 0.70782
BaseModel + Knowledge Distillation 0.82073 0.72752
BaseModel + Knowledge Distillation + Adding xeno-canto 0.82905 0.74038
BaseModel + Knowledge Distillation + Adding xeno-canto + Pretraining 0.8312 0.74424
BaseModel + Knowledge Distillation + Adding xeno-canto + Pretraining + Ensemble (4 models) 0.84019 0.75688

References

birdclef-2023-identify-bird-calls-in-soundscapes's People

Contributors

atsunorifujita avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.