Giter VIP home page Giter VIP logo

zindi_wheat_growth's Introduction

1st Place Solution for the Zindi CGIAR Wheat Growth Stage Challenge

Competition website

The problem is to estimate the growth stage of a wheat crop based on an image sent in by the farmer. Model must take in an image and output a prediction for the growth stage of the wheat shown, on a scale from 1 (crop just showing) to 7 (mature crop).

Instructions to run the code

System Requirements

The following system requirements should be satisfied:

  • OS: Ubuntu 16.04
  • Python: 3.6
  • CUDA: 10.1
  • cudnn: 7
  • pipenv (pip install pipenv)

The training has been done on 2 x GeForce RTX 2080 (training time of the final ensemble is about 15 hours). The batch sizes are selected accordingly.

  • Change the list of available GPUs in ./conf/config.yaml. The parameter is called gpu_list
  • Override the batch size if needed in train.sh and inference.sh providing training.batch_size argument

Environment Setup

  1. Navigate to the project directory
  2. Run pipenv install --deploy --ignore-pipfile to install dependencies
  3. Run pipenv shell to activate the environment

Data Setup

  1. Download data from the competition website and save it to the ./data/ directory.
  2. Unzip Images.zip there: unzip Images.zip

Best ensemble submission

  • The best Private LB submission (0.39949 RMSE) is available in ./lightning_logs/best_model.csv

Best ensemble inference

  • Download zindi_wheat_weights.zip to the project directory
  • Unzip them there: unzip zindi_wheat_weights.zip
  • For inference, run ./inference.sh. Final ensemble predictions will be saved to ./lightning_logs/2_3_4_5_6_ens.csv

Train the model from scratch

  • To start training the model from scratch, run ./train.sh (takes about 15 hours on 2x2080)
  • Afterwards, ./inference.sh could be run

Solution Description

The dataset has two sets of labels: bad and good quality, but test dataset consists only of good quality labels. First of all, there is no clear correspondance between bad and good labels (good labels contain only five classes: 2, 3, 4, 5, 7). Secondly, bad and good quality images could be easily distinguished using a simple binary classifier. So, they come from the different distributions. Looking at the Grad-CAM of such a model suggests that the major difference between two sets of images is these white sticks (poles):

That's why training process consists of 2 steps:

  1. Pre-train the model on the mix of bad and good quality labels
  2. Finetune the model only on the good quality labels

Single best model

Best single model (average of 5 folds on the test set) achieves 0.40327 RMSE on Private LB.

Model hyperparameters

  • Architecture: ResNet101
  • Problem type: classification
  • Loss: cross entropy
  • FC Dropout probability: 0.3
  • Input size: (256, 256)
  • Predicted probabilities are multiplied by the class labels and summed up

Augmentations

The median image size in the data is about 180 x 512. For preprocessing, firstly image is padded to img_width // 2 x img_width and then is resized to 256 x 256. Augmentations list includes:

  • Horizontal flips
  • RandomBrightnessContrast
  • ShiftScaleRotate
  • Imitating additional white sticks (poles) on the images
  • Label augmentation (changing class labels to the neighbor classes with a low probability)
  • Use horizontal flips as TTA

A couple of examples:

Training process

  1. Pre-train on mix of good and bad labels for 10 epochs
  2. Fine-tune on good labels for 50 epochs with reducing learning rate on plateau

Ensemble

Ensembling multiple models worked pretty well in this problem. My final solution is just an average of five 5-fold models (25 checkpoints overall):

  • Architectures: ResNet50, ResNet101, ResNeXt50
  • Input sizes: 256x256 and 512x512
  • It achieved 0.39949 RMSE on Private LB

What didn't work

  • Various options of pseudolabels. Generating pseudolabels for bad quality label and test set; pre-training; mixing with actual labels, etc.
  • MixUp and CutMix augmentations
  • EffecientNet architectures
  • Treating the problem as a regression (didn't help even in the ensemble)
  • Stacking over the first level models predictions

zindi_wheat_growth's People

Contributors

ybabakhin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.