Giter VIP home page Giter VIP logo

noisystudent's Introduction

Noisy Student Training

Overview

Noisy Student Training is a semi-supervised learning method which achieves 88.4% top-1 accuracy on ImageNet (SOTA) and surprising gains on robustness and adversarial benchmarks. Noisy Student Training is based on the self-training framework and trained with 4 simple steps:

  1. Train a classifier on labeled data (teacher).
  2. Infer labels on a much larger unlabeled dataset.
  3. Train a larger classifier on the combined set, adding noise (noisy student).
  4. Go to step 2, with student as teacher

For ImageNet checkpoints trained by Noisy Student Training, please refer to the EfficientNet github.

SVHN Experiments

Here we show an implementation of Noisy Student Training on SVHN, which boosts the performance of a supervised model from 97.9% accuracy to 98.6% accuracy.

# Download and preprocess SVHN. Download the teacher model trained on labeled data with accuracy 97.9.
bash local_scripts/svhn/prepro.sh

# Train & Eval (expected accuracy: 98.6 +- 0.1) 
# The teacher model generates predictions on the fly in this script. To store the teacher model's prediction to save training time, see the following instructions.
bash local_scripts/svhn/run.sh

Instructions on running prediction on unlabeled data, filtering and balancing data and training using the stored predictions.

# Run prediction on multiple shards.
# Run predictions in parallel if you have multiple GPUs/TPUs
bash local_scripts/svhn/predict.sh

# Get statistics of different shards (parallelizable).
bash local_scripts/svhn/filter_unlabel.sh 1

# Output the filtered and balanced data (parallelizable).
bash local_scripts/svhn/filter_unlabel.sh 0

# Training & Eval the stored predictions.
bash local_scripts/svhn/run_offline.sh

If you get a better model, you can use the model to predict pseudo-labels on the filtered data.

# Reassign pseudo-labels.
# Run predictions in parallel if you have multiple GPUs/TPUs
bash local_scripts/svhn/reassign.sh

You can also use the colab script noisystudent_svhn.ipynb to try the method on free Colab GPUs.

ImageNet Experiments

Scripts used for our ImageNet experiments:

# Train:
# See the scripts for hyperparameters for EfficientNet-B0 to B7.
# You need to fill in the label_data_dir, unlabel_data_dir, model_name, teacher_model_path in the script.
bash local_scripts/imagenet/train.sh

# Eval
bash local_scripts/imagenet/eval.sh

Similar scripts to run predictions on unlabeled data, filter and balance data and train using the filtered data.

# Run prediction on multiple shards.
bash local_scripts/imagenet/predict.sh

# Get statistics of different shards (parallelizable).
bash local_scripts/imagenet/filter_unlabel.sh 1

# Output the filtered and balanced data (parallelizable).
bash local_scripts/imagenet/filter_unlabel.sh 0

# Training & Eval using the filtered data.
bash local_scripts/imagenet/run_offline.sh
bash local_scripts/imagenet/eval.sh

Use a model to predict pseudo-labels on the filtered data:

# Reassign pseudo-labels.
# Run predictions in parallel if you have multiple GPUs/TPUs
bash local_scripts/imagenet/reassign.sh

Bibtex

@article{xie2019self,
  title={Self-training with Noisy Student improves ImageNet classification},
  author={Xie, Qizhe and Luong, Minh-Thang and Hovy, Eduard and Le, Quoc V},
  journal={arXiv preprint arXiv:1911.04252},
  year={2019}
}

This is not an officially supported Google product.

noisystudent's People

Contributors

lmthang avatar michaelpulsewidth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

noisystudent's Issues

tabular data/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

  1. Does it also work on tabular data?
  2. Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

Test accuracy?

测试预训练模型:
I1020 09:23:09.595353 140204815505152 main.py:805] test, results: {'loss': 2.5957024, 'top_1_accuracy': 0.0988783, 'top_5_accuracy': 0.5069914, 'global_step': 0}
与提供97.9%对比不上?

HOW

Hi!How to use your trained model to test my own dataset?Thank you.

Which tensorflow version are you using?

Hi guys, thanks for the very interesting work!

Could please provide the list of required packages? I am just trying to reproduce the experiments. Thanks!

Confusion about the loss function

hello, when i look at this code after read the paper,I am quite confused about how the loss function is calculated in the code.The final loss in the paper is the sum of the cross-entropy loss of the distribution calculation of the two kinds of data (with or without labels), but the calculation method in the code is not the same. like:
real_lab_bsz = tf.to_float(lab_bsz) * FLAGS.label_data_sample_prob
real_unl_bsz = batch_size * FLAGS.label_data_sample_prob * FLAGS.unlabel_ratio
data_loss = lab_loss * real_lab_bsz + unl_loss * real_unl_bsz
data_loss = data_loss / real_lab_bsz`
The loss in front of this part of the code has been averaged, but it has multiplied by the number of samples first, and then divided by the number of labeled samples. What is the significance of this calculation method? It doesn't feel the same as described in the paper.

What is the structure of the training data for ImageNet?

Thank you for your great work! I am trying to re-do the experiment on ImageNet data as described in the Readme file.

However, I could not find anywhere in the instruction that shows how the folders label_data_dir and unlabel_data_dir are structured. Could you please clarify? Thank you!

techer model link not working

Hi, I am trying to run your script on colab, but the error message shows that the URL for pulling teacher_ckpt is not working, is there a new link to the teacher model?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.