google-research / noisystudent Goto Github PK

Code for Noisy Student Training. https://arxiv.org/abs/1911.04252

License: Apache License 2.0

Python 88.59% Shell 9.44% Jupyter Notebook 1.97%

noisystudent's Introduction

Noisy Student Training

Overview

Noisy Student Training is a semi-supervised learning method which achieves 88.4% top-1 accuracy on ImageNet (SOTA) and surprising gains on robustness and adversarial benchmarks. Noisy Student Training is based on the self-training framework and trained with 4 simple steps:

Train a classifier on labeled data (teacher).
Infer labels on a much larger unlabeled dataset.
Train a larger classifier on the combined set, adding noise (noisy student).
Go to step 2, with student as teacher

For ImageNet checkpoints trained by Noisy Student Training, please refer to the EfficientNet github.

SVHN Experiments

Here we show an implementation of Noisy Student Training on SVHN, which boosts the performance of a supervised model from 97.9% accuracy to 98.6% accuracy.

# Download and preprocess SVHN. Download the teacher model trained on labeled data with accuracy 97.9.
bash local_scripts/svhn/prepro.sh

# Train & Eval (expected accuracy: 98.6 +- 0.1) 
# The teacher model generates predictions on the fly in this script. To store the teacher model's prediction to save training time, see the following instructions.
bash local_scripts/svhn/run.sh

Instructions on running prediction on unlabeled data, filtering and balancing data and training using the stored predictions.

# Run prediction on multiple shards.
# Run predictions in parallel if you have multiple GPUs/TPUs
bash local_scripts/svhn/predict.sh

# Get statistics of different shards (parallelizable).
bash local_scripts/svhn/filter_unlabel.sh 1

# Output the filtered and balanced data (parallelizable).
bash local_scripts/svhn/filter_unlabel.sh 0

# Training & Eval the stored predictions.
bash local_scripts/svhn/run_offline.sh

If you get a better model, you can use the model to predict pseudo-labels on the filtered data.

# Reassign pseudo-labels.
# Run predictions in parallel if you have multiple GPUs/TPUs
bash local_scripts/svhn/reassign.sh

You can also use the colab script noisystudent_svhn.ipynb to try the method on free Colab GPUs.

ImageNet Experiments

Scripts used for our ImageNet experiments:

# Train:
# See the scripts for hyperparameters for EfficientNet-B0 to B7.
# You need to fill in the label_data_dir, unlabel_data_dir, model_name, teacher_model_path in the script.
bash local_scripts/imagenet/train.sh

# Eval
bash local_scripts/imagenet/eval.sh

Similar scripts to run predictions on unlabeled data, filter and balance data and train using the filtered data.

# Run prediction on multiple shards.
bash local_scripts/imagenet/predict.sh

# Get statistics of different shards (parallelizable).
bash local_scripts/imagenet/filter_unlabel.sh 1

# Output the filtered and balanced data (parallelizable).
bash local_scripts/imagenet/filter_unlabel.sh 0

# Training & Eval using the filtered data.
bash local_scripts/imagenet/run_offline.sh
bash local_scripts/imagenet/eval.sh

Use a model to predict pseudo-labels on the filtered data:

# Reassign pseudo-labels.
# Run predictions in parallel if you have multiple GPUs/TPUs
bash local_scripts/imagenet/reassign.sh

Bibtex

@article{xie2019self,
  title={Self-training with Noisy Student improves ImageNet classification},
  author={Xie, Qizhe and Luong, Minh-Thang and Hovy, Eduard and Le, Quoc V},
  journal={arXiv preprint arXiv:1911.04252},
  year={2019}
}

This is not an officially supported Google product.

noisystudent's People

Contributors

Stargazers

Watchers

Forkers

github30 yyht entn-at amirunpri2018 alexlevn 21hub sailfish009 ngoanpv kirilzilla ishine linhduongtuan yaoq queenie88 etema19 peaceiris mohammadalm dlzhyh firojalam forks-learning yangtong1989 shun1024 yurongchen1998 wiwi ankitshah009 peternara y742035557 rounak2460 cosimofang zeta1999 alkhaldieid hughlio muskanmahajan37 sunnylee525 sinason gfdsah236 xxia-kathy imagr-ltd topchoder mysterxous doantientai jamesbsilva stjordanis isabellarossi hfxunlp prob1995 baixiangzheyan wilxy chaoso siayou zhangcheng-007 dangxusheng mucaoshen madina1467 3zn miaobinlien starflettw pkucactus bweng001 simonfahle lynnnnnnnnn binfnstats qqq487 amshek douglas2code aaaqeczyh peggyzz hye00525 israelpf doyeunlee shaharbsh yuhuang-ca jianzhu robinhoodki yueyedeai shiyuzh2007 doulemint xlmore yangdesheng kyrie-zhao arturandre isabella232 gianscarpe ehsanw42 deveshraj cswangle jinwook-shim moileehyeji python-repository-hub joeyburzynski spicer23 akshay1-6180 momodrine finamintoastcrunch gusang-lee maikuraky yunshanbai spaethbenjamin jay-hyeon

noisystudent's Issues

tabular data/ noisy instances

Hi,
thanks for sharing your implementation. I have two questions about it:

Does it also work on tabular data?
Is it possible to identify the noisy instances (return the noisy IDs or the clean set)?

Thanks!

the result got 0.09 about top_1_accuracy and got 0.50 about top_5_accuracy

hi ,
i run the "bash local_scripts/svhn/run.sh".but the result is poor, however, the teacher model trained on labeled data with accuracy 97.9. so ,why happened to this ?

How to train EfficientNet based ArcFace model with Noisy Student method

Hello
How are you?
Thanks for contributing this project.
I have a question.
Could u please explain how to train EfficientNet based ArcFace model with noisy student method?
Thanks

Why this code takes only a small part of GPU mem and volatile only 0%

I'm trying to train bash local_scripts/svhn/run.sh on my GPU (titan rtx).
I don't know why this only takes 161MB of the GPU and GPU-Util (volatile) 0%.
The output process: examples/sec: 10.5986
Anyone know why is this ?

Test accuracy?

测试预训练模型：
I1020 09:23:09.595353 140204815505152 main.py:805] test, results: {'loss': 2.5957024, 'top_1_accuracy': 0.0988783, 'top_5_accuracy': 0.5069914, 'global_step': 0}
与提供97.9%对比不上？

HOW

Hi！How to use your trained model to test my own dataset？Thank you.

Which tensorflow version are you using?

Hi guys, thanks for the very interesting work!

Could please provide the list of required packages? I am just trying to reproduce the experiments. Thanks!

How to train on BDD100K data?

Hi,

Please guide me how to train this model on custom data set or other standard data set like BDD100K?

Thanks,

Can these hyperparameters: layer number, layer dimensions, etc., trainable by this noiseStudent?

From what I read, the noisystudent is to get the layer weights from data, but could these architecture parameters trainable using this semi-supervision methods?

Confusion about the loss function

hello, when i look at this code after read the paper,I am quite confused about how the loss function is calculated in the code.The final loss in the paper is the sum of the cross-entropy loss of the distribution calculation of the two kinds of data (with or without labels), but the calculation method in the code is not the same. like:
real_lab_bsz = tf.to_float(lab_bsz) * FLAGS.label_data_sample_prob
real_unl_bsz = batch_size * FLAGS.label_data_sample_prob * FLAGS.unlabel_ratio
data_loss = lab_loss * real_lab_bsz + unl_loss * real_unl_bsz
data_loss = data_loss / real_lab_bsz`
The loss in front of this part of the code has been averaged, but it has multiplied by the number of samples first, and then divided by the number of labeled samples. What is the significance of this calculation method? It doesn't feel the same as described in the paper.

What is the structure of the training data for ImageNet?

Thank you for your great work! I am trying to re-do the experiment on ImageNet data as described in the Readme file.

However, I could not find anywhere in the instruction that shows how the folders label_data_dir and unlabel_data_dir are structured. Could you please clarify? Thank you!

techer model link not working

Hi, I am trying to run your script on colab, but the error message shows that the URL for pulling teacher_ckpt is not working, is there a new link to the teacher model?