yyzharry / imbalanced-semi-self Goto Github PK

[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning

Home Page: https://arxiv.org/abs/2006.07529

License: MIT License

Python 100.00%

imbalanced-learning imbalanced-classification semi-supervised-learning unlabeled-data self-supervised-learning long-tail long-tailed-recognition class-imbalance neurips neurips-2020

imbalanced-semi-self's People

Contributors

Stargazers

Watchers

Forkers

frankfan007 optharry helenligit yangyin2016 xrosliang nbswords khuongnd charlesyyun bruinxiong youtang1993 eyekid xujinglin super-ljg modricwang qianrenjian tommylitlle laomagic 13301338176 xiaotiancd saraderictk luan1412167 wrx812 chestnut1999 yh2010 tonylibing yangsenwxy dorniwang tickleliu bxshin tzo13123 markwjj chenwuperth heewonchung92 bisht9887 saturdays muximuxi zhanghongbo2019 liuyuxuan-github yingjianwu jkadowaki-dcsg maybeee18 bugmaker-bot hubomax dzbwhut sparkparis sandeshregmi zjamy-hust miora2020 qyaaang xinyouduogao peggyzz budali hl026 pylxtu sysujayce tingleng nathanhundley jasonlee0502 chihuanbin qiqi545 hengxyz kevinmtian julienyulinma hejiay agentds super-lyc gqduke zylprivate sidingli linktopast1900 xinhen zy0851 yuconan issaccyj yilinmaster wang21jun qyb156 isaacxu1024 tanmdl jxzhangjhu estellajing mengkunzhao chang111 egoist-eins beijita-yegucheng blackflame22 juyongjiang lzhangbj laurasanchz2 bikong2 jiaxingjian1 hongxin001 junchengberry bhaskar2443053 cryptowealth-technology marssgon hanyangliu canqiangxu sunarker kxgong

imbalanced-semi-self's Issues

Can it be used to solve the unbalanced problem of supervised learning? And How?

Thanks for sharing the codes! Hope you can answer my question！

command to get a base classifier in semi-supervised learning

In semi-supervised training, what is the command to get the base classifier using only labeled data?

Question about the Self-supervised pre-trained models (MoCo)

Thanks for your exellent code! I reproduced the results of CE(uniform) + SSP and cRT+SSP on ImageNet_LT based on the Self-supervised pre-trained models (MoCo), and got the same results as reported in your paper.
But I still have some question about the MoCo SSP checkpoint.
I directly evaluated the performance of MoCo checkpoints + cRT (without CE-uniform supervised training), and the accuracy is 0.118, which is not good. But according to the original paper of MoCo, the accuracy of MoCo on full imagenet should be 0.60+, which is not far from supervised learning.
So is the 0.118 accuracy reasonable? It's much lower than supervised accuracy on ImageNet_LT.

Can't achieve the given performance: ResNet-50 + SSP+CE(Uniform) for imageNet-LT

I download the pre-trained model from the given path Resnet-50-rot. And train the model with the given config imagenet_inat/config/ImageNet_LT/feat_uniform.yaml
The training cmd is:
python imb_cls/imagenet_inat/main.py --cfg 'imb_cls/imagenet_inat/config/ImageNet_LT/feat_uniform.yaml' --model_dir workdir/pretrain/moco_ckpt_0200.pth.tar.
I only get 41.1 top-1 accuracy but the given model achieved 45.6 [CE(Uniform) + SSP].

Can you help me check where is the problem？

你好，半监督的伪标签没有经过置信度筛选的吗？

What's the required hardware to reproduce the result?

Thanks for sharing this code. It's interesting.
May I know the required hardware to reproduce the result?

The reason I'm asking because I tried to run "pretrain_rot.py --dataset 'cifar10' --imb_factor 0.01 ", but the system doesn't response for a long time when running at "output = model(inputs)".

moco on cifar dataset

Thanks for the great repo!

I have a quick question, is there any specific reason not adding cifar&svhn datasets to the moco training script? like, it's not suitable or the performance is really bad on the small datasets?

Thanks!

The pretrained models of "self" can not be open,can you solve it,pealse?

请教大大一个问题：FileNotFoundError: [Errno 2] No such file or directory: './data/ti_80M_selected.pickle'

作者大大你好，想请教下，我看论文里整个self-training阶段的总loss为：

那这个 \omega 是怎么定的呢？还有半监督学习的总轮数epoch的确定方法。

训练自己数据集

非常感谢作者作出的贡献，如果能给出训练自己数据集的具体步骤就更好了。

How to get the image in the readme

Could you give some hint how to get the image stored in the directory of assets?
I want to get that type of image on my own dataset.
Thanks

How to apply to traditional ML techniques such as lightgbm?

As indicated by the question, is it possible to apply your method to traditional machine learning techniques and how?

Why use 5 times more unlabeled data?

I read the paper.
Question about Appendices E3: Effect of Unlabeled Data amount.

The results of CE+Du are 21.75, 20.35, 18.36, and 16.88 about {0.5x, 1x, 5x, 10x}.
The result of 10x is better than 5x more unlabeled data.
But in this paper selected 5 times.

Is there a reason?

Questions about self-supervised learning on cifar10

Thanks for sharing the codes! This work is really interesting to me. My questions are as follows:

I'm trying to reproduce the results in Table 2. Specifically, I trained the models with/without self-supervised pre-training (SSP). However, the baselines (w.o. SSP) consistently outperform those with SSP under different training rules (including None, Resample, and Reweight). The best precisions are presented below. For each experimental setting, I run twice to see if the results are stable, so there're two numbers per cell.

For your reference, I used the following commands:

Train Rotation

python pretrain_rot.py --dataset cifar10  --imb_factor 0.01 --arch resnet32

Train baseline

python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None

Train baseline + SSP

python train.py --dataset cifar10 --imb_factor 0.01 --arch resnet32 --train_rule None --pretrained_model xxx

Category imbalance

I'm going to do semantic segmentation of deeplabv3+, but I have a problen with Category imbalance. Please help me to solve this problem. Thanks

What is the intended learning rate schedule?

imbalanced-semi-self/utils.py

Lines 28 to 39 in 16d8f02

 def adjust_learning_rate(optimizer, epoch, args): 

 epoch = epoch + 1 

 if epoch <= 5: 

 lr = args.lr * epoch / 5 

 elif epoch > 160: 

 lr = args.lr * 0.01 

 elif epoch > 180: 

 lr = args.lr * 0.0001 

 else: 

 lr = args.lr 

 for param_group in optimizer.param_groups: 

 param_group['lr'] = lr

Hi, thanks for sharing your code!

I have a question about the referenced code above.
In the 'adjust_learning_rate' function, the lines 34 and 35 will never be passed.
Can I ask the learning rate schedule that you used for experiments in the paper?

According to the 'adjust_learning_rate' function, the learning rate may change as follows.

epoch lr
0: args.lr * 1 / 5
1: args.lr * 2 / 5
2: args.lr * 3 / 5
3: args.lr * 4 / 5
4: args.lr * 5 / 5
5 ~ 160: args.lr
161~: args.lr * 0.01

error python pretrain_rot.py --dataset cifar10 --imb_factor 0.01

When I use python pretrain_rot.py --dataset cifar10 --imb_factor 0.01,it occurs RuntimeError: Given input size: (2048x1x1). Calculated output size: (2048x0x0). Output size is too small.
How should I modify the code?

是否有在多分类分割问题上衡量这个方法呢？

Have you ever tried "Semi-Supervised Imbalanced Learning on ImageNet-LT"?

Hi, have you ever tried "Semi-Supervised Imbalanced Learning" on ImageNet-LT?

According to the experiment result in the paper, the performance with Semi-Supervised Imbalanced Learning seems better than Self-Supervised Imbalanced Learning on CIFAR-10-LT.

If I want to try this experiment, how can I modify the dataset/imagenet.py to dataset/imblance_imagenet.py (similar to imblance_cifar.py)?

About the method

Thank you for sharing your interesting work. Would you mind clarify that what is the method of "CE(Balanced)" ?

About the Proof of Theorem1

At the end of proof, the probability of event E is 1-P1-P2-P3
BUT WHY not the product of three probability which is (1-P1)(1-P2)(1-P3)?

Error: No module named 'dataset.resnet_cifar' when running

When I run this command:
python train_semi.py --dataset cifar10 --imb_factor 0.02 --imb_factor_unlabel 0.02

I got this error:
Traceback (most recent call last):
File "train_semi.py", line 15, in
from dataset.imbalance_cifar import SemiSupervisedImbalanceCIFAR10
File "/home/insights-user/imbalanced-semi-self/dataset/init.py", line 1, in
from .resnet_cifar import *
ModuleNotFoundError: No module named 'dataset.resnet_cifar'

Where can I setting the CE(Uniform) and CE(Balanced) ?

I see the Self-supervised pretrained learning (SSP).
There are many models in SSP.

CE(Uniform) + SSP
CE(Balanced) + SSP

Where can I setting the CB in train.py code?
In my opinion, per_cls_weights seems to set a uniform or balance.
Does the CB setting mean 'Reweight' in args.train_rule?

    if args.train_rule == 'Reweight':
        beta = 0.9999
        effective_num = 1.0 - np.power(beta, cls_num_list)
        per_cls_weights = (1.0 - beta) / np.array(effective_num)
        per_cls_weights = per_cls_weights / np.sum(per_cls_weights) * len(cls_num_list)
        per_cls_weights = torch.FloatTensor(per_cls_weights).cuda(args.gpu)
    elif args.train_rule == 'DRW':
        idx = epoch // 160
        betas = [0, 0.9999]
        effective_num = 1.0 - np.power(betas[idx], cls_num_list)
        per_cls_weights = (1.0 - betas[idx]) / np.array(effective_num)
        per_cls_weights = per_cls_weights / np.sum(per_cls_weights) * len(cls_num_list)
        per_cls_weights = torch.FloatTensor(per_cls_weights).cuda(args.gpu)
    else:
        per_cls_weights = None

    if args.loss_type == 'CE':
        criterion = nn.CrossEntropyLoss(weight=per_cls_weights).cuda(args.gpu)
    elif args.loss_type == 'LDAM':
        criterion = LDAMLoss(cls_num_list=cls_num_list, max_m=0.5, s=30, weight=per_cls_weights).cuda(args.gpu)
    elif args.loss_type == 'Focal':
        criterion = FocalLoss(weight=per_cls_weights, gamma=1).cuda(args.gpu)
    else:
        warnings.warn('Loss type is not listed')
        return

Some problems about the assumption in the papaer.

Hi, I'm very interested in your paper. Especially, the proofs attract me. However, I meet some questions on understanding the proof.

"We assume a properly designed black-box self-supervised task so that the learned representation is Z = k1 ||X||^{2} + k2, where k1, k2 > 0. Precisely, this means that we have access to the new features Zi for the i-th data after the black-box self-supervised step,
without knowing explicitly what the transformation ψ is. "

I'm confused by the following questions:
(1) Why a properly designed black-box self-supervised task can obtain the learned representation, Z = k1 ||X||^{2} + k2 ? whether the moco or rotation-based self-supervised method respect this assumption?

(2) Why the supervised classification task can not obtain the similar representation, Z = k1 ||X||^{2} + k2 ?

	def adjust_learning_rate(optimizer, epoch, args):
	epoch = epoch + 1
	if epoch <= 5:
	lr = args.lr * epoch / 5
	elif epoch > 160:
	lr = args.lr * 0.01
	elif epoch > 180:
	lr = args.lr * 0.0001
	else:
	lr = args.lr
	for param_group in optimizer.param_groups:
	param_group['lr'] = lr