Giter VIP home page Giter VIP logo

kaihuatang / long-tailed-recognition.pytorch Goto Github PK

View Code? Open in Web Editor NEW
555.0 12.0 68.0 18.9 MB

[NeurIPS 2020] This project provides a strong single-stage baseline for Long-Tailed Classification, Detection, and Instance Segmentation (LVIS). It is also a PyTorch implementation of the NeurIPS 2020 paper 'Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect'.

License: GNU General Public License v3.0

Python 46.66% Shell 0.04% Jupyter Notebook 50.06% Dockerfile 0.02% Makefile 0.01% Batchfile 0.02% C++ 1.29% Cuda 1.91%

long-tailed-recognition.pytorch's Introduction

A Strong Single-Stage Baseline for Long-Tailed Problems

Python PyTorch

This project provides a strong single-stage baseline for Long-Tailed Classification (under ImageNet-LT, Long-Tailed CIFAR-10/-100 datasets), Detection, and Instance Segmentation (under LVIS dataset). It is also a PyTorch implementation of the NeurIPS 2020 paper Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect, which proposes a general solution to remove the bad momentum causal effect for a variety of Long-Tailed Recognition tasks. The codes are organized into three folders:

  1. The classification folder supports long-tailed classification on ImageNet-LT, Long-Tailed CIFAR-10/CIFAR-100 datasets.
  2. The lvis_old folder (deprecated) supports long-tailed object detection and instance segmentation on LVIS V0.5 dataset, which is built on top of mmdet V1.1.
  3. The latest version of long-tailed detection and instance segmentation is under lvis1.0 folder. Since both LVIS V0.5 and mmdet V1.1 are no longer available on their homepages, we have to re-implement our method on mmdet V2.4 using LVIS V1.0 annotations.

Slides

If you want to present our work in your group meeting / introduce it to your friends / seek answers for some ambiguous parts in the paper, feel free to use our slides. It has two versions: one-hour full version and five-minute short version.

New Long-tailed Settings

If you are interested in a more general long-tailed classification setting that considers both class-wise (inter-class) imbalance and attribute-wise (intra-class) imbalance, please refer to our ECCV 2022 paper Invariant Feature Learning for Generalized Long-Tailed Classification and corresponding project.

Installation

The classification part allows the lower version of the following requirements. However, in detection and instance segmentation (mmdet V2.4), I tested some lower versions of python and pytorch, which are all failed. If you want to try other environments, please check the updates of mmdetection.

Requirements:

  • PyTorch >= 1.6.0
  • Python >= 3.7.0
  • CUDA >= 10.1
  • torchvision >= 0.7.0
  • gcc version >= 5.4.0

Step-by-step installation

conda create -n longtail pip python=3.7 -y
source activate longtail
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
pip install pyyaml tqdm matplotlib sklearn h5py

# download the project
git clone https://github.com/KaihuaTang/Long-Tailed-Recognition.pytorch.git
cd Long-Tailed-Recognition.pytorch

# the following part is only used to build mmdetection 
cd lvis1.0
pip install mmcv-full
pip install mmlvis
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

Additional Notes

When we wrote the paper, we are using lvis V0.5 and mmdet V1.1 for our long-tailed instance segmentation experiments, but they've been deprecated by now. If you want to reproduce our results on lvis V0.5, you have to find a way to build mmdet V1.1 environments and use the code in lvis_old folder.

Datasets

ImageNet-LT

ImageNet-LT is a long-tailed subset of original ImageNet, you can download the dataset from its homepage. After you download the dataset, you need to change the data_root of 'ImageNet' in ./classification/main.py file.

CIFAR-10/-100

When you run the code for the first time, our dataloader will automatically download the CIFAR-10/-100. You need to set the data_root in ./classification/main.py to the path where you want to put all CIFAR data.

LVIS

Large Vocabulary Instance Segmentation (LVIS) dataset uses the COCO 2017 train, validation, and test image sets. If you have already downloaded the COCO images, you only need to download the LVIS annotations. LVIS val set contains images from COCO 2017 train in addition to the COCO 2017 val split.

You need to put all the annotations and images under ./data/LVIS like this:

data
  |-- LVIS
    |--lvis_v1_train.json
    |--lvis_v1_val.json
      |--images
        |--train2017
          |--.... (images)
        |--test2017
          |--.... (images)
        |--val2017
          |--.... (images)

Getting Started

For long-tailed classification, please go to [link]

For long-tailed object detection and instance segmentation, please go to [link]

Advantages of the Proposed Method

  • Compared with previous state-of-the-art Decoupling, our method only requires one-stage training.
  • Most of the existing methods for long-tailed problems are using data distribution to conduct re-sampling or re-weighting during training, which is based on an inappropriate assumption that you can know the future distribution before you start to learn. Meanwhile, the proposed method doesn't need to know the data distribution during training, we only need to use an average feature for inference after we train the model.
  • Our method can be easily transferred to any tasks. We outperform the previous state-of-the-arts Decoupling, BBN, OLTR in image classification, and we achieve better results than 2019 Winner of LVIS challenge EQL in long-tailed object detection and instance segmentation (under the same settings with even fewer GPUs).

Citation

If you find our paper or this project helps your research, please kindly consider citing our paper in your publications.

@inproceedings{tang2020longtailed,
  title={Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect},
  author={Tang, Kaihua and Huang, Jianqiang and Zhang, Hanwang},
  booktitle= {NeurIPS},
  year={2020}
}

long-tailed-recognition.pytorch's People

Contributors

kaihuatang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

long-tailed-recognition.pytorch's Issues

about optimizer

请问de-counfound适用于其它optimizer吗?当我把代码里的optimizer换成adam后,结果很差,改变优化器,还有其它哪些配置需要做出相应的改变?

关于multi_head classifier

请问用多头分类器的动机是什么?从实验结果看,它确实有效,是因为多头分类器引入了投票集成的**吗?

关于do(X=x)

感谢您的工作,极大的开拓了我的眼界。我想请问一下,您论文中的do(X=x)操作,去除了M对X的影响,我不是很理解是怎么做到的,看了您的代码也没有很理清头绪,想请问您,这部分的代码是在哪里实现的呢,期待您的回复,感谢。

关于训练和测试集同是长尾分布时 TDE 有效性的问题?

尊敬的作者,您好,感谢您的工作并开源代码
我尝试使用您的方案训练自己的数据集(5类,训练集和测试集都是长尾分布,且同分布)

训练和模型参数

# default num_head = 2
criterions:
  PerformanceLoss:
    def_file: ./loss/SoftmaxLoss.py
    loss_params: {}
    optim_params: null
    weight: 1.0
last: false
# apply incremental pca to remove main components
apply_ipca: false
num_components: 512
model_dir: null
tuning_memory: false
networks:
  classifier:
    def_file: ./models/CausalNormClassifier.py
    optim_params: {lr: 0.001, momentum: 0.9, weight_decay: 0}
    scheduler_params: {coslr: false, endlr: 0.0, gamma: 0.1, step_size: 30, warmup: true, lr_step: [60, 80], lr_factor: 0.1, warm_epoch: 5}
    params: {dataset: GCAssCls, feat_dim: 128, num_classes: 5, stage1_weights: false, use_effect: true, num_head: 2, tau: 16.0, alpha: 1.0, gamma: 0.03125}
  feat_model:
    def_file: ./models/ResNet18Feature.py
    fix: false
    optim_params: {lr: 0.001, momentum: 0.9, weight_decay: 0}
    scheduler_params: {coslr: false, endlr: 0.0, gamma: 0.1, step_size: 30, warmup: true, lr_step: [60, 80], lr_factor: 0.1, warm_epoch: 5}
    params: {dataset: GCAssCls, dropout: 0.5, stage1_weights: false, use_fc: True, fc_channel: 128, pretrained: True}
shuffle: false
training_opt:
  backbone: resnet18
  batch_size: 128
  dataset: GCAssCls
  display_step: 10
  display_grad: False
  display_grad_step: 10
  feature_dim: 128
  log_dir: ./logs/GCAssCls/models/resnet18_e100_C5_warmup_causal_norm_lr1e-3_adam
  log_root: /logs/GCAssCls
  num_classes: 5
  num_epochs: 100
  num_freeze_epochs: 6
  num_workers: 12
  open_threshold: 0.1
  sampler: null
  sub_dir: models
  optimizer: adam

我使用了ResNet18作为backbone,并使用了ImageNet上的pretrain model,训练出来的结果,并在backbone连接了fc和dropout,迭代100个epoch,在训练集上的准确率基本到99.9%,此时获得模型在测试集上表现很差
classifice 中 use_effect 置为 false 是测试 不含 TDE 的结果。

测试在长尾测试集上的结果

image

几点疑问

  • 添加了TDE后,many-shot类准确率大幅下降,few-shot类的召回率有所提升,请问这正常吗?
  • 论文分类任务中test数据集都是类别均衡分布的数据集,如 ImageNet-LT 每个类有50个样本,cifar-10/100只在训练时进行非均衡采样,而实际生产场景中的分类任务很多本身也是长尾分布的,在收集的训练集基本同应用场景同分布,在与实际场景不同分布的情况下去评估模型的性能有违初衷?

Question about methods Cosine and Capsule in the paper

Hi @KaihuaTang , thank you for doing such an inspiring job and opening the source code. Would you mind telling some details of the methods Cosine and Capsule in the Table 2 of the paper.

  1. I got some confused because I don't know the cosine similarity is used in the training phase or just test phase. I got no performance improvement when using it in the training phase for both cifar100-LT and ImageNet-LT.

  2. For Capsule, I also don't know how to do it.

Thanks a lot!

about dataset

你好,ImageNet-LT数据集包括“val”, “train” 和 “test”三个文件夹,请问文章中的结果基于val文件夹还是test文件夹

Moving average for d_hat

In the code the moving average is like this,

self.embed_mean = self.mu * self.embed_mean + self.features.detach().mean(0).view(-1).cpu().numpy()

In the above scenario, more importance is being given to newer epochs, in that case, why can't we just use the final epoch's model?
Or is there any other rationale behind it?

Should the line have been like the following instead ?
self.embed_mean = self.mu * self.embed_mean + (1-self.mu)*self.features.detach().mean(0).view(-1).cpu().numpy()

where the self.mu = 0.9.

Focal loss on ImageNet-LT

Thank you for your extraordinary work. Would you like to provide the config code (.yaml) of Focal Loss in Imagenet-LT?

Migration to multi label long tailed recognition

Thank you for your excellent work. I want to try to migrate the classifier (CausalNormClassifier) in this work to the task of multi label long tailel recognition. However, the following problems appeared. I believe the model has not learned any knowledge at all, perhaps the output is smoothed out in the inference stage. Because the loss change during training seems normal. Can you give me some advice or tell me the possible problems?

感谢你开源这么优秀的工作,我想尝试把这个工作中的分类器迁移到多标签长尾数据分类任务中去,但是却出现了如下的问题,感觉模型完全没有学习到任何知识,或者好像是在推理的阶段输出被平滑抵消掉了?因为在训练过程中的loss变化看起来是正常的。能否给我一些建议或者告诉我可能出现的问题?

捕获

About checkpoints

注意到您的代码中有加载./data/checkpoints/final_model_checkpoint.pth. 发现在lws这个方法中,对模型性能的影响挺大的, 想请教你这个预训练的模型是怎么得到的,以及为什么要用这个预训练的模型?

Cifar Validation set

Hello. I found in ImbalanceCIFAR.py, the validation set and the test set seem to be the same. Is it better to use a balanced training set (50000 frames) as the validation set instead of using the 10000 frame test set for validation? Thanks!

iNaturalist18

Is the process for importing iNaturalist18 incomplete?

imagenet-LT

Thank you, the author, would like to ask how to generate Imagenet LT, according to what rules are sampled out

关于classifier normalize?

你好,看了您的paper受益匪浅。
这边有个问题想问一下,paper中提到,借鉴propensity score的**,对logit进行normalize。不知道这个是什么原理呢?如果不加normalize对de-confound的影响大吗?

请问detection中rcnn head的cls是不是做了2次softmax操作?

第一次在cos_forward函数里,if self.KEEP_FG会做一次softmax

第二次在基类bbox_headget_bboxes函数里

scores = F.softmax(cls_score, dim=1) if cls_score is not None else None

请问是我理解错了还是就是这样设计的哇?

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 1.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

咨询

您好,我有个问题,您说 狮鹫=狮子+鹰,这个是类别之间差异比较大的可以利用头部类信息;但是像细粒度分类任务中,每类之间都很相似,也存在着严重的长尾分布,这个还能借用头部类的信息么?即:狮鹫=狮子+鹰??

embed_mean grow to inf

Hello, i have added CausalNormClassifier to my own project, and recorded embed mean. However, i meet the problem that the embed mean grow to inf.
I print torch.sum(embed_mean)

tensor(1902.6863, device='cuda:0')
Train:   2%|                                   | 1/45 [00:07<05:36,  7.65s/it]
tensor(8754.8096, device='cuda:0')
Train:   4%|                                   | 2/45 [00:06<03:03,  4.27s/it]
tensor(33422.0312, device='cuda:0')
Train:   7%|                                  | 3/45 [00:08<02:47,  3.99s/it]
tensor(122225.3984, device='cuda:0')

the embed_mean is updated as follows:

self.embed_mean = torch.zeros(int(self.training_opt['feature_dim'])).numpy()
self.embed_mean = self.mu * self.embed_mean + self.features.detach().mean(0).view(-1).cpu().numpy()

During the train process, the gradient will be small and small, so the the velocity will not grow to inf. But the feature which generated by model may not be small and small, so it seems to grow to inf. So i can't store this variable. How can i solve this problem? Or anything i miss?

ImageNet-LT

While using ImageNet-LT how to change between many-shot, medium-shot and few-shot ?

一个严重的bug

感谢本文贡献的优秀idea,但是多头分类器在训练过程中会导致loss和权重会突然变成nan,这个问题麻烦关注一下

关于cifar10训练精度问题

@KaihuaTang 从知乎慕名而来,非常感谢您的工作,并把它开源出来,我想用该模型来训练自己的数据,首先用cifar10 所有数据训练了400个epoch,batch_size 设置为32 别的参数几乎没改动,最后测试结果在训练集上91%+ 在测试集上86%,感觉和论文提高的指标相差甚大,还是该训练方式对非长尾问题无效?

关于训练过程中loss突然变成nan,acc变成0的问题

您好,感谢您分享的代码,我在训练模型的过程中出现了loss突然变成nan,acc变成0的问题,我分别从头开始进行了两次训练,但是还是产生了一样的问题。
我的训练环境是:

  • Nvidia RTX 2080Ti cuda10.1+cudnn7.6.3
  • python3.8.5+pytorch1.5.1
  • 数据集是ILSVRC2015
  • 由于显存只有11G,所以我在config里把batch_size改成了64
  • 其他不变

请问您知道可能的原因是什么吗?您用的ImageNet-LT是由ILSVRC2015提取的吗?
loss2nan

loss2nan2

A question regarding the assignment of num_head, tau, alpha and gamma

Hey @KaihuaTang, I am an NLPer. Thanks for your interesting work. I've observed that you got the same cofigs regarding the parameters in class Causal_Norm_Classifier, namely num_head, tau, alpha and gamma, for both CIFAR and ImagNet datasets. How come? Could you also please explain the idea of that kinda assginment? Thanks a lot.

LVIS training bug: TypeError: can't pickle _thread.RLock objects

Describe the bug
training on COCO dataset is ok, but when I train on LVIS meet this bug.

Environment

  1. Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.
sys.platform: linux
Python: 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54) [GCC 7.3.0]
CUDA available: False
GCC: gcc (GCC) 5.2.0
PyTorch: 1.4.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

TorchVision: 0.5.0
OpenCV: 4.4.0
MMCV: 1.1.2
MMDetection: 2.4.0+
MMDetection Compiler: GCC 7.3
MMDetection CUDA Compiler: 10.1

Error traceback
If applicable, paste the error trackback here.

2020-11-14 20:28:12,249 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
Traceback (most recent call last):
  File "./tools/train.py", line 177, in <module>
    main()
  File "./tools/train.py", line 173, in main
    meta=meta)
  File "/data/cdp_algo_ceph_ssd/users/georgeni/causallvis/mmdet/apis/train.py", line 143, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 27, in train
    for i, data_batch in enumerate(self.data_loader):
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 719, in __init__
    w.start()
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
Traceback (most recent call last):
  File "./tools/train.py", line 177, in <module>
    main()
  File "./tools/train.py", line 173, in main
    meta=meta)
  File "/data/cdp_algo_ceph_ssd/users/georgeni/causallvis/mmdet/apis/train.py", line 143, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 27, in train
    for i, data_batch in enumerate(self.data_loader):
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 719, in __init__
    w.start()
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _thread.RLock objects
^C^C^C^C^C^C^C^C^C^C^C^C^CTraceback (most recent call last):
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/torch/distributed/launch.py", line 263, in <module>
    main()
  File "/data/anaconda3/envs/zxcheng/lib/python3.6/site-packages/torch/distributed/launch.py", line 259, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/data/anaconda3/envs/zxcheng/bin/python', '-u', './tools/train.py', '--local_rank=1', 'configs/lvis/htcnosemlvis.py', '--launcher', 'pytorch', '--work-dir', 'work_bendilvis/lvis/htcnosemlvis', '--no-validate']' returned non-zero exit status 1.

关于lvis1.0中计算d

def update_embed(self, targets, gt_label):
if self.training:
# remove background
with torch.no_grad():
fg_target = targets[gt_label > 0].clone().detach().mean(0, keepdim=True)
self.causal_embed = self.MU * self.causal_embed + fg_target
return

新版mmdet中0-(num_classes-1)才是正样本,上面gt_label > 0是不是错了?

论文问题

同学你好,我想请问下,在《Long-Tailed Classifcation by Keeping the Good and Removing the Bad Momentum Causal Effect》这篇论文中,公式6以及对应的f(x,d;w)和g(x,d;w)都是怎么推导出来的?谢谢

关于统计移动平均特征的代码问题

您好,首先非常感谢您solid的工作。

但在我看具体代码时,有一些问题想要请教:

在 lvis1.0\mmdet\models\roi_heads\bbox_heads\convfc_bbox_head.py 文件中的 update_embed 函数里:

        if self.training:
            # remove background
            with torch.no_grad():
                fg_target = targets[gt_label > 0].clone().detach().mean(0, keepdim=True)   
                self.causal_embed = self.MU * self.causal_embed + fg_target    
        return

就我理解而言,这里是给滑动平均值加上了特征图的(fg_target)的数值,而非梯度值,与下图式中的 g_t 不符。请问是我对代码的理解有误吗?可否指正一下,非常感谢!
image

Regarding training_opt: {open_threshold: 0.1}

What does training_opt: {open_threshold: 0.1} do? It looks like, it is pointing to the theta below but it is not being used at all.

def F_measure(preds, labels, theta=None):
    # Regular f1 score
    return f1_score(labels.detach().cpu().numpy(), preds.detach().cpu().numpy(), average='macro')

Cifar experiment setting

Thank you for sharing your interesting work. Would you mind clarify some of the Cifar experiment setting:

  1. Does the Cifar result in the paper initialised using Imagenet pretrained? If so, how long do you trained the imagenet pretrained model?

  2. Do we need to fix the random seed when generating Cifar-LT dataset since it uses np.random.shuffle()? If not, the dataset might not be exactly the same for different run?

feat_uniform.yaml

  1. Is feat_uniform.yaml just normal training?
  2. For ResNet32Feature.py, why are you using the BBN_ResNet_Cifar class instead of using the parent class ResNetFeature ?

Lvis setting

您好,我看到在config里,coco和lvis的num_class分别设置为80和1203,想请教一下这里为什么没有算背景呢?非常感谢~

论文求助

作者你好,最近才看到这篇关于长尾数据分类的论文。有一些地方不是很明白。
1)论文指出验证或测试时才需要消除头部偏移的影响,那么消除头部偏移影响后,模型在训练集上的表现如何?
2)该偏移可以看作零输入的特征输出,那么能否将网络中所有卷积和BatchNormalization的bias均设置为否?
3) alpha的含义是什么,为什么可以大于1
image

谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.