automan000 / camera-based-person-reid Goto Github PK

The official code for ECCV2020 "Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization"

Python 99.27% Shell 0.73%

camera-based-person-reid's Introduction

Camera-based Person Re-identification

The official code for Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization. It implements the fundamental idea of our paper: aligning all training and testing cameras. This code is based on an early version of Cysu/open-reid.

Demonstration

Details

The goal of our code is to provide a generic camera-aligned framework for future researches. Thus, the fundamental principle is to make the entire camera alignment process transparent to the neural network and loss functions. To this end, we make two major changes.

First: we avoid customizing the BatchNorm layer. Otherwise, the forward process will require additional input for identifying camera IDs. Given that the nn.Sequential module is widely used in PyTorch, a customized BatchNorm layer will lead to massive changes in the network definition. Instead, we turn to use the official BatchNorm layer. For the training process, we can simply use the official BatchNorm implementation and feed the network with images from the same camera. In this stage, the collected running_mean and running_var are directly ignored since they will always be overridden in the testing stage. Thus, the BN parameter momentum can be set to any value. For the testing process, we change the default definition of BatchNorm layers from:

nn.BatchNorm2d(planes, momentum=0.1)

to:

nn.BatchNorm2d(planes, momentum=None)

Note:

In PyTorch, Momentum=None is not equivalent to Momentum=0.0. It calculates the cumulative moving average. Please check https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html for more details.

Then, given several mini-batches from a specific camera, we simply set the network to the Train mode and forward all these mini-batches. After forwarding all these batches, the running_mean and running_var in each BatchNorm layer are the statistics for this exact camera. Then, we simply set the network to the Eval mode and process images from this specific camera.

Second: during training, we need a process of re-organizing mini-batches. With a tensor sampled by an arbitrary sampler, we split this tensor by the corresponding camera IDs and re-organize them as a list of tensors. It is achieved by our customized Trainer. Then, our DataParallel forwards these tensors one by one, assembles all outputs, and then feeds them to the loss function in the same way of the conventional DataParallel.

Preparation

1. Download Market-1501, DukeMTMC-reID, and MSMT17 and organize them as follows:

.
+-- data
|   +-- market
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|   +-- duke
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|   +-- msmt17
|       +-- train
|       +-- test
|       +-- list_train.txt
|       +-- list_val.txt
|       +-- list_query.txt
|       +-- list_gallery.txt
+ -- other files in this repo

Note: For MSMT17, we highly recommend the V1 version. Our experiments show that the noises introduced in the V2 version affect the performance of both the fully supervised learning and direct transfer tasks.

2. Install the required packages

pip install -r requirements.txt

Note: Our code is only tested with Python3.

3. Put the official PyTorch ResNet-50 pretrained model to your home folder: '~/.torch/models/'

Usage

1. Train a ReID model

Reproduce the results in our paper

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 \
python train_model.py train --trainset_name market --save_dir='market_demo'

Note that our training code also supports an arbitrary number of GPUs.

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0,1,2,3 \
python train_model.py train --trainset_name market --save_dir='market_demo'

However, since the current implementation is immature, the ratio of speedup is not good. Any advice about the parallel acceleration is welcomed.

2. Evaluate a trained model

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 \
python test_model.py test --testset_name market --save_dir='market_demo'

To reproduce our reported performance, each experiment should be conducted 10 times.

Trained Models

You can download our trained models via Google Drive.

Cite our paper

If you use our code in your paper, please kindly use the following BibTeX entry.

@inproceedings{zhuang2020rethinking,
  title={Rethinking the Distribution Gap of Person Re-identification with Camera-Based Batch Normalization},
  author={Zhuang, Zijie and Wei, Longhui and Xie, Lingxi and Zhang, Tianyu and Zhang, Hengheng and Wu, Haozhe and Ai, Haizhou and Tian, Qi},
  booktitle={European Conference on Computer Vision},
  pages={140--157},
  year={2020},
  organization={Springer}
}

camera-based-person-reid's People

Contributors

Stargazers

Watchers

Forkers

trantorrepository luluqie xrosliang peterzs victor-gui cv-ip chenxin061 huang-ju-git abandonsea celestezj reid-wangchan iamaloseri flyingroastduck

camera-based-person-reid's Issues

Why momentum is set to None?

Hi,

First and foremost, Thanks for your code!

As shown in your code, you set the Momentum of BN to None. While in the testing stage, it means that :

running mean = mean of the last mini-bath
running var = var of the last mini-bath

So I wonder why the number of mini-batch influences your results.

I think this is a random problem that if you choose the best mini-batch, you will get the best results, even if you only choose one mini-batch to calculate the running mean and var of the test camera.

论文中比较重要的地方是不是最后的一个BN

最后输出特征时加了一个 nn.BatchNorm1d(2048, momentum=None)

where is CBN? I just see the normal BN

关于不更新BN的测试

CBN是一个非常棒的idea，他在更新少量的参数情况下就能取得非常卓越的性能提升。为了复现你们论文的baseline，我在你们的test_mdel.py的49行之后，为每个camera更新BN之前增加了一个正常测试的函数，然而我发现这样子跑出来的性能会低于你们的baseline 10几个点以上。我想问一下，在copy模型的state dict的时候或者之后，是不是没有copy到某个东西？还是保存的模型的state dict有问题？
代码如下
`

def _normal_testing( data ):
        data_loader = DataLoader(
            data_manager.init_datafolder(opt.testset_name, data, TestTransform(opt.height, opt.width)),
            batch_size=opt.test_batch, num_workers=opt.workers,
            pin_memory=pin_memory, shuffle=False
        )
        fs, pids, camids = reid_evaluator.produce_features(data_loader, normalize=True)
        return fs, pids, camids

print('Processing query features...')
qf, q_pids, q_camids = _normal_testing(dataset.query )
print( qf.shape )
print('Processing gallery features...')
gf, g_pids, g_camids = _normal_testing(dataset.gallery )
print( gf.shape )
print('Computing CMC and mAP...')
reid_evaluator.get_final_results_with_features(qf, q_pids, q_camids, gf, g_pids, g_camids)`

关于训练阶段使用CBN

作者您好！看了您的论文，感觉这是一个很不错的工作，我看您消融实验部分table6有关于训练阶段使用CBN方法的效果，有点困惑，请问训练阶段怎么使用CBN呢？是在每个epoch前采样每个相机图像的均值和方差参数吗？

关于训练损失函数的设计

这是一个非常棒的工作，通过对齐不同摄像机的统计量来获得巨大的性能提升。然而我注意到你的损失函数为CrossEntorpyLoss，并且你的Sampler为IdentitySampler，那么你为什么不加上Triplet Loss呢？加上Triplet Loss是会引起你模型性能的下降吗？我训练了fastreid的模型，并且在测试的时候使用了你的CBN，发现模型mAP直接到了个位数，无论是跨域还是supervised，这是否表明你的CBN依靠于一种固定的训练策略？

关于CamDataParallel

代码中的重点应该是CamDataParallel对吧？请问如果要用distributedDataParallel，我应该怎么修改呢？

When using multi gpu to test the trained model, mAP and rank-1 descended.

Hello，I find that when using multi gpu to test the trained model, mAP and rank-1 descended. Is this normal？Could you provide some possible explanations？

how are the batch norm layers being updated during training process

Hello, I greatly appreciate the work and its pretty resourceful.
I am trying to understand the training process and my question here is how are the running means and variances of batch norm layers going to get affected over time during the training process if I have data from multiple cameras.

Say, if I am training on a single gpu, I would have only one replica created in your custom DataParallel module and all the sub-batches belonging to different cameras would be sent through the same replica or model. Wouldn't this cause the running means and variances of batch norm layers to be updated using data from all the cams? I understand that we would eventually drop these values during testing phase but this would cause the batch norm updates to be uneven during training phase and the network wouldn't learn efficiently. Moreover, wouldn't it defeat the purpose of isolating the batch norm values based on camera specific information? Please share your thoughts on this. Thanks in advance.

关于CBN的实现

您好，非常感谢并敬佩您的工作！我看到您项目trainer.py里传参数进self._forward()里的时候传的是一个列表，然后列表的不同元素是当前batch的不同相机的图片集，我想问问这样进行前传的时候会遍历列表单独算每个元素（相机）里图片的均值和标准差然后进行相机的正则化吗？谢谢

About weakly-supervised learning and incremental learning.

Hi, I am appreciated for the work you have done.
I am really interested in the weakly-supervised learning and incremental learning parts, which are not released. Will it be released in the future?

A larger model will cause loss to converge hard?

Hello, I reproduced the results of the paper on the two data sets of market and duke. I tried to use this method in my own larger network, but xent became difficult to converge. Have you tried other Is the baseline, for example strongbaseline, a larger model will make it difficult for loss to converge?

How to draw Figure1?

I'm confusing about how to draw Figure1, especially Figure1(a)？