Giter VIP home page Giter VIP logo

camera-based-person-reid's Introduction

Camera-based Person Re-identification

The official code for Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization. It implements the fundamental idea of our paper: aligning all training and testing cameras. This code is based on an early version of Cysu/open-reid.

Demonstration

Details

The goal of our code is to provide a generic camera-aligned framework for future researches. Thus, the fundamental principle is to make the entire camera alignment process transparent to the neural network and loss functions. To this end, we make two major changes.

First: we avoid customizing the BatchNorm layer. Otherwise, the forward process will require additional input for identifying camera IDs. Given that the nn.Sequential module is widely used in PyTorch, a customized BatchNorm layer will lead to massive changes in the network definition. Instead, we turn to use the official BatchNorm layer. For the training process, we can simply use the official BatchNorm implementation and feed the network with images from the same camera. In this stage, the collected running_mean and running_var are directly ignored since they will always be overridden in the testing stage. Thus, the BN parameter momentum can be set to any value. For the testing process, we change the default definition of BatchNorm layers from:

nn.BatchNorm2d(planes, momentum=0.1)

to:

nn.BatchNorm2d(planes, momentum=None)

Note:

In PyTorch, Momentum=None is not equivalent to Momentum=0.0. It calculates the cumulative moving average. Please check https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html for more details.

Then, given several mini-batches from a specific camera, we simply set the network to the Train mode and forward all these mini-batches. After forwarding all these batches, the running_mean and running_var in each BatchNorm layer are the statistics for this exact camera. Then, we simply set the network to the Eval mode and process images from this specific camera.

Second: during training, we need a process of re-organizing mini-batches. With a tensor sampled by an arbitrary sampler, we split this tensor by the corresponding camera IDs and re-organize them as a list of tensors. It is achieved by our customized Trainer. Then, our DataParallel forwards these tensors one by one, assembles all outputs, and then feeds them to the loss function in the same way of the conventional DataParallel.

Preparation

1. Download Market-1501, DukeMTMC-reID, and MSMT17 and organize them as follows:

.
+-- data
|   +-- market
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|   +-- duke
|       +-- bounding_box_train
|       +-- query
|       +-- bounding_box_test
|   +-- msmt17
|       +-- train
|       +-- test
|       +-- list_train.txt
|       +-- list_val.txt
|       +-- list_query.txt
|       +-- list_gallery.txt
+ -- other files in this repo

Note: For MSMT17, we highly recommend the V1 version. Our experiments show that the noises introduced in the V2 version affect the performance of both the fully supervised learning and direct transfer tasks.

2. Install the required packages

pip install -r requirements.txt

Note: Our code is only tested with Python3.

3. Put the official PyTorch ResNet-50 pretrained model to your home folder: '~/.torch/models/'

Usage

1. Train a ReID model

Reproduce the results in our paper

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 \
python train_model.py train --trainset_name market --save_dir='market_demo'

Note that our training code also supports an arbitrary number of GPUs.

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0,1,2,3 \
python train_model.py train --trainset_name market --save_dir='market_demo'

However, since the current implementation is immature, the ratio of speedup is not good. Any advice about the parallel acceleration is welcomed.

2. Evaluate a trained model

CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 \
python test_model.py test --testset_name market --save_dir='market_demo'

To reproduce our reported performance, each experiment should be conducted 10 times.

Trained Models

You can download our trained models via Google Drive.

Cite our paper

If you use our code in your paper, please kindly use the following BibTeX entry.

@inproceedings{zhuang2020rethinking,
  title={Rethinking the Distribution Gap of Person Re-identification with Camera-Based Batch Normalization},
  author={Zhuang, Zijie and Wei, Longhui and Xie, Lingxi and Zhang, Tianyu and Zhang, Hengheng and Wu, Haozhe and Ai, Haizhou and Tian, Qi},
  booktitle={European Conference on Computer Vision},
  pages={140--157},
  year={2020},
  organization={Springer}
}

camera-based-person-reid's People

Contributors

automan000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

camera-based-person-reid's Issues

Why momentum is set to None?

Hi,

First and foremost, Thanks for your code!

As shown in your code, you set the Momentum of BN to None. While in the testing stage, it means that :

running mean = mean of the last mini-bath
running var = var of the last mini-bath

So I wonder why the number of mini-batch influences your results.

I think this is a random problem that if you choose the best mini-batch, you will get the best results, even if you only choose one mini-batch to calculate the running mean and var of the test camera.

关于不更新BN的测试

CBN是一个非常棒的idea,他在更新少量的参数情况下就能取得非常卓越的性能提升。为了复现你们论文的baseline,我在你们的test_mdel.py的49行之后,为每个camera更新BN之前增加了一个正常测试的函数,然而我发现这样子跑出来的性能会低于你们的baseline 10几个点以上。我想问一下,在copy模型的state dict的时候或者之后,是不是没有copy到某个东西?还是保存的模型的state dict有问题?
代码如下
`

def _normal_testing( data ):
        data_loader = DataLoader(
            data_manager.init_datafolder(opt.testset_name, data, TestTransform(opt.height, opt.width)),
            batch_size=opt.test_batch, num_workers=opt.workers,
            pin_memory=pin_memory, shuffle=False
        )
        fs, pids, camids = reid_evaluator.produce_features(data_loader, normalize=True)
        return fs, pids, camids

print('Processing query features...')
qf, q_pids, q_camids = _normal_testing(dataset.query )
print( qf.shape )
print('Processing gallery features...')
gf, g_pids, g_camids = _normal_testing(dataset.gallery )
print( gf.shape )
print('Computing CMC and mAP...')
reid_evaluator.get_final_results_with_features(qf, q_pids, q_camids, gf, g_pids, g_camids)`

关于训练阶段使用CBN

作者您好!看了您的论文,感觉这是一个很不错的工作,我看您消融实验部分table6有关于训练阶段使用CBN方法的效果,有点困惑,请问训练阶段怎么使用CBN呢?是在每个epoch前采样每个相机图像的均值和方差参数吗?

关于训练损失函数的设计

这是一个非常棒的工作,通过对齐不同摄像机的统计量来获得巨大的性能提升。然而我注意到你的损失函数为CrossEntorpyLoss,并且你的Sampler为IdentitySampler,那么你为什么不加上Triplet Loss呢?加上Triplet Loss是会引起你模型性能的下降吗?我训练了fastreid的模型,并且在测试的时候使用了你的CBN,发现模型mAP直接到了个位数,无论是跨域还是supervised,这是否表明你的CBN依靠于一种固定的训练策略?

关于CamDataParallel

代码中的重点应该是CamDataParallel对吧?请问如果要用distributedDataParallel,我应该怎么修改呢?

how are the batch norm layers being updated during training process

Hello, I greatly appreciate the work and its pretty resourceful.
I am trying to understand the training process and my question here is how are the running means and variances of batch norm layers going to get affected over time during the training process if I have data from multiple cameras.

Say, if I am training on a single gpu, I would have only one replica created in your custom DataParallel module and all the sub-batches belonging to different cameras would be sent through the same replica or model. Wouldn't this cause the running means and variances of batch norm layers to be updated using data from all the cams? I understand that we would eventually drop these values during testing phase but this would cause the batch norm updates to be uneven during training phase and the network wouldn't learn efficiently. Moreover, wouldn't it defeat the purpose of isolating the batch norm values based on camera specific information? Please share your thoughts on this. Thanks in advance.

关于CBN的实现

您好,非常感谢并敬佩您的工作!我看到您项目trainer.py里传参数进self._forward()里的时候传的是一个列表,然后列表的不同元素是当前batch的不同相机的图片集,我想问问这样进行前传的时候会遍历列表单独算每个元素(相机)里图片的均值和标准差然后进行相机的正则化吗?谢谢

A larger model will cause loss to converge hard?

Hello, I reproduced the results of the paper on the two data sets of market and duke. I tried to use this method in my own larger network, but xent became difficult to converge. Have you tried other Is the baseline, for example strongbaseline, a larger model will make it difficult for loss to converge?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.