dk-liang / transcrowd Goto Github PK

View Code? Open in Web Editor NEW

93.0 93.0 17.0 181 KB

[SCIS] TransCrowd: Weakly-Supervised Crowd Counting with Transformers

License: MIT License

Python 100.00%

transcrowd's Introduction

Website: http://dk-liang.github.io/

Google Scholar: https://scholar.google.com/dk-liang

Top Repositories

Visitor count

transcrowd's People

Contributors

Stargazers

Watchers

Forkers

harrylan cv-ip mcasterline-nvidia feixuedudiao wz940216 tracyummy summerlovestudy zykooooo syedayazsa zengjunxia winner-4 tianyouzisyd h-hui2277 cxmmaycxm blre6 rainbow2811 20224073gist

transcrowd's Issues

how can we generate the attetion weight map?

Small mistake in an example call

Really quick to fix:

In one of your training calls, there is a "--" missing before "batch_size":
"python train.py --dataset ShanghaiA --save_path ./save_file/ShanghaiA batch_size 24 --model_type 'gap'"

QNRF训练数据中包含了全监督信息？

TransCrowd/data/predataset_qnrf.py

Line 96 in 1615f99

result = np.hstack((density_map, crop_img))

作者您好，看到您的work效果非常出色，想请教个问题，在这段qnrf数据集预处理中，把density map监督信息和img信息做了一个hstack然后保存为训练数据，不就相当于在图像训练数据中加了全监督标签信息吗？

请问模型没有预训练直接在shanghai partA上的结果是怎么跑出来的呢

是要训练整整2w个epoch吗，我浮现的结果停在MAE 200左右啊

小问题

您好，我使用您训练好的模型进行测试，在test.py中运行，出现unboundLocal Error问题，请问如何解决谢谢！

Another pretrained models disclosed

Hi, thanks for your novel work. Can you disclose pre-trained models for other datasets and data preparation, which I can compare with your work? Thanks again.

AssertionError: Input image size (7681152) doesn't match model (384384)

Input image size ({H}{W}) doesn't match model ({self.img_size[0]}{self.img_size[1]})

想复现代码，但是报了上面的错误，不知道什么原因

您好，最近想复现您的代码时候，根据readme中的指示，在train阶段发生报错，具体表现为：
RuntimeError: stack expects each tensor to be equal size, but got [3, 384, 384] at entry 0 and [3, 768, 1152] at entry 1
尝试用transformer.Resize(384,384)解决无果，遂来询问。望解答

关于nni超参调整

tuner_params = nni.get_next_parameter()根据这个tuner_params为空，看来nni并没有调整参数呢，为什么要调这个呢

怎么生成NWPU数据集的npy呢？

没有提供prepare_nwpu.py 跟描述里面的不一样呢

你好，我读取了一下npy文件，发现照片并不是全的，是怎么回事呢，谢谢。

关于参数设置问题

你好，我正在复现你的代码。有两点疑问：

batch_size 为多少？原文中并未提到
文中提到，regression token 有两个线性层，请问第一个线性层采用的权重维度是VIT的code中的representation_size吗？具体是多少呢？另外，连个线性层之间的activate function 用的是‘tanh’还是‘gelu’？预训练模型的head中的参数需要load进来吗？

参数设置

请问作者尝试调整过里面一些超参数吗？就比如用AdamW优化器等，而且现在出了一些对VIT优化的trick，会不会使您的方法有一定提升？

Training token model.... where is the regression token?

So im playing with this model around to see exactly how it works at code level, and as far as I know the 'token' model uses a regression token to the input sequence Z0 for the counting, creating a size of HW/K² + 1 input in the regression head (being K the number of patches, HW the dimensions of the image). But i am not able to recognize the explicit difference between the 'token' and 'gap' regression heads inputs in the code.

Could you give me more explanation on how this "regression token" is created and where? and what it is exactly? the paper does not give much enough information about it...

Attention weight可视化

请问作者可以提供attention weight可视化的代码吗？从而方便后续工作分析网络在关注什么

The performance cannot be reproduced

The performance cannot be reproduced. The experiment was conducted on Shanghaitech A, and the mae of the method using the token is 70.4, and the method using the gap is only 74. All experiments performed in accordance with the experimental hyperparameters

We should not crop the images in weakly-supervised counting.

In weakly-supervised counting, only the global crowd numbers are available. If you crop one image into several patches, there will be more supervision information, and the comparison may be unfair due to different number of patches.
I wonder the performance with only the global ground truth number of each image.
Thanks!

pretrained model

I am very interested in your proposal and would love to use it.
However, since I cannot create a Baidu account, can you distribute the trained model with other cloud services?

Thank you for your work

UnboundLocalError: local variable 'img_return' referenced before assignment

Thank you for your work

Now, when I run test.py and train.py, I get the error "UnboundLocalError: local variable'img_return' referenced before assignment".

Can you solve this problem?
I'm running it with the code below.

python test.py --dataset ShanghaiA --pre ./Networks/model_best_A_gap.pth --model_type gap

python train.py --dataset ShanghaiA --save_path ./save_file/ShanghaiA --batch_size 1 --model_type gap