zzzxxxttt / pytorch_simple_centernet_45 Goto Github PK

View Code? Open in Web Editor NEW

307.0 307.0 64.0 9.75 MB

A simple pytorch implementation of CenterNet (Objects as Points)

Python 43.01% Shell 0.09% C++ 7.25% Cuda 22.14% C 13.11% Lua 3.79% MATLAB 10.57% Makefile 0.04%

deep-learning object-detection pytorch

pytorch_simple_centernet_45's People

Contributors

Stargazers

Watchers

pytorch_simple_centernet_45's Issues

coco 140个epochs??

问一下，大佬是几个GPU训练的啊 coco 140 epochs的得训练多久啊得好长时间吧。。

What is the meaning of "self.max_objs = 128"

Hi guys,
I am trying training this repo on my custom dataset,
and I want to know what the meaning of self.max_objs = 128 is, is this mean that there must be no more than 128 objects in every image?

Your answer or idea will be appreciated!

Train centernet with custom dataset

Hi, Could you please add what are the steps required to train this network for a new dataset and classes?

加载预训练模型出错

加载离线权重时出错，resnet 和 hourglass都是一样，全部显示 No Param，文件路径配置没问题

There is no nms folder in the repo

Hi, guys,
I am continuing learning this repo today,
and I find there is no nms folder in this repo, which has been imported in the test.py, as

So, how can I fix this?

Your answer and idea will be appreciated!

centernet训练自己的数据遇到的错误

用centernet训练自己的数据出现一个物体有多个预测框重叠在一起的现象,加入nms后还是有一个物体两个检测框的情况

cuda error

when I run it in docker container,it shows that:

Epoch: 1 [2020-09-02 08:56:59,475]
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
Traceback (most recent call last):
File "train.py", line 237, in
main()
File "train.py", line 226, in main
train(epoch)
File "train.py", line 143, in train
outputs = model(batch['image'])
File "/opt/conda/envs/centernet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/centernet/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "/opt/conda/envs/centernet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/dyl/Centernet/nets/resdcn.py", line 220, in forward
x = self.conv1(x)
File "/opt/conda/envs/centernet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/envs/centernet/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 320, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:405

Can you share your pre-trained model weights?

It will take a lot of time to train a model on big datasets. So can you share your pre-trained model weights on COCO or pascal-voc?

Result Extraction

Hi @zzzxxxttt,

Thank you for your amazing work. I have completed my training and testing. However, I require two more results for documentation purposes.

Precision-Recall Curve: May I know which python file is the repo using? This is because I found two cocoeval.py after finish
compiling PythonAPI, under the two paths stated below. I have tried modifying both but can't seem to work (looks like the
test.py is not using both) . Can you let me know how can I extract the 101 datapoints for PR curve?
- \lib\cocoapi\PythonAPI\build\lib.linux-x86_64-3.6\pycocotools, and
- \lib\cocoapi\PythonAPI\pycocotools
[email protected] by classes. I can't seem to find the line of code that gives me this. Can you let me know how? I got the overall coco results already, I just need the results by classes.

Hope to hear from you soon. Thank you.

Cheers,
JiaLim98

It seems the affine transformation in the coco.py force the image into the size of 512x512

Hi guys,
After reading the code of coco.py, I find that it seems the affine transformation in the coco.py forces the image into the size of 512x512, as

    trans_img = get_affine_transform(center, scale, 0, [self.img_size['w'], self.img_size['h']])
    img = cv2.warpAffine(img, trans_img, (self.img_size['w'], self.img_size['h']))

I am not sure whether this is a general trick to resize the images of different sizes.

Any idea or answer will be appreciated!

How can't I get the Theoretical performance

I use coco dataset, and resdcn_18_512. And follow the instructions. "python train.py --log_name coco_resdcn18_512 --dataset coco --arch resdcn_18 --lr 5e-4 --lr_step 90,120 --batch_size 24 --num_epochs 100 --num_workers 8 "
After 100 epochs. I get a very poor performance. Am I missing some important steps? Thanks for everyone.

大家好，我是个新手。我使用了作者的预设指令，如上，经过100epochs后的IOU非常的低，请问我有做错哪个重要的步骤吗?感谢赐教

大佬，我在用运行demo.py文件时，跑不通，训练跟测试都跑通了，

验证时，输出的结果都是-1

你好，请问下，我验证时为什么输出的结果都是-1，如下所示：

RuntimeError: The size of tensor a (152) must match the size of tensor b (150) at non-singleton dimension 3

I train my custom dataset using this pytorch centernet, but the following error was raised:

RuntimeError: The size of tensor a (152) must match the size of tensor b (150) at non-singleton dimension 3

Traceback (most recent call last):
  File "train.py", line 238, in <module>
    main()
  File "train.py", line 227, in main
    train(epoch)
  File "train.py", line 149, in train
    hmap_loss = _neg_loss(hmap, batch['hmap'])
  File "pytorch_simple_CenterNet_45/utils/losses.py", line 47, in _neg_loss
    pos_loss = torch.log(pred) * torch.pow(1 - pred, 2) * pos_inds
RuntimeError: The size of tensor a (152) must match the size of tensor b (150) at non-singleton dimension 3

Could you help me to fix it? Thank you.

Illegal Instruction

Hi @zzzxxxttt,

I have already get this repo to work on Google Colab. However, when I try to install at another linux computer. It shows this when I try to compile DCNv2.

Do you have any idea why? Hope to hear from you soon.

Best regards,
JiaLim98

How can we train on our own custom dataset?

请问一下，这个项目里面有用到DCN吗？

尊敬的开发者，你好，
我想请教一下，这个CenterNet里面有用到DCN吗？
还有就是那个关于DCN的编译，

第一行的命令是不是改成，
cd $CenterNet_ROOT/lib/DCNv2_new

期待你的回复！

为什么我使用代码训练的时候得不到预期结果....

我根据README.md中提供的命令行接口对PascalVOC数据集进行训练,最后在validation中的mAP只有10%，其中关于w_h_的loss下降到10左右就不降了，而README.md提供的相同的已经训练好的模型参数的关于w_h_的loss在2左右，为什么我的loss降不下去呢？

python train.py --log_name pascal_resdcn18_384_dp \
                --dataset pascal \
                --arch resdcn_18 \
                --img_size 384 \
                --lr 1.25e-4 \
                --lr_step 45,60 \
                --batch_size 32 \
                --num_epochs 70 \
                --num_workers 10

楼主您好，训练好的 hourglass-104 39.9mAP的模型能共享一下吗？

可以提供一个demo么？我自己写了半天，结果不对

RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCBlas.cu:425

how to solve this error???
help !!

FPS

如何计算fps，代码中有体现吗？

替换backbone

老哥，想问你一下，如果我想修改或者替换backbone，那我应该怎么做呢？我是指有没有什么格式上的要求，比如模型的返回值要包含什么之类的？

AttributeError: 'list' object has no attribute 'to'

Hi @zzzxxxttt,

Thank you for your interesting work.

I followed everything about setting up this repo. However, when I start to train, it gives me this error. Do you know what's wrong?

Hope to hear from you soon.

Regards,
JiaLim98

Why I can't achieve the given mAP after retraining by following the steps？

Are there any training details I need to pay attention to？I reach 70% mAP using flip test. I'm looking forward to your reply.

No such file or directory: './data/voc/VOCdevkit/VOC2007/annotations_cache'

您好，这个报这个错误该如何解决No such file or directory: './data/voc/VOCdevkit/VOC2007/annotations_cache'

Hi, why do you setup the torch.backends.cudnn.enabled False?

Disable cudnn batch normalization. Open torch/nn/functional.py and find the line with torch.batch_norm and replace the torch.backends.cudnn.enabled with False.

Gaussian radius implementation is wrong

The official centernet repo is based on cornernet repo, the code for gaussian radius is wrong, please refer to this issue
princeton-vl/CornerNet#110
Duankaiwen/CenterNet#47

IndexError: index 14 is out of bounds for axis 0 with size 2

Epoch: 1 [2020-08-31 11:13:22,017]
Traceback (most recent call last):
File "train.py", line 241, in
main()
File "train.py", line 229, in main
train(epoch)
File "train.py", line 139, in train
for batch_idx, batch in enumerate(train_loader):
IndexError: index 14 is out of bounds for axis 0 with size 2
在pascal_voc数据集上训练时出现维度问题

zzzxxxttt / pytorch_simple_centernet_45 Goto Github PK

pytorch_simple_centernet_45's People

Contributors

Stargazers

Watchers

Forkers

pytorch_simple_centernet_45's Issues

Recommend Projects

Recommend Topics

Recommend Org