easton-cau / sotr Goto Github PK
View Code? Open in Web Editor NEWSOTR: Segmenting Objects with Transformers
License: MIT License
SOTR: Segmenting Objects with Transformers
License: MIT License
hello, author! I use the visualize_data.py to visualize the image , but the results of the instance segmentation have the bounding box . However, there is no bounding box in the paper. Can you tell me the reason and how adjust it?
How to run demo, there are a lot of files not found
您好,我在运行测试和训练时遇到了几个问题,希望您解答一下。
首先我运行
python tools/train_net.py \
--config-file configs/SOTR/R101.yaml \
--eval-only \
--num-gpus 4 \
MODEL.WEIGHTS work_dir/SOTR_R101/SOTR_R101.pth
得到的结果是:
| AP | AP50 | AP75 | APs | APm | APl |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 39.730 | 60.303 | 42.707 | 18.045 | 43.414 | 59.794 |
跟您给出的有一定差距,不知道是哪里出了问题。
另外,在运行训练代码时,tools/train_net.py
的第52行super(DefaultTrainer, self).__init__(model, data_loader, optimizer)
似乎有问题,我把它改成super(Trainer, self).__init__(cfg)
问题消失,但是会出现
FloatingPointError: Loss became infinite or NaN at iteration=2!
loss_dict = {'loss_ins': nan, 'loss_cate': nan}
我的学习率设置是:
SOLVER:
IMS_PER_BATCH: 4
BASE_LR: 0.00001
WARMUP_FACTOR: 0.00001
请问NaN的问题应该如何解决?
期待您的回复,谢谢。
Noticed that SOTR APs is low 10.7 vs MaskRCNN 20, does there any reason for this? Just wonder if APs can boosted the AP overall could even higher.
super(DefaultTrainer, self).init(model, data_loader, optimizer)
TypeError: init() takes 1 positional argument but 4 were given
does anyone know How to solve this problem?
尊敬的作者您好,我想使用自己的图片测试您这里的demo程序,但是参数一致填不对,pth文件已经下载,是否方便给一个使用demo.py测试的示例呢
hi, i can't find the model about SOTR-RT-736 to test pic at high FPS, when i am reshowing your work, can you give me some help?
Hi, thanks for your great work!
I tested your pre-trained model (R-101 3x) on test-dev2017 in the coco evaluation server.
when I extract only mask results and save to json files, the segmentation score is matched the results reported on the paper. However, when I save mask results with box results (generated by 511~514 lines in sotr.py) to json files, the score (AP_s, AP_m, AP_l) is different from the paper.
This is the result from json file which is consist of only mask information
overall performance
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.402
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.612
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.102
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.590
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.731
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.328
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.512
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.536
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.301
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.590
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.733
This is the result from json file which is consist of mask information with box information
overall performance
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.402
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.612
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.434
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.194
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.440
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.552
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.328
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.512
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.536
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.301
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.590
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.733
As can be seen, (AP_s, AP_m, AP_l) of first result and second result are (0.102, 0.590, 0.731) and (0.194, 0.440, 0.552), respectively.
I don't know the detailed process of the coco evaluation server, so I wonder why there is a difference in segmentation ap due to the presence or absence of box information.
Hello!I read your code recently,i have same questions. First, if xy_pos_emb_shaped=None, get the tensor is same as input.But in your paper say"Position embeddings are added to the blocks to retain positional information, meaning that the position embedding spaces for the column and row are 1∗N ∗C and N ∗1∗C. "How can i add the position information?Second,different training setting with other models in yaml files,(1024, 2048)?Is this an unfair setting?
感谢,测试CPU已经通过,耗时大约0.3s,然后自己想测试GPU版本效果的时候发现时间和CPU效果一致,VisualizationDemo初始化的时候已经修改parallel为true,且开启了多次循环测试,发现时间还是和CPU相当。添加日志证明确实是调用的AsyncPredictor,您是否方便给些指导?已确认环境安装无误
With this command
python tools/train_net.py \
--config-file configs/SOTR/R101.yaml \
--num-gpus 1
I get error
File "tools/train_net.py", line 52, in __init__
super(DefaultTrainer, self).__init__(model, data_loader, optimizer)
TypeError: __init__() takes 1 positional argument but 4 were given
My environment
torch==1.7.1
torchvision==0.9.2
detectron2==0.5
I am not sure what to do with it. Maybe try
detectron2==0.2.1 with torch==1.6
python -m pip install detectron2==0.2.1 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.6/index.html
Can you help me with this?
请问作者用的是分布式训练方式吗?
只有pytorch1.6版本对应有detectron2==0.2.1.
pytorch>1.7版本对应detectron2==0.5
但是要求pytorch>1.7,detectron2==0.2.1.
请问应该如何解决呢
期待作者的回复
I encountered the following problem when running SOTR with a new data set.
File "/home/xxx/EndoCV2022/detectron2/SOTR/adet/modeling/sotr/sotr.py", line 33, in init
self.scale_ranges = cfg.MODEL.SOTR.FPN_SCALE_RANGES
I checked the configuration file R50.yaml and found that there was no such variable. Do you know how to solve it?
Traceback (most recent call last): File "tools/train_net.py", line 218, in <module> launch( File "/root/miniconda3/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch main_func(*args) File "tools/train_net.py", line 206, in main trainer = Trainer(cfg) File "tools/train_net.py", line 52, in __init__ super(DefaultTrainer, self).__init__(model, data_loader, optimizer) TypeError: __init__() takes 1 positional argument but 4 were given
Excuse me, what is this problem?
作者您好,我在集群训练的时候出现问题,希望您能解答一下:
我的环境是:
torch == 1.7.1
torchvision == 0.8.2
detectron == 0.2.1
集群显卡使用:
1块显存12G的V100
学习率设置:
IMS_PER_BATCH: 2
BASE_LR: 0.00001
WARMUP_FACTOR: 0.00001
报出结果:NAN
学习率设置:
IMS_PER_BATCH: 4
BASE_LR: 0.00001
WARMUP_FACTOR: 0.00001
报错结果:CUDA out of memory
请问怎么解决这个问题?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.