Giter VIP home page Giter VIP logo

seam's People

Contributors

yudewang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

seam's Issues

crf code should be changed

    def _crf_with_alpha(cam_dict, alpha):
        v = np.array(list(cam_dict.values()))
        bg_score = np.power(1 - np.max(v, axis=0, keepdims=True), alpha)
        bgcam_score = np.concatenate((bg_score, v), axis=0)
        crf_score = imutils.crf_inference(orig_img, bgcam_score, labels=bgcam_score.shape[0])
        pred_map = crf_score.argmax(0).astype(np.uint8)
        keys = np.array(list(cam_dict.keys()))+1
        keys = np.pad(keys, (1, 0), mode='constant')
        pred_map = keys[pred_map]
        return pred_map


    for t in crf_alpha:
        crf = _crf_with_alpha(cam_dict, t)
        folder = args.out_crf + ('_%.1f' % t)
        if not os.path.exists(folder):
            os.makedirs(folder)
        import imageio
        imageio.imsave(os.path.join(folder, "%s.png" % img_name), crf.astype(np.uint8))



    print(iter)

`

Training the segmentation code

Hello!

Thank you for sharing the excellent code.

I am trying to reproduce the performance you reported and I tried to train the result of the affinity network [Ahn et al.] with the segmentation code of https://github.com/itijyou/ademxapp

But I failed to train. Can you share the hyper-parameters or any change when you train?
From the affinity net I found that he changed SGD to Adam with his work.

You may not remember, I need a little clue.

Thank you.

Replace AffinityNet with IRN

Dear YudeWang,
Thanks for your code!
Do you have replaced AffinityNet with IRN before? I get a worse result when I replaced AffinityNet with IRN. Could you give me some advice about this?

dCRF on CAMs

Hi, thanks for sharing your great work,I have one question about dCRF in your paper and wish for your reply.
In your paper, a bunch of CAMs can be generated after training the SEAM.py, I want to know how to proceed the dCRF process in these CAMs (56.83% in table 1). Proceed the dCRF on the CAMs after combing with the best background scores from (0,60) or simply using foreground images ?

cam multiplied by GT label?

Both during training and inference the cam output is multiplied by the ground truth label.

training:

Line 123: cam_rv1 = F.interpolate(visualization.max_norm(cam_rv1),scale_factor=scale_factor,mode='bilinear',align_corners=True)*label
Line 129: cam_rv2 = visualization.max_norm(cam_rv2)*label

inference:
Line 63: cam = cam.cpu().numpy() * label.clone().view(20, 1, 1).numpy()

Is that done in error? How can we assume that labels are available during inference?

Huge Time Cost When Running the Code

Hi, thanks to your work!
But I have got a problem when running this code with an 8*GPU(A100) server, it just stuck on this two line
model = torch.nn.DataParallel(model).cuda()
for iter, pack in enumerate(train_data_loader):
And also it cost a lot of time to run evert option in the traning process like F.interpolate
I wonder if there is something wrong with my conda env?

Question about the classification loss

The SEAM is really a excellent work. After reading the paper, I have a question:

  1. how to get the final segmentation mask? In my understanding, the SEAM finally output a CAM map, then the Random work is used to segment the final mask? Am I right?

  2. How to calculate the classification loss? For example, the final output is
    image
    and we can also calculate the background as:
    image
    but, how can we use the two result to calculate the loss? how can we generate the ground truth? Is img(m, n) = c (the true label) the ground truth?

Any suggestion is appreciated!

About training deeplab

Hi, thank you for your excellent work,
Can you provide the code of training the fully-supervised deeplab model? Or can you give me some hints about how you initialize the deeplab to train on pseudo labels? Did you just train it from scratch, or use the backbone trained on imagenet, or use the pretrained parameters on coco?
Thank you for your reply.

Background threshold?

I notice that you traverse all background threshold options and give the best mIoU of pseudo labels, this setting assumes that the ground truth masks are available during pseudo label generating. However, in practice, if the gt masks are available, why don't we just use these gt labels? So I think a background threshold selection strategy without depending on gt masks is needed here for practice. What do you think of it? Thanks!

Can you share the segmentation code your used?

Hey, thanks for sharing your code!
I've run your code and achieved close results as reported on training set. But I didn't find the segmentation code. Can you share it?
Many thanks!

Some issues about the performance(mIoU)

Hi,

Thanks for sharing the code.

I am trying to reproduce your code, but the final result I got is 3% different from the result in your article.

I followed the steps in the readme exactly. The results of local training are Train: 63.420%, Val: 60.336%; the results obtained by using the model you provided are: Train: 63.606%, Val: 60.076%.

I am not sure where the problem occurred, and I look forward to your answer, thank you.

关于CRF的参数设置

您好,我觉得您这个库挺方便的,但我有个问题,就是你的crf参数是怎么设置的?比如infer_SEAM中为什么是4和24?我看之前有人问了这个问题,你说的是just set bkg_score_low<best bkg_score<bkg_score_high。我在实验我的模型的时候,发现best bkg_score是0.21,按理来说按照你这个设置就好,但发现crf refine后的结果更低了,请问有什么好的调参方法嘛?

Why using cls labels to generate CAMs at inference time? Is it valid?

At val / test time, in infer_SEAM.py (line 79 to line 82), you use GT cls labels to choose CAMs of these categories and save these specified CAMs as .npy files. I am wondering whether using GT cls labels at inference time is valid in weakly-supervised semantic segmentation. Could you provide me with some hints? Much thanks!

Optimization problem when training SEAM from scratch

Hi, firstly thank you for releasing the code, I've successfully reproduced part of the result by using the provided weights.

However, when I tried to train SEAM from scratch (not using any pretrained weights), it seems ER loss easily goes down to 0 and ECR loss just cannot go down, then the model cannot improve anymore. I've tried to increase the loss weight of ECR loss but the outcome is still the same.
Could you provide more details or suggestions on how you train SEAM without pretrained weights?

Thanks!

How is CAM mIOU validated?

Hi @YudeWang, whether CAM mIOU reported in your paper validated on original PASCAL VOC trainset or SBD augmented trainset? When I use the same hyperparameter as your code, I can only get 43.9% mIOU with single-scale test on augmented trainset, lower than reported 46.1%.

Large performance gap between trained model using default setting and the provided trained model.

With the provided trained 'resnet38_SEAM.pth', the results of SEAM step evaluation:

0/60 background score: 0.000 mIoU: 28.861%
1/60 background score: 0.010 mIoU: 32.021%
2/60 background score: 0.020 mIoU: 35.937%
3/60 background score: 0.030 mIoU: 39.372%
4/60 background score: 0.040 mIoU: 42.470%
5/60 background score: 0.050 mIoU: 45.309%
6/60 background score: 0.060 mIoU: 47.967%
7/60 background score: 0.070 mIoU: 50.436%
8/60 background score: 0.080 mIoU: 52.721%
9/60 background score: 0.090 mIoU: 54.865%
10/60 background score: 0.100 mIoU: 56.885%
11/60 background score: 0.110 mIoU: 58.777%
12/60 background score: 0.120 mIoU: 60.595%
13/60 background score: 0.130 mIoU: 62.310%
14/60 background score: 0.140 mIoU: 63.905%
15/60 background score: 0.150 mIoU: 65.372%
16/60 background score: 0.160 mIoU: 66.710%
17/60 background score: 0.170 mIoU: 67.907%
18/60 background score: 0.180 mIoU: 68.925%
19/60 background score: 0.190 mIoU: 69.758%
20/60 background score: 0.200 mIoU: 70.414%
21/60 background score: 0.210 mIoU: 71.014%
22/60 background score: 0.220 mIoU: 71.291%
23/60 background score: 0.230 mIoU: 71.324%
24/60 background score: 0.240 mIoU: 71.143%
25/60 background score: 0.250 mIoU: 70.799%
26/60 background score: 0.260 mIoU: 70.287%
27/60 background score: 0.270 mIoU: 69.664%
28/60 background score: 0.280 mIoU: 68.952%
29/60 background score: 0.290 mIoU: 68.148%
30/60 background score: 0.300 mIoU: 67.274%
31/60 background score: 0.310 mIoU: 66.322%
32/60 background score: 0.320 mIoU: 65.305%
33/60 background score: 0.330 mIoU: 64.232%
34/60 background score: 0.340 mIoU: 63.105%
35/60 background score: 0.350 mIoU: 61.939%
36/60 background score: 0.360 mIoU: 60.727%
37/60 background score: 0.370 mIoU: 59.485%
38/60 background score: 0.380 mIoU: 58.215%
39/60 background score: 0.390 mIoU: 56.921%
40/60 background score: 0.400 mIoU: 55.609%
41/60 background score: 0.410 mIoU: 54.281%
42/60 background score: 0.420 mIoU: 52.940%
43/60 background score: 0.430 mIoU: 51.605%
44/60 background score: 0.440 mIoU: 50.279%
45/60 background score: 0.450 mIoU: 48.955%
46/60 background score: 0.460 mIoU: 47.630%
47/60 background score: 0.470 mIoU: 46.303%
48/60 background score: 0.480 mIoU: 44.982%
49/60 background score: 0.490 mIoU: 43.653%
50/60 background score: 0.500 mIoU: 42.330%
51/60 background score: 0.510 mIoU: 41.015%
52/60 background score: 0.520 mIoU: 39.709%
53/60 background score: 0.530 mIoU: 38.409%
54/60 background score: 0.540 mIoU: 37.119%
55/60 background score: 0.550 mIoU: 35.848%
56/60 background score: 0.560 mIoU: 34.601%
57/60 background score: 0.570 mIoU: 33.372%
58/60 background score: 0.580 mIoU: 32.158%
59/60 background score: 0.590 mIoU: 30.959%

When using the 'resnet38_SEAM.pth' trained myself using the default settings (except that I used two GPU cards,the batch size was still set to 8), the results of SEAM step evaluation:

0/60 background score: 0.000 mIoU: 22.938%
1/60 background score: 0.010 mIoU: 26.294%
2/60 background score: 0.020 mIoU: 30.367%
3/60 background score: 0.030 mIoU: 33.779%
4/60 background score: 0.040 mIoU: 36.815%
5/60 background score: 0.050 mIoU: 39.461%
6/60 background score: 0.060 mIoU: 41.722%
7/60 background score: 0.070 mIoU: 43.691%
8/60 background score: 0.080 mIoU: 45.386%
9/60 background score: 0.090 mIoU: 46.875%
10/60 background score: 0.100 mIoU: 48.230%
11/60 background score: 0.110 mIoU: 49.466%
12/60 background score: 0.120 mIoU: 50.592%
13/60 background score: 0.130 mIoU: 51.575%
14/60 background score: 0.140 mIoU: 52.443%
15/60 background score: 0.150 mIoU: 53.182%
16/60 background score: 0.160 mIoU: 53.806%
17/60 background score: 0.170 mIoU: 54.334%
18/60 background score: 0.180 mIoU: 54.759%
19/60 background score: 0.190 mIoU: 55.087%
20/60 background score: 0.200 mIoU: 55.339%
21/60 background score: 0.210 mIoU: 55.510%
22/60 background score: 0.220 mIoU: 55.590%
23/60 background score: 0.230 mIoU: 55.594%
24/60 background score: 0.240 mIoU: 55.525%
25/60 background score: 0.250 mIoU: 55.382%
26/60 background score: 0.260 mIoU: 55.169%
27/60 background score: 0.270 mIoU: 54.892%
28/60 background score: 0.280 mIoU: 54.556%
29/60 background score: 0.290 mIoU: 54.155%
30/60 background score: 0.300 mIoU: 53.685%
31/60 background score: 0.310 mIoU: 53.182%
32/60 background score: 0.320 mIoU: 52.640%
33/60 background score: 0.330 mIoU: 52.064%
34/60 background score: 0.340 mIoU: 51.445%
35/60 background score: 0.350 mIoU: 50.793%
36/60 background score: 0.360 mIoU: 50.107%
37/60 background score: 0.370 mIoU: 49.380%
38/60 background score: 0.380 mIoU: 48.624%
39/60 background score: 0.390 mIoU: 47.837%
40/60 background score: 0.400 mIoU: 47.029%
41/60 background score: 0.410 mIoU: 46.199%
42/60 background score: 0.420 mIoU: 45.353%
43/60 background score: 0.430 mIoU: 44.483%
44/60 background score: 0.440 mIoU: 43.593%
45/60 background score: 0.450 mIoU: 42.681%
46/60 background score: 0.460 mIoU: 41.749%
47/60 background score: 0.470 mIoU: 40.809%
48/60 background score: 0.480 mIoU: 39.855%
49/60 background score: 0.490 mIoU: 38.890%
50/60 background score: 0.500 mIoU: 37.914%
51/60 background score: 0.510 mIoU: 36.934%
52/60 background score: 0.520 mIoU: 35.954%
53/60 background score: 0.530 mIoU: 34.974%
54/60 background score: 0.540 mIoU: 33.988%
55/60 background score: 0.550 mIoU: 32.998%
56/60 background score: 0.560 mIoU: 32.011%
57/60 background score: 0.570 mIoU: 31.033%
58/60 background score: 0.580 mIoU: 30.064%
59/60 background score: 0.590 mIoU: 29.102%

is there any ablation study of ECR loss

the performance improvement is mostly abtained by PCM module which is constrained by ECR loss, so the effect of ECR loss is more important than ER loss.
Can you provide the ablation study of ECR loss?
it is better to add a ablation study of ecr loss in table 1.

Can u give some detail about training?

Dear Wang, I just use the default training script to run train.py, However the result are lower than paper said(the result mIOU I test in VOC2012 val are 0.388), So can u give some detail about training?(some hyper-param?)

Question about affinityNet Inference

Dear YudeWang,
Thanks for sharing your code!
Should I use the result value of the CAM that I checked through the existing SEAM for the cam_dir paramter in infer_aff.py?
Anyone knows the answer?

Performance about `SEAM step`, step3 and `Random walk step`, step3 in README?

Thansks for your sharing!
Can you report the mIoU in SEAM step, step3 and Random walk step, step3 in README?
The previous AffinityNet train another segmentation network with pseudo label and the related source code is not open. I am not sure if i can reproduce the results mentioned in AffinityNet becase i am not familiar with the training of DeepLab. Thanks!

GPU and batch size?

Thanks for your great work!
I noticed that in your paper you mentioned: The model is trained on 4 TITAN-Xp GPUs with batch size 8 for 8 epochs.
However, I train the SEAM on 4 2080Ti GPUs with batch size 8, and find that each card only took up about 4G memory.
So I wonder, are 4×12G GPUs necessary?
Thanks for your reply.

The paramters in optimizer

Hello, I note that the order of paramters (params lr wd) in PolyOptimizer is different from official SGD(params lr momentum). So I think the value of wd will actually be assigned to momentum. Is it so?

class PolyOptimizer(torch.optim.SGD):

    def __init__(self, params, lr, weight_decay, max_step, momentum=0.9):
        super().__init__(params, lr, weight_decay)

Loss_ER is too small. Is it really helpful?

image
Thanks for your wonderful job.
However, when I read your code, I find that the loss_er is too small compared with other two loss. The reason is that you apply mean operation directly, however, there are a lot of 0(you set 0 for C-1 channels).
image
And I find the improvement within loss_er is much small in your paper compared with loss_ecr, I argue this may a bug?
image
I am sorry I do not have enough gpus to reproduce it.
Look forward to your reply

segmentation training problems

  1. It seems that you use the train_set to train segmentation model. why not use trainaug?
  2. Following the setting in #11, my results is 61.5 training with trainaug and 56.7 with train. Why it differs a lot from the results of the paper? (Note that the weight is from ilsvrc-cls_rna-a1_cls1000_ep-0001.params. test resolution is (1024*512) * [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] in test.)
  3. why it drops after applying crf in RW step?

infer_seam npy

first of all,thanks for this code ,it vary useful!
however,when i use infer_seam, the result will saved as .npy,how can i used it to pred hotmap?

如何利用自己的数据集进行训练

您好,我自己的数据集包含两个类别,需要制作成voc2012这种格式来进行训练吗?还有就是您提供的预训练模型可以用于我自己数据集的训练吗。

Exception during running train_SEAM.py

When I run the script train_SEAM with the weights ilsvrc-cls_rna-a1_cls1000_ep-0001.params, the code will stop randomly without any information. How can I solve it? Anybody has the same problem?

CAM is not accurate.

Hi, I know the performance of weakly-supervised semantic segmentation is not so well as supervised SS.

But I still confused about the result, it's much worse than I thought.

Original image:
image

CAM:
image

I just run this command: python infer_SEAM.py --weights ../resnet38_SEAM.pth --infer_list voc12/val.txt --out_cam_pred out_cam_pred

cam_full_arr[k+1] = v out of bounds

Thanks for posting your excellent code!
I met some problems when using the code. In line95, the npy files store the information of classes in a training sample. The max k in line 98 is 21 for the VOC dataset, because the VOC dataset contains 21 categories. In line 99, the index will out of the bound, because k+1 can vbe 22 But cam_full_arr's size is 21. How can I do to solve the error?
And what does the line 100 mean? What is filled in the cam_full_arr[0]? I am confused.

SEAM/infer_aff.py

Lines 95 to 100 in 3212261

cam = np.load(os.path.join(args.cam_dir, name + '.npy'), allow_pickle=True).item()
cam_full_arr = np.zeros((21, orig_shape[2], orig_shape[3]), np.float32)
for k, v in cam.items():
cam_full_arr[k+1] = v
cam_full_arr[0] = (1 - np.max(cam_full_arr[1:], (0), keepdims=False))**args.alpha

Looking forward to your reply.

exception during SEAM inference

Dear YudeWang,
I have successfully trained the model but during the stage of SEAM inference, the code will stop at a random iter(38, 14 or whatever) and don't go on and out_cam(or out_crf) folder won't produce file anymore. How can I solve the exception?

About code

Thank you for your sharing firstly!
What do the model outputs crm1 and crm_rv1 represent in the train_SEAM.py file?

Performance from the provided weights

Hello!

Thanks for the sharing the code.

I ran your code with the trained weights and got lower performance than the paper reported.
60.076 % mIOU for the validation set.

My inference step is

  1. Infer the CAM(npy files) from the infer_SEAM.py with the trained model
  2. Infer the Segmentation map(png files) from the infer_aff.py with the trained model
  3. evaluate the png files with gt files.

Is there anything I missed?
Than you!

CRF Inference

Hi,

Thank you for sharing the code. When I was trying to understand the code, I ran into a trouble to understand crf_inference. In line 96 of infer_SEAM.py, bgcam_score does not seem like probabilities (I checked max value for some images are 1.2). But unary_from_softmax takes probs from softmax as inputs. I am not sure if something is wrong here or I am missing something.
It would be highly appreciated if you could clarify it. Thank you.

training with my custom dataset

Hi. Thank you for sharing your code.

I'm trying to train the model with my custom dataset.
The number of class is 3, so I changed the code in resnet38_SEAM.py

line 16 : self.fc8 = nn.Conv2d(4096, 4, 1, bias=False)

I just changed the dim and run the code, but error occurs.
I thought that it's about the CUDA so I changed the batch size 2.
But the result same.

I found that the error occurs when the loss is Nan.
After some iterations, the loss_cls1 and loss_cls2 become Nan..

THCudaCheck FAIL file=C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src\THC/generic/THCTensorMathPointwise.cu line=253 error=59 : device-side assert triggered
C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorScatterGather.cu:130: block: [0,0,0], thread: [0,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorScatterGather.cu:130: block: [0,0,0], thread: [1,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorScatterGather.cu:130: block: [0,0,0], thread: [2,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorScatterGather.cu:130: block: [0,0,0], thread: [3,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorScatterGather.cu:130: block: [0,0,0], thread: [4,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src/THC/THCTensorScatterGather.cu:130: block: [0,0,0], thread: [5,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
Traceback (most recent call last):
  File "C:/Users/Goeun/PycharmProjects/SEAM2/train_SEAM.py", line 144, in <module>
    loss.backward()
  File "C:\Users\Goeun\miniconda3\envs\seam\lib\site-packages\torch\tensor.py", line 118, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "C:\Users\Goeun\miniconda3\envs\seam\lib\site-packages\torch\autograd\__init__.py", line 93, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at C:/w/1/s/tmp_conda_3.6_045031/conda/conda-bld/pytorch_1565412750030/work/aten/src\THC/generic/THCTensorMathPointwise.cu:253

Process finished with exit code 1

OHEM

Hello, Yude.

Thanks for sharing this great work!

I have one question about table 1. You mentioned that you reported results in table 1 with the training set. Then, it seems OHEM process should be involved with train_SEAM.py. Is that correct? Does your repo include OHEM process? How can I use OHEM in your code?

Thanks

result images

How can I get the final result images? Is the generated npy file converted to an image?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.