Questions regarding mIoU, accuracy, FID about spade HOT 7 CLOSED

nvlabs commented on July 23, 2024

Questions regarding mIoU, accuracy, FID

from spade.

Comments (7)

taesungp commented on July 23, 2024

Hello,

I did not crop, but resized the whole image to the same size as generated ones (usually 256x256).
(512x256 for Cityscapes and 256x256 for the others).
You are all correct about the models.
I downsampled the labels using nearest neighbor interpolation.

from spade.

Godo1995 commented on July 23, 2024

Thank you so much for your quick response.

from spade.

SolidShen commented on July 23, 2024

Hello !
Can you achieve the mIoU score as the paper mentioned?(62.3 on Cityscapes). I follow your guide and only get 48.7 mIoU on cityscapes val set.

from spade.

zkchen95 commented on July 23, 2024

Hello !
Can you achieve the mIoU score as the paper mentioned?(62.3 on Cityscapes). I follow your guide and only get 48.7 mIoU on cityscapes val set.

I resize the label as the generator photo(512x256), and the result is 53.5 mIoU, 91.0 accu,
and resize both the label and the generator photo to (1024x512), the result is 58 mIoU, 92.9 accu

I have a question for author, what size of the label and generator's photo when evaluation?

from spade.

ShihuaHuang95 commented on July 23, 2024

@ZzzackChen @SolidShen Hi, guys. I have tested the pretrained models (Cityscapes and ADE20k), and I got 64.07 and 43.02 (both represent mIoU). I have downsampled the labels using nearest neighbor interpolation as the authors suggested, (512x256 for cityscapes and 256x256 for others). However, I am confused about the unexpected higher scores than the scores in the paper, especially for ADE20k. @taesungp Would you like to presenet more details for evaluation?

from spade.

fido20160817 commented on July 23, 2024

Hi,

Thank you for sharing this awesome code! Base on this issue, I understand that you are not going to release the evaluation code, and I'm working on reimplementing them myself. I have the following questions:

When computing the FID scores, do you compare to the generated images the original images or the cropped images (the same size as the generated ones)?

What are the image sizes you used for evaluation? Do you generate higher resolution ones for evaluation or just use the default size (512x256 for cityscape, and 256x256 for the others)?

What are the pre-trained segmentation models and code base you use for each datasets? Based on the paper, I assume these are the ones you use. Could you please confirm them?

COCO stuff: code: kazuto1011/deeplab-pytorch model:
deeplabv2_resnet101_msc-cocostuff164k-100000.pth

ADE20K: code: CSAILVision/semantic-segmentation-pytorch model: baseline-resnet101-upernet

Cityscapes: code: fyu/drn model: drn-d-105_ms_cityscapes.pth

When you evaluate mIoUs and accuracies, do you upsample the images or downsample the labels? If so, how do you interpolate them?

Thanks in advance.

Best, Godo

Hi, link for "baseline-resnet101-upernet" is invalid now, can you share this model with me?

from spade.

fido20160817 commented on July 23, 2024

I have successfully downloaded by information from https://github.com/CSAILVision/semantic-segmentation-pytorch/blob/master/demo_test.sh and https://github.com/CSAILVision/semantic-segmentation-pytorch/blob/master/config/ade20k-resnet101-upernet.yaml

from spade.

Questions regarding mIoU, accuracy, FID about spade HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent