lukemelas / deep-spectral-segmentation Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2022] Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
[CVPR 2022] Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Hi, thank you for providing this awsome work!
After reading the paper, I realized that we are highly relying on color, spatial, and the features extracted by the DINO algorithm.
So if we train a DINO model by the dataset that doesn't include background classes like road and sidewalk, and this two different classes shared the similar color and spatial features, can the eigen map still differentiate this two classes?
Thanks again for the discussion!!
Kevin
Thank you for sharing great work!!
I have two questions about extract eigenvectors.
python extract.py extract_eigs
generate the eigenvectors on the input image path (e.g., /home/naoki/deep-spectral-segmentation/testdata/images/014583.pth) and takes about 10 seconds per a image. Is this normal?Thank you in advance.
Hi,
Thank you for your beautiful research. I am facing an issue with the data loader when it comes to object localization. The address for the VOC12 dataset seems to be incorrect, and there is no guide available to set it up properly. The code expects VOC dataset to exist in the directory "datasets"; however, in the previous sections, we set it up in "data". Even, when I set up the address to "data" it cannot load it. Would you please guide me on this?
Thanks.
Hi,
in
https://github.com/lukemelas/deep-spectral-segmentation/blob/main/extract/extract.py#L204
and
https://github.com/lukemelas/deep-spectral-segmentation/blob/main/extract/extract.py#L208
https://github.com/lukemelas/deep-spectral-segmentation/blob/main/extract/extract.py#L210
you convert image_lr
to float by dividing through 255.
Looks like a bug to me.
Hi after running, extract_features, extract_eigs, extract_single_region_segmentations and extract_crf_segmentations. I get a single_region_segmentation that has some segmentation inside but extract_crf_segmentations produces (I guess it is a simple upscaling) black images.
Hi! I am trying to run the self-training part of semantic segmentation, but it cannot run successfully... May I ask if it is the defect of the code itself or if there are any tricks of running the code that not mentioned in the readme file? Many thanks!
I am getting small size of segmaps say 189 bytes, 215 bytes, 139 bytes.
Also, getting variable size segmaps of images present in VOC2012..
How can I increase the size of those segmaps.
Great work, thank you!
After I use the code, the result of the eigenvector is much worse than that of the API in the hugging face. May I ask whether the model used in the hugging face has a fine tune on other data sets? I'm not too sure about the difference between the PAI provided in the project and that in the hugging face
Hi,
I am exactly following the steps mentioned for object segmentation. I can see the segmentations in the folder "patch". However, when I up sample them using CRF, everything becomes a big black picture. Do you know what might be wrong?
Hi, @lukemelas !
Thank you very much to provide your cool work!
I have a question about matting.
In eigenvalue calculation, you do not separate the method between hard and soft decomposition in
deep-spectral-segmentation/extract/extract.py
Line 175 in c90e382
How do I reproduce your results in Figure 6? Could you teach me?
Actually, the matting method is not implemented in https://github.com/lukemelas/deep-spectral-segmentation/blob/main/object-localization/object_discovery.py#L45 .
Hi @lukemelas, fascinating work, thank you for your contribution!
While looking at the semantic segmentation results, I got several questions regarding the baselines used.
Additionally, we give results for directly clustering DINO-pretrained features masked with Deep-USPS saliency maps
Can you explain how you obtained the features for clustering?
we also train a version of MaskContrast based on a DINO-pretrained model
Hi!
It's a great job for unsupervised detection and segmentation. When I try to reproduce the dino-segmentation result in table-4, I only get 19.55 which is far lower than the value reported in your paper(30.8+-2.7). Did I miss something? Looking forward to your reply.
Hi!
I am trying to run semantic segmentation example, following the readme, and it was failing to find the directory. After debugging a bit, turned out an issue was in the MODEL variable in the very start of the example, where there was "dino_vitb16" instead of "dino_vits16" (b instead of s).
Just in case anyone had the same small issue ;)
I have been following the instructions for object segmentation and output is as expected until the CRF segmentation step at which point the output images are entirely black. The masks produced in the previous step are correct, and the upscaling the mask also works, however the output from the denseCRF function is a completely black image.
Hi!
I use ViT-base/16 pretrained with DINO to reproduce the 61.6 result in table-2. But I only get 56.70.
I strictly follow the readme instruction. Do you have any idea?
Looking forward to your prely.
Hi
Thanks for sharing your code.
Hi!
Thank you for your work!
I was testing your project on the voc dataset, when it broke at Step 2, because module "pymatting" is missing. I added it to requiremenets and installed.
Very minor thing, but wanted to let you know:)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.