nv-nguyen / cnos Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything and DINOv2
License: MIT License
[ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything and DINOv2
License: MIT License
Thanks for the elegant code!
After i get the seg results using the command"python run_inference.py dataset_name=$DATASET_NAME model.onboarding_config.rendering_type=pyrender", i get the json file named"CustomSamAutomaticMaskGenerator_template_pyrender0_aggavg_5_lmo.json".
When i try to evaluate it using bop_toolkit, using scripts/eval_bop22_coco.py, bug occurs due to inappropriate name spliting.
I believe the cause is this:
result_name = os.path.splitext(os.path.basename(result_filename))[0]
result_info = result_name.split('_')
method = str(result_info[0])
dataset_info = result_info[1].split('-')
dataset = str(dataset_info[0])
split = str(dataset_info[1])
split_type = str(dataset_info[2]) if len(dataset_info) > 2 else None
Can you check this and tell me how to fix this ? Thanks
Is there any plan to release the pre-computed Linemod and YCBV segmentation results? People who are interested in evaluating benchmarks might want to directly use your segmentation w/o needing to setup the code and re-run themselves.
Hello, I have a question about when using DINOv2. Could you please help me?I instantiated a vit_small ViT model and tried to load the pretrained weights using the load_pretrained_weights function from utils. Here's the code I wrote:
self.vit_model = vits.dict'vit_small'
load_pretrained_weights(self.vit_model, 'https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_pretrain.pth', None)
However, I encountered the following error:
Traceback (most recent call last):
File "/data/PycharmProjects/train.py", line 124, in
model = model(aff_classes=args.num_classes)
File "/data/PycharmProjects/models/locate.py", line 89, in init
load_pretrained_weights(self.vit_model, pretrained_url, None)
File "/data/PycharmProjects/models/dinov2/dinov2/utils/utils.py", line 32, in load_pretrained_weights
msg = model.load_state_dict(state_dict, strict=False)
File "/home/ustc/anaconda3/envs/locate/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1605, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DinoVisionTransformer:
size mismatch for pos_embed: copying a param with shape torch.Size([1, 1370, 384]) from checkpoint, the shape in current model is torch.Size([1, 257, 384]).
Could you please help me understand what might be causing this issue? Thank you for your assistance.
This is the result (scene 000048, image 001087) that I parsed from the released precomputed results sam_pbr_ycbv.json
. Each pixel shows the object ID.
As you can see the "chef can" is under-segmented. There are also many other false positives (the actual objects shouldn't be there). Is this the expected result, or is there anything I'm missing when parsing the json file?
Hi, can the renderer provide correspondences between pixels on a rendered image and the points on the CAD model?
i need to render objects in custom rotation. when i use the script you provided to render ycbv and hb dataset, i got normal results.
but i got strange results when rendering objects in several subset of BOP, For example:
lmo obj000008
icbin obj000001
itodd obj000008
Have you ever encounter a similar situation? or could you give some advice about what caused the problem? Thanks!
Could you please provide the output result of the rendering templates again? The Google Drive link you provided is invalid. Thank you.
Thank you for your great work โ so far I have found it very helpful!
To run your work on a dataset I created, I wrote the attached script. For the most part, I'm getting good results, but I'm also having problems with a few predictions (missing detections, incorrect class assignments and incorrect detections), as the following visualizations show, for example:
I would be very grateful if you could, based on your experience in using the model, give me some hints on what changes I could perhaps try to improve the results a bit more.
Thank you very much!
I have some custom CAD models which have dimensions originally in the order of 0.1m. However, they don't appear at all in the rendered images. So I scaled the mesh to 1000x using blender and then rendered. My rendered images are attached below: some of the regions are getting cut off. (This is mesh_001.ply in this link). Is there some standard in scaling the meshes so that I get perfect renders?
In turn, I'm also not getting any segmented results in the below test RGB Image:
Hi, I was installing the CNOS and found an issue with the installation - the newest version of ultralytics doesn't work with CNOS, as ultralytics.yolo
is deprecated since version '8.0.136' and was later removed. Adding the '<=8.0.135' requirement solves the issue, it should be added to the README.
I would like to know how to perform inference on custom dataset using FastSAM instead of normal SAM.
When using the script inference_custom.py it does not seem possible to change the model as normal SAM is predetermined.
If somebody could reach me a hand i would really appreciate it. :)
thanks in advance
Hello once again! I return with another surgical tool that I am unable to segment. The original image and segmentation is shown below:
As you can see, absolutely nothing is detected. In order to make sure that it is not a problem with my CAD model, I masked out the entire tool to make it a single color. See below:
With the same templates, it is suddenly detected. This brings me to the conclusion that it has something to do with the SAM segmentation. I ran the tool through SAM Segment Everything and found this:
I believe the problem is that the tool has multiple colors and is segmented in a fragmented way instead of as an entire tool. Is there some way in this code to have an influence on connecting masks and checking them for a match?
Hi,
Thank you for sharing the excellent baseline for unseen object segmentation.
I wonder do you have plans to release the CAD-free novel object segmentation results on BOP dataset,
which mentioned in the Discussion Section?
Thank you
Thanks for the great work! I now want to get multiple objects from the image, any reference code for me?
After detail read the code, may be need to change this line into ref_feats with multi object shape? But I am not very confidence to solve the problem.
scores=metric(decriptors[:,None,:],self.ref_feats[None,:,:])
The final hope that can get a all_masks np.array, which include all object mask~
Hello, I have a question. I didn't find the reference of 'src->model->dinov2.py'. So, in which file did you import dinov2 model? Haven't you provided the training file?
Dear author, thanks for open-sourcing the great work. I am confused about the result in the BOP leaderboard (https://bop.felk.cvut.cz/leaderboards/detection-unseen-bop23/core-datasets/). The FastSAM result is different from that in the Table 1. What makes this difference?
First of all, thank you for sharing such an outstanding piece of work. I have an issue here. I have a pair of pliers where you can observe two distinct colors on the upper and lower parts. Consequently, CNOS is only segmenting the yellow section on top. Do you have any suggestions to address this problem? I believe this might be an inherent limitation of the SAM model.
Hi, thanks for your great work. When running run_inference.py
, I got error about wandb.
wandb: WARNING `resume` will be ignored since W&B syncing is set to `offline`. Starting a new run with run id cizd5iz4.
wandb: Tracking run with wandb version 0.15.5
wandb: W&B syncing is set to `offline` in this directory.
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
[2023-07-22 19:31:46,008][pytorch_lightning.utilities.rank_zero][INFO] - ModelCheckpoint(save_last=True, save_top_k=-1, monitor=None) will duplicate the last checkpoint saved.
Error executing job with overrides: ['dataset_name=', 'model.onboarding_config.rendering_type=pyrender']
Error in call to target 'pytorch_lightning.trainer.trainer.Trainer':
MisconfigurationException('You requested gpu: [0, 1, 2, 3]\n But your machine only has: [0]')
full_key: machine.trainer
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
wandb: Waiting for W&B process to finish... (failed 1).
wandb: You can sync this run to the cloud by running:
wandb: wandb sync ./datasets/bop23_challenge/results/cnos_exps/wandb/offline-run-20230722_193143-cizd5iz4
Do I have to use some wandb account to restore stuff?
Hi, thanks for your nice work!
And I have some questions hope you can help me:
The dissimilarity I have observed primarily revolves around the number of reference images utilized, could that be the main contributing factor?
Would it be possible for you to kindly provide the code necessary for replicating the baseline results? I sincerely appreciate your assistance in this matter.
Hello, thanks for the great work. While testing out the repo on my custom CAD model, I had memory issues and the requirement seems to be huge. I tried using CUDA_VISIBLE_DEVICES=0,1 before the bash command but it still uses 1 GPU only. Let me know what I might be doing wrong or a solution to this problem
Thanks a lot for releasing the pre-computed segmentations!
However, I found in YCBV data, the frame_ids are not complete. E.g. 0048/001012
is a keyframe in evaluation set, but it doesn't exist in the sam_pbr_ycbv.json
I tried this code and it works beautifully on the provided datasets. Considering the shots are cluttered with all kinds of objects, there are occlusions, etc. I am extremely impressed by the performance on this.
However, as soon as I move to a custom dataset, this performance is not repeatable whatsoever. I am trying this out on surgical tools, for which I have an accurate CAD model. The images I tried it on are close-ups of the objects, the single object only, a white background, and no occlusion; in other words, as simple as it gets and technically a perfect template match. However, the model either only predicts a tiny part of the object or just something completely wrong like the entire background (everything but the object).
Could you maybe comment on the types of objects this works well on (the objects in your datasets seem a bit more bulky while the surgical tools are more skinny) or whether there are any tricks to improving this performance?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.