liuyuan-pal / gen6d Goto Github PK

View Code? Open in Web Editor NEW

572.0 12.0 73.0 27.88 MB

[ECCV2022] Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images

License: GNU General Public License v3.0

Python 100.00%

6dof-pose 6d-pose-estimation eccv2022 pose-estimation

gen6d's Introduction

Gen6D

Gen6D is able to estimate 6DoF poses for unseen objects like the following video.

Project page | Paper

Todo List

Pretrained models and evaluation codes.
Pose estimation on custom objects.
Training codes.

Installation

Required packages are list in requirements.txt. To determine how to install PyTorch along with CUDA, please refer to the pytorch-documentation

Download

Download pretrained models, GenMOP dataset and processed LINEMOD dataset at here.
Organize files like

Gen6D
|-- data
    |-- model
        |-- detector_pretrain
            |-- model_best.pth
        |-- selector_pretrain
            |-- model_best.pth
        |-- refiner_pretrain
            |-- model_best.pth
    |-- GenMOP
        |-- chair 
            ...
    |-- LINEMOD
        |-- cat 
            ...

Evaluation

# Evaluate on the object TFormer from the GenMOP dataset
python eval.py --cfg configs/gen6d_pretrain.yaml --object_name genmop/tformer

# Evaluate on the object cat from the LINEMOD dataset
python eval.py --cfg configs/gen6d_pretrain.yaml --object_name linemod/cat

Metrics about ADD-0.1d and Prj-5 will be printed on the screen.

Qualitative results

3D bounding boxes of estimated poses will be saved in data/vis_final/gen6d_pretrain/genmop/tformer. Ground-truth is drawn in green while prediction is drawn in blue.

Intermediate results about detection, viewpoint selection and pose refinement will be saved in data/vis_inter/gen6d_pretrain/genmop/tformer.

This image shows detection results.

This image shows viewpoint selection results. The first row shows the input image to the selector. The second row shows the input images rotated by the estimated in-plane rotation (left column) or the ground-truth in-plane rotation(right column) Subsequent 5 rows show the predicted (left) or ground-truth (right) 5 reference images with nearest viewpoints to the input image.

This image shows the pose refinement process. The red bbox represents the input pose, the green one represents the ground-truth and the blue one represents the output pose for the current refinement step.

Pose estimation on custom objects

Please refer to custom_object.md

Training

Download processed co3d data (co3d.tar.gz), google scanned objects data (google_scanned_objects.tar.gz) and ShapeNet renderings (shapenet.tar.gz) at here.
Download COCO 2017 training set.
Organize files like

Gen6D
|-- data
    |-- GenMOP
        |-- chair 
            ...
    |-- LINEMOD
        |-- cat 
            ...
    |-- shapenet
        |-- shapenet_cache
        |-- shapenet_render
        |-- shapenet_render_v1.pkl
    |-- co3d_256_512
        |-- apple
            ...
    |-- google_scanned_objects
        |-- 06K3jXvzqIM
            ...
    |-- coco
        |-- train2017

Train the detector

python train_model.py --cfg configs/detector/detector_train.yaml

Train the selector

python train_model.py --cfg configs/selector/selector_train.yaml

Prepare the validation data for training refiner

python prepare.py --action gen_val_set \
                  --estimator_cfg configs/gen6d_train.yaml \
                  --que_database linemod/cat \
                  --que_split linemod_val \
                  --ref_database linemod/cat \
                  --ref_split linemod_val

python prepare.py --action gen_val_set \
                  --estimator_cfg configs/gen6d_train.yaml \
                  --que_database genmop/tformer-test \
                  --que_split all \
                  --ref_database genmop/tformer-ref \
                  --ref_split all

This command will generate the information in the data/val, which will be used in producing validation data for the refiner. 7. Train the refiner

python train_model.py --cfg configs/refiner/refiner_train.yaml

Evaluate all components together.

# Evaluate on the object TFormer from the GenMOP dataset
python eval.py --cfg configs/gen6d_train.yaml --object_name genmop/tformer

# Evaluate on the object cat from the LINEMOD dataset
python eval.py --cfg configs/gen6d_train.yaml --object_name linemod/cat

How to make a GenMOP object for evaluation

The process of making the GenMOP dataset is described as follows:

run SfM on the reference sequence using COLMAP.
run SfM on the test sequence using COLMAP. Note the test sequence for evaluation need be captured in a static scene.
Manually label at least 4 keypoints on two images from the query sequence and label the same 4 keypoints on two images from the test sequence. For example, we label 4 keypoints on the frame40.jpg and frame620.jpg from the reference sequence of the TFormer object, which is stored in aling-data/tformer-anno/ref-frame40(620). And we label 4 keypoints on the frame130.jpg and frame540.jpg from the test sequence of the TFormer object, which is stored in aling-data/tformer-anno/test-frane130(540).
Compute the alignment poses and scale for two sequences and save the results in align.pkl in tformer-test/. We provide an example in compute_align_poses.py.
With align.pkl, you may use the GenMOPDatabase by parse_database_name('genmop/tformer-que') and parse_database_name('genmop/tformer-que').

We use the annotation tools from https://github.com/luigivieira/Facial-Landmarks-Annotation-Tool to label keypoints for the GenMOP dataset.

Acknowledgements

In this repository, we have used codes or datasets from the following repositories. We thank all the authors for sharing great codes or datasets.

We provide a paper list about recent generalizable 6-DoF object pose estimators at https://github.com/liuyuan-pal/Awsome-generalizable-6D-object-pose.

Citation

@inproceedings{liu2022gen6d,
  title={Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images},
  author={Liu, Yuan and Wen, Yilin and Peng, Sida and Lin, Cheng and Long, Xiaoxiao and Komura, Taku and Wang, Wenping},
  booktitle={ECCV},
  year={2022}
}

gen6d's People

Contributors

Stargazers

Watchers

gen6d's Issues

Pose output

Hi,

This is a very interesting paper and I was just trying out the code for custom objects. I was wondering if it outputs the 6d pose for the object that is being tracked in the video? For example the mouse video

Where to download MOPED dataset?

The link to the original paper's dataset is broken，404.

Rotate volume to align with the input pose

Dear author,
Thanks for your creative work. I encountered an issue when I read your provided code, which is about constructing feature volume. Specifically, I can't understand why you firstly rotate volume to align with the input pose before projecting the volume points onto the query image and reference images based on the input pose and reference poses.

What is the most time-consuming step in predicting the pose of an object in an image?

Dear Liu Yuan:

I want to know what is the most time-consuming step in predicting the pose of an object in an image so tahat I can do some applications. Thank you!

Contents of align of genmop dataset

Dear Liu Yuan:
In each test file of the genmop dataset, there are align.pkl files. I would like to know what each value in the file represents and how it was generated.

Code error on custom test

The code img = self.img_norm(imgs) in network/detector.py reports ValueError: Expected tensor to be a tensor image of size (C, H, W). Got tensor.size() = torch.Size([32, 3, 120, 120]).

Training speed toooo way slow

Hi! Thanks for your work and sharing!

I installed all the requirements and tried training "view selector" on RTX 2080Ti and V100.
As I changed batch size to 10 and 64 for each GPU, I changed the total iteration to 60000 and 10000 separately.
and Both case for each GPU shows way slow iteration speed as below.
I used my custom synthetic data(just 1 object and I made it as almost same as LINEMOD format).

RTX 2080Ti(batch size = 10): 14/60000 [06:52<489:44:52, 29.39s/it, loss=0.879, lr=0.0001]
V100 (batch size=64) : 86/10000 [1:25:35<163:57:49, 59.54s/it, loss=0.572, lr=0.0001]

Which version of CUDA/cudnn/Torch/Torchvision/Pytorch3d do you use??
Thank you!

Pose shaking when doing pose estimation on custom objects

Hi,I followed the instructions under "Pose estimation on custom objects", and tried the code in Ubuntu 18.04. I tried with the mouse image. But when I ran predict.py, it turned out with a wrong result that the estimated pose shaking around the mouse, instead of focused on the mouse.
I have tried download the mouse_processed folder and run it directory. Also,I tried to manually split reference video with colmap and crop with cloudcompare. But both of them finally got the same result. Could you help with that? I used python3.9 in anaconda,pytorch is '1.12.1+cu102' version.

Question about the supplementary materials

Multiple objects and performance

Hi,

Thank you for your interesting work.
I noticed from the videos in the project page that this pose estimator should be able to work for multiple objects at the same time as well. How would we have to adapt the custom_object.md to achieve that possibility?

Also, do you see a big drop off in fps when training the model to detect multiple objects at the same time?

colmap issue

Excuse me, I'm asking you a question. The instructions are in 'Pose estimation on custom objects'
'python prepare.py --action sfm --database_name custom/mouse --colmap ', I did not find path-to-your-colmap-exe. I installed colmap in ubuntu18.04. Could you give me a hand?

no detector_pretrain?

custom error

Hi, when I try to use this network to estimate pose and scale for co3d's car class, I get an error. Although I tried with the provided mouse sample, it still doesn't work.

> Hi, @CharlesHuan it seems that your `object_point_cloud.ply` has some problem so that the generated reference images are not centered on the object: (since we use the `object_point_cloud.ply` to specify the object (size and center) you want to track.) ![image](https://user-images.githubusercontent.com/23974453/183577051-a511a512-4cc0-4e0e-8b62-737c02dbdd48.png) The reference images should look like: ![image](https://user-images.githubusercontent.com/23974453/183577525-20d1e365-7428-4ef1-8b3b-7bd9bb7d19b0.png)

          > Hi, @CharlesHuan it seems that your `object_point_cloud.ply` has some problem so that the generated reference images are not centered on the object: (since we use the `object_point_cloud.ply` to specify the object (size and center) you want to track.) ![image](https://user-images.githubusercontent.com/23974453/183577051-a511a512-4cc0-4e0e-8b62-737c02dbdd48.png) The reference images should look like: ![image](https://user-images.githubusercontent.com/23974453/183577525-20d1e365-7428-4ef1-8b3b-7bd9bb7d19b0.png)

You may open both the pointcloud.ply and object_point_cloud.ply in the same CloudCompare windows to see if they are aligned or not.

Thanks for your reply！Through your prompts，I found problems in generating database：
It should be like this

but dense clouds generated by the GUI I used,maybe cause the database to be misaligned with the generated cloud.
The problem has now been solved,thanks again！

Originally posted by @CharlesHuan in #12 (comment)

Question about the poor performance on custom data

Thank you so much for this great work. I also tried configuring Gen6D and successfully applied it to the mouse data provided.

However, when I tried to apply Gen6D to my collected data, the results were not satisfying as the detector could not work properly.

There are some previous trials in #4, #11 and #24. I noticed that you mentioned the issue of z flip or reference target being too small there, so I was careful with those factors. I also tried to put an ArUco tag as the background. I once thought my target object might be too hard, so I changed to a mouse like your demo one. But Gen6D still did not work on my own data.

A sample is like this

while my segmentation and labeling should be good

Do you have any thoughts on the reasons for the failure cases? I wonder if I have to print out another ArUco tag (denser than the one I use right now) similar to the one used in your demo mouse video or #24 (comment). Have you tried capturing the registration video without the ArUco tag board?

Thanks a lot.

How do I get the true pose of the custom object in camera coordinates

I can now find the bounding box of my custom object in the image, but how do I get the object's pose in camera coordinates.I'm confused

Hello，is there any way I can get the frame from camera real-time？

Hello Mr，Liuyuan。

Thanks for the source code,I can get the frame from a video.
I want to konw if there is anyway I can use a camera to get the frame and play it in real-time？
Any help would be appreciated

Hardware Requirements

Hello, first of all, thank you for sharing such excellent work. However, I found some minor problems during operation. First, how much video memory is needed for the graphics card to run this project? When I ran the first project, the video memory was insufficient. In the custom object, colmap showed the following WARNING: No images with matches found in the database. What kind of problem is that? Looking forward to your reply @liuyuan-pal

ShapeNet renderings (shapenet.tar.gz) downloading problem

When I download the ShapeNet renderings (shapenet.tar.gz) file at the OneDrive link, download process always collapses suddenly and I have to cancel it. I wonder whether there are other ways to obtain corresponding data. (Others files in the OneDrive link can be downloaded smoothly.)

Custom Object X and Z direction

Hello, I am trying to get the forward and up direction for x and z (step 6 of custom objects). I am not sure how to get the values for meta_info.txt. In your example the generated numbers were:

2.396784 -0.617233 0.282476
-0.0452898 -0.550408 -0.833667

However, I am looking at the numbers of the x forward direction and the z plane and nothing is matching those numbers.

Thanks

About the intermediate results

Hello author! Glad to find such an interesting project. Of course, I'm currently doing similar research, but I'm just getting started.

At present, I can complete the whole process with the mouse custom data you provided and the result is expected. However, the performance of the data I shot myself was not very satisfying.

The reference images that match the query image are transposed, like this.

But the second half of the test video is as expected. Why?
result video like this.
https://user-images.githubusercontent.com/114153764/195965944-60b5bdf2-ed43-4f88-ac75-edbe353fa665.mp4

Looking forward to your reply！

Inference on custom dataset producing key reference error

python predict.py --cfg configs/gen6d_pretrain.yaml --database custom/kappa --video data/custom/video/kappa-test.MOV --resolution 960 --transpose --output data/custom/kappa/test --ffmpeg ffmpeg
/home/fftai/anaconda3/envs/gen6d/lib/python3.7/site-packages/torchvision/models/_utils.py:136: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
f"Using {sequence_to_str(tuple(keyword_only_kwargs.keys()), separate_last='and ')} as positional "
/home/fftai/anaconda3/envs/gen6d/lib/python3.7/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=VGG11_BN_Weights.IMAGENET1K_V1. You can also use weights=VGG11_BN_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
/home/fftai/anaconda3/envs/gen6d/lib/python3.7/site-packages/torch/cuda/init.py:497: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
load from detector_pretrain/model_best.pth step 240000
load from selector_pretrain/model_best.pth step 270000
load from refiner_pretrain/model_best.pth step 260000
Traceback (most recent call last):
File "predict.py", line 97, in
main(args)
File "predict.py", line 32, in main
estimator.build(ref_database, split_type='all')
File "/home/fftai/working/6D Pose Estimation/Gen6D-main/estimator.py", line 145, in build
ref_ids = select_reference_img_ids_fps(database, ref_ids_all, self.cfg['ref_view_num'])
File "/home/fftai/working/6D Pose Estimation/Gen6D-main/utils/database_utils.py", line 115, in select_reference_img_ids_fps
poses = [database.get_pose(ref_id) for ref_id in ref_ids_all]
File "/home/fftai/working/6D Pose Estimation/Gen6D-main/utils/database_utils.py", line 115, in
poses = [database.get_pose(ref_id) for ref_id in ref_ids_all]
File "/home/fftai/working/6D Pose Estimation/Gen6D-main/dataset/database.py", line 290, in get_pose

pytorch version issue

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Hello, first of all, thank you for sharing such excellent work. This error occurred when I was running, which may be caused by different versions of pytorch. I used 10.1, which version do you use?

custom dataset

Hi author,
It's an amazing work.Whether the prediction module can be embedded in ROS.I used my own custom dataset. The effect is not very ideal. It is not clear whether it is a problem with the definition of the directions of X and Z.

Error on training code

Thank you for sharing your great work!
I found an error on the training detector with

python train_model.py --cfg configs/detector/detector_train.yaml

which returns

Exception has occurred: AttributeError
'GoogleScannedObjectDatabase' object has no attribute 'model_name'

on line 476 of dataset/database.py.

Actually, I found that the self._get_diameter() in line 464 requires self.model_name, which is not declared yet.

Also, how much time does it takes to train each detector, selector, and refiner with your setting?

Any help would be appreciated.

how to run it faster

Does the model can be change into onnx format and run faster, looking forward to your reply

CloudCompare erro

Hello, thanks for your excellent work. When I verify the method in custom object, I got the erro from CloudCompare as follow. Have you ever met this erro？Thanks！

Realtime inference

Hello,

Is there an easy way to compute the pose of a custom object in real time from a webcam stream for example ?

Thanks !

How to leverage pre-made CAD model for Custom Object training

Hey @liuyuan-pal , thanks for publishing this amazing research along with the code.

I was wondering if there is anyway we can use CAD model instead of the reconstruction step using Colmap. I already have a high quality obj, I can convert that to a point cloud. I want to do this as reconstruction using Colmap is not very robust. Will that be more accurate? If yes, how can I do that?

Also, this is there a way to completely replace the custom data pipeline of creating a video, poses, etc through some third party data-generator using CAD model.

Thanks!

cameras.bin provided in mouse_processed.tar.gz contains nothing.

I tried to use mouse_processed.tar.gz to run pose estimation on custom objects, while I got this Error:

After reading it with python, I found it contains nothing:

Could you please provide the right cameras.bin in mouse_processed.tar.gz? Thanks!

Question about training settings

Hi, authors!

Thank you for your inspiring work! However, I have a question about the training settings.

In the paper, you mentioned that you train the network on ShapeNet and Google Scanned Object, but I'd like to know more details. For example, what's the way you generate training data, how many epochs do you train the network and how long does it take?

Looking forward to your reply!

About the 3D translation

Hi, authors!

Thanks for your work！

I got the pose output as shown above，Is the last column the 3D translation of the object in the camera coordinate system?I can draw the correct bounding box in the picture, but I don't think the 3D coordinates is correct ,especially the depth z, what's the reason?and What is the unit of 3D translation？

Looking forward to your reply！

TypeError: tensor is not a torch image.

Pose estimation of multiple objects

Dear Liu Yuan:
I want to detect pose estimation of multiple objects in one image. Is it possible with this network?

meta_info

Hi,

I was trying to make my own custom object to track and was following your tutorial. I have two questions:

In your mouse-ref file, you had a mouse on an ArUco board. Do I need an ArUco board as well when I am recording my custom object? If yes, can I just use any ArUco board online for this purpose?
I am confused on how to to get the meta_info using CloudCompare. It says to get the x+ and z+ direction from the plane point cloud? I was wondering how do you find those since I do not see those values in the image. Is that value in the properties or console of CloudCompare?

Image for ref:

The size of the bounding box recognized by the Detector

Dear Liu Yuan:
I have a question about the Detector. You detect the center coordinates of the object and the scale of the bounding box. I think that the bounding box scale is detected slightly larger than the object. If I crop it with a box that is exactly the same size as the object, is there any bad effect?

About selected objects from GSO dataset

Hi, authors!

Thanks for your work.
I've noticed that you chose some models from the GSO dataset for training. I'm wondering if you pick those models on purpose or randomly selected some? And the model names downloaded from onedrive are different from the official link, could you please offer me a list with the selected raw model names?

RSVP.

Issue with provided datasets

Dear authors,

thanks for your great work. I have problem with extracting the datasets. I've tried to decompress genmop and google-scanned datasets from your link, both have error "tar: Unexpected EOF in archive". I'm wondering if this happens during downloading or because of the compressed data itself. Could you comment on that? Thanks!

colmap运行patch_match_stereo的时候报错invalid texture reference

在运行python prepare.py --action sfm --database_name custom/mouse --colmap 的时候，前面的几步骤（feature_extrator exhaustive matcher mapper image undistorter）都可以正常跑完，但是跑到patch_match_stereo的时候就出错了。感觉是调用gpu的时候报错了，但不知道怎么解决。

Custom object : How to filter out bad detections ?

Hi, I managed to run Gen6D live with a camera, the performances are not so bad with my laptop 3070ti by the way !

An issue I have is that if the object I want to detect is not in the frame, the model will still output a guess, while I would like it to know the object is not in the scene.

I looked at the detection and selection scores. I can see a difference in the highest scores when the object is detected and when there is no object, but it is quite small. Moreover, I sometimes get higher scores on fames with no objects than on frames with containing the object, so I don't think I can just apply a threshold on the scores.

Is there something I missed in the code ? Can the detector tell if it did not find a strong object candidate ?

Thank you !

colmap : WARNING: No images with matches found in the database.

Hi There, I am not able to run the colmap code.

OS : Ubuntu 20

Here is the error for the same
'''

Loading cameras... 146 in 0.000s
Loading matches... 0 in 0.001s
Loading images... 146 in 0.000s (connected 0)
Building correspondence graph... in 0.000s (ignored 0)

Elapsed time: 0.000 [minutes]

WARNING: No images with matches found in the database.

ERROR: failed to create sparse model
Traceback (most recent call last):
  File "prepare.py", line 98, in <module>
    build_colmap_model_no_pose(parse_database_name(args.database_name),args.colmap_path)
  File "/home/fftai/working/6D Pose Estimation/Gen6D-main/colmap_script.py", line 104, in build_colmap_model_no_pose
    run_sfm(colmap_path, sparse_model_path, database_path, image_path)
  File "/home/fftai/working/6D Pose Estimation/Gen6D-main/colmap_script.py", line 24, in run_sfm
    subprocess.run(cmd, check=True)
  File "/home/fftai/anaconda3/envs/gen6d/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['colmap', 'mapper', '--database_path', 'data/custom/mouse/colmap/database.db', '--image_path', 'data/custom/mouse/colmap/images', '--output_path', 'data/custom/mouse/colmap/sparse']' returned non-zero exit status 1.

'''

Additional error to my code
'''
Name: 139.jpg
ERROR: Failed to extract features.
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
FilterH: no kernel image is available for execution on the device
FilterV: no kernel image is available for execution on the device
PyramidCU::GenerateFeatureList: no kernel image is available for execution on the device

'''

How do I choose the forward (x+) direction?

Hello, I read the manual and learned the steps to generate the forward (x+) direction. But when I operate, I find that I can generate many values of the forward (x+) direction, right? So what I want to confirm now is whether the value of the forward (x+) direction can be arbitrarily determined? Or is it unique?

Objects texture

Hi,

I was just playing around with the code a bit more to see what objects can have pose estimation done on, it seems for objects that are either too small like lego pieces or objects that are too big they lose their texture for some reason.

I have two questions:

You mentioned earlier to use aruco markers for better texture, however I am unable to get enough textures in a 1 minute video for the object image provided below. Any suggestions on how to tackle this?
Thoughts on detecting lego pieces? i.e. can I do motion reconstruction on a single lego piece?

Object for reference:

PermissionError: [WinError 5] Access is denied

Dear Author

Thank you very much for your great and amazing work:

I have got the following error when I run this line of code:

python prepare.py --action sfm --database_name custom/Drill2 --colmap C:\Users\User\Desktop\colmap\COLMAP-3.7-windows-no-cuda\bin

and I also have tried on cuda version but I have same issue

please help me in this case . Thanks in advanced

Custom object : KeyError in database.py get_pose()

Hello,

I think I followed quite accurately the instructions for custom objects, but when I try to run predict.py like this :

python3 predict.py --cfg configs/gen6d_pretrain.yaml --database custom/pince --video data/custom/video/pince_test.mp4 --resolution 960 --transpose --output data/custom/pince/test --ffmpeg /usr/bin/ffmpeg

I get the following error :

Traceback (most recent call last):
  File "predict.py", line 99, in <module>
    main(args)
  File "predict.py", line 34, in main
    estimator.build(ref_database, split_type='all')
  File "<...>/Gen6D/estimator.py", line 147, in build
    ref_ids = select_reference_img_ids_fps(database, ref_ids_all, self.cfg['ref_view_num'])
  File "<...>/Gen6D/./utils/database_utils.py", line 115, in select_reference_img_ids_fps
    poses = [database.get_pose(ref_id) for ref_id in ref_ids_all]
  File "<...>/Gen6D/./utils/database_utils.py", line 115, in <listcomp>
    poses = [database.get_pose(ref_id) for ref_id in ref_ids_all]
  File "<...>/Gen6D/dataset/database.py", line 282, in get_pose
    return self.poses[img_id].copy()

When I print the keys of self.poses, keys '2', '14', '20', and '25' are missing :

dict_keys(['0', '1', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '15', '16', '17', '18', '19', '21', '22', '23', '24', '26', '27', '28', '29', '30', '31', '32', '33', '34'])
dict_keys(['0', '1', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '15', '16', '17', '18', '19', '21', '22', '23', '24', '26', '27', '28', '29', '30', '31', '32', '33', '34'])
dict_keys(['0', '1', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '15', '16', '17', '18', '19', '21', '22', '23', '24', '26', '27', '28', '29', '30', '31', '32', '33', '34'])

Any ideas why ?

Thanks !

Custom Model Training without Colmap

Hey @liuyuan-pal, thanks for publishing code for this amazing repo

Can I use the custom detector without the colmap process. I have the following ground truth data. The mesh, synthetic RGB images, depth images, masks and ground truth poses. Basically, I have used Blenderproc to get the data.

The reason for not using Colmap is depth ambiguity due to multi-view reconstruction, as you mentioned in one of the issues.

Is CloudCompare the only tool for forward (x+) direction and the up (z+) direction for the object?

why don't use cropped image when training selector?

After reviewed your code, I found, it uses the whole original image when training selector.
is there a some reason? why it doesn't use object-centric cropped images?

Plus, my custom dataset is generated by a 3D modeling tool (so it's not real data).
And the world environment of the object is keep changing(So, background image and light color/intensity change every frame by frame) while spinning the virtual camera.
Does this kind of augmentation make a bad impact on training selector? I think your model already use COCO dataset to augment background when using custom training data. So isn't OK?

Thank you!

About meta_info

Dear authors,

thanks for your great work!

I was trying to make my own custom object to track and was following your tutorial.，I have some questions :
my custom meta_info.txt like this:

It's based on

the visualizations I got weren't good enough,like this:

but directly download the reference images processed by you at here (mouse_processed.tar.gz) ,could get a good visualizations ,like this:

so I guess the problem maybe appear in my meta info.txt
I need some help T-T

Custom Object prediction - "Attempting to deserialize object on a CUDA device" exception

Hello,

First of all thank you for the work provided here.
I have been following step by step the manual to add custom objects, and everything has been OK, but when launching the last prediciton command I am encountering this error:

raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I have tried the device cpu option suggested, but it doesn't work either.

I would say that I have everything installed correctly, I have followed the main instructions and I have followed the requirements:

Any idea what could be going on? thank you very much!

liuyuan-pal / gen6d Goto Github PK

gen6d's Introduction

Gen6D

Project page | Paper

Todo List

Installation

Download

Evaluation

Qualitative results

Pose estimation on custom objects

Training

How to make a GenMOP object for evaluation

Acknowledgements

Citation

gen6d's People

Contributors

Stargazers

Watchers

Forkers

gen6d's Issues

Recommend Projects

Recommend Topics

Recommend Org