jac99 / minklocmultimodal Goto Github PK
View Code? Open in Web Editor NEWMinkLoc++: Lidar and Monocular Image Fusion for Place Recognition
License: MIT License
MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition
License: MIT License
I have a few questions regarding the KITTI dataset.
I would be waiting for your reply. Thank you!
Hi, thanks for your great work on multi modal fusion for place recognition.
However, the Oxford Robotcar Dataset is unavailable from 2022-11-08 and I'm a bit frustrated that there is no sign of re-opening the download. Could you please provide me with the centre RGB images by Google drive?
Thanks in advance.
According to your paper, the database and query split of the KITTI dataset were as follows.
We take Sequence 00 which visits the same places repeatedly and construct the reference database using the data gathered during the first 170 seconds. The rest is used as localization queries.
However, most queries did not have a close database data with the above split when I tried. (sequence 00 -> database: first 170 sec & query: rest(=170~470 sec)) Could you please explain more about this setting or send me the code?
Hi, in your paper "As point coordinates in the Oxford RobotCar dataset are normalized to be within [−1,1] range, this gives up to 200 voxels in each spatial direction." May I know what algorithm was used to get the normalized point cloud between -1 and 1?
Hello, I have another question. When I read the point cloud bin file, I found that the value inside is very small, all less than 1. What preprocessing has been done? I would be waiting for your reply. Thank you!
Hello, thanks for your great work.
I have run your training code and found something confused.
image loss
is relatively larger than point cloud (around 100 times), which is also mentioned in the paper as "overfitting problem". Could you please explain it more detailed? Why are there less active triplets for RGB image modality than for 3D modality, does the active triplet correspond to num_non_zero_triplets
in code?resnetFPN
is already pretrained. Is the large training loss mainly caused by huge difference in illumination across traversals on the Oxford RobotCar Dataset? It's also a bit werried that image loss gets down while its weight(beta) is set 0.0.Looking forward to your reply. Thanks.
Hello~ Thanks for your work!
I saw that your paper gave the result of using only rgb pictures. But I did not find the relevant weight file and runing steps. Can you tell me the operation steps?
Thanks for the work!
I'm a beginner in point cloud. When I run generate_rgb_for_lidar.py, it seems that lidar2image_ndx.pickle is required, but I cannot figure out where it is first generated. The problem is reported before the function create_lidar2img_ndx has been run. Could U plz give me some hint about that?
Thansk for your great work
I didn't see the introduction of inference time in paper, how about the MinkLoc++ inference efficiency compared to MinkLoc3D(22 ms per cloud) ?
Hi,
Thanks for your nice work and shared code.
When I need to get the image to correspond to the lidar point cloud, I follow your instruction to run the generate_rgb_for_lidar.py
script firstly. But the code which showed as below is confused for me:
May I know what are lidar2image_ndx_path
, lidar2image_ndx.pickle
and pickle...
, etc. Do I need to run other scripts to get these files?
Hello~ Thanks for your work!
How can I reproduce the result of Table 3 in your paper? Can you provide the script that finds LiDAR readings with corresponding timestamps in the in original RobotCar dataset for each image in RobotCar Seasons Datasets?
hello!
I run your code and find the loss becomes nan after 40 epoch. I find it is caused by that values of all embeddings become nan. Do you find this during your research? Is there any bug? or just by accident?
Hi, thanks for your great work.
The pre-processed and downsampled RobotCar images are unavailable here. Could you please re-upload to the pre-processed images via Google Drive?
Thanks in advance.
Hi, thanks for your great work.
The pre-processed and downsampled RobotCar images are unavailable here. Could you please re-upload to the pre-processed images via Google Drive?
Thanks in advance.
Hi, I wanted to recreate the generalisation results on KITTI dataset, that you have mentioned in the paper. I would appreciate any advice on how to run the model for KITTI dataset.
I've tried to implement validation during training and faced a problem that in the val phase the script returns an error:
AssertionError: Unknown lidar timestamp: 1435937763823973
I've checked the index generation script and found that there is a bug in the code:
I am pretty sure that it should be ts, traversal = get_ts_traversal(val_queries[e].rel_scan_filepath)
I don't know if it's me, but I am using it to map lidar to images of the oxford dataset and I end up with a very low number of samples per run. I supposed the mapping doesn't map all the point clouds to an image.
When evaluating the KITTI dataset, how did you map the point cloud to the image?
Hi,
My question is about the way you generate batches (in samplers.py
).
MinkLocMultimodal/datasets/samplers.py
Line 92 in 683ef1a
Hi, I am confused about the positives_masks
and negatives_masks
, could you please explain more about it?
Hi, in your work you used the 2D LiDAR (from Oxford RobotCar: 2 x SICK LMS-151 2D LIDAR, 270° FoV, 50Hz, 50m range, 0.5° resolution). According to the documentation description: Each 2D scan consists of 541 triplets of (x, y, R), where x, y are the 2D Cartesian coordinates of the LiDAR return relative to the sensor (in metres), and R is the measured infrared reflectance value. Does the point cloud data you used actually not a real 3-dimensional dataset, the coordinate Z represents the reflectance?
Hi, thanks for your great work.
I have one question.
The stereo/center camera data downloaded from the official website is single-channel, but I haven't seen any code in your implementation for handling single-channel data. Did I overlook something?
Thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.