jytime / deep-sfm-revisited Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2021] Deep Two-View Structure-from-Motion Revisited
License: MIT License
[CVPR 2021] Deep Two-View Structure-from-Motion Revisited
License: MIT License
Hi, thank you for your paper contribution.
I'm sorry to bother you. when I reproduced the code program I can't find these files (train_files. txt, test_files.txt, kitti_raw_calib_dict.npy, kitti_raw_calib_dict.npy) . But I only find calib_cam_to_cam.txt. Do I need to make certain data format conversion?
Can you tell me the correct way find them or the source of these files.
Thanks!
class KITTIRAWLoaderGT(data.Dataset):
def __init__(self, root, transform=None, target_transform=None, co_transform=None, train=True):
train_files = os.path.join(self.root, 'train_files.txt')
test_files = os.path.join(self.root, 'test_files.txt')
self.calib_dict = np.load(os.path.join(self.root, 'kitti_raw_calib_dict.npy'), allow_pickle=True).item()
Hey @jytime, thanks for sharing your work!
I was trying to replicate your experiments using the grountruth pose on KITTI, which should provide the upper bound for your method. I used the GT_POSE
and GT_POSE_NORMALIZED
flags to perform inference with gt poses.
For some reason, the performance is much worse than when estimating the pose with RANSAC
, which I am instead able to replicate. Can you provide some guidance on what might be the problem?
Thanks a lot!
Hi, thanks for sharing the impressive work.
According to the codes at Line 576-585 in main.py, you use the ratio between the median values of predicted and GT depth to scale the predicted depth. However, the predicted depth has been scaled by the GT scale \alpha_gt (see Line 536-541 in main.py). Hence, I am confused about why the rescaling operation by the ratio of median values is necessary (the performance would drop significantly without it).
Could you kindly help me resolve the confusion. Thank you so much.
I've installed essential_matrix module as per the README file, i am able to import the module. However, i ran into "essential_matrix' has no attribute 'initialise'"
My installation process was successful. Could you provide some leads on how to solve this?
Thank you
Hi, thanks for sharing the impressive work.
After browsing the code, I am confused about a hyper-parameter NORM_TARGET. Could you kindly explain the usage of this hyper-parameter?
Thanks.
Sorry, my code level is not good.
I always run out of GPU memory after running code one round of training.
I have two GPUs.
GPU 0: NVIDIA GeForce RTX 2070 SUPER Memory 8192MiB
GPU 1: NVIDIA GeForce RTX 2070 SUPER Memory 8192MiB
args.batch_size = 4
args.lr = 0.005
args.epoch_size=0 # help='manual epoch size (will match dataset size if set to 0)'
However, I encountered the following error during the first training iteration.
023-10-22 01:30:20,463 INFO Epoch: [0][0/19905] Time 18.368 (18.368) Data 2.604 (2.604) Loss 13.457 (13.457)
Traceback (most recent call last):
...
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
...
RuntimeError: CUDA out of memory. Tried to allocate 98.00 MiB (GPU 0; 7.78 GiB total capacity; 5.65 GiB already allocated; 108.31 MiB free; 5.75 GiB reserved in total by PyTorch)
May I ask what GPU you are using or what methods do you have?
Hi, I saw some works evaluating on the ScanNet dataset. May I ask if you have evaluated on ScanNet and how is the performance? Thanks!
Helloooooooo!
Great job!!!!!!
Hope to see the pre-trained model and more complete code soon!!
Hello , thanks great work,@jytime
I do as your README said , when run demo.py and evaluate.py in model.RAFT , but cause some issues
run demo.py
I already download the Sintel of dataset and download_models.,but when i try python evaluate.py --model=models/raft-things.pth --dataset=sintel --mixed_precision.it has some errors ,.
run evaluate.py errors
i found this cause by evaluate.py in line 116 for "flow_low, flow_pr = model(image1, image2, iters=iters, test_mode=True)",seems only one image could as parameter otherwise cause such errors as error picture shows. But change it also don't work . I could not solve it , could you give me some advice ? thanks very much
My environment : ubuntu20.04 cuda:11.1(RTX3060 needsotherwise error) torch==1.8.0 , other satisfy README.md
Hi,
First thanks for the great work and nice paper.
I have one question about the dataset. In readme it mentioned to download RAW data from kitti. However I'm wonder which scene category do we need? (e.g., city, campus, road, ... or all of the scene?)
And maybe I miss this information in the paper, but which kind of kitti scene is used for training? I only see that for depth evaluation of kitti dataset, Eigen is used. But not sure which dataset is used for training.
Thanks again!
To evaluate VO in KITTI, I first predicted poses of Sequences 09 when cfg.PRED_POSE_GT_SCALE = True. After that, I multiplied relative poses between frames with the current pose to get the absolute pose of the frame, and use the KITTI odometry evaluation toolbox mentioned in README.md. However, the result is not so good. The translational error is 4.182%, the rotational error is 1.352 deg/100m, and ATE is as large as 62.991m.
sequence_09.pdf
Did I make a mistake or miss any steps? @jytime
Hello,
I want to reproduce the results of pose evaluation with your method. During this process, I was confused by some problems as blow:
Following issue #8 , I predicted rel_pose with your model first, transformed them into abs_pose using VO evaluation code provided in your answer, and evaluated them with KITTI odometry evaluation toolbox mentioned in README.md.. But the result is also not so good, closing to results in the issue. I found in the code, that the pose is calculated by RANSAC with default choice, which means just using flow matches to get the pose(is that right?). Is it because the default choice that caused the result not good, or the default choice is enough to get the best result? I was wondering how to set the config to get the best result as in Table 3 of your paper.
As in Table 5, there are many choice to calculate pose. But I don't know how to use them in your code . There are so many flags in config.py, and some of them are dummy. Which flags should I set 'True' to use,for example, the method with best performence in Table 5 as '5-point' + ' Flow matches '+ 'SIFT Loc'?
@jytime can you help me with them? Thank you very much!
Best
Great work!
I have installed 'essential_matrix' following the steps in ReadMe, and run the test.py successfully.
However, the results are confusing:
Ground-truth E matrix
tensor([[-0.1391, -0.0442, 0.5476],
[ 0.1396, -0.0444, -0.1480],
[-0.4839, 0.2972, -0.1344]])
Start essential matrix initialisation
The number of inliers: 0
Initialized E matrix
tensor([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]])
Error
tensor(nan)
tensor(nan)
Start essential matrix optimisation
E matrix after optimization (polishing)
tensor([[nan, nan, nan],
[nan, nan, nan],
[nan, nan, nan]])
Error
tensor(nan)
tensor(nan)
I wonder if there is something wrong in my installation? How can I get the correct Pose?
Thanks!
would you please provide demonloader? i really want to know the settings for the pre-process for demon data
can u provide the eigne sfm splits?
Hello there, loved reading your paper and am very keen to test your code. Can you tell when will be the complete code made available?
Thanks!
Hey @jytime, thank you very much for your support, really appreciate your time!
As your work seems to outperform DeepSfM on all datasets tested, I wanted to replicate this result on ETH3D dataset.
I tried to run the network as is, but the results are not great. I suspect that surely the KITTI checkpoint requires some finetuning to the feature extractor, which I will perform on the DeMoN datasets.
Is there any chance you can share the checkpoints on the DeMoN datasets?
Do you believe that there is anything else which should be changed to test the method on ETH3D?
Thanks!
Best
Best ,when I using 2011_09_26/2011_09_26_drive_0002_sync for training, I found that the loss_depth corresponding to line 392 of the main function will appear NAN. I also suspect that the learning rate is too large. Even if I set it to --lr =0, the following picture will still appear. , please give me some advice if you know how to slove , thanks
Hello,
Not all dependencies are listed in the requirements.txt. Specifically, I have problems installing minieigen. How do you install the package and which version do you use? sudo apt-get python3-minieigen installs a version for python 3.8. I found no other way but potentially building it from source. Is that right?
The RANSAC CUDA code cannot launch on my GPU(TITAN Xp).
~/Deep-SfM-Revisited-main/RANSAC_FiveP$ python test.py
Ground-truth E matrix
tensor([[-0.1391, -0.0442, 0.5476],
[ 0.1396, -0.0444, -0.1480],
[-0.4839, 0.2972, -0.1344]])
Start essential matrix initialisation
CUDA Error (/data3/liubq/Deep-SfM-Revisited-main/RANSAC_FiveP/essential_matrix/essential_matrix.cu:145): invalid device symbol
Segmentation fault (core dumped)
Setting SET(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS}; --std=c++11; -gencode=arch=compute_61,code=sm_61)
in CMakelists.txt and ext_modules=[ CUDAExtension('essential_matrix', sources, extra_compile_args={ 'cxx':['-O2'], 'nvcc':'-gencode', 'arch=compute_61,code=sm_61'} ), # extra_compile_args, extra_link_args ],
in setup.py does not help.
How to deal with this problem in different devices?
Thanks!
Great work!
I am confused about the gt_depth_dir
.
raw velodyne_points
(.bin) in KITTI RAW data, the projected velodyne_raw (.png) or the groundtruth (.png) in 14GB official depth maps? I guess it would be raw velodyne_points
(.bin) ?Looking forward to your reply!
Thanks!
Great work!
Could you release your pre-trained Posenet model trained on KITTI?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.