borglab / gtsfm Goto Github PK
View Code? Open in Web Editor NEWEnd-to-end SFM framework based on GTSAM
License: Other
End-to-end SFM framework based on GTSAM
License: Other
Replace current test with more carefully-thought-out camera configuration
In tests/data_assocation/test_data_assoc.py
The Ransac-F code fails for 5 points each from 2 planes in the current configuration. As a result, we need to add more unit tests to cover more cases.
Add method in SFMResult or GtsfmData that returns
The links from the README to gtsam-4.1.1-py3-none-any.whl don't work.
Make each config more human readable by setting defaults in a default.py file.
Some form of inheritance is supported in these YAML files. That way, we could have a default configuration in default.yaml, include it in deep_front_end.yaml and change/override only the feature_extractor and two_view_estimator.
Test the effect of image size on frontend and overall performance.
Currently few metrics from BA are just logged and not dumped.
Example:
[2021-04-07 01:19:01,343 INFO bundle_adjustment.py line 165 569967] initial error: 64453.72
[2021-04-07 01:19:01,344 INFO bundle_adjustment.py line 166 569967] final error: 1372.83
We cannot recover the correct poses right now and the unit test requires very high tolerances to pass.
In #143, we added a warning log in cases where the decomposed R, t do not form the same input E again. This happens for 9/66 pairs on the door dataset using RANSAC-E.
Although we can handle dropped cameras after DA, we can't before rot averaging:
None
elementsMaybe to the rotation averaging wrapper we want to add
+ # indices of cameras found inside connected subgraph
+ connected_idxs = set()
for (i1, i2), i2Ri1 in i2Ri1_dict.items():
if i2Ri1 is not None:
+ connected_idxs = connected_idxs.union(set([i1,i2]))
- return [result_values.atRot3(i) for i in range(num_images)]
+ # some camera indices may not have been present in connected component subgraph that is passed in
+ return [result_values.atRot3(i) if i in connected_idxs else None for i in range(num_images)]
The reconstruction from GTSFM on my tree dataset is as shown below:
Visualizing the tracks and the correspondences, I came across some failure cases, examples of which I am attaching here.
Correspondences:
Tracks:
This is the link to my dataset for reproducing the above results: https://drive.google.com/drive/folders/11cg8mio8zTVu-Kow4rH9RZumv3qJ_Iq1?usp=sharing
OpenCV keypoint is needed as inputs to descriptors like SIFT. In this case, we might need to use a default value of the keypoint size, which is the meaningful diameter around the point.
Proposed experiment:
We have created a class called Keypoints
, which can be confused with cv2.Keypoint
. We should rename it with GtsfmKeypoints
.
Could you please help me to check whether these metrics based on dense point clouds make sense?
from typing import List
import numpy as np
from gtsam import Pose3, Rot3
from gtsfm.utils.geometry_comparisons import align_poses_sim3
gtsfm_results_path = 'results_densify/dense_tools/gtsfm_results'
olsson_results_path = 'results_densify/dense_tools/olsson_results'
# default transformation: identity matrix
DEFAULT_ROTATION = Rot3.RzRyRx(0, 0, 0)
DEFAULT_TRANSLATION = np.array([0, 0, 0])
def read_data(path: str) -> (np.ndarray, np.ndarray):
"""load vertices and masks data
Args:
path: path where stores the vertices and masks data. there will be two files in the path folder
1. all_masks.npy:
stores all final masks for each view, which is a np.ndarray of [N, H, W] shape
2. all_vertices.npy:
stores all coordinates of candidate dense points (not filtered by masks) in the first camera's view,
a np.ndarray of [N*W*H, 3]
Returns:
masks: a np.ndarray of [N, H, W] shape
vertices: a np.ndarray of [N*W*H, 3] shape
"""
masks = np.load(f'{path}/all_masks.npy')
vertices = np.load(f'{path}/all_vertices.npy')
return masks, vertices
def metrics(gt_path: str, sample_path: str) -> (float, float, List[float]):
""" calculate accuracy and completeness as metrics
Args:
gt_path: path where stores the vertices and masks data as ground truth
sample_path: path where stores the vertices and masks data as the sample to test
Returns:
Completeness: float
Absolute Accuracy in depth direction: float
Value range of ground truth coordinates in depth direction: List[float]
"""
s_m, s_v = read_data(sample_path)
g_m, g_v = read_data(gt_path)
"""completeness is defined as ratio of the number of valid vertices among all masks"""
completeness = np.mean(s_m) / np.mean(g_m)
"""accuracy is defined as mean depth coordinate difference among all points pairs that are both valid for sample and gt"""
""" 1. convert coordinates list to poses list"""
s_vertices = []
g_vertices = []
num_img = s_m.shape[0]
for i in range(num_img):
mask = s_m[i] * g_m[i]
s_vertices.append(s_v[i][mask.flatten()])
g_vertices.append(g_v[i][mask.flatten()])
s_vertices = np.concatenate(s_vertices, axis=0)
g_vertices = np.concatenate(g_vertices, axis=0)
num_valid_vertices = s_vertices.shape[0]
s_poses = [Pose3(DEFAULT_ROTATION, DEFAULT_TRANSLATION + s_vertices[i]) for i in range(num_valid_vertices)]
g_poses = [Pose3(DEFAULT_ROTATION, DEFAULT_TRANSLATION + g_vertices[i]) for i in range(num_valid_vertices)]
""" 2. run SIM(3) alignment"""
s2g_poses = align_poses_sim3(g_poses, s_poses)
""" 3. calculate mean differences"""
error = [0, 0, 0]
g_range = [[100000, 0], [100000, 0], [100000, 0]]
for i in range(num_valid_vertices):
error[0] += np.abs((g_poses[i].x() - s2g_poses[i].x()))
error[1] += np.abs((g_poses[i].y() - s2g_poses[i].y()))
error[2] += np.abs((g_poses[i].z() - s2g_poses[i].z()))
g_range[0][0] = min(g_range[0][0], g_poses[i].x())
g_range[0][1] = max(g_range[0][1], g_poses[i].x())
g_range[1][0] = min(g_range[1][0], g_poses[i].y())
g_range[1][1] = max(g_range[1][1], g_poses[i].y())
g_range[2][0] = min(g_range[2][0], g_poses[i].z())
g_range[2][1] = max(g_range[2][1], g_poses[i].z())
accuracy = np.array(error) / num_valid_vertices
return completeness, accuracy[2], g_range[2]
Windows users need this
Metrics:
[2021-04-23 12:32:54,514 INFO sfm_track.py line 173 70703] DSF Union-Find: 1.19% of tracks discarded from multiple obs. in a single image.
[2021-04-23 12:35:16,131 DEBUG data_assoc.py line 87 70703] [Data association] input number of tracks: 5397
[2021-04-23 12:35:16,131 DEBUG data_assoc.py line 88 70703] [Data association] input avg. track length: 2.7196590698536225
[2021-04-23 12:35:16,999 INFO gtsfm_data.py line 211 70703] Largest connected component contains 11 of 11 cameras returned by front-end (of 11 total imgs)
[2021-04-23 12:35:17,011 DEBUG data_assoc.py line 133 70703] [Data association] output number of tracks: 1215
[2021-04-23 12:35:17,011 DEBUG data_assoc.py line 134 70703] [Data association] output avg. track length: 2.09
[2021-04-23 12:35:17,102 INFO bundle_adjustment.py line 43 70703] Input: 1215 tracks on 11 cameras
[2021-04-23 12:35:17,636 INFO bundle_adjustment.py line 108 70703] initial error: 2761678.19
[2021-04-23 12:35:17,636 INFO bundle_adjustment.py line 109 70703] final error: 2761678.19
[2021-04-23 12:35:17,690 INFO bundle_adjustment.py line 114 70703] [Result] Number of tracks before filtering 1215
[2021-04-23 12:35:17,773 INFO bundle_adjustment.py line 119 70703] [Result] Number of tracks after filtering: 42
[2021-04-23 12:35:17,774 INFO bundle_adjustment.py line 121 70703] [Result] Mean track length 2.000
[2021-04-23 12:35:17,776 INFO bundle_adjustment.py line 122 70703] [Result] Median track length 2.000
[2021-04-23 12:35:17,778 INFO gtsfm_data.py line 277 70703] Min scene reproj error: 0.007
[2021-04-23 12:35:17,779 INFO gtsfm_data.py line 278 70703] Avg scene reproj error: 1.203
[2021-04-23 12:35:17,780 INFO gtsfm_data.py line 279 70703] Median scene reproj error: 1.062
[2021-04-23 12:35:17,781 INFO gtsfm_data.py line 280 70703] Max scene reproj error: 2.751
(1) Add pandas to requirements
(2) Update wheels: https://github.com/borglab/gtsam-manylinux-build/actions/runs/436145595.
gtsam-4.1.0-cp35-cp35m-manylinux2014_x86_64.whl
gtsam-4.1.0-cp36-cp36m-manylinux2014_x86_64.whl
gtsam-4.1.0-cp37-cp37m-manylinux2014_x86_64.whl
gtsam-4.1.0-cp38-cp38-manylinux2014_x86_64.whl
wheels.zip containing gtsam-4.1.0-py3-none-any.whl
We have yet to push these new wheels to pypi, however. Once we get the updated wheels pushed with a new version number, we can
(3) add a gtsam dependency to the requirements list.
But I think the problem now is that the version number hasn't changed all fall (4.1.0 on August 23, 2020, per https://pypi.org/project/gtsam/#history), but the code has changed a lot, so we should bump the version number now to 4.1.1?
(4) Indeed looks like we have half of our setup in the setup.py file, and the other half in the conda environment. We will clean that up.
TODO: updating the install instructions in the readme
We should probably move completely to a conda environment, since we will be messing with a bunch of libraries like OpenCV that are better placed in a sequestered environment.
Need to add code to log the number of keypoints from each method (SIFT vs. SuperGlue).
python 3.9 incompatibility with dask requires additional import ordering
The current style of storing the models as private variables of a class in __init__
, and reuse them in functions. This leads to trouble as the scheduler gets locked up frequently and also fails for moderately sized datasets.
The current hack is to read the model from disk every single time, but its inefficient and makes the program slow.
Dask's recommended strategy is to wrap the model in delayed and pass it to the function each time. Will need to try that and properly benchmark compute.
COLMAP checks the angle of the rays used in triangulation and rejects them if it is too low.
I wrote a unit test which confirms the behavior of failures due to small baseline when there is a 2px error in the measurements (quite common scenario).
We need to add this check in the data association pipeline.
Push the scene with log id "273c1883-673a-36bf-b124-88311b1a80be" through the pipeline and debug failures.
We need to enforce in the base loader that image pairs i.e. edges (i1,i2) are ordered as i1 < i2.
We also need to add this in documentation for each loader.
COLMAP uses 3 or 4 px as its threshold. I think this should go in a global config, since it's such an important hyperparameter. COLMAP seems to use the same pixel threshold when fitting both E, F, and H
Open MVG uses "a contrario" RANSAC, so no threshold needs to be specified. This could be an additional verifier we add
Add option for user to specify whether to use intrinsics for verification (i.e. for comparative evaluation)
"use_intrinsics_for_verification"
make a verify() method on each verifier that switches based on flag and availability?
Clean up plane point generation code
At
a,b,c,d = plane_coefficients
x = pts[:, 0]
pts[:,2] = (a * x + b * y + d) / c
should it be ax+by+cz+d = 0
so d = -(ax-by-d ) / c
unit test could just be that the plane equation is satisfied and they are in range
move l2 normalization before l1 norm in rootsift bc l1 norm followed by square root already gives unit norm under l2
Throw an exception if match indices has zero length, or check for None values from verifier output
Currently, the MFAS step is run as a loop, and each iteration is independent. Hence we can parallelize this step.
TypeError: can't pickle gtsam.gtsam.Cal3Bundler objects
Needed for
import dask.multiprocessing
dask.config.set(scheduler='processes')
We should move to angular errors. Comparing matrices for rotations do not make much sense.
John has verified the Opencv's recoverPose function to give correct results. However, my unit test does not work correctly.
We have currently commented the test out, but will need to fix it.
In general runs, the dense point cloud will have about 1e6 points, but sometimes the number will be less than 1e3.
Something like:
from typing import Dict, List, NamedTuple
class SfmDataPython(NamedTuple)
cameras: Dict[int,SfmCamera]
tracks: List[SfmTrack]
Can COLMAP format support non-existing indices (non-contiguous indices)?
Thousands of lines printed with:
writeBAL has not been tested for calibration with nonzero (u0,v0)
Colmap uses a crawler to generate the dataset for camera sensor database as C++ code directly. We plan to use the same crawler to generate a csv file.
The crawler is currently broken, and we need to fix it first before integrating it with our codebase.
Our initializer which implements RANSAC is found here: https://github.com/borglab/gtsfm/blob/2ffbfc3f4b089fbcc666947c97f78a310073fa15/gtsfm/data_association/point3d_initializer.py#L45
missing access to
(1) image filenames
(2) unique 2d point indices
(3) camera id vs. image id
in colmap-format dump/ write
Later:
increase max lookahead default to far more than default of 1
We need to create a way to see if a PR improves results on a dataset (or suite of small datasets).
Perhaps via publishing an artifact? And storing the previous best results in a special file location somewhere?
@ProfFan @akshay-krishnan @ayushbaid any thoughts on how we should do this?
[2021-02-16 14:36:10,595 ERROR bundle_adjustment.py line 104 38567] LM Optimization failed
Traceback (most recent call last):
File "/Users/johnlambert/Documents/gtsfm/gtsfm/bundle/bundle_adjustment.py", line 102, in run
result_values = lm.optimize()
RuntimeError: An inference algorithm was called with inconsistent arguments. The
factor graph, ordering, or variable index were inconsistent with each
other, or a full elimination routine was called with an ordering that
does not include all of the variables.
Specify epipolar distances in unnormalized coords, not normalized ones.
I have been using wTc
, which is not preferred. This needs to be changed everywhere in GTSFM.
OpenCV 4.5.3 breaks interface since size() argument is now required for Keypoint()
E cv2.error: OpenCV(4.5.3) :-1: error: (-5:Bad argument) in function 'KeyPoint'
E > Overload resolution failed:
E > - KeyPoint() missing required argument 'size' (pos 3)
See CI report in https://github.com/borglab/gtsfm/runs/3182429925?check_suite_focus=true
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <gtsfm.common.keypoints.Keypoints object at 0x7f1d65bfb3d0>
def cast_to_opencv_keypoints(self) -> List[cv.KeyPoint]:
"""Cast GTSFM's keypoints to list of OpenCV's keypoints.
Args:
keypoints: GTSFM's keypoints.
Returns:
List of OpenCV's keypoints with the same information as input keypoints.
"""
# cast input attributed to floating point numpy arrays.
keypoints = self.cast_to_float()
opencv_keypoints = []
if self.responses is None and keypoints.scales is None:
for idx in range(len(keypoints)):
opencv_keypoints.append(
> cv.KeyPoint(
x=keypoints.coordinates[idx, 0],
y=keypoints.coordinates[idx, 1],
_size=OPENCV_DEFAULT_SIZE,
)
E cv2.error: OpenCV(4.5.3) :-1: error: (-5:Bad argument) in function 'KeyPoint'
E > Overload resolution failed:
E > - KeyPoint() missing required argument 'size' (pos 3)
gtsfm/common/keypoints.py:163: error
Hi all,
I learnt about this library from a talk I wanted from the TUM AI lecture series. It looks very promising.
However, being fairly new to the field, it would be great if you could point to some references to follow to understand the algorithms and designs of GTSFM and how it compares to other available packages.
Thanks!
Need a better 3d visualizer e.g. Pangolin and make sure we are dumping to disk in a clean way to read from this
1. All implementation code should go under a 12/23/2020 donegtsfm
folder so that module imports are clear system-wide
2. Per https://github.com/borglab/gtsfm/blob/master/common/image.py#L61
img_w_px, img_h_px = self.value_array.shape[:2]
should instead be:
img_h_px, img_w_px = self.value_array.shape[:2]
@akshay-krishnan do you mind taking a look at this?-- potentially no parallax is causing this issue:
Inputs to 1dsfm:
num_images: 3
wRi_list: [R: [
-0.249122, -0.189531, -0.949745;
0.948813, -0.244351, -0.200115;
-0.194143, -0.950984, 0.240703
]
, R: [
-0.248403, -0.189431, -0.949954;
0.948985, -0.244232, -0.199447;
-0.194228, -0.951034, 0.240435
]
, R: [
-0.248206, -0.189295, -0.950032;
0.948946, -0.244596, -0.199186;
-0.19467, -0.950968, 0.240341
]
]
i2Ui1_dict: {(0, 1): : 0.0770847
-0.00048986
-0.997024
, (0, 2): :-0.023718
0.0052633
-0.999705
, (1, 2): :0.00795643
0.00148208
-0.999967
}
scale_factor: 1.0
Output of 1dsfm:
R: [
-0.249122, -0.189531, -0.949745;
0.948813, -0.244351, -0.200115;
-0.194143, -0.950984, 0.240703
]
[ 0.9280717 0.2721252 -0.25422586]
R: [
-0.248403, -0.189431, -0.949954;
0.948985, -0.244232, -0.199447;
-0.194228, -0.951034, 0.240435
]
[-1.17731570e-09 2.06031814e-08 -2.54346348e-09]
R: [
-0.248206, -0.189295, -0.950032;
0.948946, -0.244596, -0.199186;
-0.19467, -0.950968, 0.240341
]
[-436863.72253421 -87650.33973069 111136.50286774]
See issue I posted here:
ducha-aiki/pydegensac#7
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.