Hello, first of all thank you for the great work and to make the license permissive, i

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Reproducing results with SuperPoint + MNN on Megadepth-1500 about lightglue HOT 2 CLOSED

cvg commented on May 24, 2024

Reproducing results with SuperPoint + MNN on Megadepth-1500

from lightglue.

Comments (2)

Phil26AT commented on May 24, 2024

Hi @guipotje

sorry for the late reply. The pipeline is resize to 1600px -> inference -> rescale keypoints to original image size -> estimate relative pose. We use top 2048 keypoints and set the detection threshold to 0. The threshold range is correct, we found the best results at th=1 for SP+NN.

Here is the pose estimation code for opencv:

def estimate_relative_pose(
    kpts0, kpts1, K0, K1, thresh, conf=0.99999, solver=cv2.RANSAC
):
    if len(kpts0) < 5:
        return None

    f_mean = np.mean([K0[0, 0], K1[0, 0], K0[1, 1], K1[1, 1]])
    norm_thresh = thresh / f_mean

    kpts0 = (kpts0 - K0[[0, 1], [2, 2]][None]) / K0[[0, 1], [0, 1]][None]
    kpts1 = (kpts1 - K1[[0, 1], [2, 2]][None]) / K1[[0, 1], [0, 1]][None]

    E, mask = cv2.findEssentialMat(
        kpts0, kpts1, np.eye(3), threshold=norm_thresh, prob=conf, method=solver
    )

    if E is None:
        return None

    best_num_inliers = 0
    ret = None
    for _E in np.split(E, len(E) / 3):
        n, R, t, _ = cv2.recoverPose(_E, kpts0, kpts1, np.eye(3), 1e9, mask=mask)
        if n > best_num_inliers:
            best_num_inliers = n
            ret = (R, t[:, 0], mask.ravel() > 0)
    return ret

For the best results (LO-RANSAC) we used the excellent PoseLib, which provides python bindings. There, we tested thresholds in a range [0.5, 3.0].

Here is a small script for poselib:

import poselib


def intrinsics_to_camera(K):
    px, py = K[0, 2], K[1, 2]
    fx, fy = K[0, 0], K[1, 1]
    return {
        "model": "PINHOLE",
        "width": int(2 * px),
        "height": int(2 * py),
        "params": [fx, fy, px, py],
    }


M, info = poselib.estimate_relative_pose(
    kpts0, kpts1,
    intrinsics_to_camera(K0),
    intrinsics_to_camera(K1),
    {"max_epipolar_error": th},
)

R, t, inl = M.R, M.t, info["inliers"]

from lightglue.

guipotje commented on May 24, 2024

Hello @Phil26AT, thank you very much for the detailed answer!

Poselib indeed provides impressive gains in pose accuracy. After the suggestions, I was able to reproduce SuperPoint results by running the suggested pipeline only after using MNN + ratio test with r = 0.95 (31.4 AuC @ 5), but with MNN alone the results I obtain are worse than the reported in Table 2 (24.3 AuC @ 5). However, I think this is sufficient to validate the baseline. Thanks a lot!

from lightglue.

Recommend Projects

Reproducing results with SuperPoint + MNN on Megadepth-1500 about lightglue HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent