Giter VIP home page Giter VIP logo

Comments (2)

Phil26AT avatar Phil26AT commented on May 24, 2024

Hi @guipotje

sorry for the late reply. The pipeline is resize to 1600px -> inference -> rescale keypoints to original image size -> estimate relative pose. We use top 2048 keypoints and set the detection threshold to 0. The threshold range is correct, we found the best results at th=1 for SP+NN.

Here is the pose estimation code for opencv:

def estimate_relative_pose(
    kpts0, kpts1, K0, K1, thresh, conf=0.99999, solver=cv2.RANSAC
):
    if len(kpts0) < 5:
        return None

    f_mean = np.mean([K0[0, 0], K1[0, 0], K0[1, 1], K1[1, 1]])
    norm_thresh = thresh / f_mean

    kpts0 = (kpts0 - K0[[0, 1], [2, 2]][None]) / K0[[0, 1], [0, 1]][None]
    kpts1 = (kpts1 - K1[[0, 1], [2, 2]][None]) / K1[[0, 1], [0, 1]][None]

    E, mask = cv2.findEssentialMat(
        kpts0, kpts1, np.eye(3), threshold=norm_thresh, prob=conf, method=solver
    )

    if E is None:
        return None

    best_num_inliers = 0
    ret = None
    for _E in np.split(E, len(E) / 3):
        n, R, t, _ = cv2.recoverPose(_E, kpts0, kpts1, np.eye(3), 1e9, mask=mask)
        if n > best_num_inliers:
            best_num_inliers = n
            ret = (R, t[:, 0], mask.ravel() > 0)
    return ret

For the best results (LO-RANSAC) we used the excellent PoseLib, which provides python bindings. There, we tested thresholds in a range [0.5, 3.0].

Here is a small script for poselib:

import poselib


def intrinsics_to_camera(K):
    px, py = K[0, 2], K[1, 2]
    fx, fy = K[0, 0], K[1, 1]
    return {
        "model": "PINHOLE",
        "width": int(2 * px),
        "height": int(2 * py),
        "params": [fx, fy, px, py],
    }


M, info = poselib.estimate_relative_pose(
    kpts0, kpts1,
    intrinsics_to_camera(K0),
    intrinsics_to_camera(K1),
    {"max_epipolar_error": th},
)

R, t, inl = M.R, M.t, info["inliers"]

from lightglue.

guipotje avatar guipotje commented on May 24, 2024

Hello @Phil26AT, thank you very much for the detailed answer!

Poselib indeed provides impressive gains in pose accuracy. After the suggestions, I was able to reproduce SuperPoint results by running the suggested pipeline only after using MNN + ratio test with r = 0.95 (31.4 AuC @ 5), but with MNN alone the results I obtain are worse than the reported in Table 2 (24.3 AuC @ 5). However, I think this is sufficient to validate the baseline. Thanks a lot!

from lightglue.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.