Giter VIP home page Giter VIP logo

scoring-without-correspondences's Introduction

Two-view Geometry Scoring Without Correspondences

This is the reference PyTorch implementation for testing the FSNet fundamental matrix scoring method described in

Two-view Geometry Scoring Without Correspondences

Axel Barroso-Laguna, Eric Brachmann, Victor Adrian Prisacariu, Gabriel Brostow and Daniyar Turmukhambetov

Paper, Supplemental Material

Patent pending. This code is for non-commercial use; please see the license file for terms. If you do find any part of this codebase helpful, please cite our paper using the BibTex below and link this repo. Thank you!

3 minute CVPR presentation video link

Overview

FSNet takes as input two RGB images and a fundamental matrix, and outputs the relative translation and rotation errors. Such errors are used as the scores to rank the fundamental matrices:

Setup

Assuming a fresh Anaconda distribution, you can install dependencies with:

conda env create -f resources/FSNet_environment.yml

We ran our experiments with PyTorch 1.11, CUDA 11.3, Python 3.9.16 and Debian GNU/Linux 10.

Running FSNet network

demo_inference.py can be used to select the best fundamental matrix in a pool according to FSNet scoring. We provide as an example two images, im_src.jpg and im_dst.jpg and fundamentals.npy, which contains sampled fundamental matrices for im_src.jpg and im_dst.jpg. Images and fundamental matrices are stored within the resources im_test folder. For a quick test, please run:

Arguments:

  • src_path: Path to source image.
  • dst_path: Path to destination image.
  • weights_path: Path to FSNet weights (options: indoor_fundamentals, indoor_essentials, outdoor_fundamentals, or outdoor_essential). Weights are stored in FSNet/weights.
  • fundamentals_path: Path to the numpy file storing the fundamental matrices (N x 3 x 3).

The demo script returns the top scoring fundamental matrix and its predicted translation and rotation errors. Optionally, the script also prints the epipolar lines corresponding to the selected fundamental matrix for easy inspection. See the example below:

BibTeX

If you use this code in your research, please consider citing our paper:

@inproceedings{barroso2023fsnet,
  title={Two-view Geometry Scoring Without Correspondences},
  author={Barroso-Laguna, Axel and Brachmann, Eric and Prisacariu, Victor and Brostow, Gabriel and Turmukhambetov, Daniyar},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2023}
}

scoring-without-correspondences's People

Contributors

axelbarroso avatar daniyar-niantic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scoring-without-correspondences's Issues

Question about the epipolar cross-attention

Hello! Thanks for open-sourcing this amazing work!

However, I was confused about the "Epipolar Cross-attention" module proposed in the paper. I wonder how it receives "visual features" extracted by a common backbone, then somehow conditions them with "epipolar-gemetry fitness" information, and then applies MLP to output fitness score, ranking the F/E hypothesis. Could you kindly explain the intuition of mechanism behind the "Epipolar Cross-attention"?

Looking forward to your reply!

Question about batch generation for training

Hi there,

Thank you for the interesting work. I have a couple of questions regarding batch generation for the purpose of training the pipeline.

In the paper it is mentioned that batches of 56 image pairs are used at training, and 500 hypotheses are clustered into bins based on their pose error prior to sampling. I'd like to ask about this binning and sampling process:

  1. What is the pose error quantity that is used for binning, is it max(e^R, e^t)?
  2. How many bins are used?
  3. For each training batch, which I understand has 56 image pairs, how many hypotheses are sampled per image pair?

Thanks in advance!
Fereidoon

demo in custom data

Do I need to regenerate the fundamentals.npy file when using this model in custom scenes?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.