Giter VIP home page Giter VIP logo

verlab / dalf_cvpr_2023 Goto Github PK

View Code? Open in Web Editor NEW
61.0 5.0 10.0 18.31 MB

DALF is a joint image keypoint detector and descriptor for handling non-rigid deformations. It also works great under large rotations.

License: Apache License 2.0

Python 7.63% Shell 0.12% Jupyter Notebook 92.25%
feature-extraction image-matching keypoint-detection non-rigid-deformation non-rigid-registration pytorch thin-plate-splines kornia warper-net reinforce

dalf_cvpr_2023's Introduction

DALF: Deformation-Aware Local Features (CVPR 2023)

License Open In Colab


DALF registration with challenging deformation + illumination + rotation transformations.

TL;DR: A joint image keypoint detector and descriptor for handling non-rigid deformation. Also works great under large rotations.

Just wanna quickly try in your images? Check this out: Open In Colab

Table of Contents

Introduction

This repository contains the official implementation of the paper: Enhancing Deformable Local Features by Jointly Learning to Detect and Describe Keypoints, to be presented at CVPR 2023.

Abstract: Local feature extraction is a standard approach in computer vision for tackling important tasks such as image matching and retrieval. The core assumption of most methods is that images undergo affine transformations, disregarding more complicated effects such as non-rigid deformations. Furthermore, incipient works tailored for non-rigid correspondence still rely on keypoint detectors designed for rigid transformations, hindering performance due to the limitations of the detector. We propose DALF (Deformation-Aware Local Features), a novel deformation-aware network for jointly detecting and describing keypoints, to handle the challenging problem of matching deformable surfaces. All network components work cooperatively through a feature fusion approach that enforces the descriptors’ distinctiveness and invariance. Experiments using real deforming objects showcase the superiority of our method, where it delivers 8% improvement in matching scores compared to the previous best results. Our approach also enhances the performance of two real-world applications: deformable object retrieval and non-rigid 3D surface registration.

Overview of DALF achitecture Our architecture jointly optimizes non-rigid keypoint detection and description, and explicitly models local deformations for descriptor extraction during training. An hourglass CNN computes a dense heat map providing specialized keypoints that are used by the Warper Net to extract deformation-aware matches. A feature fusion layer balances the trade-off between invariance and distinctiveness in the final descriptors. DALF network is used to produce a detection heatmap and a set of local features for each image. In the detector path, the heatmaps are optimized via the REINFORCE algorithm considering keypoint repeatability under deformations. In the descriptor path, feature space is learned via the hard triplet loss. A siamese setup using image pairs is employed to optimize the network.

Requirements

  • conda for automatic installation;

Installation

Tested on Ubuntu 18, 20, and 22. Clone the repository, and build a fresh conda environment for DALF:

git clone https://github.com/yourusername/DALF.git
cd DALF
conda env create -f environment.yml -n dalf_env
conda activate dalf_env

Manual installation

In case you just want to manually install the dependencies, first install pytorch (>=1.12.0) and then the rest of depencencies:

#For GPU (please check your CUDA version)
pip install torch==1.12.0+cu102 torchvision==0.13.0+cu102 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu102
#CPU only
pip install torch==1.12.0+cpu torchvision==0.13.0+cpu torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu

pip install --user numpy scipy opencv-contrib-python kornia

Usage

For your convenience, we provide ready to use notebooks for some tasks.

Description Notebook
Matching example Open In Colab
Register a video of deforming object (as shown in the GIF) Open In Colab
Download data and train from scratch Open In Colab

Inference

To run DALF on an image, three lines of code is enough:

from modules.models.DALF import DALF_extractor as DALF
import torch
import cv2

dalf = DALF(dev=torch.device('cuda' if torch.cuda.is_available else 'cpu'))

img = cv2.imread('./assets/kanagawa_1.png')

kps, descs = dalf.detectAndCompute(img)

Or you can use this script in the root folder:

python3 run_dalf.py

Training

DALF can be trained in a self-supervised manner with synthetic warps (see augmentation.py), i.e., one can use a folder with random images for training. In our experiments, we used the raw images (without any annotation) of 1DSfM datasets which can be found in this link. To train DALF from scratch on a set of arbitrary images with default parameters, run the following command:

python3 train.py

To train the model, we recommend a machine with a GPU with at least 10 GB memory, and 16 GB of RAM. You can attempt to reduce the batch size and increase the number of gradient accumulations accordingly, to train in a GPU with less than 10 GB. We provide a Colab to demonstrate how to train DALF from scratch: Open In Colab. While it is possible to train the model on Colab, it should take more than 48 hours of GPU usage.

Evaluation

We follow the same protocol and benchmark evaluation of DEAL. You will need to download the non-rigid evaluation benchmark files. Then, run the evaluation script:

sh ./eval/eval_nonrigid.sh

Please update the variables PATH_IMGS and PATH_TPS to point to your downloaded benchmark files before running the evaluation script!

Applications

The image retrieval and non-rigid surface registration used in the paper will be released very soon in a new repository focused on application tasks involving local features. Stay tuned!

The video below show the non-rigid 3D surface registration results from the paper:

Non-rigid 3D registration visual results

Citation

If you find this code useful for your research, please cite the paper:

@INPROCEEDINGS{potje2023cvpr,
  author={Guilherme {Potje} and and Felipe {Cadar} and Andre {Araujo} and Renato {Martins} and Erickson R. {Nascimento}},
  booktitle={2023 IEEE / CVF Computer Vision and Pattern Recognition (CVPR)}, 
  title={Enhancing Deformable Local Features by Jointly Learning to Detect and Describe Keypoints}, 
  year={2023}}

License

License

Acknowledgements

  • We thank Christoph Heindl, and the authors of DISK and HardNet for releasing their code, which inspired our work.
  • We thank the developers of Kornia for developing and releasing the amazing kornia library!
  • We thank the agencies CAPES, CNPq, FAPEMIG, and Google for funding different parts of this work.

VeRLab: Laboratory of Computer Vison and Robotics https://www.verlab.dcc.ufmg.br

dalf_cvpr_2023's People

Contributors

guipotje avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dalf_cvpr_2023's Issues

grid plot

Thank you for you great work and sharing. I may want to ask a question about plotting TPS grid on source and target image, and could you please advice how to do that? I saw a plot_grid() function inside utils.py file, but can't make it right to plot the grid. Many thanks in advance.

3D non rigid registration

Hey team @guipotje, I wanted to compliment you on your excellent work on solving the deformable correspondence problem. I would like to inquire about the anticipated release timeline for the 3D non-rigid registration pipeline that utilizes DALF ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.