cvg / lightglue Goto Github PK
View Code? Open in Web Editor NEWLightGlue: Local Feature Matching at Light Speed (ICCV 2023)
License: Apache License 2.0
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
License: Apache License 2.0
can someone make this run in google colab ??
Hi, you did a good job, this repo is really help ! I wonder if Superpoint+LightGlue have any issues of license for commercial use ? Does it infringe copyright of Superpoint of MagicLeap ( https://github.com/magicleap/SuperPointPretrainedNetwork/blob/master/LICENSE ). How about the end2end (Superpoint+Lightglue) onnx model (from https://github.com/fabio-sim/LightGlue-ONNX) also ?
Hi, I modified your codes to make it support batch mode. But when I tested on 40 pairs of images, I found it is slower (0.57 s) under batch mode compared to matching them one by one (0.48 s), do you know the potential issue for this problem?
Hi,
I trained the neural network that provides (x,y) coordinates of objects and descriptors for them (such as SuperPoint).
When I use GT points and LightGlue, the matching works excellently. But when I use (x, y) estimations, LightGlue strongly filters matches (for my example, I have 264 detected points, when LightGlue returns 110 matches).
It is worth highlighting that the objects between frames can lightly move.
I set the parameters:
depth_confidence: -1
width_confidence': -1
filter_threshold': 1e-5,
It is possible to extend the algorithm for this purpose? Could you suggest any changes/improvements for such a task?
Hello, thank you for publishing the great code. could you please also share the SIFT+LightGlue code?
Hello, It was a very helpful project. I am trying to get the camera pose from the image correspondences. but when i am giving the whole video as frames some of the frames are getting 0 or less 3 matches .
` extractor = SuperPoint(max_num_keypoints=2048)
matcher = LightGlue(features='superpoint')
feats0 = extractor.extract(image0)
feats1 = extractor.extract(image1)
matches01 = matcher({'image0': feats0, 'image1': feats1})
feats0, feats1, matches01 = [rbd(x) for x in [feats0, feats1, matches01]] # remove batch dimension
kpts0, kpts1, matches = feats0['keypoints'], feats1['keypoints'], matches01['matches']
m_kpts0, m_kpts1 = kpts0[matches[..., 0]], kpts1[matches[..., 1]]
return m_kpts0.cpu(),m_kpts1.cpu()
`
When will this be released as the ETA was July?
@Phil26AT @skydes @ducha-aiki @yusufaydin0797
Hi 👋🏻 Is there anyone working on HF space for that model? If not, I can contribute.
Thanks for sharing this cool repo! I've been getting great results with image pairs where "up is up." However, I'm curious about cases where there's a 90-degree or even 180-degree rotation. It seems to work when we manually rectify those images before applying the technique. But I'm wondering if there's a way to do this without the manual step.
Great work!
Do you have an idea why the accuracy is becoming better? Is it because you exclude the irrelevant points so that the remained ones won't be affected? Any other potential reasons?
Hello, first of all thank you for the great work and to make the license permissive, it will surely boost research in image matching!
I am trying to reproduce SuperPoint + MNN as a baseline. For that, I follow the protocol of the paper, trying to achieve results as close as possible to values reported in Table 2 of the LightGlue paper. I am doing the following steps:
ransac_thr = 1.5
{'auc@5': 0.251782299270867, 'auc@10': 0.3987322068921645, 'auc@20': 0.5415882032042043}
I also attempted to run LO-RANSAC instead of using cv2.RANSAC since it gives a great boost in AUC in Table 2, but without success. I tested the implementation from both pydegensac and cv2.USACs, but with results very far away to the performance of AUC@5 of 0.51, testing with several configurations of inlier thresholds and different flags. Could you guys kindly provide more details on the SuperPoint parameters, RANSAC implementation and hyperparameters used to achieve these results, specifically for SuperPoint + MNN matching (Table 2)?
Thank you in advance!
It's very impressive result. It surpasses previous SOAs by a lot. Congrats! And you for sharing.
The code of m1 may exist a bug, the shape of attn10 is 'b h j i', so the shape of attn10.transpose(-2, -1) is 'b h i j', can you help check that?
“qk0, qk1 = qk0 * self.scale**0.5, qk1 * self.scale**0.5
sim = torch.einsum('b h i d, b h j d -> b h i j', qk0, qk1)
attn01 = F.softmax(sim, dim=-1)
attn10 = F.softmax(sim.transpose(-2, -1).contiguous(), dim=-1)
m0 = torch.einsum('bhij, bhjd -> bhid', attn01, v1)
m1 = torch.einsum('bhji, bhjd -> bhid', attn10.transpose(-2, -1), v0)”
superPoint descriptor is 256 dim , which is too big for my application ,/ but the superPoint (u, v) is the best point for me . I want to konw can you provide another descriptor (maybe 64 dim), to train LightGlue.
Hi,
Thanks for your great work~
I'm using LightGlue pretrained weight get keypoints from a template image and an input image, and then align the input image with the template by keypoints obtained. I noticed when the input image is rotated(by 90 degrees, 180 degrees and 270 degrees), if I use LightGlue pretrained weight to get the keypoints, the output aligned image is distorted which seems failed. However I tried to use OpenCV SIFT to get the keypoints, the output aligned image is good.
keypoints got by SIFT
sift = cv2.SIFT_create(contrastThreshold=0.02)
...
kp2, des2 = sift.detectAndCompute(target_img, None)
align
# m_kpts1-> template image, m_kpts0 -> input image
M, mask = cv2.findHomography(m_kpts1, m_kpts0, cv2.RANSAC, 5.0)
M_r = np.linalg.inv(M)
aligned_img = cv2.warpPerspective(src_img, M_r, (template_w, template_h))
Do you know if the problem is caused by that LightGlue pretrained weight does not work good for rotated images? or if there is anything wrong with my code for alignment?Appreciated for suggestions.
As asked, can I stitch the keypoints and descriptors of multiple images together as one image to match with the new image, will this affect the matching performance?
At present, when I do this, the keypoint_scores is very low, and when I use filter_threshold=0.1, there are very few keypoints left。
Looking forward to your answer.
Hi,
I tested the speed and match results using batch mode and non-batch mode respectively and found: although batch mode is faster, its accuracy is worse than non-batch mode. I have checked the matched points' coordinates on each image between batch or non-batch modes and found most of them are the same, but some are different.
I used a query image with 50 similar images to do testing and print the matched pairs number of two modes and get:
batch matched points num: 179
non-batch matched points num: 179
batch matched points num: 109
non-batch matched points num: 107
batch matched points num: 107
non-batch matched points num: 106
batch matched points num: 124
non-batch matched points num: 117
batch matched points num: 111
non-batch matched points num: 113
batch matched points num: 138
non-batch matched points num: 140
batch matched points num: 125
non-batch matched points num: 125
batch matched points num: 136
non-batch matched points num: 129
batch matched points num: 110
non-batch matched points num: 108
batch matched points num: 126
non-batch matched points num: 127
batch matched points num: 141
non-batch matched points num: 135
batch matched points num: 137
non-batch matched points num: 130
batch matched points num: 157
non-batch matched points num: 157
batch matched points num: 129
non-batch matched points num: 126
batch matched points num: 93
non-batch matched points num: 93
batch matched points num: 115
non-batch matched points num: 113
batch matched points num: 71
non-batch matched points num: 106
batch matched points num: 53
non-batch matched points num: 128
batch matched points num: 70
non-batch matched points num: 132
batch matched points num: 57
non-batch matched points num: 76
batch matched points num: 87
non-batch matched points num: 106
batch matched points num: 68
non-batch matched points num: 119
batch matched points num: 85
non-batch matched points num: 76
batch matched points num: 58
non-batch matched points num: 96
batch matched points num: 87
non-batch matched points num: 75
batch matched points num: 121
non-batch matched points num: 150
batch matched points num: 73
non-batch matched points num: 85
batch matched points num: 89
non-batch matched points num: 128
batch matched points num: 79
non-batch matched points num: 133
batch matched points num: 125
non-batch matched points num: 112
batch matched points num: 67
non-batch matched points num: 118
batch matched points num: 75
non-batch matched points num: 114
batch matched points num: 67
non-batch matched points num: 45
batch matched points num: 83
non-batch matched points num: 97
batch matched points num: 98
non-batch matched points num: 168
batch matched points num: 62
non-batch matched points num: 85
batch matched points num: 94
non-batch matched points num: 101
batch matched points num: 106
non-batch matched points num: 82
batch matched points num: 88
non-batch matched points num: 80
batch matched points num: 32
non-batch matched points num: 42
batch matched points num: 95
non-batch matched points num: 113
batch matched points num: 98
non-batch matched points num: 180
batch matched points num: 88
non-batch matched points num: 101
batch matched points num: 51
non-batch matched points num: 109
batch matched points num: 77
non-batch matched points num: 114
batch matched points num: 85
non-batch matched points num: 99
batch matched points num: 64
non-batch matched points num: 62
I also checked these matched pairs and found non-batch mode is more accurate, do you know why this happened? All other parameters remain the same in the two testings.
Thank you!
Hello, is there any scoring to the points matched?
pred = match_pair(extractor, matcher, prev_frame_t, frame_t)
'keypoints0'
'keypoint_scores0'
'descriptors0'
'keypoints1'
'keypoint_scores1'
'descriptors1'
'image0'
'image1'
'log_assignment'
'matches0'
'matches1'
I see that pred contains matching_score but I cant see any info on what that actually is. What I would like to do is filter out potentialy bad matches / low confidence, on for example white walls with low ammount of features.
I'd love to be able to generate a heatmap to overlay on each of the images used in a comparison to visualize the distribution of keypoints and highlight significant areas.
I don't have much experience handling torch.Tensor outputs, but I see the data below for an image comparison
kpts0[:3] - tensor([[1012.0620, 455.7500], [1371.6401, 755.7500], [1226.8101, 123.2500])
kpts1[:3] - tensor([[ 382.5601, 11.8231], [ 412.9731, 11.8231], [ 502.7641, 11.8231])
matches[:3] - tensor([[ 0, 794], [ 1, 1017], [ 2, 363])
Can you suggest the best way to return (x,y) pixel values of keypoints for each image?
Thanks! This model & paper are really excellent. :)
Hi,
It was unclear to me whether your early exit scheme also supports the possibility of rejecting a pair of frames to match at all. Is this something you've looked at? If not would it be a straightforward extension, or do you see caveats?
Hi, great work, how can I calculate the difference of keypoints of two images and backpropagate it as a loss
Hi,
First of all, thanks for this excellent work.
When I use Lightglue for matching, I found the input image resized to 1024 in the default config and want to know the reason for doing this. Because I found that image size changes may cause feature point positions to offset, and I tried sparse reconstruction based on matching results, which may increase mean reprojection errors.
Thank you
Hi,
I have created a PR to kornia kornia/kornia#2436
Please tell me, if you have any feedback or objections.
Best, Dmytro
Hi there
Thanks for your great work!
I am a new student in machine learning and am facing difficulties in retraining the model due to issues with data and loss.
In section 4, "Details that Matter," the paper discusses using the "Revisiting Oxford and Paris" dataset for pre-training the model. However, since this dataset was designed for retrieval purposes, I am uncertain how to incorporate it into the retraining of the model.
How to calculate the matching error of two images after obtaining Superpoints feature points.
Hit the subscribe button on the right of this issue if you wish to be notified of the training and evaluation code release in a separate repo. Please do not reply to this issue to not spam other subscribers. Please do not contact us to ask for early-access to the code.
ETA: July 2023
Thank you for your work. Due to the lack of training code, I am currently unable to replace the method used myself. I wonder if you are interested in testing the use of ALIKED as a feature extraction module.
In my experiment, lightglue+superpoint is better than LightGlue + disk,which is not consistent with lightglue paper:
Hello,
I was wondering if it would be possible if lightglue could be ran in batch mode, since I have to match a large amount of images the bottleneck right now is inference time.
Thank you for your work!
Hi! Thank you for making this available. DO you have any plans to create a TensorRT c++ inference example? Or, are you aware of one?
Thanks!
In the paper, there are two datasets, can you offer them? So when you release the train code, I can train immediately
resize_image() function in utils doesn't take a 'grayscale' argument, which explains why you're seeing the "resize_image() got an unexpected keyword argument 'grayscale'". The only arguments this function takes are: "image", "size", "fn", and "interp". If you want to convert the image to grayscale, you'll need to do it either before or after calling resize_image().
/content/LightGlue/lightglue/utils.py in load_image(path, resize, **kwargs)
119 image = read_image(path)
120 if resize is not None:
--> 121 image, _ = resize_image(image, resize, **kwargs)
122 return numpy_image_to_torch(image)
123
TypeError: resize_image() got an unexpected keyword argument 'grayscale'
I want to know about each matching points'matching accuracy so I have to make use of matching distances
but there are errors that there are no matching distances in 'matches01'
so could you know how to solve it?
I tested lightglue and superglue using both CPU and GPU ,In both cases, superglue takes less time,but in paper,lightglue is faster。I want to know why there is such a contradiction,and my CPU:Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz ,GPU:
RTX 2080Ti
Hi! Thanks for your great work! But I don't quite understand some parts of the code
C.1. Architecture.Confidence classifier
: "... and its gradients are not propagated into the states to avoid impacting the matching accuracy." Why the classifier's gradients will impacting the matching accuracy? have you conducted experiments about it?assert mask is None
when using FlashCrossAttention() which is self-installed ? (FlashCrossAttention
also support mask
parameter) codeCrossBlock
, when flash
is enable, you do not use Bidirectional Cross Attention actually, right ? code . If so, what's the reason ?scores1
by scores1 = F.log_softmax(sim, 1)
but in a more complicated way (scores1 = F.log_softmax(sim.transpose(-1, -2).contiguous(), 2).transpose(-1, -2)
) ? code.Thanks for your time! Looking forward to your reply!
Tried running the example script but I get this strange error when trying to import LightGlue
from lightglue import LightGlue
..
.. <stack trace>
TypeError: 'numpy._DTypeMeta' object is not subscriptable
Seems to be using some typed implementation of numpy? Any guidance would be greatly appreciated.
Hi, thanks for posting your awesome code. Did you supervise the loss on the matching scores during the training stage
Hi,
Thank you and the matching results are very good. :)
Earlier i used Sift for feature extraction and cv.detail_BestOf2NearestMatcher() for matching. Then i use cv.detail_HomographyBasedEstimator() to estimate the K, R and T.
How can i achieve this using LightGlue results? Could you please help me with calculating K, R and T from lightglue results?
Thank you
I know that this is probably not the primary focus of code/repository.
But I found lightglue very efficient in my use case and I would like to merge multipl flat images (slices of a marble/granite slab).
I dont have homography issues, just a little light correction and blending.
Which is the preferred approach?
seeing all the points is overwhelming, can I change it to be top 10? random 10? only blue?
extractor = SuperPoint(max_num_keypoints=2048).eval().cuda()
matcher = LightGlue(features='superpoint').eval().cuda()
image0 = load_image('path/assets/DSC_0410.jpg').cuda()
image1 = load_image('path/assets/DSC_0411.jpg').cuda()
feats0 = extractor.extract(image0)
feats1 = extractor.extract(image1)
matches01 = matcher({'image0': feats0, 'image1': feats1})
feats0, feats1, matches01 = [rbd(x) for x in [feats0, feats1, matches01]]
kpts0 = feats0['keypoints']
kpts1 = feats1['keypoints']
print("kpts0 = ",kpts0.shape)
print("kpts1 = ",kpts1.shape)
matches = matches01['matches']
print("matches = ",matches.shape)
kpts0 = torch.Size([2048, 2])
kpts1 = torch.Size([2048, 2])
matches = torch.Size([0, 2])
I would appreciate if you could provide the test code, on Hpatches Dataset. Btw, I cannot fine any test code on Public dataset.
I suppose batch support is not yet implemented?
The inference speed is nice but it needs to be batched in order to use 100% GPU
Hi there,
Thanks for this great work.
In class LightGlue(nn.Module), the images are not mentioned as required_data_keys. But later on they are needed in the forward function:
kpts0 = normalize_keypoints(
kpts0_, size=data.get('image_size0'), shape=data['image0'].shape)
kpts1 = normalize_keypoints(
kpts1_, size=data.get('image_size1'), shape=data['image1'].shape)
Thank you very much for you works
I have some questions about IMC2023. I want to turn my pipeline with your Lightglue, my feature matching is superpoint and superglue which i get 0.65 in heritage_dioscuri scence
But i turn to superpoint and lightglue ,the scores is 0.48 ,i am very confused with the results beacuse the large decline .
It is Strange beacuse in other scences the scores improved.
The two ways have the same settings with resized to 1600 and the number of superpoint is 2048
Thank you
I now have: the internal parameter matrix of the camera, the 3D model of the target object, and the rgb image of the target object. My goal is to estimate the pose matrix of the object in the camera coordinate system. I am currently get target object bbox by object detection model and using the pnp method. Can I use this project to obtain the pose of the target object more accurately?
Hello!
First of all, I would like to applaud your work in pushing the envelope on SOTA local feature matching with LightGlue!
I've actually made an ONNX-compatible version at https://github.com/fabio-sim/LightGlue-ONNX. It'd be great if you could kindly add a link to it in your readme :)
With ONNX, however, comes some caveats (e.g., difficulty in exporting dynamic control flow). Do let me know if you've got any ideas to support early stopping & adaptive point pruning in ONNX runtime. Have a good day!
I would like to train / test light-glue with other feature extractors models like (r2d2 and shift ), can you please publish training code ?
Hi!
Your ablation experiments demonstrate the excellent performance of relative position encoding, howerver, I have two questions:
Looking forward for your reply!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.