Comments (15)
Hi!
I think you are mistaken regarding the number of training pairs in LoFTR. Could you tell me where you get the number 100 from?
From my understanding they are using the pairs in here: https://drive.google.com/drive/folders/1SrIn9WJ1IuG08yh2nEvIsLftXHLrrIwh
And they load those npz files into this dataset loader: https://github.com/zju3dv/LoFTR/blob/master/src/datasets/megadepth.py
Running the following code:
sampled_pair_files = [f for f in open("trainvaltest_list/train_list.txt","r").read().split("\n") if len(f) > 0]
num_pairs = 0
for scene_name in sampled_pair_files:
scene = np.load(f"scene_info_0.1_0.7/{scene_name}.npz",allow_pickle=True)
scene_pairs = len(scene['pair_infos'])
num_pairs = num_pairs + scene_pairs
print(num_pairs)
yields 8862673
. So they have around 9 million unique pairs.
For Scannet they use the same procedure as SuperGlue and end up with 240M pairs.
The main reason that we don't follow this exact procedure is that:
- We believe our approach is more modular, and easier to modify for someone looking to improve the sampling, or put focus on certain overlaps.
- More transparent in how the pairs are sampled exactly.
However, we did not find that our sampling procedure produces better results after training than the original version on the benchmarks.
from dkm.
Well, take it easy.
I just wanna to explore the influence of training data numbers on performance.
I think you did not realize the n_samples_per_subset
parameters in the sampler.py
, it is set to 100
in the megadepth datasets.
So, it is exactly the LoFTR only use the 15300
pairs of images.
from dkm.
Hi again!
I hadn't seen that detail before :)
From reading their implementation:
https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py#L5
it says the following:
Random sampler for ConcatDataset. At each epoch,
n_samples_per_subset
samples will be draw from each subset
in the ConcatDataset.
I'm guessing they run more than 1 epoch? Hence the correct number should be 368 * 100 * num_epochs?
from dkm.
In fact, they will not sample each epoch, as the reload_dataloaders_every_epoch=False
in the train.py
.
If reload_dataloaders_every_epoch=True
, the sampler will resample each epoch.
from dkm.
Aha, got it. However, they also use 64 GPUs which I guess means that each GPU gets its own sampler?
from dkm.
My general guess is that they found that the exact specifics of the sampling was not very important for the final performance?
from dkm.
I don’t know why they do not resample training data each epoch, it is better to ask the author.
While for the sampling, as self.generator = torch.manual_seed(seed)
is fix the generator, hence it fix the sample results, and the sampled indices will uniformly assign to each gpu.
Even each gpu sample it by self, while as the generator is fixed, so they still get the same sample indices.
from dkm.
I find it hard to believe that each GPU would sample the exact same indices, but I'm not completely familiar with their exact sampling. I'll run their code to get a better understanding.
I'll get back to you after I have done this so that we can have a more informed discussion.
from dkm.
Yep, anyway, I think DKM is a good method as its impressive results, I like it.
One more question, could I use DKM on 1920*1080
images when I training it on other size like, 520*720 ?
from dkm.
Yes. We don't have a perfectly clean way of doing it but there are two alternatives:
- Set the internal dimensions (we always resize to a fixed size, so you can change this resolution to your desire, note however that the method may become quite slow for large images)
- Keep the internal dimension to (540,720) but upsample the prediction by rerunning the final layer
In the model zoo:
DKM/dkm/models/model_zoo/__init__.py
Lines 18 to 26 in 5b28266
you can see some api for changing these variables. However, right now the "upsample_preds" variable is autoset to use (864,1152) see:
Line 685 in 5b28266
You can change this hardcoding, so that its settable. If you were to change it, please submit a pullrequest so I can update the code. There is a lot of mess so I'd appreciate it.
from dkm.
Wow, thank you so much.
My questions are addressed.
Thank you.
🌹
from dkm.
@noone-code
Ok I ran LoFTRs training code and here's how I think it works:
They use RandomConcatSampler : https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py
When using distributed they split the scenes over the GPUS, each GPU gets 384//world_size scenes. The GPUS are initialized with a seeded generator. Their iter method, defined here:
Samples 100 pairs for each scene and shuffles them. This defines 1 epoch for each worker.
The next epoch this is done again. Note that the module is not reinitizalized. Hence the state of the generator is different from the first epoch. Therefore, the 100 pairs the second epoch will be different from the first.
I think I got this correct, went through their code by debugging the train.py function on two GPUs using their standard outdoor training setting. However I have not actually ran a full epoch yet, so I might have misunderstood something. However, if you look at their comment:
https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py#L15
This leads me to believe that they are aware of the potential issue with repeated samples, and therefore make sure not to reinitialize it.
Please let me know if I got anything wrong, it might be the case that I misunderstood something.
from dkm.
I find the code local_npz_names = get_local_split(npz_names, self.world_size, self.rank, self.seed)
I agree that LoFTR assign different scenes to each gpu, and each gpu will sample 100 pairs images at each epochs.
So, actually, the LoFTR uses upper to 384(scenes)*100(samples each scene)*30(epoch)=1,152,000
pairs images ?
Is it correct ?
from dkm.
I think so! (but not sure)
There is some potential that our sampling may yield slightly better (or worse) results compared to theirs if used in DKM. Of course, our method has been developed using our sampling and theirs with theirs, so it might be the case that both would degrade using the others sampling ;)
In conclusion, I would say that since they do sample quite a lot of pairs, they are comparable to us, however it would of course be interesting to investigate a bit more deeply how to sample good pairs for training of feature matchers.
from dkm.
Yes, finally, thank you so much.
from dkm.
Related Issues (20)
- Inference time HOT 2
- Can DKM run on CPU only? HOT 9
- 3d point projection, best way to fetch matches HOT 6
- `numpy.random.choice` to pytorch? HOT 6
- run on batch data HOT 5
- the structure and information of 'warp' HOT 5
- what does low_res_certainty do in dkm.py? HOT 3
- e_R reaches 180° HOT 15
- About the pretrained model HOT 3
- Questions about global matcher HOT 3
- About the pretrained model with resnet18 HOT 1
- When using multi-GPU training, there is additional memory occupancy on GPU 0 HOT 1
- About testing results HOT 2
- DKMv3 and DKMv2 HOT 4
- Pretrained weights licensing HOT 2
- Questions about the key points on Megadepth test images HOT 2
- About previous sota pdc-net+ HOT 5
- torch.linalg.inv HOT 6
- synthetic dataset
- How to use GC-RANSAC for pose estimation? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dkm.