LoFTR uses only 15,300 =153*100 image training pairs from MegaDepth.<b

It is fair to compare DKM directly with LoFTR when DKM uses more training data sampled from MegaDepth ? about dkm HOT 15 CLOSED

parskatt commented on May 30, 2024

It is fair to compare DKM directly with LoFTR when DKM uses more training data sampled from MegaDepth ?

from dkm.

Comments (15)

Parskatt commented on May 30, 2024

Hi!

I think you are mistaken regarding the number of training pairs in LoFTR. Could you tell me where you get the number 100 from?
From my understanding they are using the pairs in here: https://drive.google.com/drive/folders/1SrIn9WJ1IuG08yh2nEvIsLftXHLrrIwh

And they load those npz files into this dataset loader: https://github.com/zju3dv/LoFTR/blob/master/src/datasets/megadepth.py

Running the following code:

sampled_pair_files = [f for f in open("trainvaltest_list/train_list.txt","r").read().split("\n") if len(f) > 0]
num_pairs = 0
for scene_name in sampled_pair_files:
    scene = np.load(f"scene_info_0.1_0.7/{scene_name}.npz",allow_pickle=True)
    scene_pairs = len(scene['pair_infos'])
    num_pairs = num_pairs + scene_pairs
print(num_pairs)

yields 8862673. So they have around 9 million unique pairs.
For Scannet they use the same procedure as SuperGlue and end up with 240M pairs.

The main reason that we don't follow this exact procedure is that:

We believe our approach is more modular, and easier to modify for someone looking to improve the sampling, or put focus on certain overlaps.
More transparent in how the pairs are sampled exactly.

However, we did not find that our sampling procedure produces better results after training than the original version on the benchmarks.

from dkm.

noone-code commented on May 30, 2024

Well, take it easy.
I just wanna to explore the influence of training data numbers on performance.
I think you did not realize the n_samples_per_subset parameters in the sampler.py, it is set to 100 in the megadepth datasets.
So, it is exactly the LoFTR only use the 15300 pairs of images.

from dkm.

Parskatt commented on May 30, 2024

Hi again!
I hadn't seen that detail before :)

From reading their implementation:
https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py#L5
it says the following:

Random sampler for ConcatDataset. At each epoch, n_samples_per_subset samples will be draw from each subset
in the ConcatDataset.

I'm guessing they run more than 1 epoch? Hence the correct number should be 368 * 100 * num_epochs?

from dkm.

noone-code commented on May 30, 2024

In fact, they will not sample each epoch, as the reload_dataloaders_every_epoch=False in the train.py.
If reload_dataloaders_every_epoch=True, the sampler will resample each epoch.

from dkm.

Parskatt commented on May 30, 2024

Aha, got it. However, they also use 64 GPUs which I guess means that each GPU gets its own sampler?

from dkm.

Parskatt commented on May 30, 2024

My general guess is that they found that the exact specifics of the sampling was not very important for the final performance?

from dkm.

noone-code commented on May 30, 2024

I don’t know why they do not resample training data each epoch, it is better to ask the author.
While for the sampling, as self.generator = torch.manual_seed(seed) is fix the generator, hence it fix the sample results, and the sampled indices will uniformly assign to each gpu.
Even each gpu sample it by self, while as the generator is fixed, so they still get the same sample indices.

from dkm.

Parskatt commented on May 30, 2024

I find it hard to believe that each GPU would sample the exact same indices, but I'm not completely familiar with their exact sampling. I'll run their code to get a better understanding.

I'll get back to you after I have done this so that we can have a more informed discussion.

from dkm.

noone-code commented on May 30, 2024

Yep, anyway, I think DKM is a good method as its impressive results, I like it.
One more question, could I use DKM on 1920*1080 images when I training it on other size like, 520*720 ?

from dkm.

Parskatt commented on May 30, 2024

Yes. We don't have a perfectly clean way of doing it but there are two alternatives:

Set the internal dimensions (we always resize to a fixed size, so you can change this resolution to your desire, note however that the method may become quite slow for large images)
Keep the internal dimension to (540,720) but upsample the prediction by rerunning the final layer

In the model zoo:

DKM/dkm/models/model_zoo/__init__.py

Lines 18 to 26 in 5b28266

 def DKMv3_outdoor(): 

 """ 

  Loads DKMv3 outdoor weights, uses internal resolution of (540, 720) by default 

  resolution can be changed by setting model.h_resized, model.w_resized later. 

  Additionally upsamples preds to fixed resolution of (864, 1152), 

  can be turned off by model.upsample_preds = False 

  """ 

 weights = torch.hub.load_state_dict_from_url(weight_urls["DKMv3"]["outdoor"]) 

 return DKMv3(weights, 540, 720, upsample_preds = True)

you can see some api for changing these variables. However, right now the "upsample_preds" variable is autoset to use (864,1152) see:

DKM/dkm/models/dkm.py

Line 685 in 5b28266

hs, ws = 864,1152

You can change this hardcoding, so that its settable. If you were to change it, please submit a pullrequest so I can update the code. There is a lot of mess so I'd appreciate it.

from dkm.

noone-code commented on May 30, 2024

Wow, thank you so much.
My questions are addressed.
Thank you.
🌹

from dkm.

Parskatt commented on May 30, 2024

@noone-code
Ok I ran LoFTRs training code and here's how I think it works:

They use RandomConcatSampler : https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py

When using distributed they split the scenes over the GPUS, each GPU gets 384//world_size scenes. The GPUS are initialized with a seeded generator. Their iter method, defined here:

https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py#L44

Samples 100 pairs for each scene and shuffles them. This defines 1 epoch for each worker.

The next epoch this is done again. Note that the module is not reinitizalized. Hence the state of the generator is different from the first epoch. Therefore, the 100 pairs the second epoch will be different from the first.

I think I got this correct, went through their code by debugging the train.py function on two GPUs using their standard outdoor training setting. However I have not actually ran a full epoch yet, so I might have misunderstood something. However, if you look at their comment:
https://github.com/zju3dv/LoFTR/blob/4feac496c1eacebc49ce53793039a8162930935e/src/datasets/sampler.py#L15

This leads me to believe that they are aware of the potential issue with repeated samples, and therefore make sure not to reinitialize it.

Please let me know if I got anything wrong, it might be the case that I misunderstood something.

from dkm.

noone-code commented on May 30, 2024

I find the code local_npz_names = get_local_split(npz_names, self.world_size, self.rank, self.seed)
I agree that LoFTR assign different scenes to each gpu, and each gpu will sample 100 pairs images at each epochs.
So, actually, the LoFTR uses upper to 384(scenes)*100(samples each scene)*30(epoch)=1,152,000 pairs images ?
Is it correct ?

from dkm.

Parskatt commented on May 30, 2024

I think so! (but not sure)

There is some potential that our sampling may yield slightly better (or worse) results compared to theirs if used in DKM. Of course, our method has been developed using our sampling and theirs with theirs, so it might be the case that both would degrade using the others sampling ;)

In conclusion, I would say that since they do sample quite a lot of pairs, they are comparable to us, however it would of course be interesting to investigate a bit more deeply how to sample good pairs for training of feature matchers.

from dkm.

noone-code commented on May 30, 2024

Yes, finally, thank you so much.

from dkm.

It is fair to compare DKM directly with LoFTR when DKM uses more training data sampled from MegaDepth ? about dkm HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	def DKMv3_outdoor():
	"""
	Loads DKMv3 outdoor weights, uses internal resolution of (540, 720) by default
	resolution can be changed by setting model.h_resized, model.w_resized later.
	Additionally upsamples preds to fixed resolution of (864, 1152),
	can be turned off by model.upsample_preds = False
	"""
	weights = torch.hub.load_state_dict_from_url(weight_urls["DKMv3"]["outdoor"])
	return DKMv3(weights, 540, 720, upsample_preds = True)