License: MIT License

Python 73.07% Shell 26.93%

generalized_contrastive_loss's Introduction

Generalized Contrastive Loss

Visual place recognition is a challenging task in computer vision and a key component of camera-based localization and navigation systems. Recently, Convolutional Neural Networks (CNNs) achieved high results and good generalization capabilities. They are usually trained using pairs or triplets of images labeled as either similar or dissimilar, in a binary fashion. In practice, the similarity between two images is not binary, but rather continuous. Furthermore, training these CNNs is computationally complex and involves costly pair and triplet mining strategies. We propose a Generalized Contrastive loss (GCL) function that relies on image similarity as a continuous measure, and use it to train a siamese CNN. Furthermore, we propose three techniques for automatic annotation of image pairs with labels indicating their degree of similarity, and deploy them to re-annotate the MSLS, TB-Places, and 7Scenes datasets. We demonstrate that siamese CNNs trained using the GCL function and the improved annotations consistently outperform their binary counterparts. Our models trained on MSLS outperform the state-of-the-art methods, including NetVLAD, and generalize well on the Pittsburgh, TokyoTM and Tokyo 24/7 datasets. Furthermore, training a siamese network using the GCL function does not require any pair mining.

Paper and license

The code is licensed under the MIT License.

If you use our code please cite our paper

@article{leyvavallina2021gcl,
  title={Data-efficient Large Scale Place Recognition with Graded Similarity Supervision}, 
  author={María Leyva-Vallina and Nicola Strisciuglio and Nicolai Petkov},
  journal={CVPR},
  year={2023}
}

Network activation

Contact details

If you have any doubts please contact us at:

María Leyva-Vallina: m.leyva.vallina at rug dot nl
Nicola Strisciuglio: n.strisciuglio at utwente dot nl

How to use this library

Download the data

MSLS: The dataset is available on request here. For the new GT annotations, please request them here.
Pittsburgh: The whole dataset is available on request here and the train val splits for Pitts30k are available here.
TokyoTM: The dataset is available on request here.
Tokyo 24/7: The dataset is available on request here.
TB-Places: The dataset is available here. For the new GT annotations, please request them here.

Download the models

All our models (and labels) can be downloaded from here.

Our results

MSLS

				MSLS-val			MSLS-test			Pitts30k			Tokyo24/7			RobotSeasons v2- all			Extended CMU-all
Method	PCA_w	Dim	R@1	R@5	R@10	R@1	R@5	R@10	R@1	R@5	R@10	R@1	R@5	R@10	0.25m/2°	0.5m/5º	5.0m/10º	0.25m/2°	0.5m/5º	5.0m/10º
NetVLAD-GCL	N	32768	62.7	75.0	79.1	41.0	55.3	61.7	52.5	74.1	81.7	20.3	45.4	49.5	3.3	14.1	58.2	3.0	9.7	52.3
NetVLAD-GCL	Y	4096	63.2	74.9	78.1	41.5	56.2	61.3	53.5	75.2	82.9	28.3	41.9	54.9	3.4	14.2	58.8	3.1	9.7	52.4
VGG-GeM-GCL	N	512	65.9	77.8	81.4	41.7	55.7	60.6	61.6	80.0	86.0	34.0	51.1	61.3	3.7	15.8	59.7	3.6	11.2	55.8
VGG-GeM-GCL	Y	512	72.0	83.1	85.8	47.0	60.8	65.5	73.3	85.9	89.9	47.6	61.0	69.2	5.4	21.9	69.2	5.7	17.1	66.3
ResNet50-GeM-GCL	N	2048	66.2	78.9	81.9	43.3	59.1	65.0	72.3	87.2	91.3	44.1	61.0	66.7	2.9	14.0	58.8	3.8	11.8	61.6
ResNet50-GeM-GCL	Y	1024	74.6	84.7	88.1	52.9	65.7	71.9	79.9	90.0	92.8	58.7	71.1	76.8	4.7	20.2	70.0	5.4	16.5	69.9
ResNet152-GeM-GCL	N	2048	70.3	82.0	84.9	45.7	62.3	67.9	72.6	87.9	91.6	34.0	51.8	60.6	2.9	13.1	63.5	3.6	11.3	63.1
ResNet152-GeM-GCL	Y	2048	79.5	88.1	90.1	57.9	70.7	75.7	80.7	91.5	93.9	69.5	81.0	85.1	6.0	21.6	72.5	5.3	16.1	66.4
ResNeXt-GeM-GCL	N	2048	75.5	86.1	88.5	56.0	70.8	75.1	64.0	81.2	86.6	37.8	53.6	62.9	2.7	13.4	65.2	3.5	10.5	58.8
ResNeXt-GeM-GCL	Y	1024	80.9	90.7	92.6	62.3	76.2	81.1	79.2	90.4	93.2	58.1	74.3	78.1	4.7	21.0	74.7	6.1	18.2	74.9

To reproduce them

Run the src/labeling/create_json_idx.py file to generate the necessary json index files for the dataset.

Clone the mapillary repository and run the following command on your machine, substituting "MYDIR" with the path where you downloaded the mapillary library:

export MAPILLARY_ROOT="/MYDIR/mapillary_sls/"

Create the JSON index files for the MSLS dataset as follows. (replace PATH-TO-DATASET with the directory where the MSLS dataset is)

python3 src/labeling/create_json_idx.py --dataset msls --root_dir PATH-TO-DATASET/MSLS/

Run the extract_predictions.py script to compute the map and query features, and the top-k prediction. For instance:

python3 extract_predictions.py --dataset MSLS --root_dir PATH-TO-DATASET/MSLS/ --subset val --model_file models/MSLS/MSLS_resnet152_GeM_480_GCL.pth --backbone resnet152 --pool GeM --f_length 2048

This will produce the results on the MSLS validation set for this model. If you select --subset test, the file results/MSLS/test/MSLS_resnet152_GeM_480_GCL_predictions.txt will be generated. To evaluate the predictions you will need to submit this file to the MSLS evaluation server.

To apply PCA whitening run the apply_pca.py script with the appropriate parameters. For instance, for the example above, you have to run:

python3 apply_pca.py --dataset MSLS --root_dir PATH-TO-DATASET/MSLS/ --subset val --name MSLS_resnet152_GeM_480_GCL

To reproduce our experiments we include a series of evaluation scripts in the 'scripts' folder, for the MSLS, Pittsburgh, Tokyo24/7, TokyoTM, RobotCar Seasons v2 and Extended CMU Seasons datasets. These scripts need the index files for each dataset that are available here and our model files, available here.

Train your own models

If you want to train a model on MSLS using the GCL function you must execute train.py with the appropriate parameters. For example:

python3 train.py --root_dir PATH-TO-DATASET/MSLS/ --cities val --backbone vgg16 --use_gpu --pool GeM --last_layer 2

IMPORTANT NOTES FOR TRAINING

Make sure that the 'train_val' dir of the MSLS data set contains the graded files with the graded Ground Truth (the label files should be at the same level of the city folders in the train_val directory of MSLS).
Download the ground truth files from our DataVerse repository.
The password to extract the zip-file with the labels is 'gcl2022'.

generalized_contrastive_loss's People

Contributors

Stargazers

Watchers

Forkers

cv-ip nis-research nicstrisc nwpu-hjs kaiyi98 whu-lyh jessseee ronanoo blueted2 msnikken sontung zhangxiurui520 pean1128 aipanpan123

generalized_contrastive_loss's Issues

about attention map

Hello, I'm interested in your work. How can I obtain the attention maps as shown in Figure 15? Do you have any related code available?

Could you please share the code for marking the ground truth label of the msls or other datasets?

What a great job you are doing! Just from the perspective of the datasets, the results are so attractive! It is obvious that your code is logical and I have learned a lot while reading your code. But I have a puzzle that I hope you can solve.

I am curious about how you identify the ground truth label for msls dataset. I looked through your code on github and found nothing relevant. Did you publish this part of the code? If you publish it, please tell me where it is.

Thank you for taking the time to read my issue and I hope you can answer it.

Requset password for new GT annotaions files

Hi María and Nicola,

Thanks for your impressive work.

However, when I was trying to unzip new GT annotation files downloaded via the link provided in this GitHub repo, the password was required.

May I know the password?

Thank you.

How to get the value of Extended CMU Seasons?

Hello, @marialeyvallina @nicstrisc ，I really appreciate for your work.
I have a question, the Table 6 shows that the result of Extended CMU Seasons of ResNext-GeM-GCL:
Urban:11.1 / 28.7 / 87.4;
Suburban:4.2 / 14.6 / 77.4;
Park:3.1 / 11.1 / 57.7

And the Mean result is :
6.1 / 18.2 / 74.9

Is it weighted in some way? How is it converted?
I am always looking forward to your kind response.
Best regards.

How to get the generalized_contrastive_loss code

Hi,
How to get the generalized_contrastive_loss code

I can't load pre-trained models on MSLS

Dear Maria,

Thanks for your work and effort. I can't test your models which has GeM pooling. It can't load and shows following error:

model.load_state_dict(torch.load(args.resume_model)["model_state_dict"])
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1605, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BaseNet:
	Unexpected key(s) in state_dict: "pool.p".

GCL reproduce

Hi, I am trying to reproduce the results in the paper using ResNet152 as backbone and GCL loss. However, I get nan error during training. Here is my implementation for GCL loss.

class GCLoss(torch.nn.Module):  

    def __init__(self, margin=0.5):
        super(GCLoss, self).__init__()
        self.margin = margin

    def forward(self, out1, out2, label): # out1, out2 are the output features,  and label is the FoV value
        dist = torch.sqrt(torch.sum((out1 - out2) ** 2, dim=1))
        dist_square = torch.sum((out1 - out2) ** 2, dim=1)
        loss = label * 0.5 * dist_square + (1 - label) * 0.5 * torch.relu(self.margin - dist) ** 2
        # embed()
        if torch.isnan(loss).sum() > 0:
            embed()
        loss_m = torch.mean(loss)
        return loss_m

Could you please give me some advice? Do I get something wrong?

Siamese Vs. Single network in training

Hi, I have read your paper and the code. The work is cool and fantastic.
However, I am confused about the siamese network here. It is said in the paper that these two networks share weights and the same structure, so what is the difference if the tensors are input only once using single network? Like, how about we concatenate these two tensors into one batch and input it to the network?
Could you please provide some ablation study on Siamese Vs. Single network in MSLS? Thanks a lot.

Inquire about the 7Scenes relabeling methods

Hi,
Your work "Data-efficient Large Scale Place Recognition with Graded Similarity Supervision" is fascinating and inspire me a lot. I want to follow but curious about how to relabel the 7 Scenes dataset based on 3D overlap. I'll be really appreciate if you can provide a pseudo or source code for understanding.
Best wishes
Yingxiu Chang

W18_query.json and W18_map.json files

Hi,
Can I get the W18_query.json and W18_map.json files?Because I can not find that in TB_places files.
Thanks

How to get the new GT annotation

How to get the new GT annotation for TB_places dataset, I have filled in the email in the link left by the author, but I haven't received a reply for a long time, so where should i download it? thank you

when to open source the code about trainning?

First of all, thank you for such a great work. Do you plan to open source the code of how to train my own model? or the GCL loss code?thank you

About the Figure 14 in the paper

Hello, @marialeyvallina -The Figure 14(CNN activations) in the paper looks great. Currently I want to try to implement it but I failed. Is it convenient to provide the relevant implementation code?Of course, detailed guidance would be better.
I am always looking forward to your kind response.
Best regards.

How to reproduce the result of RobotCar Seasons v2?

Hi, @marialeyvallina @nicstrisc,
I really appreciate for your great work. I'm trying to reproduce results of RobotCar Seasons v2. I have got the MSLS_resnet50_avg_480_GCL_predictions.npy.Could you please tell me how to get the results in the paper next? Could you please provide the subsequent code? Of course, detailed guidance would be better.
I am always looking forward to your kind response.
Best regards.

Could you let me know about the value of nominal size of FOV of the camera concerned(theta) & radius(r) value in MSLS dataset?

Any plan to release your training code?

Where does `create_model` come from?

I see a function create_model being called in the [extract_predictions] file, but I'm not sure where it comes from.

I tried using TIMM's create_model, but it seems to load an incompatible state dictionary. Is there another create_model I should be looking at? What version of timm do you use?

PCA whitening

Hi,

Interesting work! I'm excited try the method out in my own research.

But I have a quation regarding the descriptor PCA whitening implementation mentioned in the paper - namely, I couldn't find the implementation in the code. Did I just miss it, or did you not include the implementation in the release?

Thanks!

How to calculate gt_file?

can you release the code about how to calculate the gt_file,thank you very much!

release date for generalized contrastive loss code?

Hello dear authors, Thank you for your interesting work. Are you going to release the code for GCL? I am trying to experiment on different contrastive learning and I want to try you proposed GCL but I could find the code. Can you please confirm me if the loss function code is going to be release or not in the near future?

marialeyvallina / generalized_contrastive_loss Goto Github PK