shivangi-aneja / cosmos Goto Github PK

[AAAI 2023] COSMOS: Catching Out-of-Context Misinformation using Self Supervised Learning

License: MIT License

Python 90.36% JavaScript 3.27% HTML 6.37%

dataset misinformation cheapfakes fake-news deepfakes deepfake-detection computer-vision deep-learning pytorch machine-learning nlp

cosmos's Introduction

COSMOS: Catching Out-of-Context Misinformation using Self-Supervised Learning (AAAI 2023)

COSMOS dataset consists of images and captions scraped from news articles and other websites designed for training and evaluation of out-of-context use of images. We refer readers to the paper for more details. To get access to the dataset, please fill out this form. We will provide you script to download the dataset. The official documentation for the project can be found here

Dataset Description

Dataset Statistics

COSMOS dataset consist of three splits : Training (160 K images), Validation (40 K images) and Test (1700 images). For training, we do not have/use out-of-context annotations. We only use these annotations in the end to evaluate our model. The dataset stats are listed below.

Table 1: Dataset stats.

Split	# Images	# Captions	Context Annotation
Train	161,752	360,749	No
Valid	41,006	90,036	No
Test	1700	3400	Yes

Data Format

The COSMOS training, validation and test sets are provided as JSON (JavaScript Object Notation) text files with the following attributes for every data sample stored as a dictionary:

File Structure for train.json and val.json

{	"img_local_path": <img_path>, 
	"articles": [
                 { "caption": <caption1>, 
                   "article_url": <url1>, 
                   "caption_modified": <caption_mod1>,
                   "entity_list": <entity_list1>},
                   
                 { "caption": <caption2>,
                   "article_url": <url2>,
                   "caption_modified": <caption_mod2>,
                   "entity_list": <entity_list2>},

                 { "caption": <caption3>,
                   "article_url": <url3>,
                   "caption_modified": <caption_mod3>,
                   "entity_list": <entity_list3>},
                   
                  ......

				 ],
    "maskrcnn_bboxes": [ [x1,y1,x2,y2], [x1,y1,x2,y2], ... ]
}

Table 2: Attributes in Train/Validation files.

Key	Description
`img_local_path`	Source path in dataset directory for the image
`articles`	List of dict containing metadata for every caption associated with the image
`caption`	Original Caption scraped from the news website
`article_url`	Link to the website image and caption scraped from
`caption_modified`	Modified caption after applying Spacy NER (We used these caption as input to our model during experiments)
`entity_list`	List that consists of mapping between modified named entities in the caption with the corresponding hypernym
`maskrcnn_bboxes`	List of detected bounding boxes corresponding to the image. (x1,y1) refers to start vertex of the rectangle and (x2, y2) refers to end vertex of the rectangle

Note that for detecting bounding boxes, we used Detectron2 pretrained model linked here. We detect upto 10 bounding boxes per image.

File Structure for test.json

{	
        "img_local_path": <img_path>,
	"caption1": <caption1>,
	"caption1_modified": <caption1_modified>,
	"caption1_entities": <caption1_entities>,
	"caption2": <caption2>,
	"caption2_modified": <caption2_modified>,
	"caption2_entities": <caption2_entities>,
	"article_url": <article_url>,
	"label": "ooc/not-ooc",
	"maskrcnn_bboxes": [ [x1,y1,x2,y2], [x1,y1,x2,y2], ... ]
}

Table 3: Attributes in Test file.

Key	Description
`img_local_path`	Source path in dataset directory for the image
`caption1`	First caption associated with the image
`caption1_modified`	Modified Caption1 after applying Spacy NER
`caption1_entities`	List that consists of mapping between modified named entities in the caption1 with the corresponding hypernym
`caption2`	Second caption associated with the image
`caption2_modified`	Modified Caption2 after applying Spacy NER
`caption2_entities`	List that consists of mapping between modified named entities in the caption2 with the corresponding hypernym
`article_url`	Link to the website image and caption scraped from
`label`	Class label whether the two captions are out-of-context with respect to the image (1=Out-of-Context, 0=Not-Out-of-Context )
`maskrcnn_bboxes`	List of detected bounding boxes corresponding to the image. (x1,y1) refers to start vertex of the rectangle and (x2, y2) refers to end vertex of the rectangle

Getting started

The code is well-documented and should be easy to follow.

Source Code: $ git clone this repo and install the Python dependencies from requirements.txt. The source code is implemented in PyTorch so familarity with PyTorch is expected.
Dataset: Download the dataset by filling out the form here.
Visualize Dataset: It is difficult to view the dataset using only JSON file. Navigate to the directory dataset_visualizer and follow the instructions to visualize the dataset using a simple Python-Flask based web tool
Train and Test For Image-Text Matching Task: This code is based on Detectron2 to extract features from objects present in the image. Please setup and install detectron2 first if you wish to use our feature detector for images. The minimal changes to be done to detectron2 source code to extract object features are added to detectron2_changes directory. Navigate to detectron2 source code directory and simply copy and replace these files. Consider setting up detectron2 inside this directory, it worked seamlessly for me without doing many changes.
All the training parameters are configured via utils/config.py. Specify paths, hyperparameters, text-embeddings, threshold values, etc in the config .py file. Model names are specifed in the trainer script itself. Configure these parameters according to your need and start training.
To train the model, execute the following command: python trainer_scipt.py -m train
Once training is finished, then to evaluate the model with Match vs No-Match Accuracy, execute the following command: python trainer_scipt.py -m eval
Test For Out-of-Context Detection Accuracy: Once training is over, then to evaluate the model for out-of-Context Detection task, specify model name in evaluate_ooc.py and execute:

    python evaluate_ooc.py

Citation

If you find our dataset or paper useful for your research , please include the following citation:


@inproceedings{aneja2021cosmos,
    title={{COSMOS}: Catching {O}ut-of-{C}ontext {M}isinformation with {S}elf-{S}upervised {L}earning}, 
    author={Shivangi Aneja and Chris Bregler and Matthias Nie{\ss}ner},
    booktitle={ArXiv preprint arXiv:2101.06278},
    year={2021}
}

If you have questions regarding the dataset or code, please email us at [email protected]. We will get back to you as soon as possible.

cosmos's People

Contributors

Stargazers

Watchers

Forkers

zebrajack nathanbegbie suryagutta ahmednasserswe phucthanh schulze-paul pwnyniche simrit1 yuanfei-lin amishakov chenruiwe jeremy-swack simr122 twelcone katherienne

cosmos's Issues

Training error about detectron2

Hi,

During the training, I've encountered the error below:

2021-06-02 19:47:08.050507: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-02 19:47:10.647322: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2021-06-02 19:47:14.161774: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:3b:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.163223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties:
pciBusID: 0000:5e:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.164597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 2 with properties:
pciBusID: 0000:86:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.165981: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 3 with properties:
pciBusID: 0000:af:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.166042: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-02 19:47:14.168723: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-06-02 19:47:14.170526: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-06-02 19:47:14.171492: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-06-02 19:47:14.175216: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-06-02 19:47:14.177022: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-06-02 19:47:14.182611: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-06-02 19:47:14.192325: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2, 3
2021-06-02 19:47:14.193076: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-02 19:47:14.205182: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2100000000 Hz
2021-06-02 19:47:14.206884: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xa3fe5b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-02 19:47:14.206923: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-06-02 19:47:14.748093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:3b:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.749386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties:
pciBusID: 0000:5e:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.750610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 2 with properties:
pciBusID: 0000:86:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.751809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 3 with properties:
pciBusID: 0000:af:00.0 name: Tesla V100-PCIE-32GB computeCapability: 7.0
coreClock: 1.38GHz coreCount: 80 deviceMemorySize: 31.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-02 19:47:14.751872: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-02 19:47:14.751902: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-06-02 19:47:14.751941: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-06-02 19:47:14.751958: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-06-02 19:47:14.751976: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-06-02 19:47:14.751996: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-06-02 19:47:14.752014: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-06-02 19:47:14.761251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2, 3
2021-06-02 19:47:14.761304: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-02 19:47:16.808596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-02 19:47:16.808665: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 1 2 3
2021-06-02 19:47:16.808686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N Y Y Y
2021-06-02 19:47:16.808696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 1:   Y N Y Y
2021-06-02 19:47:16.808706: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 2:   Y Y N Y
2021-06-02 19:47:16.808716: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 3:   Y Y Y N
2021-06-02 19:47:16.816101: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-06-02 19:47:16.816159: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 30132 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-32GB, pci bus id: 0000:3b:00.0, compute capability: 7.0)
2021-06-02 19:47:16.819248: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-06-02 19:47:16.819287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 30132 MB memory) -> physical GPU (device: 1, name: Tesla V100-PCIE-32GB, pci bus id: 0000:5e:00.0, compute capability: 7.0)
2021-06-02 19:47:16.822083: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-06-02 19:47:16.822118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 30132 MB memory) -> physical GPU (device: 2, name: Tesla V100-PCIE-32GB, pci bus id: 0000:86:00.0, compute capability: 7.0)
2021-06-02 19:47:16.824747: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-06-02 19:47:16.824781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 30132 MB memory) -> physical GPU (device: 3, name: Tesla V100-PCIE-32GB, pci bus id: 0000:af:00.0, compute capability: 7.0)
2021-06-02 19:47:16.827988: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3a36fae0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-06-02 19:47:16.828018: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla V100-PCIE-32GB, Compute Capability 7.0
2021-06-02 19:47:16.828030: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): Tesla V100-PCIE-32GB, Compute Capability 7.0
2021-06-02 19:47:16.828041: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (2): Tesla V100-PCIE-32GB, Compute Capability 7.0
2021-06-02 19:47:16.828058: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (3): Tesla V100-PCIE-32GB, Compute Capability 7.0
Total Params 2559576
Img Model 2405676
Text Model 153900
Loading Saved Model
  0%|                                                                                                                                                                                                 | 0/2528 [00:00<?, ?it/s]2021-06-02 19:47:48.137599: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
  2%|███▍                                                                                                                                                                                    | 48/2528 [01:08<48:47,  1.18s/it]Traceback (most recent call last):
  File "trainer_scipt.py", line 232, in <module>
    train_joint_model()
  File "trainer_scipt.py", line 156, in train_joint_model
    train_model(epoch)
  File "trainer_scipt.py", line 85, in train_model
    z_img, z_t_match, z_t_diff = combined_model(img, text_match, text_diff, batch, seq_len_match, seq_len_diff,
  File "/home/engine210/MMFinal2/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/engine210/MMFinal2/COSMOS/model_archs/models.py", line 51, in forward
    img = self.maskrcnn_extractor(img, bboxes, bbox_classes)
  File "/home/engine210/MMFinal2/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/engine210/MMFinal2/COSMOS/model_archs/image/image_models.py", line 43, in forward
    targets = [annotations_to_instances(bbox.cpu().numpy(), bbox_class.cpu().numpy(), img_shape) for
  File "/home/engine210/MMFinal2/COSMOS/model_archs/image/image_models.py", line 43, in <listcomp>
    targets = [annotations_to_instances(bbox.cpu().numpy(), bbox_class.cpu().numpy(), img_shape) for
  File "/home/engine210/MMFinal2/COSMOS/utils/img_model_utils.py", line 23, in annotations_to_instances
    target.classes = classes
  File "/home/engine210/MMFinal2/detectron2/detectron2/structures/instances.py", line 61, in __setattr__
    self.set(name, val)
  File "/home/engine210/MMFinal2/detectron2/detectron2/structures/instances.py", line 76, in set
    assert (
AssertionError: Adding a field of length 11 to a Instances of length 1

My environment is

CentOS 7
CUDA 11.0/10.1
Python 3.8.1

I installed detectron2 v0.3 (commit 4841e70) with the modified code provided in this repo. I think it's the problem with detectron2 version.
May I ask what version (or more specifically which commit) of detectron2 should we use in this project?

TypeError: 'NoneType' object is not subscriptable

Hi, during training, I got an error as below.

2021-06-09 14:10:21.451241: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-09 14:10:22.281671: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2021-06-09 14:10:22.294018: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.294396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: TITAN Xp computeCapability: 6.1
coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s
2021-06-09 14:10:22.294418: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-09 14:10:22.295529: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-06-09 14:10:22.296210: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-06-09 14:10:22.296353: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-06-09 14:10:22.297998: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-06-09 14:10:22.298682: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-06-09 14:10:22.301049: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-06-09 14:10:22.301142: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.301557: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.301903: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-06-09 14:10:22.302099: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-09 14:10:22.306142: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 3999980000 Hz
2021-06-09 14:10:22.306447: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xa167020 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-09 14:10:22.306458: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-06-09 14:10:22.469818: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.470282: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xa1d2ae0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-06-09 14:10:22.470300: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): TITAN Xp, Compute Capability 6.1
2021-06-09 14:10:22.470446: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.470815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: TITAN Xp computeCapability: 6.1
coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s
2021-06-09 14:10:22.470844: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-09 14:10:22.470862: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-06-09 14:10:22.470874: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-06-09 14:10:22.470886: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-06-09 14:10:22.470897: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-06-09 14:10:22.470909: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-06-09 14:10:22.470921: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-06-09 14:10:22.470962: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.471343: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.471692: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-06-09 14:10:22.471718: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-06-09 14:10:22.786787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-09 14:10:22.786817: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0
2021-06-09 14:10:22.786824: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N
2021-06-09 14:10:22.786984: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.787437: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-09 14:10:22.787818: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2021-06-09 14:10:22.787842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11210 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
Total Params 2559576
Img Model 2405676
Text Model 153900
Loading Saved Model
  0%|                                                                                                                          | 0/2527 [00:00<?, ?it/s]2021-06-09 14:10:38.988999: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
  5%|█████                                                                                                         | 116/2527 [07:03<2:11:06,  3.26s/it]
Error in  train/104298.jpg
Traceback (most recent call last):
  File "trainer_scipt.py", line 232, in <module>
    train_joint_model()
  File "trainer_scipt.py", line 156, in train_joint_model
    train_model(epoch)
  File "trainer_scipt.py", line 80, in train_model
    for batch_idx, (img, text_match, text_diff, seq_len_match, seq_len_diff, bboxes, bbox_classes) in enumerate(
  File "/home/mimiliaogo/MMFinal/COSMOS/env/lib/python3.8/site-packages/tqdm/_tqdm.py", line 1000, in __iter__
    for obj in iterable:
  File "/home/mimiliaogo/MMFinal/COSMOS/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/home/mimiliaogo/MMFinal/COSMOS/env/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 403, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/mimiliaogo/MMFinal/COSMOS/env/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/mimiliaogo/MMFinal/COSMOS/utils/dataset_utils.py", line 96, in __call__
    return self.pad_collate(batch)
  File "/home/mimiliaogo/MMFinal/COSMOS/utils/dataset_utils.py", line 73, in pad_collate
    t1 = list(map(lambda x: x[self.embed_dim1], batch))
  File "/home/mimiliaogo/MMFinal/COSMOS/utils/dataset_utils.py", line 73, in <lambda>
    t1 = list(map(lambda x: x[self.embed_dim1], batch))
TypeError: 'NoneType' object is not subscriptable

It shows that there is an error in the training image.

How did you choose maskrcnn_bboxes?

Usually, Mask-R-CNN outputs many bounding boxes to an image.
In your paper, the model use only ten bounding boxes for an image.
How did you choose ten bounding boxes? (confidence, the size of bbox, ...)

maskrcnn_bboxes	List of detected bounding boxes corresponding to the image. (x1,y1) refers to start vertex of the rectangle and (x2, y2) refers to end vertex of the rectangle

Thanks

Can not reproduce the reported result

Hi,
We rerun your code and train for 10~40 epochs, the highest evaluated accuracy is still 0.75, much lower than reported( 0.85), even lower than the language model baseline.

asking for .pt files

/home/pradeep/COSMOS/script/annotations/test_data.json
Traceback (most recent call last):
File "/home/pradeep/COSMOS/evaluate_ooc.py", line 124, in
pred_context = evaluate_context_with_bbox_overlap(v_data)
File "/home/pradeep/COSMOS/evaluate_ooc.py", line 95, in evaluate_context_with_bbox_overlap
score_c1, score_c2 = get_scores(v_data)
File "/home/pradeep/COSMOS/evaluate_ooc.py", line 37, in get_scores
checkpoint = torch.load(BASE_DIR + 'models_final/' + model_name + '.pt')
File "/home/pradeep/.local/lib/python3.10/site-packages/torch/serialization.py", line 986, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/pradeep/.local/lib/python3.10/site-packages/torch/serialization.py", line 435, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/pradeep/.local/lib/python3.10/site-packages/torch/serialization.py", line 416, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/pradeep/COSMOS/models_final/img_use_rcnn_margin_10boxes_jitter_rotate_aug_ner.pt'

evaluate error

2023-11-20 12:22:45.003935: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-11-20 12:22:45.005021: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at
When i tried to install and load en_core_web_sm from spacy it did not work and when i tried to run the
evaluate_ooc.py using python i got the below error
https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
File "/home/pradeep/COSMOS/evaluate_ooc.py", line 5, in
from utils.config import *
File "/home/pradeep/COSMOS/utils/init.py", line 3, in
from .dataset import *
File "/home/pradeep/COSMOS/utils/dataset.py", line 5, in
from utils.dataset_utils import modify_caption_replace_entities
File "/home/pradeep/COSMOS/utils/dataset_utils.py", line 8, in
nlp = spacy.load("en")
File "/home/pradeep/.local/lib/python3.10/site-packages/spacy/init.py", line 51, in load
return util.load_model(
File "/home/pradeep/.local/lib/python3.10/site-packages/spacy/util.py", line 471, in load_model
raise IOError(Errors.E941.format(name=name, full=OLD_MODEL_SHORTCUTS[name])) # type: ignore[index]
OSError: [E941] Can't find model 'en'. It looks like you're trying to load a model from a shortcut, which is obsolete as of spaCy v3.0. To load the model, use its full name instead:

nlp = spacy.load("en_core_web_sm")

For more details on the available models, see the models directory: https://spacy.io/models and if you want to create a blank model, use spacy.blank: nlp = spacy.blank("en")
Exception ignored in: <function AtomicFunction.del at 0x7fd2c7181120>
Traceback (most recent call last):
File "/home/pradeep/.local/lib/python3.10/site-packages/tensorflow/python/eager/polymorphic_function/atomic_function.py", line 283, in del
TypeError: 'NoneType' object is not subscriptable

Training and testing code released day?

Hi, Thanks for your paper and your contributions.
I am so interested in this problem, Can you please help either give me some info about your code released day or send me a raw version? I very appreciate, thanks.

Trained model release day?

Hi,
I've successfully rerun your code. But I found that due to the huge dataset, it takes nearly 8 hours for a train epoch and nearly 2 hours for a val epoch. It will take a long time to complete 500 epochs.
So may I ask that do you have a plan to release the trained model? It will save us a lot of time.
Thanks!

AttributeError: CONV_DIMS

Hi, I have replace default.py in config from detectron2_changes to detectron2. But it seems that Detectron2 model require setting parameter for _C.MODEL.RPN.CONV_DIMS. As I look at your detectron2_changes/config/default.py , I see you do not setting for this parameter. If I mistake anything or how should I setting for this?
Sincerely.