milvlg / openvqa Goto Github PK

View Code? Open in Web Editor NEW

311.0 12.0 64.0 853 KB

A lightweight, scalable, and general framework for visual question answering research

License: Apache License 2.0

Python 99.06% Dockerfile 0.94%

visual-question-answering vqa pytorch deep-learning benchmark

openvqa's People

Contributors

Stargazers

Watchers

openvqa's Issues

Attention Grounding for GQA

Hello, I am interested in getting the grounding results for GQA but it doesn't seem to be supported at the moment. Is there a plan to support this in the future? or maybe pointer on how to extend the current implementation to support this? (I am particularly interested in getting the result for MCAN model)

Thank you.

How about the result of MCAN model when use frcn_feat + bbox_feat on VQA v2.0 datasets?

Hi , this is great work your team has made, I' m following your work. So how about the result of MCAN model when use frcn_feat + bbox_feat on VQA v2.0 datasets? It's better than only use frcn_feat on VQA v2.0? If not, could you tell me the reason that you think?
Thank you very much:)

Setup Issues #2

Hi,

I tried setting up the environment described in the doc and then ran

python3 run.py --RUN='train' --MODEL='mcan_small' --DATASET='vqa'

in the directory, but I didn't receive any output or error message,let alone any checkpoint files or training log.

I also tried running other codes such as

python3 run.py --RUN='train' --MODEL='mcan_large' --DATASET='vqa'

and it still showed nothing.

What could be the problem?

Thank you.

Invalid gradient error when training on GQA with mcan_small

Hi,

Thanks for this great project and the detailed doc. I really appreciate it.

I was trying to run some experiments on GQA with the default mcan_small model. I followed the instruction to prepare the GQA data and everything seemed to be working. However, when I launched the training with the following command

python3 run.py --RUN='train' --MODEL='mcan_small' --DATASET='gqa'

, I got the following error

[early log is omitted]
Loading validation set for per-epoch evaluation........
 ========== Dataset size: 12578
 ========== Question token vocab size: 2933
Max token length: 29 Trimmed to: 29
 ========== Answer token vocab size: 1843
Finished!

Initializing log file........
Finished!

[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Traceback (most recent call last):
  File "run.py", line 160, in <module>
    execution.run(__C.RUN_MODE)
  File "/local/xiaojianm/workspace/openvqa/utils/exec.py", line 33, in run
    train_engine(self.__C, self.dataset, self.dataset_eval)
  File "/local/xiaojianm/workspace/openvqa/utils/train_engine.py", line 192, in train_engine
    loss.backward()
  File "/local/xiaojianm/anaconda3/envs/default/lib/python3.8/site-packages/torch/_tensor.py", line 255, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/local/xiaojianm/anaconda3/envs/default/lib/python3.8/site-packages/torch/autograd/__init__.py", line 147, in backward
    Variable._execution_engine.run_backward(
RuntimeError: Function MmBackward returned an invalid gradient at index 1 - got [6, 2048] but expected shape compatible with [5, 2048]

I'm wondering how can this issue happen and do you have any suggestions for debugging it?

I'm using torch==1.9.0 with spacy==2.3.7. Please feel free to ping me directly in this thread if more information is needed.

Question Regarding question-answering architecture

Do the models in the model zoo answer only single-word answers, or can they be used for answer generation where the answer consists of multiple words?

Thanks,

train_choices

There are no train_choices and val_choices.json files in the GQA dataset.
'train_choices': self.DATA_PATH['gqa'] + '/raw' + '/eval/train_choices'
'val_choices': self.DATA_PATH['gqa'] + '/raw' + '/eval/val_choices.json'

为什么mscoco_bottom_up_features这个文件下载不了？总是显示本地文件写入失败。。。（已确定不是网络的问题）

Adding custom dataset

Is there any documentation for adding new datasets to train on existing methods?

Problem while building docker

Problem while building the docker.

Custom training data where the answer is sentence

Hey @MIL-VLG ,
I want to get your advice if possible
I was able to generate .npz files and to run the project, but i am facing weird issue
I have some medical data consists of Q and A, each of them is sentence
in the original VQA most of the answers are single word/number or something around,
i am running the mcan_small, on the accuracy i always get 0.0 after inspecting i see that i have irrelevant predicted answers and 99% of the answers is the same,
On the training i have very very small loss which makes me wonder how i get small loss(0.0006) and the answers are that much irrelevant

raw Data sample:
synpic45783|what is abnormal in the x-ray?|hydroxyapatite crystal deposition disease
Prepossessed data sample to fit the format[same sample]
In the questions { "question_id": 183, "question": "what plane is demonstrated?", "image_id": 45783 }, In the annotations { "question_id": 130, "answers": [ { "answer_id": 1, "answer": "hydroxyapatite crystal deposition disease", "answer_confidence": "yes" } ], "answer_type": "other", "image_id": 45783, "question_type": "what is" },

Example for predicted answer[same prediction for all the validation set]
{'image_id': 28204, 'answer_type': 'other', 'question_id': 0, 'question_type': 'was', 'answer': 'model'}
What do you think i can do so overcome this issue ?

About VQA-CP dataset

Very nice program for VQA research. Do you plan to add the VQA-CP datasets to this program?

Reason for using Mask?

Firstly, the code is well written.
I am new to this field. I know how the masked_fill() works.
However, I want to know the exact reason why masking of features is done here. From what I understood, wherever the absolute sum of features along the last dimension is 0, you are basically avoiding its consideration for computing the score (in MHAtt)
What is the exact reason of doing so and how is it beneficial?

Thanks in advance.

Dataset setup for custom data

Hey,
In the wiki at the Dataset Setup written

We store the features for each image in a .npz file. You can prepare the visual features by yourself or download the extracted features

My question is how to train on custom data and to

prepare the visual features by yourself

So is it the original image converted to numpy format ?
about "each image being represented as an dynamic number (from 10 to 100) of 2048-D", is there a helper function that do this transformation ? and why 10 to 100 ? why 2048D ?

What does frcn in the GQA part of Model Zoo stand for?

In the GQA section of the Model zoo, it shows that in every model it incorporates frcn.

What does frcn stand for?

.

Downloading mscoco bottom-up features

Hello,
While downloading the bottom-up features for the mscoco images, either from OneDrive or BaiduYun, the download fails after a few minutes. I have tried several times from different browsers and OS, but it didn't work.
Could you please give direct links to the files so that one can 'wget' or provide a script to download them.

Thanks in advance

KeyError issue

Hi, I am trying to train the mcan model using gqa dataset, and everytime I try to run, I get the following error:

Traceback (most recent call last):
File "run.py", line 162, in
execution.run(__C.RUN_MODE)
File "D:\VQA_docs\openvqa-master\utils\exec.py", line 33, in run
train_engine(self.__C, self.dataset, self.dataset_eval)
File "D:\VQA_docs\openvqa-master\utils\train_engine.py", line 137, in train_engine
for (step,frcn_feat_iter, grid_feat_iter, bbox_feat_iter, ques_ix_iter, ans_iter) in enumerate(dataloader):
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 521, in next
data = self._next_data()
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
data.reraise()
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch_utils.py", line 434, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data_utils\worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data_utils\fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\PC\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\utils\data_utils\fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\VQA_docs\openvqa-master\openvqa\core\base_dataset.py", line 36, in getitem
frcn_feat_iter, grid_feat_iter, bbox_feat_iter = self.load_img_feats(idx, iid)
File "D:\VQA_docs\openvqa-master\openvqa\datasets\gqa\gqa_loader.py", line 204, in load_img_feats
frcn_feat = np.load(self.iid_to_frcn_feat_path[iid])
KeyError: '2363772'

I was getting some other errors before this one, and I got to solve them. however, the only issue I am struggling with now is this one. DOES ANYONE KNOW HOW I CAN FIX THIS?

Any way to use the original images

Hi,
I was implementing some models to do VQA and found your repo really useful. However, it seems like I can only get the FRCN_FEAT and BBOX_FEAT as image inputs to the model. Is there any way to take the original images as inputs and not the extracted features?

The GQA results are lower than the reported performance.

We follow the default setting provided by the repo, but get a lower performance on both online leadboard and offline leadboard. The log file is as following. By the way, I have tested the official mcan_small model and achieved 58.3 on online leadboard. That's strange. Any one can help to fix this?


{ BATCH_SIZE        }->64
{ BBOXFEAT_EMB_SIZE }->2048
{ CACHE_PATH        }->./results/cache
{ CKPTS_PATH        }->./ckpts
{ CKPT_EPOCH        }->0
{ CKPT_PATH         }->None
{ CKPT_VERSION      }->2134787
{ DATASET           }->gqa
{ DATA_PATH         }->{'vqa': './data/vqa', 'gqa': './data/gqa', 'clevr': './data/clevr'}
{ DATA_ROOT         }->./data
{ DEVICES           }->[0]
{ DROPOUT_R         }->0.1
{ EVAL_BATCH_SIZE   }->32
{ EVAL_EVERY_EPOCH  }->True
{ FEATS_PATH        }->{'vqa': {'train': './data/vqa/feats/train2014', 'val': './data/vqa/feats/val2014', 'test': './data/vqa/feats/test2015'}, 'gqa': {'default-frcn': './data/gqa/feats/gqa-frcn', 'default-grid': './data/gqa/feats/gqa-grid'}, 'clevr': {'train': './data/clevr/feats/train', 'val': './data/clevr/feats/val', 'test': './data/clevr/feats/test'}}
{ FEAT_SIZE         }->{'vqa': {'FRCN_FEAT_SIZE': (100, 2048), 'BBOX_FEAT_SIZE': (100, 5)}, 'gqa': {'FRCN_FEAT_SIZE': (100, 2048), 'GRID_FEAT_SIZE': (49, 2048), 'BBOX_FEAT_SIZE': (100, 5)}, 'clevr': {'GRID_FEAT_SIZE': (196, 1024)}}
{ FF_SIZE           }->2048
{ FLAT_GLIMPSES     }->1
{ FLAT_MLP_SIZE     }->512
{ FLAT_OUT_SIZE     }->1024
{ GPU               }->2
{ GRAD_ACCU_STEPS   }->1
{ GRAD_NORM_CLIP    }->-1
{ HIDDEN_SIZE       }->512
{ LAYER             }->6
{ LOG_PATH          }->./results/log
{ LOSS_FUNC         }->ce
{ LOSS_FUNC_NAME_DICT }->{'ce': 'CrossEntropyLoss', 'bce': 'BCEWithLogitsLoss', 'kld': 'KLDivLoss', 'mse': 'MSELoss'}
{ LOSS_FUNC_NONLINEAR }->{'ce': [None, 'flat'], 'bce': [None, None], 'kld': ['log_softmax', None], 'mse': [None, None]}
{ LOSS_REDUCTION    }->sum
{ LR_BASE           }->0.0001
{ LR_DECAY_LIST     }->[8, 10]
{ LR_DECAY_R        }->0.2
{ MAX_EPOCH         }->11
{ MODEL             }->mcan_small
{ MODEL_USE         }->mcan
{ MULTI_HEAD        }->8
{ NUM_WORKERS       }->8
{ N_GPU             }->1
{ OPT               }->Adam
{ OPT_PARAMS        }->{'betas': (0.9, 0.98), 'eps': 1e-09, 'weight_decay': 0, 'amsgrad': False}
{ PIN_MEM           }->True
{ PRED_PATH         }->./results/pred
{ RAW_PATH          }->{'vqa': {'train': './data/vqa/raw/v2_OpenEnded_mscoco_train2014_questions.json', 'train-anno': './data/vqa/raw/v2_mscoco_train2014_annotations.json', 'val': './data/vqa/raw/v2_OpenEnded_mscoco_val2014_questions.json', 'val-anno': './data/vqa/raw/v2_mscoco_val2014_annotations.json', 'vg': './data/vqa/raw/VG_questions.json', 'vg-anno': './data/vqa/raw/VG_annotations.json', 'test': './data/vqa/raw/v2_OpenEnded_mscoco_test2015_questions.json'}, 'gqa': {'train': './data/gqa/raw/questions1.2/train_balanced_questions.json', 'val': './data/gqa/raw/questions1.2/val_balanced_questions.json', 'testdev': './data/gqa/raw/questions1.2/testdev_balanced_questions.json', 'test': './data/gqa/raw/questions1.2/submission_all_questions.json', 'val_all': './data/gqa/raw/questions1.2/val_all_questions.json', 'testdev_all': './data/gqa/raw/questions1.2/testdev_all_questions.json', 'train_choices': './data/gqa/raw/eval/train_choices', 'val_choices': './data/gqa/raw/eval/val_choices.json'}, 'clevr': {'train': './data/clevr/raw/questions/CLEVR_train_questions.json', 'val': './data/clevr/raw/questions/CLEVR_val_questions.json', 'test': './data/clevr/raw/questions/CLEVR_test_questions.json'}}
{ RESULT_PATH       }->./results/result_test
{ RESUME            }->False
{ RUN_MODE          }->train
{ SEED              }->2134787
{ SPLIT             }->{'train': 'train+val', 'val': 'testdev', 'test': 'test'}
{ SPLITS            }->{'vqa': {'train': '', 'val': 'val', 'test': 'test'}, 'gqa': {'train': 'train+val', 'val': 'testdev', 'test': 'test'}, 'clevr': {'train': '', 'val': 'val', 'test': 'test'}}
{ SUB_BATCH_SIZE    }->64
{ TASK_LOSS_CHECK   }->{'vqa': ['bce', 'kld'], 'gqa': ['ce'], 'clevr': ['ce']}
{ TEST_SAVE_PRED    }->False
{ TRAIN_SPLIT       }->train+val
{ USE_AUX_FEAT      }->True
{ USE_BBOX_FEAT     }->True
{ USE_GLOVE         }->True
{ VERBOSE           }->True
{ VERSION           }->2134787
{ WARMUP_EPOCH      }->2
{ WORD_EMBED_SIZE   }->300
=====================================
nowTime: 2020-01-19 14:05:12
Epoch: 1, Loss: 1.7046293609006327, Lr: 6.666666666666667e-05
Elapsed time: 5132, Speed(s/batch): 0.3055514312835215

Binary: 58.18%
Open: 36.41%
Accuracy: 46.41%
Distribution: 3.07 (lower is better)
Accuracy / structural type:
  choose: 60.05% (1129 questions)
  compare: 54.33% (589 questions)
  logical: 56.63% (1803 questions)
  query: 36.41% (6805 questions)
  verify: 59.50% (2252 questions)
Accuracy / semantic type:
  attr: 50.46% (5186 questions)
  cat: 42.04% (1149 questions)
  global: 45.22% (157 questions)
  obj: 64.14% (778 questions)
  rel: 40.83% (5308 questions)
Accuracy / steps number:
  1: 60.76% (237 questions)
  2: 43.16% (6395 questions)
  3: 47.63% (4266 questions)
  4: 45.02% (793 questions)
  5: 59.37% (822 questions)
  6: 78.05% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 32.45% (151 questions)
  4: 45.56% (630 questions)
  5: 38.53% (1290 questions)
  6: 43.30% (2074 questions)
  7: 43.97% (1642 questions)
  8: 47.85% (1185 questions)
  9: 50.12% (1281 questions)
  10: 51.88% (1249 questions)
  11: 45.47% (994 questions)
  12: 51.10% (638 questions)
  13: 50.43% (462 questions)
  14: 50.72% (345 questions)
  15: 57.81% (237 questions)
  16: 49.57% (117 questions)
  17: 44.68% (94 questions)
  18: 52.63% (76 questions)
  19: 60.47% (43 questions)
  20: 53.12% (32 questions)
  21: 57.89% (19 questions)
  22: 50.00% (12 questions)
  23: 25.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-19 15:31:07
Epoch: 2, Loss: 1.3994251725464903, Lr: 0.0001
Elapsed time: 5100, Speed(s/batch): 0.3036757551451245

Binary: 63.94%
Open: 37.90%
Accuracy: 49.85%
Distribution: 2.36 (lower is better)
Accuracy / structural type:
  choose: 65.46% (1129 questions)
  compare: 55.86% (589 questions)
  logical: 61.40% (1803 questions)
  query: 37.90% (6805 questions)
  verify: 67.32% (2252 questions)
Accuracy / semantic type:
  attr: 52.97% (5186 questions)
  cat: 40.91% (1149 questions)
  global: 46.50% (157 questions)
  obj: 79.05% (778 questions)
  rel: 44.56% (5308 questions)
Accuracy / steps number:
  1: 64.56% (237 questions)
  2: 45.54% (6395 questions)
  3: 52.93% (4266 questions)
  4: 48.17% (793 questions)
  5: 62.53% (822 questions)
  6: 65.85% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 31.13% (151 questions)
  4: 44.44% (630 questions)
  5: 38.76% (1290 questions)
  6: 45.90% (2074 questions)
  7: 47.69% (1642 questions)
  8: 53.84% (1185 questions)
  9: 55.35% (1281 questions)
  10: 55.40% (1249 questions)
  11: 51.21% (994 questions)
  12: 54.86% (638 questions)
  13: 53.68% (462 questions)
  14: 59.13% (345 questions)
  15: 56.96% (237 questions)
  16: 57.26% (117 questions)
  17: 51.06% (94 questions)
  18: 53.95% (76 questions)
  19: 69.77% (43 questions)
  20: 56.25% (32 questions)
  21: 57.89% (19 questions)
  22: 50.00% (12 questions)
  23: 0.00% (4 questions)
  24: 50.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-19 16:56:30
Epoch: 3, Loss: 1.2595117600660048, Lr: 0.0001
Elapsed time: 5073, Speed(s/batch): 0.3020486486869416

Binary: 68.66%
Open: 37.74%
Accuracy: 51.93%
Distribution: 2.90 (lower is better)
Accuracy / structural type:
  choose: 68.47% (1129 questions)
  compare: 58.23% (589 questions)
  logical: 66.89% (1803 questions)
  query: 37.74% (6805 questions)
  verify: 72.91% (2252 questions)
Accuracy / semantic type:
  attr: 56.85% (5186 questions)
  cat: 40.91% (1149 questions)
  global: 53.50% (157 questions)
  obj: 82.52% (778 questions)
  rel: 44.99% (5308 questions)
Accuracy / steps number:
  1: 67.09% (237 questions)
  2: 46.25% (6395 questions)
  3: 55.53% (4266 questions)
  4: 55.11% (793 questions)
  5: 67.76% (822 questions)
  6: 70.73% (41 questions)
  7: 95.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 34.44% (151 questions)
  4: 46.19% (630 questions)
  5: 38.99% (1290 questions)
  6: 47.73% (2074 questions)
  7: 51.34% (1642 questions)
  8: 53.76% (1185 questions)
  9: 58.24% (1281 questions)
  10: 58.29% (1249 questions)
  11: 52.21% (994 questions)
  12: 58.78% (638 questions)
  13: 54.76% (462 questions)
  14: 60.00% (345 questions)
  15: 63.71% (237 questions)
  16: 63.25% (117 questions)
  17: 52.13% (94 questions)
  18: 53.95% (76 questions)
  19: 79.07% (43 questions)
  20: 50.00% (32 questions)
  21: 52.63% (19 questions)
  22: 66.67% (12 questions)
  23: 50.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-19 18:21:26
Epoch: 4, Loss: 1.1537336046805873, Lr: 0.0001
Elapsed time: 5056, Speed(s/batch): 0.3010697898679644

Binary: 70.19%
Open: 38.28%
Accuracy: 52.93%
Distribution: 2.06 (lower is better)
Accuracy / structural type:
  choose: 69.26% (1129 questions)
  compare: 46.01% (589 questions)
  logical: 71.05% (1803 questions)
  query: 38.28% (6805 questions)
  verify: 76.29% (2252 questions)
Accuracy / semantic type:
  attr: 58.79% (5186 questions)
  cat: 40.30% (1149 questions)
  global: 56.69% (157 questions)
  obj: 81.36% (778 questions)
  rel: 45.65% (5308 questions)
Accuracy / steps number:
  1: 64.98% (237 questions)
  2: 47.58% (6395 questions)
  3: 54.78% (4266 questions)
  4: 62.80% (793 questions)
  5: 68.86% (822 questions)
  6: 87.80% (41 questions)
  7: 95.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 25.83% (151 questions)
  4: 48.25% (630 questions)
  5: 39.38% (1290 questions)
  6: 50.00% (2074 questions)
  7: 52.19% (1642 questions)
  8: 55.70% (1185 questions)
  9: 59.80% (1281 questions)
  10: 57.49% (1249 questions)
  11: 55.23% (994 questions)
  12: 55.96% (638 questions)
  13: 56.71% (462 questions)
  14: 59.42% (345 questions)
  15: 62.45% (237 questions)
  16: 66.67% (117 questions)
  17: 54.26% (94 questions)
  18: 59.21% (76 questions)
  19: 79.07% (43 questions)
  20: 53.12% (32 questions)
  21: 52.63% (19 questions)
  22: 58.33% (12 questions)
  23: 50.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-19 19:46:06
Epoch: 5, Loss: 1.0821526790137108, Lr: 0.0001
Elapsed time: 5034, Speed(s/batch): 0.2997583388487309

Binary: 72.18%
Open: 38.87%
Accuracy: 54.16%
Distribution: 2.16 (lower is better)
Accuracy / structural type:
  choose: 72.54% (1129 questions)
  compare: 60.95% (589 questions)
  logical: 70.66% (1803 questions)
  query: 38.87% (6805 questions)
  verify: 76.15% (2252 questions)
Accuracy / semantic type:
  attr: 61.16% (5186 questions)
  cat: 44.21% (1149 questions)
  global: 57.96% (157 questions)
  obj: 82.52% (778 questions)
  rel: 45.20% (5308 questions)
Accuracy / steps number:
  1: 71.31% (237 questions)
  2: 47.82% (6395 questions)
  3: 57.59% (4266 questions)
  4: 60.40% (793 questions)
  5: 72.02% (822 questions)
  6: 80.49% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 27.81% (151 questions)
  4: 47.30% (630 questions)
  5: 39.38% (1290 questions)
  6: 50.29% (2074 questions)
  7: 53.17% (1642 questions)
  8: 56.20% (1185 questions)
  9: 61.83% (1281 questions)
  10: 59.01% (1249 questions)
  11: 57.44% (994 questions)
  12: 60.97% (638 questions)
  13: 59.31% (462 questions)
  14: 61.16% (345 questions)
  15: 64.56% (237 questions)
  16: 64.10% (117 questions)
  17: 58.51% (94 questions)
  18: 64.47% (76 questions)
  19: 74.42% (43 questions)
  20: 59.38% (32 questions)
  21: 63.16% (19 questions)
  22: 66.67% (12 questions)
  23: 50.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-19 21:10:24
Epoch: 6, Loss: 1.0242240601529657, Lr: 0.0001
Elapsed time: 5283, Speed(s/batch): 0.3145479110343935

Binary: 72.37%
Open: 39.44%
Accuracy: 54.56%
Distribution: 2.30 (lower is better)
Accuracy / structural type:
  choose: 69.97% (1129 questions)
  compare: 64.86% (589 questions)
  logical: 69.94% (1803 questions)
  query: 39.44% (6805 questions)
  verify: 77.49% (2252 questions)
Accuracy / semantic type:
  attr: 60.70% (5186 questions)
  cat: 43.08% (1149 questions)
  global: 54.78% (157 questions)
  obj: 84.32% (778 questions)
  rel: 46.67% (5308 questions)
Accuracy / steps number:
  1: 70.46% (237 questions)
  2: 48.54% (6395 questions)
  3: 58.49% (4266 questions)
  4: 59.02% (793 questions)
  5: 69.34% (822 questions)
  6: 82.93% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 29.80% (151 questions)
  4: 45.56% (630 questions)
  5: 40.47% (1290 questions)
  6: 51.74% (2074 questions)
  7: 54.57% (1642 questions)
  8: 55.70% (1185 questions)
  9: 59.88% (1281 questions)
  10: 61.81% (1249 questions)
  11: 56.34% (994 questions)
  12: 61.91% (638 questions)
  13: 56.06% (462 questions)
  14: 63.77% (345 questions)
  15: 60.76% (237 questions)
  16: 63.25% (117 questions)
  17: 69.15% (94 questions)
  18: 65.79% (76 questions)
  19: 72.09% (43 questions)
  20: 56.25% (32 questions)
  21: 63.16% (19 questions)
  22: 66.67% (12 questions)
  23: 25.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-19 22:38:50
Epoch: 7, Loss: 0.975194708264801, Lr: 0.0001
Elapsed time: 5540, Speed(s/batch): 0.32989268883207523

Binary: 72.91%
Open: 38.75%
Accuracy: 54.43%
Distribution: 1.97 (lower is better)
Accuracy / structural type:
  choose: 72.45% (1129 questions)
  compare: 62.48% (589 questions)
  logical: 70.99% (1803 questions)
  query: 38.75% (6805 questions)
  verify: 77.40% (2252 questions)
Accuracy / semantic type:
  attr: 60.89% (5186 questions)
  cat: 42.91% (1149 questions)
  global: 59.87% (157 questions)
  obj: 84.96% (778 questions)
  rel: 45.97% (5308 questions)
Accuracy / steps number:
  1: 71.73% (237 questions)
  2: 48.18% (6395 questions)
  3: 58.06% (4266 questions)
  4: 62.04% (793 questions)
  5: 69.46% (822 questions)
  6: 78.05% (41 questions)
  7: 95.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 30.46% (151 questions)
  4: 46.51% (630 questions)
  5: 40.47% (1290 questions)
  6: 50.72% (2074 questions)
  7: 54.93% (1642 questions)
  8: 57.13% (1185 questions)
  9: 60.73% (1281 questions)
  10: 59.89% (1249 questions)
  11: 57.34% (994 questions)
  12: 61.13% (638 questions)
  13: 51.73% (462 questions)
  14: 64.06% (345 questions)
  15: 63.29% (237 questions)
  16: 66.67% (117 questions)
  17: 58.51% (94 questions)
  18: 64.47% (76 questions)
  19: 69.77% (43 questions)
  20: 62.50% (32 questions)
  21: 63.16% (19 questions)
  22: 75.00% (12 questions)
  23: 50.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-20 00:11:56
Epoch: 8, Loss: 0.9346485017632614, Lr: 0.0001
Elapsed time: 5532, Speed(s/batch): 0.3294190824187974

Binary: 72.63%
Open: 38.85%
Accuracy: 54.36%
Distribution: 2.27 (lower is better)
Accuracy / structural type:
  choose: 69.97% (1129 questions)
  compare: 62.48% (589 questions)
  logical: 71.77% (1803 questions)
  query: 38.85% (6805 questions)
  verify: 77.31% (2252 questions)
Accuracy / semantic type:
  attr: 60.20% (5186 questions)
  cat: 44.73% (1149 questions)
  global: 57.32% (157 questions)
  obj: 84.19% (778 questions)
  rel: 46.27% (5308 questions)
Accuracy / steps number:
  1: 69.20% (237 questions)
  2: 48.71% (6395 questions)
  3: 56.92% (4266 questions)
  4: 61.41% (793 questions)
  5: 71.53% (822 questions)
  6: 82.93% (41 questions)
  7: 90.00% (20 questions)
  8: 66.67% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 29.14% (151 questions)
  4: 47.46% (630 questions)
  5: 41.55% (1290 questions)
  6: 51.25% (2074 questions)
  7: 53.78% (1642 questions)
  8: 55.86% (1185 questions)
  9: 59.48% (1281 questions)
  10: 61.33% (1249 questions)
  11: 56.54% (994 questions)
  12: 61.91% (638 questions)
  13: 56.93% (462 questions)
  14: 58.84% (345 questions)
  15: 62.03% (237 questions)
  16: 63.25% (117 questions)
  17: 62.77% (94 questions)
  18: 65.79% (76 questions)
  19: 60.47% (43 questions)
  20: 56.25% (32 questions)
  21: 63.16% (19 questions)
  22: 75.00% (12 questions)
  23: 25.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-20 01:44:43
Epoch: 9, Loss: 0.6903910459351245, Lr: 2e-05
Elapsed time: 5527, Speed(s/batch): 0.32908785358784626

Binary: 75.78%
Open: 41.45%
Accuracy: 57.21%
Distribution: 1.63 (lower is better)
Accuracy / structural type:
  choose: 74.58% (1129 questions)
  compare: 66.89% (589 questions)
  logical: 74.27% (1803 questions)
  query: 41.45% (6805 questions)
  verify: 79.93% (2252 questions)
Accuracy / semantic type:
  attr: 64.02% (5186 questions)
  cat: 46.30% (1149 questions)
  global: 59.87% (157 questions)
  obj: 86.50% (778 questions)
  rel: 48.55% (5308 questions)
Accuracy / steps number:
  1: 72.15% (237 questions)
  2: 51.34% (6395 questions)
  3: 60.34% (4266 questions)
  4: 63.68% (793 questions)
  5: 73.60% (822 questions)
  6: 82.93% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 29.80% (151 questions)
  4: 48.73% (630 questions)
  5: 44.65% (1290 questions)
  6: 54.44% (2074 questions)
  7: 55.54% (1642 questions)
  8: 60.00% (1185 questions)
  9: 64.09% (1281 questions)
  10: 63.33% (1249 questions)
  11: 59.66% (994 questions)
  12: 64.26% (638 questions)
  13: 59.31% (462 questions)
  14: 61.45% (345 questions)
  15: 64.14% (237 questions)
  16: 64.96% (117 questions)
  17: 62.77% (94 questions)
  18: 68.42% (76 questions)
  19: 76.74% (43 questions)
  20: 53.12% (32 questions)
  21: 73.68% (19 questions)
  22: 66.67% (12 questions)
  23: 25.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-20 03:17:26
Epoch: 10, Loss: 0.5974722676920379, Lr: 2e-05
Elapsed time: 5601, Speed(s/batch): 0.3334813243748319

Binary: 75.92%
Open: 41.72%
Accuracy: 57.42%
Distribution: 1.49 (lower is better)
Accuracy / structural type:
  choose: 75.29% (1129 questions)
  compare: 66.38% (589 questions)
  logical: 74.32% (1803 questions)
  query: 41.72% (6805 questions)
  verify: 80.02% (2252 questions)
Accuracy / semantic type:
  attr: 64.02% (5186 questions)
  cat: 45.95% (1149 questions)
  global: 57.32% (157 questions)
  obj: 87.15% (778 questions)
  rel: 49.10% (5308 questions)
Accuracy / steps number:
  1: 75.11% (237 questions)
  2: 51.49% (6395 questions)
  3: 60.45% (4266 questions)
  4: 63.43% (793 questions)
  5: 74.21% (822 questions)
  6: 85.37% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 35.76% (151 questions)
  4: 51.27% (630 questions)
  5: 44.88% (1290 questions)
  6: 54.05% (2074 questions)
  7: 56.52% (1642 questions)
  8: 58.48% (1185 questions)
  9: 63.39% (1281 questions)
  10: 63.73% (1249 questions)
  11: 58.75% (994 questions)
  12: 64.11% (638 questions)
  13: 60.39% (462 questions)
  14: 64.06% (345 questions)
  15: 64.56% (237 questions)
  16: 65.81% (117 questions)
  17: 62.77% (94 questions)
  18: 72.37% (76 questions)
  19: 76.74% (43 questions)
  20: 65.62% (32 questions)
  21: 63.16% (19 questions)
  22: 75.00% (12 questions)
  23: 25.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

=====================================
nowTime: 2020-01-20 04:51:12
Epoch: 11, Loss: 0.5036989731823418, Lr: 4.000000000000001e-06
Elapsed time: 5452, Speed(s/batch): 0.3246474958766384

Binary: 75.91%
Open: 41.63%
Accuracy: 57.36%
Distribution: 1.54 (lower is better)
Accuracy / structural type:
  choose: 75.02% (1129 questions)
  compare: 67.91% (589 questions)
  logical: 74.32% (1803 questions)
  query: 41.63% (6805 questions)
  verify: 79.71% (2252 questions)
Accuracy / semantic type:
  attr: 64.29% (5186 questions)
  cat: 46.65% (1149 questions)
  global: 58.60% (157 questions)
  obj: 85.86% (778 questions)
  rel: 48.70% (5308 questions)
Accuracy / steps number:
  1: 75.95% (237 questions)
  2: 51.07% (6395 questions)
  3: 60.81% (4266 questions)
  4: 64.06% (793 questions)
  5: 73.97% (822 questions)
  6: 85.37% (41 questions)
  7: 100.00% (20 questions)
  8: 100.00% (3 questions)
  9: 100.00% (1 questions)
Accuracy / words number:
  3: 32.45% (151 questions)
  4: 49.68% (630 questions)
  5: 44.42% (1290 questions)
  6: 54.39% (2074 questions)
  7: 57.06% (1642 questions)
  8: 59.16% (1185 questions)
  9: 63.08% (1281 questions)
  10: 63.49% (1249 questions)
  11: 59.36% (994 questions)
  12: 63.79% (638 questions)
  13: 58.87% (462 questions)
  14: 64.06% (345 questions)
  15: 63.71% (237 questions)
  16: 66.67% (117 questions)
  17: 63.83% (94 questions)
  18: 73.68% (76 questions)
  19: 76.74% (43 questions)
  20: 65.62% (32 questions)
  21: 63.16% (19 questions)
  22: 66.67% (12 questions)
  23: 25.00% (4 questions)
  24: 100.00% (2 questions)
  25: 100.00% (1 questions)

How about the result that MMNasNet model used on GQA dataset?

Hi, I readed MMNasNet paper these days, great work! So I want to know the result that MMNasNet model used on GQA dataset? Did you have try, has some improvment compare to MCAN model? I'm looking forward your reply, thanks~

why the result is 0.01 when I validate

I run your model about mfb.When I train,every step is done according to your requirements.The input is"python run.py --RUN='train' --MODEL='mfb' --DATASET='vqa' " And we can see the results.The accuracy is about 65%.
when I validate ,The input is "python run.py --RUN='val' --MODEL='mfb' --DATASET='vqa ' --CKPT_V=7153401 --CKPT_E=13" 7153401 is the version about loaded model.But the results is 0.01. Why it is 0.01. I dont know what's wrong.

Setup Issues

Hello,

I have experienced some issues when getting openvqa to set up. First, the "setup.sh" file is not available in this repository, so I just used the "setup.sh" file in the mcan-vqa repository. Is this suitable?

This may be related to the first issue, but when I try the first run command "python3 run.py --RUN='train' --MODEL='mcan_small' --DATASET='vqa'", I receive the error "./data/vqa/feats/train2014 NOT EXIST".

Thank you very much for your help.

Question about the learning rates of butd, ban in gqa dataset.

Hello, from the information in the model zoo for GQA, the learning rates of butd and ban-4 are 2e-4. However, in the file of configs/gqa/butd and configs/gqa/ban-4. the setting of learning rates is 2e-3. I want to know which learning rate is right for the models (butd, ban-4, ban-8) in the model zoo.
Thank you!

Bbox problem

Thanks for this repo!

I'm trying to train BUTD with GQA, but running into several issues (fixing one causes the next etc). It seems that there is an issue with the bounding box calculation. Could you explain why the expected size of bboxes is 5 values and not 4? This is defined in openvqa/core/base_cfgs.py.
I'm getting the following error when running as is:

File "openvqa/openvqa/core/base_dataset.py", line 87, in forward
return self.gqa_forward(feat_dict)
File "openvqa/openvqa/models/butd/adapter.py", line 55, in gqa_forward
bbox_feat = self.bbox_linear(bbox_feat)
File "lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "lib/python3.9/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (51200x4 and 5x1024)
(Note: batch size is 512)

The bbox layer expects 5 input values, but the data preprocessing code only produces 4 (which makes perfect sense to me, but the 5 is hard-coded, which seems weird).
If I change the code to 4, I'm getting a gradient/backprop error and it crashes, as well.

Any ideas here? Thanks!

Reproducing model zoo results on GQA

Hello,
I am trying to reproduce the results reported on model zoo for GQA dataset. I am doing Train+val -> Test-dev with mcan_large on GQA dataset with the following code:

python3 run.py --RUN='train' --SPLIT='train+val' --MODEL='mcan_large' --DATASET='gqa' --GPU='7' --VERSION=''default_frcn+bbox+grid

The accuracy on local evaluation is 56.23% and from GQA evaluation server is 56.57% (with in reasonable limit I guess). But the reported accuracy in model zoo for MCAN-large (frcn+bbox+grid) is 58.10%. I think this is a significant difference in accuracy. Could you please tell me if I am doing something wrong? I have used all provided features as is, and did not modify the code.

From 72.80% to 70.82% accuracy for single model (VQA-v2)

Hello.

First of all, thank you for this repo.

For the vqa challenge 2018, you manage to reach 72.80 % for the test-dev split. In the pretrained section, only a 70.82 accuracy model is available.

What do you think differs from both models that you lost 2%

Thank you very much in advance

how to evaluate the result in test split with clevr dataset

I have trained the clevr model with train+val, and got the result.txt by evaluating in test split. I want to konw how to use this .txt to get the final result of test spilt. In the Model Zoo, I found that you just ''train -> val'' in clevr.

CLEVR problem

Why is this project code reporting errors in the CLEVR dataset?
The question is:
Traceback (most recent call last):
File "/mnt/public/home/s-xuk/mcan-gqa/run.py", line 160, in
execution.run(__C.RUN_MODE)
File "/mnt/public/home/s-xuk/mcan-gqa/utils/exec.py", line 33, in run
train_engine(self.__C, self.dataset, self.dataset_eval)
File "/mnt/public/home/s-xuk/mcan-gqa/utils/train_engine.py", line 143, in train_engine
) in enumerate(dataloader):
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/mnt/public/home/s-xuk/anaconda3/envs/python37/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/mnt/public/home/s-xuk/mcan-gqa/openvqa/core/base_dataset.py", line 36, in getitem
frcn_feat_iter, grid_feat_iter, bbox_feat_iter = self.load_img_feats(idx, iid)
File "/mnt/public/home/s-xuk/mcan-gqa/openvqa/datasets/clevr/clevr_loader.py", line 163, in load_img_feats
grid_feat = np.load(self.iid_to_grid_feat_path[iid], allow_pickle=True)
KeyError: '44448'

Error in Downloading Image Features

I've tried several times, but the error consistantly occurs at the end (like the last 0.05%) of the downloading process for train2014.tar.gz and test2015.tar.gz. It is fine for val2014.tar.gz. I suspect that the reason might be the size of the files (that seems the only difference).

I was downloading directly in the Chrome, because I don't know how to use "wget" command for the provided sharable links.
Any solutions? Thanks!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.