yuweijiang / hgl-pytorch Goto Github PK

View Code? Open in Web Editor NEW

46.0 46.0 13.0 237 KB

Code for the model "Heterogeneous Graph Learning for Visual Commonsense Reasoning (NeurlPS 2019)"

License: MIT License

Python 100.00%

hgl-pytorch's People

Contributors

Stargazers

Watchers

Forkers

ffzhang1231 ammieqi jaeyun95 wyuedgg wanboyang hanyu-liang autogyro hlhqbzd tzonglin66 daniel00008 hawksilent gaohuan2015 bhavya0324

hgl-pytorch's Issues

OSError: [Errno 5] Input/output error

Thanks for your nice code！
When training into nine or ten epoch, the error happened.
I hope can get help!
Traceback (most recent call last): File "train.py", line 143, in <module> for b, (time_per_batch, batch) in enumerate(time_batch(train_loader if args.no_tqdm else tqdm(train_loader), reset_every=ARGS_RESET_EVERY)): File "/home/songzijie/project/HGL-pytorch-master/utils/pytorch_misc.py", line 29, in time_batch for i, item in enumerate(gen): File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/tqdm/std.py", line 1130, in __iter__ for obj in iterable: File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in __next__ return self._process_next_batch(batch) File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch raise batch.exc_type(batch.exc_msg) OSError: Traceback (most recent call last): File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in <listcomp> samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/songzijie/project/HGL-pytorch-master/dataloaders/vcr.py", line 269, in __getitem__ image = load_image(os.path.join(VCR_IMAGES_DIR, item['img_fn'])) File "/home/songzijie/project/HGL-pytorch-master/dataloaders/box_utils.py", line 15, in load_image return default_loader(img_fn) File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/torchvision/datasets/folder.py", line 147, in default_loader return pil_loader(path) File "/home/songzijie/.conda/envs/hgl/lib/python3.6/site-packages/torchvision/datasets/folder.py", line 128, in pil_loader with open(path, 'rb') as f: OSError: [Errno 5] Input/output error: '/home/songzijie/project/HGL-pytorch-master/data/vcr1images/movieclips_The_Dark_Tower/[email protected]'

a question

Hello, this is a wonderful work.But i don't see the cvm model.I wnat to know how to extract features from pictures

Where is allennlp-requirements.txt

Nice work in NIPS. Where is the allennlp-requirements.txt file

i have problem about restore checkpoint!

hi!
i have problem about restore checkpoint.
It stopped learning, so I tried to restore but got an error.
help! T^T

restore is True
Found folder! restoring
Traceback (most recent call last):
  File "train.py", line 122, in <module>
    learning_rate_scheduler=scheduler)
  File "/home/ailab/HGL-pytorch/utils/pytorch_misc.py", line 226, in restore_checkpoint
    training_state = torch.load(training_state_path, map_location=device_mapping(-1))
  File "/home/ailab/anaconda3/envs/r2c/lib/python3.6/site-packages/torch/serialization.py", line 368, in load
    return _load(f, map_location, pickle_module)
  File "/home/ailab/anaconda3/envs/r2c/lib/python3.6/site-packages/torch/serialization.py", line 549, in _load
    deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
RuntimeError: unexpected EOF, expected 4859355 more bytes. The file might be corrupted.
terminate called after throwing an instance of 'c10::Error'
  what():  owning_ptr == NullType::singleton() || owning_ptr->refcount_.load() > 0 ASSERT FAILED at /opt/conda/conda-bld/pytorch_1549628766161/work/c10/util/intrusive_ptr.h:350, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /opt/conda/conda-bld/pytorch_1549628766161/work/c10/util/intrusive_ptr.h:350)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f3920592cf5 in /home/ailab/anaconda3/envs/r2c/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: THStorage_free + 0xca (0x7f38d72a68ea in /home/ailab/anaconda3/envs/r2c/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #2: <unknown function> + 0x12c11d (0x7f39208d011d in /home/ailab/anaconda3/envs/r2c/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #17: __libc_start_main + 0xf0 (0x7f39266a8830 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)

the graph reasoning module issue

Thanks for your sharing.Great work! But when I read the codes according to the papers, there exists some issues that I can not understand, especially the graph reasoning module. In the henG.py file , I can not find the impletation of the formula (6) in your paper.
My first question:
The e_obj variable means the x_middle in your paper,right? But it is not the combination of Y_o and X_m with softmax.Perhaps it should be like this:
e_obj = self.fc_o_(torch.cat([s_obj, o_a_view], -1)) #49th line

My second question:
In formula (6), Y_v is the combination of X_middle and Y_o. But in your impletation, it seems like the
Y_v is the combination of X_middle and answer_view. So in my opinion,it should be like this:
A_obj = F.softmax(self.w_g_o(F.relu(self.w_s_o(o_a_view) + self.w_s_o_(e_obj))), dim=-2)

Am I right? Look forward to your valuable apply! Thanks a lot!

How many GPU memory do you use?

Hi, thank you for your good work. I am trying to implement your code, but it always appears cuda out of memory. I have 3 RTX2080ti.

Issue of the pre-trained checkpoints

Dear author, thanks for sharing your work. I tried to reproduce your results on the validation set. However, the checkpoint you provided ([https://drive.google.com/drive/folders/1ux9YG3sRmUVvsCt1nHwlB5Egw43NDsK1?usp=sharing]) only achieves 63.7 accuracies on the validation set, which is the same as the original R2C model. I wonder if you mistakingly uploaded wrong checkpoint files.

Answer acc: 0.637
Rationale acc: 0.705
Joint acc: 0.452

VCR - Dataset

Is there a way to obtain the dataset.

yuweijiang / hgl-pytorch Goto Github PK

hgl-pytorch's People

Contributors

Stargazers

Watchers

Forkers

hgl-pytorch's Issues

OSError: [Errno 5] Input/output error

a question

Where is allennlp-requirements.txt

i have problem about restore checkpoint!

the graph reasoning module issue

How many GPU memory do you use?

Issue of the pre-trained checkpoints

VCR - Dataset

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent