snap-stanford / greaselm Goto Github PK

View Code? Open in Web Editor NEW

226.0 7.0 40.0 194 KB

[ICLR 2022 spotlight]GreaseLM: Graph REASoning Enhanced Language Models for Question Answering

License: MIT License

Shell 2.17% Python 89.46% Jupyter Notebook 8.37%

knowledge-graph language-model question-answering graph-neural-networks commonsense-reasoning biomedical-ques

greaselm's People

Contributors

Stargazers

Watchers

greaselm's Issues

How to know other node's text information?

Hello,
Thank you for providing such an excellent paper.
I am a student with a lot of interest in this field. I know that the OpenBookQA dataset has 4 correct labels per question, and accordingly, 4 subgraphs occur.
Here, each subgraph has 200 nodes.
There are three types of nodes: context, question, answer, and other nodes. What is the way to know the text information of other nodes? Thanks for reading the long question. have a good day.

RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB

When we are trying to run the greaselm.py we are getting this issue even if we run the batch size minimum of 8

we tried from 128-8 every time, It throws the error with different memory size as free , after some epochs. can you help us here in solving this issue and run the code

logits, _ = model(*[x[a:b] for x in input_data])
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 85, in forward
  logits, attn = self.lmgnn(lm_inputs, concept_ids,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 217, in forward
  outputs, gnn_output = self.mp(input_ids, token_type_ids, attention_mask, output_mask, gnn_input, adj, node_type_ids, node_scores, special_nodes_mask, output_hidden_$
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 411, in forward
  encoder_outputs, _X = self.encoder(embedding_output,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 815, in forward
  _X = self.gnn_layers[gnn_layer_index](_X, edge_index, edge_type, _node_type, _node_feature_extra)
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_gnn.py", line 91, in forward
  aggr_out = self.propagate(edge_index, x=x, edge_attr=edge_embeddings) #[N, emb_dim]
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 261, in propagate
  coll_dict = self.__collect__(self.__user_args__, edge_index, size,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 171, in _collect_
  data = self.__lift__(data, edge_index,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 141, in _lift_
  return src.index_select(self.node_dim, index)
RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 15.78 GiB total capacity; 14.28 GiB already allocated; 133.50 MiB free; 14.39 GiB reserved in tot

RoBERTa Baseline

Hello,
Thank you very much for providing the implementation of your model.
I have a question regarding the Roberta baseline.
Unfortunately, I could not find the implementation of Roberta baseline finetuning on in the QA tasks in the repository.
Is it present in the repo, have I overlooked it?
If not, what parameters were used for finetuning and how was the classification layer implemented?
Many thanks in advance!

Cannot reshape array of size 0 into shape (0)

Hi Xikun @XikunZhang ,

Thanks for your great work. When I preprocessed csqa, I have met this error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 337, in concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3
    adj, concepts = concepts2adj(schema_graph)
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 128, in concepts2adj
    adj = coo_matrix(adj.reshape(-1, n_node))
ValueError: cannot reshape array of size 0 into shape (0)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "preprocess.py", line 131, in <module>
    main()
  File "preprocess.py", line 125, in main
    rt_dic['func'](*rt_dic['args'])
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 512, in generate_adj_data_from_grounded_concepts__use_LM
    res3 = list(tqdm(p.imap(concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3, res2), total=len(res2)))
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
    for obj in iterable:
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
ValueError: cannot reshape array of size 0 into shape (0)

I have tried to fix it by editing the line https://github.com/snap-stanford/GreaseLM/blob/803946bba3273556c1ff2be6ad8b02850fe5972d/preprocess_utils/graph.py#L128 to just ignore the reshape method if the array has size 0:

try:
        adj = coo_matrix(adj.reshape(-1, n_node))
except:
        print("FAIL concepts2adj")

I think that I edited in an incorrect way because when running evaluation, I got this error:

points/csqa/csqa_model.pt
***** hyperparameters *****
dataset: csqa
******************************
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
NLP
pid: 74920
screen: 

gpu: 1

torch version: 1.8.0+cu101
torch cuda version: 10.1
cuda is available: True
cuda device count: 1
cudnn version: 7603
wandb id:  1ziiml5l
loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
train_statement_path ./data//csqa/statement/train.statement.jsonl
num_choice 5
Loading sparse adj data...
loading adj matrices: 100%|███████████████████████████████████████████████████████████████████████| 48705/48705 [00:22<00:00, 2158.86it/s]
| ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate： 0.00 | qc_num: 5.46 | ac_num: 1.54 |
Traceback (most recent call last):
  File "greaselm.py", line 606, in <module>
    main(args)
  File "greaselm.py", line 546, in main
    evaluate(args, has_test_split, devices, kg)
  File "greaselm.py", line 449, in evaluate
    dataset = load_data(args, devices, kg)
  File "greaselm.py", line 50, in load_data
    dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
  File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 121, in __init__
    assert all(len(self.train_qids) == len(self.train_adj_data[0]) == x.size(0) for x in [self.train_labels] + self.train_encoder_data + self.train_decoder_data)
AssertionError

Is it possible that you could give me some advices on how I can fix it (the first error).

Thank you & BR,

training time

Hi, thanks for your great work!

I'm trying to train this model with limited resources.

Can I know how much gpus (is it V100?) and how much time did you spend for training (the case for CommonSenseQA)

retrieve graph for single sentence input

Hi, thank you for introducing such an intriguing work.

your proposed sub graph retrieval process is customized to Q-A pair input.

What kinds of code snippets should I modify to retrieve a graph for just a single sentence input?

The reason why I ask is that I'd like to make a custom pipeline that samples ConceptNet's subgraph for given custom string input.

the model "roberta-large" doesn't exist

Hello, recently when replicating this project, I found that the 'reberta target' model no longer exists on the hugging face website. May I ask everyone, can this project be replaced with other models?

can not find the third glove file

where can I find the file "../data/glove/tp_str_corpus.json"? Thank you

Inquiry Regarding Experiment with Aristo-RoBERTa Encoder on OBQA Dataset in GreaseLM Paper

I am reaching out to seek assistance regarding my attempts to reproduce the experimental results mentioned in the GreaseLM paper, specifically concerning the utilization of the Aristo-RoBERTa encoder on the OBQA dataset.

Despite multiple attempts, I have been unable to replicate the performance reported in the paper. In order to facilitate my efforts, I would greatly appreciate it if you could provide more comprehensive details regarding the hyperparameters used in this particular experiment.

Your guidance on this matter would be immensely valuable to me, and I am eager to hear from you at your earliest convenience. Thank you very much for your attention to this matter.

FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

Hi Xikun,

Thanks for your great work. May I ask where could I take this inhouse_split_qids.txt file?

***** hyperparameters *****
dataset: csqa
******************************
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
NLP
pid: 493
screen: 

gpu: 1

torch version: 1.8.0+cu101
torch cuda version: 10.1
cuda is available: True
cuda device count: 1
cudnn version: 7603
wandb id:  1iygcifx
loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
train_statement_path ./data//csqa/statement/train.statement.jsonl
num_choice 5
Loading sparse adj data...
| ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate： 0.00 | qc_num: 5.46 | ac_num: 1.54 |
Finish loading training data.
Loading sparse adj data...
| ori_adj_len: mu 12.16 sigma 10.18 | adj_len: 13.16 | prune_rate： 0.00 | qc_num: 5.34 | ac_num: 1.54 |
Finish loading dev data.
Loading sparse adj data...
| ori_adj_len: mu 12.02 sigma 9.17 | adj_len: 13.02 | prune_rate： 0.00 | qc_num: 5.48 | ac_num: 1.53 |
Finish loading test data.
Traceback (most recent call last):
  File "greaselm.py", line 606, in <module>
    main(args)
  File "greaselm.py", line 546, in main
    evaluate(args, has_test_split, devices, kg)
  File "greaselm.py", line 449, in evaluate
    dataset = load_data(args, devices, kg)
  File "greaselm.py", line 50, in load_data
    dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
  File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 144, in __init__
    with open(inhouse_train_qids_path, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

BR,

Can we evaluate GreaseLM model on cpu?

Question about freezing LM parameter problem

Hi, I'am very interesting in this model. I want to know why freeze LM parameters previous epochs. In my knowledge, LM parameters are fun-tuned in previous epochs(1~3) and then freeze it.
I would really appreciate for your help.

Bert_large_model can't solved..

I am trying to solve the error: ` File "D:\GreaseLM\modeling\modeling_greaselm.py", line 583, in from_pretrained
raise EnvironmentError(msg)

'bert-large-uncased' is a correct model identifier listed on 'https://huggingface.co/models'
or 'bert-large-uncased' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.
` .

I tried to download all files from https://huggingface.co/google-bert/bert-large-uncased/tree/main but still, I don't know where to set this file or where the path to it is. Please help me with that.

About 2-hop node extraction function: According to code, I think it just extract 1hop nodes. Is there other code to change it?

In these code

GreaseLM/preprocess_utils/graph.py

Line 315 in 65d7ac8

def concepts_to_adj_matrices_2hop_all_pair__use_LM__Part1(data):

It extract 2-hop nodes from KG, but it just extract 1-hop nodes from kg actually.

The experience on complex questions with semantic nuance

I also want to try similar experiments over different complex questions with semantic nuance. But I don't find your specific classification method (Prepositional Phrases, negation terms, hedging terms) for the complex problem. If you still save the code of dealing with the questions, could you share it with me? thanks!

[Discussion] Relevance of GreaseLM results in light of 'GNN is a Counter?..' paper + dataset discussion

Very interesting work on the combination of LM + KG! This is something I am looking into myself as a research project (https://github.com/apoorvumang/transformer-kgc), and I thought this would be a good place to discuss what datasets such models should be used on.

In the very recently released paper GNN is a Counter? Revisiting GNN for Question Answering, (code at https://github.com/anonymousGSC/graph-soft-counter), they show that a 1-dim GNN + LM is able to achieve almost SOTA results on both OpenBookQA and CommonsenseQA. In fact according to their numbers it even outperforms GreaseLM on both these datasets.

I would like to discuss a few things regarding the dataset situation:

CommonSenseQA leaderboard no longer accepts ConceptNet based submissions, which is quite a bummer, and OpenBookQA is extremely small (500 test and 500 dev questions only, around 5k train). Is it worth it (for me and others) to work with these datasets, given the findings of 'GNN...' paper?
If not, could GreaseLM (and similar methods) be applied to regular KGQA datasets such WebQuestionsSP, ComplexWebQuestions or GrailQA? This of course would be harder since its no longer MCQ reasoning, but it might be more interesting and can give real evidence of LM + KG based reasoning.
Is there any other datasets apart from the ones I mentioned that could be relevant in this area? (MedQA-USMLE is ofc one, but I feel it is quite new, and having another older/more established dataset would be an advantage)

Looking forward to a healthy discussion! 😊

[Help] About the hyper-parameters to reproduce the result

@XikunZhang @michiyasunaga @roks

Hi,

Thanks for your great effort!

I've run the code in this repo with the same hyper-parameters provided in the script run_greaselm.sh, which are also the same as reported in the paper. But the results aren't as good as reported in the paper. For example, in csqa, the reported dev_acc and test_acc are 78.5(+-0.5) and 74.2(+-0.4) respectively, but the model I trained only performs 77.48 and 73.01 respectively.

I've tried several random seeds, but the problem still exists. So could you please release the hyper-parameters(i.e. random seed) that you used when you train the model?

Look forward to your response!

snap-stanford / greaselm Goto Github PK

greaselm's People

Contributors

Stargazers

Watchers

Forkers

greaselm's Issues

Recommend Projects

Recommend Topics

Recommend Org