Giter VIP home page Giter VIP logo

aggcn's People

Contributors

cartus avatar yanzhangnlp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aggcn's Issues

About "M identical blocks"

Hello, I would like to ask you how to understand “M identical blocks” in the paper? and,What's the specific meaning of this? thank you!

Running AGGCN but came across an error (cuda runtime error (38))

Hello. Thank you for sharing your code.

My environment is Python 3.6.8, PyTorch 0.4.1, and CUDA 9.0.

I got errors as below:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp line=74 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
File "train.py", line 119, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/trainer.py", line 67, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/aggcn.py", line 22, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/aggcn.py", line 47, in init
self.gcn = AGGCN(opt, embeddings)
File "/code/AGGCN/model/aggcn.py", line 129, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/code/AGGCN/model/aggcn.py", line 205, in init
self.weight_list = self.weight_list.cuda()
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in cuda
return self._apply(lambda t: t.cuda(device))
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 185, in _apply
module._apply(fn)
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 191, in _apply
param.data = fn(param.data)
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp:74

My GPU is empty, and all other example codes on GPU is fine.
Have you come across the error? Thank you.

Questions about the updated results

Hi, In the issue 7, you said that
"For the mean and std of F1 score in my experiments, the stats is 68.2% +- 0.5%. Also, we will update this score on our paper, for a fair an concrete comparison to other methods."

However, In the final version of ACL19 and the latest version in arxiv, the F1 score on TACRED are both 69.0. May I ask where the latest results were released?

code discussion

what does the parameters sublayer_first and sublayer_second mean in your code?

RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /tmp/pip-req-build-ufslq_a9/aten/src/THC/THCGeneral.cpp:50

(GPU_pytorch) hz071@hamilton:~/AGGCNN/AGGCN$ bash train_aggcn.sh 1
Vocab size 53953 loaded from file
Loading data from dataset/tacred with batch size 50...
1363 batches created for dataset/tacred/train.json
453 batches created for dataset/tacred/dev.json
Config saved to file ./saved_models/01/config.json
Overwriting old vocab file at ./saved_models/01/vocab.pkl

Running with the following configs:
data_dir : dataset/tacred
vocab_dir : dataset/vocab
emb_dim : 300
ner_dim : 30
pos_dim : 30
hidden_dim : 300
num_layers : 2
input_dropout : 0.5
gcn_dropout : 0.5
word_dropout : 0.04
topn : 10000000000.0
lower : False
heads : 3
sublayer_first : 2
sublayer_second : 4
pooling : max
pooling_l2 : 0.002
mlp_layers : 1
no_adj : False
rnn : True
rnn_hidden : 300
rnn_layers : 1
rnn_dropout : 0.5
lr : 0.7
lr_decay : 0.9
decay_epoch : 5
optim : sgd
num_epoch : 100
batch_size : 50
max_grad_norm : 5.0
log_step : 20
log : logs.txt
save_epoch : 100
save_dir : ./saved_models
id : 1
info :
seed : 0
cuda : False
cpu : False
load : False
model_file : None
num_class : 42
vocab_size : 53953
model_save_dir : ./saved_models/01

Finetune all embeddings.
/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5 and num_layers=1
"num_layers={}".format(dropout, num_layers))
THCudaCheck FAIL file=/tmp/pip-req-build-ufslq_a9/aten/src/THC/THCGeneral.cpp line=50 error=100 : no CUDA-capable device is detected
Traceback (most recent call last):
File "train.py", line 119, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/home/hz071/AGGCNN/AGGCN/model/trainer.py", line 67, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 22, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 47, in init
self.gcn = AGGCN(opt, embeddings)
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 129, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 205, in init
self.weight_list = self.weight_list.cuda()
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 304, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 201, in _apply
module._apply(fn)
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 223, in _apply
param_applied = fn(param)
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 304, in
return self._apply(lambda t: t.cuda(device))
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/cuda/init.py", line 197, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /tmp/pip-req-build-ufslq_a9/aten/src/THC/THCGeneral.cpp:50
(GPU_pytorch) hz071@hamilton:~/AGGCNN/AGGCN$ python
Python 3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.cuda.is_available()
True

I am using pytorch=1.4.0, python=3.7 and you can see above Cuda is available too. What should i do?

Dependency information of SemEval 2010 Task8

Thank you for your sharing. We are very fortunate to have such a responsible author. Before you release the code for SemEval 2010 Task8, I want to know how you obtain the dependency information for it. By StanfordNLP?

适用于中文数据集吗?

您好,
注意到论文中的实验都是在英文语料上完成的,我现在想应用到中文语料上,对于中文如何构造邻接矩阵,您有什么建议吗?
我的数据集示例如下:
{ "doc_id": "1", "paragraphs": [ { "paragraph_id": "0", "paragraph": "**成人2型糖尿病胰岛素促泌剂应用的专家共识", "sentences": [ { "sentence_id": "0", "sentence": "**成人2型糖尿病胰岛素促泌剂应用的专家共识", "start_idx": 0, "end_idx": 22, "entities": [ { "entity_id": "T0", "entity": "2型糖尿病", "entity_type": "Disease", "start_idx": 4, "end_idx": 9 }, { "entity_id": "T1", "entity": "2型", "entity_type": "Class", "start_idx": 4, "end_idx": 6 }, { "entity_id": "T2", "entity": "胰岛素促泌剂", "entity_type": "Drug", "start_idx": 9, "end_idx": 15 } ], "relations": [ { "relation_type": "Drug_Disease", "relation_id": "R0", "head_entity_id": "T2", "tail_entity_id": "T0" }, { "relation_type": "Class_Disease", "relation_id": "R1", "head_entity_id": "T1", "tail_entity_id": "T0" } ] } ] } }

Dependency type

Hi @Cartus
Thank you very much for your sharing. I would like to ask whether the dependency type information is used or whether only the edge between the nodes is considered and the specific edge type is not considered?

Evaluation score lower than reported

Hi,
I retrain the model with 5 different random seeds on TACRED. However, the average F1 score is 67.116(+-0.121), which is much lower than the reported score in your paper. Is the default model config correct? Also, how large is the std in your experiments?

关于准确率的问题

郭哥您好,在尝试复现论文代码时,发生了一些问题,执行语句为:python3 train.py --id $SAVE_ID --seed 0 --hidden_dim 300 --lr 0.7 --rnn_hidden 300 --num_epoch 100 --pooling max --mlp_layers 1 --num_layers 2 --pooling_l2 0.002

此时准确率均为100%且f1值均为0,详细截图如下:
2
1
希望您在方便时候能不吝赐教,谢谢!

Question on SemEval2010_task8 dataset

May I ask your parameters setting on SemEval2010_task8 dataset?
I can't get the performance on SemEval2010_task8 in the paper.
Thank you so much!

the Final Score

hello, I ran the code you gave and got the Final Score:
Precision (micro): 74.361%
Recall (micro): 30.617%
F1 (micro): 43.375%
Why is there a gap with the original result?

What is the function of tensor "denom" ?

In the Class of GraphConvLayer and MultiGraphConvLayer, there is a tensor called denom. I have tried to understand its meaning,however I failed.So please tell me this tensor's meaning.thanks!

` def forward(self, adj, gcn_inputs):
# gcn layer
denom = adj.sum(2).unsqueeze(2) + 1
outputs = gcn_inputs
cache_list = [outputs]
output_list = []

    for l in range(self.layers):  # 0, 1
        Ax = adj.bmm(outputs)
        AxW = self.weight_list[l](Ax)
        AxW = AxW + self.weight_list[l](outputs)  # self loop
        AxW = AxW / denom 
        gAxW = F.relu(AxW)
        cache_list.append(gAxW)
        outputs = torch.cat(cache_list, dim=2) 
        output_list.append(self.gcn_drop(gAxW))`

train set performance

hi, I would like to ask why the model on train set can only reach about 0.84 f1-value, does it means the model is under-fitting ? thanks!

Why the number of classes on the SemEval2010-Task 8 is only 10?

Hi @Cartus,
Sorry for disturbing you, but as I know, the number of classes in the SemEval2010-Task 8 is 19. (9 directed relations and a special Other class). Why the number of classes in your json semeval folder is only 10 (also in the dict LABEL_TO_ID in the file utils/constant.py)? I wonder that we are ignoring the directionality of relations?
Thank you for your consideration!

About replacing data sets

hello,
Sorry for disturbing you,When I replaced the data set with a Chinese data set, the loss became extremely huge. What is the reason?
Finetune all embeddings.
epoch 1: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_2.pt

epoch 2: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_3.pt

epoch 3: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_4.pt

epoch 4: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_5.pt

epoch 5: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_6.pt

epoch 6: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_7.pt

2020-12-23 08:47:06.380100: step 20/450 (epoch 7/150), loss = 576000088104739014705152.000000 (0.161 sec/batch), lr: 0.010000

请问我怎么在semeval数据集上运行您的代码?

您好,
我发现项目里有关于semeval的数据集和相关代码,但是在README里面并没有提及关于semeval的数据集和相关代码的任何事情。
因此,我想问一下,如果要在semeval数据集上运行您的代码,需要做哪些事情?模型在semeval数据集上的得分情况怎么样?

关于AGGCN模型细节的问题

师兄您好!
看了AGGCN那部分封装的模型代码,有个地方不了解。
在gcn层中定义的MoedlList,其中包含4个子层。分别是两个GraphConvLayer和两个MultiGarphConvLayer。因此我没明白为啥只有后两层才使用注意力机制?我理解的是不是对每条文本都要使用注意力机制生成全连接加权图吗?

how to get standford_head and stanford_deprel for cross-sentence data

hello, when I tried to generate the two fields(stanford_head and stanford_deprel) with the stanfordnlp tool,and converted the result into a recognizable pattern for your code, I found a little difference between my results and yours(pubmed dataset).

for example, the sentence"2. acquired resistance mechanism 1 ) secondary t790m mutation of the egfr gene unfortunately , many of those patient who originally had respond eventually become insensitive to gefitinib or erlotinib therapy through acquire resistance ."

my output:
屏幕快照 2021-04-15 下午5.54.09.png
your output:
屏幕快照 2021-04-15 下午5.56.15.png

we can see that,in your output ,there're many "-1" value and "self" content.

So,I wonder how do you get the output?Is it because our tool versions are different?

environment error

The version of PyTorch in Readme is 0.4.1, but the configuration environment cannot find that version. Can you provide the environment requirements for this project? It would be great if the requirements file was available!
image

some error about:"RuntimeError: cuda runtime error (100) : "

hello,when i run the " bash train_aggcn.sh",there are some errors.
but,i test my torch:
Default GPU Device: /device:GPU:0
Your tensorflow-gpu is available
Default GPU Device: Tesla P4
Your pytorch-gpu is available
and,my environment is
cuda:10.1
python:3.7.5
pytorch:1.3.
could you please help me to solve this error?
thank you very much!

Finetune all embeddings.
Traceback (most recent call last):
File "train.py", line 120, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/home/yaoshengnan/AGGCN-master/semeval/model/trainer.py", line 68, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 26, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 51, in init
self.gcn = AGGCN(opt, embeddings)
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 133, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 206, in init
self.weight_list = self.weight_list.cuda()
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
return self._apply(lambda t: t.cuda(device))
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
param_applied = fn(param)
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in
return self._apply(lambda t: t.cuda(device))
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/cuda/init.py", line 193, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:50

micro F1 and macro F1

The F1 score in your paper is macro F1, but in your code you calculate the micro score.
Can you share us with the code to calculate macro F1 score?
Thank you so much!

some question about your paper

Hello, I read your thesis recently, and I found that the formula (2) in your paper seems to be different from the algorithm in the code. In your code, calculating the attention guided adjacency matrix, the V matrix is not used, but V appears in your formula (2). If you calculate according to the formula (2) in your paper, some problems may arise in the dimensions of the data.

What method did you use to generate the dependency trees for the semeval dataset

Hi there. Thanks for posting your code!

I'm interested in applying this methodology to other datasets. My reading of it is that you preprocess data to derive the dependency tree before training. What method did you use to do this for the Semeval dataset, as this doesn't have the dependency tree annotated in the original data?

Question about the adjacent matrix

Hi~

In your abstract, you said "In this work, we propose Attention Guided Graph Convolutional Networks (AGGCNs), a novel model which directly takes full dependency trees as inputs", but I don's see how do you make use of the dependency tree.

According to Equation (2), seems you just construct the $\tilde{A}$ by using the input embeddings, so where do you make use of the original adjacency matrix $A$?

In your code, I still cannot confirm how do you make use of $A$, which is adj in your code, right?
https://github.com/Cartus/AGGCN_TACRED/blob/master/model/aggcn.py#L175

Could you please show me the details?
And by the way, that is the matrix $V$ in your Equation (2)? I can not find the definition of $V$ neither.

About case study and attention visualization mentioned in paper

hello, @Cartus ,Thanks for your wonderful work! Do you have the case study and attention visualization mentioned in your paper, I think that it's useful for me to analyse the model.

Case study and attention visualization of this example are provided in the supplementary material

Appreciate that If you could make it available, Thanks in advance!

Other dataset

Hi,
This code is about TACRED dataset, can you provide other code about other datasets mentioned in paper?
Thank!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.