Giter VIP home page Giter VIP logo

atlop's People

Contributors

wzhouad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

atlop's Issues

The results of ATLOP based on the bert-base-cased model on the DocRED dataset

Hello,
I retrained ATLOP based on the bert-base-cased model on the DocRED dataset. However, the max F1 and F1_ign score on the dev dataset is 58.81 and 57.09, respectively. However, these scores are much lower than the reported score in your paper (61.09, 59.22). Is the default model config correct? My environment is as follows:
Best regards

Python 3.7.8
PyTorch 1.4.0
Transformers 3.3.1
apex 0.1
opt-einsum 3.3.0

Any plans to release the codes for CDR?

Hello Zhou

Thank you for releasing the codes of your work.
In your paper, it has the experiment results on CDR.
I want to reproduce the performance using the CDR dataset on your approach.
Do you have any plans to release the codes for CDR?

about the labels

I see there a line of code before output the loss
that is
if labels is not None:
labels = [torch.tensor(label) for label in labels]
labels = torch.cat(labels, dim=0).to(logits)
loss = self.loss_fnt(logits.float(), labels.float())
output = (loss.to(sequence_output),) + output

and i also tried
why sometimes the label could be none???
am I got something wrong?

About the process_long_input.py

I got the error, could you help me ? thank you!

Traceback (most recent call last):
File "train.py", line 228, in
main()
File "train.py", line 216, in main
train(args, model, train_features, dev_features, test_features)
File "train.py", line 74, in train
finetune(train_features, optimizer, args.num_train_epochs, num_steps)
File "train.py", line 38, in finetune
outputs = model(**inputs)
File "D:\Anaconda\envs\pytorch-GPU\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\code\ATLOP\model.py", line 95, in forward
sequence_output, attention = self.encode(input_ids, attention_mask)
File "D:\code\ATLOP\model.py", line 32, in encode
sequence_output, attention = process_long_input(self.model, input_ids, attention_mask, start_tokens, end_tokens)
File "D:\code\ATLOP\long_seq.py", line 17, in process_long_input
output_attentions=True,
File "D:\Anaconda\envs\pytorch-GPU\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'output_attentions'

--save_path issue

I edit the script file and add --save_path followed by the directory. I can't see any saved models after running the script. Could you please explain how to save a model in detail?

About baseline implementation and results in the paper

Hello,
it's really nice work! While I have some question on the results provided in the paper: 1) what is the difference between BERT_base and BERT-E_base? 2) in ablation study, what does "-Adaptive-Thresholding Loss" stand for? Does it mean that you use a TH class with random parameters?
Besides, will you release your implementation of BERT-E_base?
Thanks!

Can you please release trained model?

Hi. Thank you for releasing the codes of your model, it is really helpful.

However I tried to retrain ATLOP based on the bert-base-cased model on the DocRED dataset but I can't get high result as your result on the paper. And I can't retrain roberta-large model because I don't have strong enough GPU (strongest GPU on Google Colab is V100). So can you please release your trained model. I would be very very happy if you can release your model, and I believe that it can help many other people, too.

Thank you so much.

Query for test score

Could you tell me how to use result.json to get the test score in colab? Thank you very much!

Why do we not recalculate recall for the ignored setting?

Hi. I noticed that when we calculate the Ign F1 score, we only recalculate the precision but not the recall. Is there any reason why this is the case? I would think that we should recalculate both, but if I'm misunderstanding something please let me know.

Thanks!

Inference

Can you please tell how can I perform inference and get (entity, relation, entity) type of output?

Question regarding 'max-seq-length'

Thank you for this great work.

I've encountered a warning during running prepro.py.

"WARNING:transformers.tokenization_utils:Token indices sequence length is longer than the specified maximum sequence length for this model (559 > 512). Running this sequence through the model will result in indexing errors"

As far as I know, the maximum sequence length of BERT Model input is 512 (Like the warning says), and the default max-seq-length in your code is '1024'.

My question is that, is 'input_ids' in which length is over 512 truncated before running BERT Model? Or is it just okay with input with over 512 length?

Where did the "/meta/rel2id.json" come from?

I only want to use DocRED dataset,and there is only "rel_info.json" in it.
Could you please tell me how can I get rel2id.json?I try to rename rel_info.json to rel2id.json but ValueError: invalid literal for int() with base 10: 'headquarters location' occured in
File "train.py", line 197, in main
train_features = read(train_file, tokenizer, max_seq_length=args.max_seq_length)
File "/home/kw/ATLOP/prepro.py", line 56, in read_docred
r = int(docred_rel2id[label['r']])
Thanks for your attention,I'm waiting for your reply.

How should I be running the Enhanced BERT Baseline model?

Hi. I recently tried to run the Enhanced BERT Baseline model (i.e., without adaptive threshold loss and local contextualized pooling) and just wanted to confirm if I'm doing it right.

Basically, in model.py lines 86-111 (i.e., the forward method) I modified the code so that I don't use rs and changed self.head_extractor and self.tail_extractor to have in_features and out_features accordingly. I did this because I'm assuming that within the get_hrt method, rs is what LOP is since we're using attention there. Modifying the extractors also implies that I'm not concatenating hs and ts with rs.

After that I changed loss_fnt to be a simple nn.BCEWithLogitsLoss rather than ATLoss. That means I also changed the get_label method within ATLoss to be a function so that I'm not depending on the class.

Am I doing this right? Or is there another way that I should be implementing it?

The reason why I'm suspicious as to whether I implemented this correctly or not is because I'm currently running the code on the TACRED dataset rather than the DocRED dataset, and while ATLOP itself shows satisfactory performance the performance of the Enhanced BERT Baseline is much lower.

Thanks.

end_tokens for RoBERTa?

Hi, could you please explain why you used two sep_token_id's for RoBERTa but only one sep_token_id for BERT? Thanks!

ATLOP/model.py

Line 31 in 1db77ab

end_tokens = [config.sep_token_id, config.sep_token_id]

How to get test score?

“The program will generate a test file result.json in the official evaluation format. You can compress and submit it to Colab for the official test score.”
After training the ATLOP, I did get the result.json. I submited the file to the Codalab of DocRED after compressing it. But I got nothing in the score, am I using the right way to get the F1 and IgF1 0f test set?

image

image

The problem of setting environment

Hi @wzhouad ,

Sorry for disturbing you. However, after creating a new environment (using Conda) with requirements as you guided, I got this error when running the command: sh scripts/run_bert.sh :

File "/home/cl/vanhien-t/anaconda3/envs/atlop/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 74, in init
self.optimizer.step = with_counter(self.optimizer.step)
File "/home/cl/vanhien-t/anaconda3/envs/atlop/lib/python3.7/site-packages/torch/optim/lr_scheduler.py", line 56, in with_counter
instance_ref = weakref.ref(method.self)
AttributeError: 'function' object has no attribute 'self'

I already tried to fix this error, but it has not solved yet. Could you please give me some suggestions to solve this problem? Thank you very much!

Multi-gpu code

Can you please provide the multi-gpu version of the code?

Mention embedding

Hi there, thanks for your nice work. I'm a bit confused that in the function get_hrt(), do you use the embedding of the first subword token as the mention embedding instead of summing up all the wordpieces? So the offset used here is due to the insertion of especial token "*" ? Please correct me if I'm wrong, thanks!

Apply model on free text data

Hi, may I know how do I apply the 'atlop-bert-base' model on free text unannotated data? Since the input data structure would require it to be in the DOCRED JSON format. Thank you!

The usage of the ATLoss

Thanks for your amazing work!
I am very interested in the ATLoss, but there is a little question I want to ask.
When using the ATLoss, should we add a no-relation label?
For example, there are 26 relation types, the gold labels may contain multiple relation types, but at least one relation type.
How to represent the no-relation? Show I create a tensor of size 27 and set the first label 1 or a tensor of size 26 and set all the labels zero?
Look forward to your reply. Many Thanks,

The main purpose of the function: get_label

Hi @wzhouad ,

Thanks so much for releasing your source code. I only wonder about the main purpose of the function get_label() in the file losses.py in calculating the final loss. Could you please explain it? Thanks for your help!

Training on custom dataset

Hi,
I was trying to train a custom dataset with the ATLOP based on the bert-base-cased model. But I found that the dev-f1 is stuck ~28-29 and not increasing even after training 30 epochs. I also checked the train loss graph but it is showing the normal behaviour (decreasing throughout and reaching order of 10^-2 in the end).
Can you please suggest how to train the model to achieve better dev-F1 score?

The train_loss and dev-f1 score graphs are attached herewith for reference.

Train-loss vs steps
Train-loss_vs_steps

Dev-F1 vs steps
Dev-F1_vs_steps

Is there a reason why the lengths of each chunk for the CDR dataset is 17 and not 23?

Hi. I'm currently trying to run your model with the BC5CDR dataset. I've noticed that you've set the size of each data chunk to be 17 rather than 23 on line 157 of prepro.py. Is there a reason why you did this? I'm under the impression that each sample in the CDR dataset is of length 23, and dividing the total length by 23 gives us an event number (e.g., 138 // 23 == 6). Setting the size to 17 also returns an assertion error in the chunks function.

Just curious if I'm missing anything. Thanks!

model.py

When I run train.py, there is an err in model.py:

line 45, in get_hrt
e_att.append(attention[i, :, start + offset])
IndexError: too many indices for tensor of dimension 1

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.