Light

Loss Curve about bert-relation-extraction HOT 11 OPEN

plkmo commented on June 15, 2024

Loss Curve

from bert-relation-extraction.

Comments (11)

pvcastro commented on June 15, 2024

Sorry, missed the results folder

from bert-relation-extraction.

plkmo commented on June 15, 2024

Hi, yes the training loss curve for semeval training is in the results folder. Please note that the MTB model has been updated and hence the old loss curve for MTB pretraining should be ignored

from bert-relation-extraction.

pvcastro commented on June 15, 2024

Thanks @plkmo !
Are you uploading an updated one?

from bert-relation-extraction.

plkmo commented on June 15, 2024

Yup, will do so once I have the available GPU compute to satisfactorily pre-train it on suitable data.

from bert-relation-extraction.

pvcastro commented on June 15, 2024

I'm reopening since we're still discussing this 😅
I got these losses. Do you think they are ok?

from bert-relation-extraction.

plkmo commented on June 15, 2024

Looks good, I also got something like this with cnn dataset. But note that the loss consists of lm_loss + MTB_loss. From what I can see, lm_loss seems to decrease much more than MTB loss.

If you can, try a larger dataset for MTB pre-training, as cnn dataset might be too small. Eg. the paper used wiki dumps data which is huge.

from bert-relation-extraction.

pvcastro commented on June 15, 2024

@plkmo From what I could see, you weren't able to get good results from MTB using cnn either, right? I did a pretraining and applied it on the task afterwards, and the results were quite worse than using bert alone.

from bert-relation-extraction.

plkmo commented on June 15, 2024

Yeah, no good results pretraining MTB based on CNN dataset so far. Best is to directly fine-tune using pre-trained BERT.

from bert-relation-extraction.

potatoper commented on June 15, 2024

sorry to bother you, but when I run the program first day, it did work.
But the next day there was a problem with the program when I run the program.

it said IndexError: ('list index out of range', 'occurred at index 47')
please help , I really appreciate if you can have a look on it , sorry for you time
thanks a lot

prog-bar: 100%|██████████| 8000/8000 [00:01<00:00, 4026.81it/s]
prog-bar: 1%| | 96/8000 [00:00<00:00, 13751.82it/s]
Traceback (most recent call last):
File "C:/article/MTB/main_task.py", line 49, in
net = train_and_fit(args)
File "C:\article\MTB\src\tasks\trainer.py", line 33, in train_and_fit
train_loader, test_loader, train_len, test_len = load_dataloaders(args)
File "C:\article\MTB\src\tasks\preprocessing_funcs.py", line 178, in load_dataloaders
train_set = semeval_dataset(df_train, tokenizer=tokenizer, e1_id=e1_id, e2_id=e2_id)
File "C:\article\MTB\src\tasks\preprocessing_funcs.py", line 133, in init
e1_id=self.e1_id, e2_id=self.e2_id), axis=1)
File "C:\ProgramData\Anaconda3\lib\site-packages\tqdm\std.py", line 767, in inner
return getattr(df, df_function)(wrapper, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py", line 6004, in apply
return op.get_result()
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py", line 142, in get_result
return self.apply_standard()
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py", line 248, in apply_standard
self.apply_series_generator()
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py", line 277, in apply_series_generator
results[i] = self.f(v)
File "C:\ProgramData\Anaconda3\lib\site-packages\tqdm\std.py", line 762, in wrapper
return func(*args, **kwargs)
File "C:\article\MTB\src\tasks\preprocessing_funcs.py", line 133, in
e1_id=self.e1_id, e2_id=self.e2_id), axis=1)
File "C:\article\MTB\src\tasks\preprocessing_funcs.py", line 129, in get_e1e2_start
e1_e2_start = ([i for i, e in enumerate(x) if e == e1_id][0] , [i for i, e in enumerate(x) if e == e2_id][0])
IndexError: ('list index out of range', 'occurred at index 47')

from bert-relation-extraction.

zjucheri commented on June 15, 2024

Yeah, no good results pretraining MTB based on CNN dataset so far. Best is to directly fine-tune using pre-trained BERT.

My result of MTB pretraining based on CNN dataset is bad too, and pretraining task takes a long time. I wonder how much time do you take to pretraining MTB with a good result?

from bert-relation-extraction.

drevicko commented on June 15, 2024

@zjucheri : I found that MTB training on the CNN data beyond about 9 epochs degraded performance on FewRel. The key to better performance on is probably to use a larger (and perhaps more relevant or at least generic) data set such as WikiPedia.

@plkmo: Thanks for sharing your rather nice code :)

from bert-relation-extraction.

Related Issues (20)

Output type rather than [CLS] is supported? HOT 1
Unable to initialize weights
Different Embeddings for [E1] [/E1] [E2] [/E2] tokens HOT 1
bad accuracy on test data HOT 3
Development Set HOT 1
BERT_REL with SciBert HOT 3
Training and prediction on new dataset HOT 3
Infer from code (args.pkl) HOT 5
Infer.py function “annotate_sent” has a problem. HOT 1
FewRel fine-tuning? HOT 1
FewRel: meta_labels[-1].item() is always 4 HOT 2
Pre-trained final model HOT 2
Why is num_classes a command line argument and not directly inferred? HOT 1
Can't infer on my own sentences after finetuning model HOT 2
No File of BioBERT exist
Files of BioBERT Lost HOT 1
Missing ./data/BERT_tokenizer.pkl file HOT 5
Standard batching vs NCE?
Error while Running the files HOT 2
Change the way "en_core_web_sm" is loaded to fix OSError: [E050] HOT 1

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.