hzfinfdu / plmtuningcompetition Goto Github PK
View Code? Open in Web Editor NEW擂台赛3-大规模预训练调优比赛的示例代码与baseline实现
License: Apache License 2.0
擂台赛3-大规模预训练调优比赛的示例代码与baseline实现
License: Apache License 2.0
INFO:root:-----Yelp-----
INFO:root:13.csv : 0.9035087719298246
INFO:root:50.csv : 0.9039473684210526
INFO:root:60.csv : 0.9070175438596492
INFO:root:42.csv : 0.9105263157894737
INFO:root:8.csv : 0.9035087719298246
INFO:root:Yelp score: 0.9057017543859649
INFO:root:-----MRPC-----
INFO:root:13.csv : 0.6666666666666666
INFO:root:50.csv : 0.6261682242990655
INFO:root:60.csv : 0.6108374384236454
INFO:root:42.csv : 0.6082474226804123
INFO:root:8.csv : 0.5757575757575758
INFO:root:MRPC score: 0.6175354655654731
INFO:root:-----SNLI-----
INFO:root:13.csv : 0.46780597272570346
INFO:root:50.csv : 0.4037631624374245
INFO:root:60.csv : 0.3759709994821336
INFO:root:42.csv : 0.43776972207837045
INFO:root:8.csv : 0.4507163818401519
INFO:root:SNLI score: 0.4272052477127568
INFO:root:-----TREC-----
INFO:root:13.csv : 0.19495835601972836
INFO:root:50.csv : 0.2984302566215465
INFO:root:60.csv : 0.16571630696711015
INFO:root:42.csv : 0.25257346925973584
INFO:root:8.csv : 0.2845905306160343
INFO:root:TREC score: 0.23925378389683102
INFO:root:-----AGNews-----
INFO:root:13.csv : 0.7782894736842105
INFO:root:50.csv : 0.8041666666666667
INFO:root:60.csv : 0.8212719298245614
INFO:root:42.csv : 0.8081140350877193
INFO:root:8.csv : 0.8364035087719298
INFO:root:AGNews score: 0.8096491228070176
INFO:root:-----SST-2-----
INFO:root:13.csv : 0.904296875
INFO:root:50.csv : 0.88671875
INFO:root:60.csv : 0.890625
INFO:root:42.csv : 0.90234375
INFO:root:8.csv : 0.896484375
INFO:root:SST-2 score: 0.89609375
按照baseline跑完,也对比了提交示例。提交三次都是0分
右键运行bbt.py文件时,可以输出一些信息,如下:
Example in train set:
{'text': "with outtakes in which most of the characters forget their lines and just utteruhhh , ' which is better than most of the writing in the movie ", 'labels': 0, 'input_text': "Xro target himself turn Europe worked energy scored * soon ball TV annual 2013 race International'd Market conferenceio o changesig officers inside form published phone co legal executive fightings hope summer officer football property@ book parents costsac manager create age email markets main . with outtakes in which most of the characters forget their lines and just utter
uhhh , ' which is better than most of the writing in the movie . It was .", 'target_text': 'bad'}
Example in dev set:
{'text': 'awe and affection -- and a strange urge to get on a board and , uh , shred , dude ', 'labels': 1, 'input_text': "Xro target himself turn Europe worked energy scored * soon ball TV annual 2013 race International'd Market conferenceio o changesig officers inside form published phone co legal executive fightings hope summer officer football property@ book parents costsac manager create age email markets main . awe and affection -- and a strange urge to get on a board and , uh , shred , dude . It was .", 'target_text': 'great'}
#of train data: 32
Example:
+------------------------+------------------------+----------+--------+
| input_ids | attention_mask | mask_pos | labels |
+------------------------+------------------------+----------+--------+
| [0, 1000, 1001, 100... | [1, 1, 1, 1, 1, 1, ... | 88 | 10999 |
+------------------------+------------------------+----------+--------+
#of dev data: 32
Example:
+------------------------+------------------------+----------+--------+
| input_ids | attention_mask | mask_pos | labels |
+------------------------+------------------------+----------+--------+
| [0, 1000, 1001, 100... | [1, 1, 1, 1, 1, 1, ... | 76 | 12338 |
+------------------------+------------------------+----------+--------+
Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-large and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[Embedding] mu: -0.018104109913110733 | std: 0.13582643866539001 [RandProj] mu: 0.0 | std: 0.006074342999950358
Population Size: 20
Serial Evaluation.
但是运行到515行:fitnesses = [model_forward_api.eval(x) for x in solutions]
会报错,IndexError: tensors used as indices must be long, byte or bool tensors
具体报错信息如下:
Traceback (most recent call last):
File "D:/00-SourceCode/PyCharm/数模比赛/PLMTuningCompetition-main/bbt.py", line 353, in eval
logits = self.model(
File "D:\ProgramData\Anaconda3\envs\bbt\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\00-SourceCode\PyCharm\数模比赛\PLMTuningCompetition-main\models\modeling_roberta.py", line 991, in forward
'logits': self.lm_head(outputs[torch.arange(outputs.size(0)), mask_pos]),
IndexError: tensors used as indices must be long, byte or bool tensors
python-BaseException
在运行 bbt.py
时,通过更改 tensor 的类型(.to(torch.int64)
),bbt.py
得以正常运行
而当训练完 best.pt
,尝试运行 test.py
时(将 line 49 由 for res, _, _ in test_api(
为 for res, _ in test_api(
,不然会报错),显示以下错误:
RuntimeError: CUDA out of memory. Tried to allocate 818.00 MiB (GPU 0; 6.00 GiB total capacity; 4.33 GiB already allocated; 0 bytes free; 5.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
此时我并未运行其它进程
调用示例的test.py报错
Traceback (most recent call last):
File "test.py", line 60, in <module>
for res, _ in test_api(
File "test_api.py", line 1066, in test_api
AttributeError: 'Dataset' object has no attribute 'remove_columns'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.