hzfinfdu / plmtuningcompetition Goto Github PK

View Code? Open in Web Editor NEW

38.0 38.0 6.0 25.23 MB

擂台赛3-大规模预训练调优比赛的示例代码与baseline实现

License: Apache License 2.0

Python 100.00%

plmtuningcompetition's People

Contributors

Stargazers

Watchers

Forkers

yuxiangzhang0114 tinmnit lpsunny e0397123 jubgjf

plmtuningcompetition's Issues

【公告】Baseline小分（排行榜中FNLP队伍的分数）

INFO:root:-----Yelp-----
INFO:root:13.csv : 0.9035087719298246
INFO:root:50.csv : 0.9039473684210526
INFO:root:60.csv : 0.9070175438596492
INFO:root:42.csv : 0.9105263157894737
INFO:root:8.csv : 0.9035087719298246
INFO:root:Yelp score: 0.9057017543859649
INFO:root:-----MRPC-----
INFO:root:13.csv : 0.6666666666666666
INFO:root:50.csv : 0.6261682242990655
INFO:root:60.csv : 0.6108374384236454
INFO:root:42.csv : 0.6082474226804123
INFO:root:8.csv : 0.5757575757575758
INFO:root:MRPC score: 0.6175354655654731
INFO:root:-----SNLI-----
INFO:root:13.csv : 0.46780597272570346
INFO:root:50.csv : 0.4037631624374245
INFO:root:60.csv : 0.3759709994821336
INFO:root:42.csv : 0.43776972207837045
INFO:root:8.csv : 0.4507163818401519
INFO:root:SNLI score: 0.4272052477127568
INFO:root:-----TREC-----
INFO:root:13.csv : 0.19495835601972836
INFO:root:50.csv : 0.2984302566215465
INFO:root:60.csv : 0.16571630696711015
INFO:root:42.csv : 0.25257346925973584
INFO:root:8.csv : 0.2845905306160343
INFO:root:TREC score: 0.23925378389683102
INFO:root:-----AGNews-----
INFO:root:13.csv : 0.7782894736842105
INFO:root:50.csv : 0.8041666666666667
INFO:root:60.csv : 0.8212719298245614
INFO:root:42.csv : 0.8081140350877193
INFO:root:8.csv : 0.8364035087719298
INFO:root:AGNews score: 0.8096491228070176
INFO:root:-----SST-2-----
INFO:root:13.csv : 0.904296875
INFO:root:50.csv : 0.88671875
INFO:root:60.csv : 0.890625
INFO:root:42.csv : 0.90234375
INFO:root:8.csv : 0.896484375
INFO:root:SST-2 score: 0.89609375

结果为0

按照baseline跑完，也对比了提交示例。提交三次都是0分

Connection Issue

复现baseline上的问题

右键运行bbt.py文件时，可以输出一些信息,如下：

Example in train set:
{'text': "with outtakes in which most of the characters forget their lines and just utter uhhh , ' which is better than most of the writing in the movie ", 'labels': 0, 'input_text': "Xro target himself turn Europe worked energy scored * soon ball TV annual 2013 race International'd Market conferenceio o changesig officers inside form published phone co legal executive fightings hope summer officer football property@ book parents costsac manager create age email markets main . with outtakes in which most of the characters forget their lines and just utter uhhh , ' which is better than most of the writing in the movie . It was .", 'target_text': 'bad'}
Example in dev set:
{'text': 'awe and affection -- and a strange urge to get on a board and , uh , shred , dude ', 'labels': 1, 'input_text': "Xro target himself turn Europe worked energy scored * soon ball TV annual 2013 race International'd Market conferenceio o changesig officers inside form published phone co legal executive fightings hope summer officer football property@ book parents costsac manager create age email markets main . awe and affection -- and a strange urge to get on a board and , uh , shred , dude . It was .", 'target_text': 'great'}
#of train data: 32
Example:
+------------------------+------------------------+----------+--------+
| input_ids | attention_mask | mask_pos | labels |
+------------------------+------------------------+----------+--------+
| [0, 1000, 1001, 100... | [1, 1, 1, 1, 1, 1, ... | 88 | 10999 |
+------------------------+------------------------+----------+--------+
#of dev data: 32
Example:
+------------------------+------------------------+----------+--------+
| input_ids | attention_mask | mask_pos | labels |
+------------------------+------------------------+----------+--------+
| [0, 1000, 1001, 100... | [1, 1, 1, 1, 1, 1, ... | 76 | 12338 |
+------------------------+------------------------+----------+--------+
Some weights of RobertaForMaskedLM were not initialized from the model checkpoint at roberta-large and are newly initialized: ['lm_head.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[Embedding] mu: -0.018104109913110733 | std: 0.13582643866539001 [RandProj] mu: 0.0 | std: 0.006074342999950358
Population Size: 20
Serial Evaluation.

但是运行到515行：fitnesses = [model_forward_api.eval(x) for x in solutions]
会报错，IndexError: tensors used as indices must be long, byte or bool tensors
具体报错信息如下：

Traceback (most recent call last):
File "D:/00-SourceCode/PyCharm/数模比赛/PLMTuningCompetition-main/bbt.py", line 353, in eval
logits = self.model(
File "D:\ProgramData\Anaconda3\envs\bbt\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\00-SourceCode\PyCharm\数模比赛\PLMTuningCompetition-main\models\modeling_roberta.py", line 991, in forward
'logits': self.lm_head(outputs[torch.arange(outputs.size(0)), mask_pos]),
IndexError: tensors used as indices must be long, byte or bool tensors
python-BaseException

提交得分为0的文件汇总

RuntimeError

在运行 bbt.py 时，通过更改 tensor 的类型（.to(torch.int64)），bbt.py 得以正常运行
而当训练完 best.pt，尝试运行 test.py 时（将 line 49 由 for res, _, _ in test_api( 为 for res, _ in test_api(，不然会报错），显示以下错误：

RuntimeError: CUDA out of memory. Tried to allocate 818.00 MiB (GPU 0; 6.00 GiB total capacity; 4.33 GiB already allocated; 0 bytes free; 5.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

此时我并未运行其它进程

可否对sentence_fn有更详细的说明？

对这个函数具体是什么功能，在预测流程中什么位置，不是很明白，不知道咱们这里是否有更详细的说明？或者什么地方有相关参考资料？

test_api ---- 'Dataset' object has no attribute 'remove_columns'

调用示例的test.py报错

Traceback (most recent call last):
  File "test.py", line 60, in <module>
    for res, _ in test_api(
  File "test_api.py", line 1066, in test_api
AttributeError: 'Dataset' object has no attribute 'remove_columns'

hzfinfdu / plmtuningcompetition Goto Github PK

plmtuningcompetition's People

Contributors

Stargazers

Watchers

Forkers

plmtuningcompetition's Issues

【公告】Baseline小分（排行榜中FNLP队伍的分数）

结果为0

Connection Issue

复现baseline上的问题

提交得分为0的文件汇总

RuntimeError

可否对sentence_fn有更详细的说明？

test_api ---- 'Dataset' object has no attribute 'remove_columns'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent