paddlepaddle / knover Goto Github PK
View Code? Open in Web Editor NEWLarge-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle
License: Apache License 2.0
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle
License: Apache License 2.0
用gpu运行plato-2/scripts/24L_plato_interact.sh是可以的,但是我在没有gpu的机器上安装了cpu版本的paddlepaddle运行时出现问题:
E0811 21:28:37.228886 20545 pybind.cc:1277] Cannot use GPU because you have installed CPU version PaddlePaddle.
If you want to use GPU, please try to install GPU version PaddlePaddle by: pip install paddlepaddle-gpu
If you only have CPU, please change CUDAPlace(0) to be CPUPlace().
我已经尝试把所有的CUDAPlace(0)都替换成了CPUPlace(),请问还有哪不对吗?。
这是git diff的结果:
https://gist.github.com/fancyerii/fa04cea4e94cf9408c5d6091697fd9fa
请问这个sentence_piece_model是需要自己准备的吗?另外如何使用这个模型的文档可以详细点吗,或者说可能是我自己没找到详细说明的地方。工作很好,不过作为一个新手上手有点困难欸,麻烦解答了,谢谢。
https://github.com/PaddlePaddle/Knover/blob/master/plato-2/scripts/24L_plato_interact.sh#L18
这个地方的文件名和使用方式已经修改了,但是这里没有改。
为什么不能提供包含__model__的ckpt下载呢?
训练plato模型时报错:
File "/home/zzg/workspace/pycharm/Knover/knover/core/model.py", line 225, in load
self.args.start_step = start_step[0]
AttributeError: 'Plato' object has no attribute 'args'
查看model.py代码如下:
if is_checkpoint:
print(f"Load model from checkpoint: {model_path}")
start_step = get_tensor("@LR_DECAY_COUNTER@")
if start_step is not None:
self.args.start_step = start_step[0]
原因:初始化init时确实没有初始化args。
疑问:需要在init中加上self.args=args吗?感觉好像没用到self.args吧
这个训练一般会持续很久,很可能会断了之后继续训练,所以继续训练也是个刚需。建议把如何继续训练写到文档里面。
还有就是现在要继续训练要自己在参数里填check_point路径和当前的start_step,这样还是太麻烦了,建议在保存check_point的时候把这个信息保存一下,这样继续训练的时候先检测这个信息,然后自动从上次最后的step开始训练
感谢开源这么好的对话训练工具。最近学习过程中,发现示例的英语字典vacab.txt中有很多带下划线的单词,同时也有对应不带下划线的单词。不理解这些带下划线单词的作用,它增加了vacab长度,同时也会增加模型的收敛难度,那为什么会存在呢?下面是几个字典中的例子:
▁of 50
▁be 51
be 408
of 2530
我看文档里给出的预测参数里默认是--num_samples 20 --topk 5.不太理解num_samples和topk的关系。
topk5的话,是说按概率大小排序后,从前5个里面采样一个作为输出token。
那num_samples 20是做什么用的呢?
百度的预训练模型很强大,推理时可以将角色放在上下文中做出响应。我想用自己的数据做加入自己设定的角色进行训练,参照了数据example/train.tsv中加入your persona:的做法,但查看代码dialog_reader.py中,里面并没有针对"your persona:"字段做特殊处理,请问百度训练时,对带有"your persona:"的信息是怎么处理的?
按着文档做infer时报了下面的错误:
UnavailableError: Load operator fail to open file output/NSP/infer_model/encoder_layer_0_multi_head_att_key_fc.b_0, please check whether the model file is complete or damaged.
[Hint: Expected static_cast(fin) == true, but received static_cast(fin):0 != true:1.] (at /paddle/paddle/fluid/operators/load_op.h:41)
[operator < load > error]
是NSP模型有问题吗?
做情感分类时做过随机mask和ngram的数据增强,请问对话任务,使用这种增强方式效果会好吗?还有其他有效的数据增强方式吗?
我装的paddle版本是1.8.2.post107 paddlehub版本是1.5.3,错误信息如下:
ERROR 2020-07-21 14:44:20,106 utils.py:422] ABORT!!! Out of all 2 trainers, the trainer process with rank=[0] was aborted. Please check its log.
Traceback (most recent call last):
File "/home/li.ma/anaconda3/lib/python3.7/site-packages/paddle/distributed/utils.py", line 406, in watch_local_trainers
terminate_local_procs(procs)
File "/home/li.ma/anaconda3/lib/python3.7/site-packages/paddle/distributed/utils.py", line 257, in terminate_local_procs
p.proc.join(timeout=1)
AttributeError: 'Popen' object has no attribute 'join'
Is it possible to give a context to the conversation, so that the setting description/persona can be given to the pretrained models beforehand?
If not, is there a possibility of incorporating something like this with the pretrained models?
Hi, first of all, thanks for the really nice work!
I'm facing very low download speeds for the 24L model -- close to 20-30 KB/s using wget
. Could you please help with an alternative mirror link? Thanks!
74 | fc_out = self._calc_logits(outputs["enc_out"], inputs["tgt_pos"])
75 | lm_loss = layers.softmax_with_cross_entropy(logits=fc_out, label=inputs["tgt_pos"])
models/nsp_model.py的74行应该是有问题的
参数错误,中间应该还有一个checkpoints参数
然后nsp的forward函数应该和UnifiedTransformer一样有个存checkpoints数据的操作
另外能给出plato2更具体的训练demo吗
比如给出如何先只训练UnifiedTransformer,如何后续训练nsp和隐状态那个
在调用train.py时,batch_size可以设为8000左右,且一步用时在200s左右,而调用infer.py时,batch_size只能设的很小,4,12或更小,超过32就可能爆显存。这与平时的直观经验不一致啊。平时eval模式下应该比train模式下更快,占用内存也更小才对啊。请问是什么原因呢?
Hello,
In dialog_reader
found followoing error:
/Knover/readers/dialog_reader.py:374:73: F821 undefined name 'do_test'
1)我看paper中的NSPModel,“To select the most appropriate responses generated by the fine-grained generation model, the evaluation model is trained to estimate the coherence of the responses.”
理解为用stage 2.1生成的候选 + label 做分类model
而代码中的 mix_negative_sample 实现是随机替换tgt做负例,感觉不一致。
2)最后上线的模型是 用2.1 先生成候选 再用2.2 排序么?
Traceback (most recent call last):
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/perfectworld/gx/Knover/knover/scripts/interact.py", line 83, in
interact(args)
File "/home/perfectworld/gx/Knover/knover/scripts/interact.py", line 70, in interact
pred = task.infer_step(model, data)[0]
File "/home/perfectworld/gx/Knover/knover/core/task.py", line 46, in infer_step
outputs = self._post_process_infer_output(predictions)
File "/home/perfectworld/gx/Knover/knover/tasks/dialog_generation.py", line 162, in _post_process_infer_output
return self._post_process_generation_output(predictions)
File "/home/perfectworld/gx/Knover/knover/tasks/dialog_generation.py", line 91, in _post_process_generation_output
get_nsp_score_batch(self.nsp_predictor, predictions)
File "/home/perfectworld/gx/Knover/knover/tasks/dialog_generation.py", line 404, in get_nsp_score_batch
outputs = nsp_predictor(data)
File "/home/perfectworld/gx/Knover/knover/utils/inference_utils.py", line 44, in predict
return_numpy=True)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run
six.reraise(*sys.exc_info())
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run
return_merged=return_merged)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl
use_program_cache=use_program_cache)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1313, in _run_program
fetch_var_name=fetch_var_name)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 624, in _add_feed_fetch_ops
if not has_feed_operators(global_block, feed, feed_var_name):
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 280, in has_feed_operators
format(feed_target_name))
Exception: 'feed_targets' does not have label_pos variable
Dear sir, do you meet this problem? How to fix it?
想通过训练脚本学习一下整个流程
aistudio@jupyter-208728-1765888:~/Knover$ git branch -av
develop dcf05a0 Support PaddlePaddle 2.0.
* master 4bad22c Fix checkpoints and add document for continuous training (#31)
remotes/origin/HEAD -> origin/develop
remotes/origin/develop dcf05a0 Support PaddlePaddle 2.0.
remotes/origin/dygraph 5a2fbec Support dygraph in PaddlePaddle 2.0 and add lic2021 baseline
remotes/origin/luge-dialogue 1b03ac1 update score
remotes/origin/master 4bad22c Fix checkpoints and add document for continuous training (#31)
remotes/origin/plato-2 4bad22c Fix checkpoints and add document for continuous training (#31)
aistudio@jupyter-208728-1765888:~/Knover$ python infer.py --model Plato --task DialogGeneration --vocab_path ./projects/lic2021/conf/vocab.txt --spm_model_file ./projects/lic2021/conf/spm.model --infer_file ./data/lic2021/test.txt --data_format numerical --file_format file --config_path ./projects/lic2021/conf/12L_P.json --init_pretraining_params Plato --batch_size 2 --max_src_len 384 --max_tgt_len 128 --max_seq_len 512 --output_name response --decoding_strategy topk_sampling --do_generation True --num_samples 4 --topk 5 --is_cn True --do_generation true --save_path ./projects/lic2021/infer/output --log_step 10
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
{
"is_distributed": false,
"save_path": "./projects/lic2021/infer/output",
"infer_file": "./data/lic2021/test.txt",
"output_name": "response",
"log_steps": 10,
"Model": {
"model": "Plato",
"config_path": "./projects/lic2021/conf/12L_P.json",
"init_checkpoint": "",
"init_pretraining_params": "Plato",
"learning_rate": 1e-05,
"warmup_steps": 0,
"weight_decay": 0.0,
"max_grad_norm": 0.1,
"use_recompute": false,
"use_amp": false,
"amp_loss_scaling": 12800,
"max_seq_len": 512,
"weight_sharing": true,
"mem_efficient": false,
"use_bow": true,
"use_entropy": false,
"pre_encoder_cmd": "d",
"preprocess_cmd": "n",
"postprocess_cmd": "da",
"post_cls_cmd": "n",
"cls_bias": true,
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"max_position_embeddings": 512,
"latent_type_size": 20,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": 2,
"role_type_size": 32,
"vocab_size": 30004
},
"Generator": {
"min_dec_len": 1,
"max_dec_len": 64,
"decoding_strategy": "topk_sampling",
"temperature": 1.0,
"ignore_unk": true,
"num_samples": 4,
"topk": 5,
"topp": 0.9,
"beam_size": 10,
"length_average": true,
"length_penalty": 0.0
},
"Task": {
"task": "DialogGeneration",
"do_generation": true,
"is_cn": true,
"nsp_inference_model_path": null,
"nsp_attention_style": "bidirectional",
"ranking_score": "decode_score"
},
"Reader": {
"max_src_len": 384,
"max_tgt_len": 128,
"truncate_first_turn": false,
"file_format": "file",
"data_format": "numerical",
"in_tokens": false,
"batch_size": 2,
"continuous_position": true,
"random_seed": 11,
"sort_pool_size": 65536
},
"Tokenizer": {
"tokenizer": "SentencePieceTokenizer",
"vocab_path": "./projects/lic2021/conf/vocab.txt",
"do_lower_case": false,
"spm_model_file": "./projects/lic2021/conf/spm.model"
},
"run_infer": true
}
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/unified_transformer.py:119
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/transformer_block.py:116
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/transformer_block.py:217
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:161
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
return (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:209
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:209
The behavior of expression A / B has been unified with elementwise_div(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_div(X, Y, axis=0) instead of A / B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:239
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:239
The behavior of expression A - B has been unified with elementwise_sub(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_sub(X, Y, axis=0) instead of A - B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
W0412 19:20:59.318835 4704 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0412 19:20:59.322726 4704 device_context.cc:372] device: 0, cuDNN Version: 7.6.
Load pretraining parameters from Plato.
Traceback (most recent call last):
File "infer.py", line 139, in <module>
infer(args)
File "infer.py", line 86, in infer
predictions = task.infer_step(model, data)
File "/home/aistudio/Knover/tasks/task_base.py", line 43, in infer_step
predictions = model.infer_step(inputs)
File "/home/aistudio/Knover/models/plato.py", line 280, in infer_step
return super(Plato, self).infer_step(inputs)
File "/home/aistudio/Knover/models/unified_transformer.py", line 439, in infer_step
predictions = self._run_generation(inputs)
File "/home/aistudio/Knover/models/unified_transformer.py", line 394, in _run_generation
return_numpy=False)
File "/home/aistudio/Knover/models/model_base.py", line 266, in _execute
fetch_vars = self.exe.run(program, feed, fetch_list, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run
six.reraise(*sys.exc_info())
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run
return_merged=return_merged)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl
use_program_cache=use_program_cache)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1328, in _run_program
[fetch_var_name])
ValueError: In user code:
File "infer.py", line 139, in <module>
infer(args)
File "infer.py", line 72, in infer
model = models.create_model(args, place)
File "/home/aistudio/Knover/models/__init__.py", line 49, in create_model
return MODEL_REGISTRY[args.model](args, place)
File "/home/aistudio/Knover/models/plato.py", line 49, in __init__
super(Plato, self).__init__(args, place)
File "/home/aistudio/Knover/models/unified_transformer.py", line 93, in __init__
super(UnifiedTransformer, self).__init__(args, place)
File "/home/aistudio/Knover/models/model_base.py", line 74, in __init__
self._build_programs()
File "/home/aistudio/Knover/models/model_base.py", line 91, in _build_programs
predictions = self.infer(inputs, outputs)
File "/home/aistudio/Knover/models/unified_transformer.py", line 380, in infer
return self.generator.inference(self, inputs, outputs)
File "/home/aistudio/Knover/models/generator.py", line 175, in inference
gather_idx=parent_idx)
File "/home/aistudio/Knover/models/unified_transformer.py", line 178, in _generation_network
gather_idx=gather_idx)
File "/home/aistudio/Knover/models/unified_transformer.py", line 202, in _encode
store=caches is not None
File "/home/aistudio/Knover/models/transformer_block.py", line 376, in encoder
store=store)
File "/home/aistudio/Knover/models/transformer_block.py", line 288, in encoder_layer
store=store)
File "/home/aistudio/Knover/models/transformer_block.py", line 158, in multi_head_attention
dropout_rate)
File "/home/aistudio/Knover/models/transformer_block.py", line 116, in scaled_dot_product_attention
product += attn_bias
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py", line 304, in __impl__
attrs={'axis': axis})
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3023, in append_op
attrs=kwargs.get("attrs", None))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2107, in __init__
for frame in traceback.extract_stack():
InvalidArgumentError: Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [160, 12, 160, 427] and the shape of Y = [160, 12, 1, 268]. Received [427] in X is not equal to [268] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:160)
[operator < elementwise_add > error]
aistudio@jupyter-208728-1765888:~/Knover$
'msg': "'UnifiedTransformerTokenizer' object has no attribute 'dialogue_encode'", 'results': '', 'status': '101'
Hi, while running the interactive script for both the 24L and 32L models, I faced the following CUBLAS error.
I'm running the script on Ubuntu 18.04 with 4 Tesla T4 16GB GPUs on GCP.
W0715 13:57:42.359799 16069 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 11.0, Runtime API Version: 10.0
W0715 13:57:42.362675 16069 device_context.cc:260] device: 0, cuDNN Version: 8.0.
Load pretraining parameters from ./24L/Plato.
Enter [EXIT] to quit the interaction, [NEXT] to start a new conversation.
[Human]: hey
/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "./interaction.py", line 83, in <module>
interact(args)
File "./interaction.py", line 72, in interact
pred = task.infer_step(model, data)[0]
File "/mnt/disks/disk-huge/bakht/Knover/tasks/task_base.py", line 46, in infer_step
predictions = model.infer_step(inputs)
File "/mnt/disks/disk-huge/bakht/Knover/models/plato.py", line 243, in infer_step
return super(Plato, self).infer_step(inputs)
File "/mnt/disks/disk-huge/bakht/Knover/models/unified_transformer.py", line 506, in infer_step
return self._run_generation(inputs)
File "/mnt/disks/disk-huge/bakht/Knover/models/unified_transformer.py", line 462, in _run_generation
return_numpy=False)
File "/mnt/disks/disk-huge/bakht/Knover/models/model_base.py", line 258, in _execute
fetch_vars = self.exe.run(program, feed, fetch_list, return_numpy=return_numpy)
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1071, in run
six.reraise(*sys.exc_info())
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1066, in run
return_merged=return_merged)
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1154, in _run_impl
use_program_cache=use_program_cache)
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1229, in _run_program
fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:
--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 void paddle::operators::math::Blas<paddle::platform::CUDADeviceContext>::GEMM<float>(CBLAS_TRANSPOSE, CBLAS_TRANSPOSE, int, int, int, float, float const*, float const*, float, float*) const
3 void paddle::operators::math::Blas<paddle::platform::CUDADeviceContext>::MatMul<float>(paddle::framework::Tensor const&, paddle::operators::math::MatDescriptor const&, paddle::framework::Tensor const&, paddle::operators::math::MatDescriptor const&, float, paddle::framework::Tensor*, float) const
4 paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
5 std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
7 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
8 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
9 paddle::framework::Executor::RunPartialPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, long, long, bool, bool, bool)
10 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
11 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool, bool)
------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 6414, in matmul
attrs=attrs)
File "/mnt/disks/disk-huge/bakht/Knover/models/plato.py", line 194, in forward
latent_emb = layers.matmul(x=weights, y=latent_embeddings, transpose_y=True)
File "/mnt/disks/disk-huge/bakht/Knover/models/model_base.py", line 90, in _build_programs
outputs = self.forward(inputs, is_infer=True)
File "/mnt/disks/disk-huge/bakht/Knover/models/model_base.py", line 74, in __init__
self._build_programs()
File "/mnt/disks/disk-huge/bakht/Knover/models/unified_transformer.py", line 98, in __init__
super(UnifiedTransformer, self).__init__(args, place)
File "/mnt/disks/disk-huge/bakht/Knover/models/plato.py", line 50, in __init__
super(Plato, self).__init__(args, place)
File "/mnt/disks/disk-huge/bakht/Knover/models/__init__.py", line 49, in create_model
return MODEL_REGISTRY[args.model](args, place)
File "./interaction.py", line 54, in interact
model = models.create_model(args, place)
File "./interaction.py", line 83, in <module>
interact(args)
----------------------
Error Message Summary:
----------------------
ExternalError: Cublas error, CUBLAS_STATUS_EXECUTION_FAILED at (/paddle/paddle/fluid/operators/math/blas_impl.cu.h:34)
[operator < matmul > error]
你好 非常感谢工作的开源。
Paper里有提到有中文和英文模型,但似乎只在github上找到了英文的开源模型(EN) 所以中文预训练模型会开源吗请问
I try to use plato to infer a example data with the instrunction of https://github.com/PaddlePaddle/Knover/tree/develop/projects/PLATO-2.
But I encounter an error below. And my code branch is develop and paddle is 2.0.1.
could you give me some help for this issue?
aistudio@jupyter-208728-1765888:~/develop/Knover$ git branch
WARNING 2021-04-14 12:08:40,192 launch.py:316] Not found distinct arguments and compiled with cuda. Default use collective mode
launch train in GPU mode
INFO 2021-04-14 12:08:40,193 launch_utils.py:471] Local start 1 processes. First process distributed environment info (Only For Debug):
+=======================================================================================+
| Distributed Envs Value |
+---------------------------------------------------------------------------------------+
| PADDLE_TRAINER_ID 0 |
| PADDLE_CURRENT_ENDPOINT 127.0.0.1:56451 |
| PADDLE_TRAINERS_NUM 1 |
| PADDLE_TRAINER_ENDPOINTS 127.0.0.1:56451 |
| FLAGS_selected_gpus 0 |
+=======================================================================================+
INFO 2021-04-14 12:08:40,193 launch_utils.py:475] details abouts PADDLE_TRAINER_ENDPOINTS can be found in ./log/endpoints.log, and detail running logs maybe found in ./log/workerlog.0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
{
"is_distributed": true,
"save_path": "./output",
"infer_file": "./data/dailydialog_test_60.tsv",
"output_name": "response",
"log_steps": 1,
"Model": {
"model": "Plato",
"config_path": "./projects/PLATO-2/24L.json",
"init_checkpoint": "",
"init_pretraining_params": "./24L/Plato",
"optimizer": "AdamW",
"learning_rate": 1e-05,
"warmup_steps": 0,
"lr_scheduler": "noam",
"max_training_steps": 2000,
"min_learning_rate": 0,
"weight_decay": 0.0,
"max_grad_norm": 0.1,
"use_recompute": false,
"use_amp": false,
"amp_loss_scaling": 32768.0,
"weight_sharing": true,
"mem_efficient": false,
"use_role": false,
"use_bow": true,
"use_entropy": false,
"pre_encoder_cmd": "d",
"preprocess_cmd": "n",
"postprocess_cmd": "da",
"post_cls_cmd": "n",
"cls_bias": true,
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"max_position_embeddings": 256,
"latent_type_size": 20,
"num_attention_heads": 16,
"num_hidden_layers": 24,
"type_vocab_size": 2,
"vocab_size": 8001
},
"Generator": {
"min_dec_len": 1,
"max_dec_len": 64,
"decoding_strategy": "topk_sampling",
"temperature": 1.0,
"ignore_unk": true,
"num_samples": null,
"topk": 10,
"topp": 0.9,
"beam_size": 10,
"length_average": true,
"length_penalty": 0.0
},
"Task": {
"task": "DialogGeneration",
"do_generation": true,
"is_cn": false,
"filter_cross_repetition": true,
"nsp_inference_model_path": "./24L/NSP",
"ranking_score": "nsp_score"
},
"Reader": {
"max_src_len": 128,
"max_tgt_len": 128,
"max_seq_len": 256,
"max_knowledge_len": 0,
"knowledge_position": "post_src",
"knowledge_style": "original",
"truncate_first_turn": false,
"file_format": "file",
"data_format": "raw",
"in_tokens": false,
"batch_size": 5,
"position_style": "continuous",
"random_seed": 11,
"shuffle_pool_size": 0,
"sort_pool_size": 65536
},
"Tokenizer": {
"tokenizer": "SentencePieceTokenizer",
"vocab_path": "./package/dialog_en/vocab.txt",
"specials_path": "",
"do_lower_case": false,
"spm_model_file": "./package/dialog_en/spm.model"
},
"run_infer": true
}
W0414 12:08:41.338814 1234 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0414 12:08:41.343097 1234 device_context.cc:372] device: 0, cuDNN Version: 7.6.
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/models/unified_transformer.py:140
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/transformer_block.py:113
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/transformer_block.py:213
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
return (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:225
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:225
The behavior of expression A / B has been unified with elementwise_div(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_div(X, Y, axis=0) instead of A / B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:255
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:255
The behavior of expression A - B has been unified with elementwise_sub(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_sub(X, Y, axis=0) instead of A - B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
Loading model from ./24L/Plato.
Load pretraining parameters from ./24L/Plato
Traceback (most recent call last):
File "./knover/scripts/infer.py", line 140, in
infer(args)
File "./knover/scripts/infer.py", line 81, in infer
predictions = task.infer_step(model, data)
File "/home/aistudio/develop/Knover/knover/core/task.py", line 46, in infer_step
outputs = self._post_process_infer_output(predictions)
File "/home/aistudio/develop/Knover/knover/tasks/dialog_generation.py", line 162, in _post_process_infer_output
return self._post_process_generation_output(predictions)
File "/home/aistudio/develop/Knover/knover/tasks/dialog_generation.py", line 91, in _post_process_generation_output
get_nsp_score_batch(self.nsp_predictor, predictions)
File "/home/aistudio/develop/Knover/knover/tasks/dialog_generation.py", line 404, in get_nsp_score_batch
outputs = nsp_predictor(data)
File "/home/aistudio/develop/Knover/knover/utils/inference_utils.py", line 44, in predict
return_numpy=True)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run
six.reraise(*sys.exc_info())
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run
return_merged=return_merged)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl
use_program_cache=use_program_cache)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1313, in _run_program
fetch_var_name=fetch_var_name)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 624, in _add_feed_fetch_ops
if not has_feed_operators(global_block, feed, feed_var_name):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 280, in has_feed_operators
format(feed_target_name))
Exception: 'feed_targets' does not have label_pos variable
INFO 2021-04-14 12:08:55,230 launch_utils.py:307] terminate all the procs
ERROR 2021-04-14 12:08:55,230 launch_utils.py:545] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
INFO 2021-04-14 12:08:58,233 launch_utils.py:307] terminate all the procs
Is it possible to give a context to the conversation, so that the setting description/persona can be given to the pretrained models beforehand?
If not, is there a possibility of incorporating something like this with the pretrained models?
没看懂,模型是英文,还是中文
我用sentencepiece生成了vocab字典,model type是unigram。字典中第二列不是index整数,而是float的概率,所以运行训练时报错:ValueError: invalid literal for int() with base 10:
代码位置是vocab[token] = int(index)
因为index是浮点数,所以转换失败。
请问这里我要改字典还是代码呢?
Hi thanks for your great work! I explore the plato-2 directory and just found there are .sh files, may I ask where is the .py files? so I could try the chatbot interaction, thanks for your help!
运行interact进行多伦问答时,发现第2,3...轮的回答还是针对第一轮的问题的,没有对后面的问题作回答。
请问这是为什么?
源码把所有历史信息和当前的问题连起来作为输入token_ids,并且type_ids都为0,不知道训练是不是也是这样的。
你好,我在一些数据上重训nsp model,发现mask策略会使tgt_label采样为空。
具体在nsp_reader.py 的_pad_batch_records函数中
batch_mask_token_ids, tgt_label, tgt_pos, label_pos = mask(
batch_tokens=batch_token_ids,
vocab_size=self.vocab_size,
bos_id=self.bos_id,
eos_id=self.eos_id,
mask_id=self.mask_id,
sent_b_starts=batch_tgt_start_idx,
labels=batch_label,
is_unidirectional=False)
而mask策略,多次采样有时候prob 均> 0.15 ,导致mask_label、mask_pos都为空。
我在这块多次采样直到非空,暂时解决了这个问题。
有关多轮对话训练数据组织,我有个疑问。
比如 a, b , c, d 是一段对话。
应该生成 a,b ,c ->d一个pair的数据,还是枚举所有上文生成下文 a b, a b ->c, abc->d 呢。
第一种方式好像会损失一些tgt,只学最后一句;
第二种方式又显得有些冗余。
我理解的max_src_len是包括了人设,历史信息,本轮对话的上一句,这三部分加和后的最大长度,max_tgt_len是本轮对话的下一句的最大长度,max_seq_len是max_src_len+max_tgt_len的最大长度,这样理解对吗?如果对的话是不是要把max_src_len设置的长一些?中文对话训练里一个文字是2个字节,所以我设置的max_src_len=1600,这样对训练有什么影响呢?
中文数据不开源我可以理解,但透露一下数据的来源没问题吧?
论文只是简单的提了一下中文数据来自中文的社交媒体,能否具体一点呢?
微博,豆瓣小组,还是百度贴吧?
不同的来源上文谈话的内容风格和话题差异还是比较大的,希望可以提供一下。
谢谢
We are looking for interns and motivated researchers & engineers in dialogue systems.
Send your resume to [email protected] if you are interested.
请问use_role参数是标记什么的?是对话中的A和B吗?要怎么使用呢?
There is no model in model file 24L/Plato, so it can not be translated into onnx. While NSP have model, why?
看代码的规则,vocab里既要有[UNK]又要有<unk>,否则会报错,这两个token都代表未知词吧,有什么区别吗?
另外我看例子中英语的vocab有些token的ids重复了,如下,不明白为什么,重复的id不会被覆盖吗?自己做vocab的时候也要改成重复的吗?
<unk> 0
<s> 1
</s> 2
[UNK] 0
[PAD] 0
[CLS] 1
[SEP] 2
This line in projects/Plato-2/README.md:
bash ./scripts/local/job.sh ./project/PLATO-2/pretrain/24L_inference.conf
should be:
bash ./scripts/local/job.sh ./projecst/PLATO-2/pretrain/24L_infer.conf
Same for all similar lines...
想请问目前有开源中文模型参数吗?谢谢!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.