paddlepaddle / research Goto Github PK

novel deep learning research works with PaddlePaddle

License: Apache License 2.0

Python 58.04% Shell 3.42% C++ 0.59% Cuda 0.14% Jupyter Notebook 35.88% Makefile 0.01% Jsonnet 0.26% Perl 1.66% C 0.01%

deep-learning computer-vision nlp knowledge-graph spatial-temporal data-mining

research's Issues

About ACL2020-GraphSum

Hi, thx for your nice work accepted by acl2020. I check out the link attached on paper leveraging graph to improve abstractive multi-document summarization. But I did not find code and result in this repo and any other branch. How could I access codes and results.

关于PLATO模型：为什么需要隐变量？

按照文章的介绍，是为了更好地进行一对多生成，但事实上seq2seq模型本身就可以通过采样生成（而不是beam search确定性生成），所以原则上seq2seq模型本身就包含了一对多生成能力，文章所说的常规seq2seq不能很好地做一对多生成的断言似乎不能成立。

那么，隐变量的意义何在呢？此外，我没看到关于隐变量的正则项，那么如何保证隐变量的分布不会退化为一个one hot分布呢（即变成只有一个类，等价于没有隐变量）？

Stuck with this errorr, please help (server not ready, wait 3 sec to retry)

Hello,
When I run the training of ACL2020-GraphSum I got the following, would you please let me know how to solve this?

I am using Ubuntu

Many thanks,

将DuIE_Baseline基线系统做了一个服务部署，可以在线查看抽取结果

http://www.junphy.com/wordpress/index.php/2020/07/17/duie-baseline/

How to download dataset

Can you upload dataset on google driver. i can't download this link on baidu sever.

Duplicates in the ACL2020_SignOrSymptom_Relationship

Thanks for publishing the KG!

It seems that there are duplicates in the disease-finding relations. For example,

支气管哮喘气喘 Symptom

appeared at least twice in the relations_respiration_all.txt. Would this matter to the results?

Best wishes,
A

多卡运行NLP/DuReader-Robust-BASELINE报错：Tensor holds no memory. Call Tensor::mutable_data first.

NLP/DuReader-Robust-BASELINE的训练程序，单卡时正常运行，多卡时则会报错，具体信息如下：

C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::framework::Tensor::check_memory_size() const
3 long const* paddle::framework::Tensor::data() const
4 paddle::operators::LookupTableV2CUDAKernel::Compute(paddle::framework::ExecutionContext const&) const
5 std::Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::LookupTableV2CUDAKernel, paddle::operators::LookupTableV2CUDAKernel, paddle::operators::LookupTableV2CUDAKernelpaddle::platform::float16 >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::M_invoke(std::Any_data const&, paddle::framework::ExecutionContext const&)
6 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const
7 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const
8 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&)
9 paddle::framework::details::ComputationOpHandle::RunImpl()
10 paddle::framework::details::ThreadedSSAGraphExecutor::RunOpSync(paddle::framework::details::OpHandleBase*)
11 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result, std::__future_base::_Result_base::_Deleter>, void> >::_M_invoke(std::_Any_data const&)
12 std::__future_base::_State_base::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>&, bool&)
13 ThreadPool::ThreadPool(unsigned long)::{lambda()#1}::operator()() const

Python Call Stacks (More useful to users):

File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2459, in append_op
attrs=kwargs.get("attrs", None))
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/input.py", line 268, in embedding
'padding_idx': padding_idx
File "/home/zhan/Research-master/NLP/DuReader-Robust-BASELINE/src/model/ernie.py", line 97, in _build_model
name=self._pos_emb_name, initializer=self._param_initializer))
File "/home/zhan/Research-master/NLP/DuReader-Robust-BASELINE/src/model/ernie.py", line 81, in init
self.build_model(src_ids, position_ids, sentence_ids, input_mask)
File "", line 39, in create_model
use_fp16=args.use_fp16)
File "", line 6, in
is_training=True)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3254, in run_ast_nodes
if (await self.run_code(code, result, async=asy)):
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3063, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
coro.send(None)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2886, in _run_cell
return runner(coro)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2858, in run_cell
raw_cell, store_history, silent, shell_futures)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/zmqshell.py", line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/ipkernel.py", line 300, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 209, in wrapper
yielded = next(result)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/kernelbase.py", line 545, in execute_request
user_expressions, allow_stdin,
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 209, in wrapper
yielded = next(result)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/kernelbase.py", line 268, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 209, in wrapper
yielded = next(result)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/kernelbase.py", line 365, in process_one
yield gen.maybe_future(dispatch(*args))
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 748, in run
yielded = self.gen.send(value)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 714, in init
self.run()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 225, in wrapper
runner = Runner(result, future, yielded)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/kernelbase.py", line 381, in dispatch_queue
yield self.process_one()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 748, in run
yielded = self.gen.send(value)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/gen.py", line 787, in inner
self.run()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/ioloop.py", line 743, in _run_callback
ret = callback()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/ioloop.py", line 690, in
lambda f: self._run_callback(functools.partial(callback, future))
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel/kernelapp.py", line 583, in start
self.io_loop.start()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/traitlets/config/application.py", line 664, in launch_instance
app.start()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/site-packages/ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/zhan/anaconda3/envs/paddle/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)

Error Message Summary:

PaddleCheckError: holder_ should not be null
Tensor holds no memory. Call Tensor::mutable_data first. at [/paddle/paddle/fluid/framework/tensor.cc:23]
[operator < lookup_table_v2 > error]

运行环境为：
paddlepaddle-gpu 1.6.1.post107
cuda 10.0
cudnn 7.6.4
nccl 2.6.4

InvalidArgumentError: The shape of input[0] and input[1] is expected to be equal.But received input[0]'s shape = [-1, 0, 1], input[1]'s shape = [-1, 1, 1, 1].

I'm trying to test the GraphSum model with the command ./scripts/predict_graphsum_local_multinews.sh found in the documentation, but I get this error:

Error Message Summary:
----------------------
InvalidArgumentError: The shape of input[0] and input[1] is expected to be equal.But received input[0]'s shape = [-1, 0, 1], input[1]'s shape = [-1, 1, 1, 1].
  [Hint: Expected inputs_dims[i].size() == out_dims.size(), but received inputs_dims[i].size():4 != out_dims.size():3.] at (/paddle/paddle/fluid/operators/concat_op.h:40)
  [operator < concat > error]

I can't really debug this issue in PyCharm because of all the shell scripts involved. Any advice would be greatly appreciated. Thanks.

Research/CV/PaddleReid/process_aicity_data/

2_prepare_real_trainlist.py
37行 all_ids.append(vid) 应该有错误，set不支持append方法，应该是add方法

2_prepare_syn_trainlist.py
29，30 行
color = s.attributes['colorID'].value
cartype = s.attributes['typeID'].value
但是不是所有的车辆都有colorID这个属性，会造成程序崩溃

Plato能否在中文语料上从头训练？

您好，我发现Plato还没有chinese版本。我想要在自己的中文数据集上使用plato模型，请问能否从头开始训练？能的话应该如何训练？直接使用英文预训练的checkpoint肯定不行吧。

谢谢

AttributeError: 'Set' Object has no attribute 'append'

I am preparing AICity 2020 data and I have followed the instruction provided on process_aicity_data script. However, I am facing this issue and I've tried to solve it but I couldn't figure it out. Please help me to solve this issue.
PS I am a very new to this. I would really appreciate it.

Traceback (most recent call last):
File "2_prepare_real_trainlist.py", line 37, in
all_ids.append(vid)
AttributeError: 'set' object has no attribute 'append'

运行PLATO模型，训练时进程被Killed

您好，
我在PaddlePaddle 1.6.0，8核8G内存的Linux机器上运行PLATO模型时，报如下错误，请帮忙看看，谢谢！

$ bash scripts/DailyDialog/train.sh

SAVE_DIR=outputs/DailyDialog
VOCAB_PATH=model/Bert/vocab.txt
DATA_DIR=data/DailyDialog
INIT_CHECKPOINT=model/PLATO
DATA_TYPE=multi
USE_VISUALDL=false
export CUDA_VISIBLE_DEVICES=
CUDA_VISIBLE_DEVICES=
export FLAGS_fraction_of_gpu_memory_to_use=0.1
FLAGS_fraction_of_gpu_memory_to_use=0.1
export FLAGS_eager_delete_scope=True
FLAGS_eager_delete_scope=True
export FLAGS_eager_delete_tensor_gb=0.0
FLAGS_eager_delete_tensor_gb=0.0
python -u ./preprocess.py --vocab_path model/Bert/vocab.txt --data_dir data/DailyDialog --data_type multi
[[ false = true ]]
python -u ./run.py --do_train true --vocab_path model/Bert/vocab.txt --data_dir data/DailyDialog --data_type multi --batch_size 6 --valid_steps 2000 --num_type_embeddings 2 --use_discriminator true --num_epoch 20 --lr 1e-5 --save_checkpoint false --save_summary false --init_checkpoint model/PLATO --save_dir outputs/DailyDialog
{
"do_train": true,
"do_test": false,
"do_infer": false,
"num_infer_batches": null,
"hparams_file": null,
"BPETextField": {
"vocab_path": "model/Bert/vocab.txt",
"filtered": false,
"max_len": 256,
"min_utt_len": 1,
"max_utt_len": 50,
"min_ctx_turn": 1,
"max_ctx_turn": 16,
"max_knowledge_num": 16,
"max_knowledge_len": 16,
"tokenizer_type": "Bert"
},
"Dataset": {
"data_dir": "data/DailyDialog",
"data_type": "multi"
},
"Trainer": {
"use_data_distributed": false,
"valid_metric_name": "-loss",
"num_epochs": 20,
"save_dir": "outputs/DailyDialog",
"batch_size": 6,
"log_steps": 100,
"valid_steps": 2000,
"save_checkpoint": false,
"save_summary": false,
"shuffle": true,
"sort_pool_size": 0
},
"Model": {
"init_checkpoint": "model/PLATO",
"model": "UnifiedTransformer",
"num_token_embeddings": -1,
"num_pos_embeddings": 512,
"num_type_embeddings": 2,
"num_turn_embeddings": 16,
"num_latent": 20,
"tau": 0.67,
"with_bow": true,
"hidden_dim": 768,
"num_heads": 12,
"num_layers": 12,
"padding_idx": 0,
"dropout": 0.1,
"embed_dropout": 0.0,
"attn_dropout": 0.1,
"ff_dropout": 0.1,
"use_discriminator": true,
"dis_ratio": 1.0,
"weight_sharing": true,
"pos_trainable": true,
"two_layer_predictor": false,
"bidirectional_context": true,
"label_smooth": 0.0,
"initializer_range": 0.02,
"lr": 1e-05,
"weight_decay": 0.0,
"max_grad_norm": null
},
"Generator": {
"generator": "BeamSearch",
"min_gen_len": 1,
"max_gen_len": 30,
"beam_size": 5,
"length_average": false,
"length_penalty": -1.0,
"ignore_unk": true
}
}
Loading parameters from model/PLATO
Loaded parameters from model/PLATO
scripts/DailyDialog/train.sh: line 45: 2041 Killed python -u ./run.py --do_train true --vocab_path $VOCAB_PATH --data_dir $DATA_DIR --data_type $DATA_TYPE --batch_size 6 --valid_steps 2000 --num_type_embeddings 2 --use_discriminator true --num_epoch 20 --lr 1e-5 --save_checkpoint false --save_summary $USE_VISUALDL --init_checkpoint $INIT_CHECKPOINT --save_dir $SAVE_DIR
[[ false = true ]]

About ACL2019 ARNOR

你好，关于ACL2019收录论文，ARNOR，请问github中给出的的data version2的F1评测结果是macro的计算方式还是micro的计算方式，评测指标是否会去除None的标注。

add webvision2018

see title, add webvision2018 code

DuConv 数据集开放问题

这个数据集怎么只能在英文网站找到

Why the "Generated Summaries" of testset on Multi-News only have 5590 lines?

Hi,

Thanks for releasing the result of test set.
I'm doubting why the test result only has 5590 lines? The original Multi-News dataset contains 5622 document-pairs for testing. Did you exclude the outliers? Did you do the same during training?

Hope get your reply soon.

Joyce

ACL2019-ARNOR dataset clarification

I have the following questions on data version 2.0.0.

Is the dev.json file used for validation? or this is just another test set? did you use a part of train.json for validation ?
did you include the instances in dev.json and test.json which is marked as is_noise=true in the F1 score calculation ?
How many relations are used for the experiments ? Can you please provide a list of them ?

Thanks !!!!

NLP/ACL2018_DuReader ModuleNotFoundError: No module named 'bidaf_model'

尝试运行PaddlePaddle/Research/NLP/ACL2018_DuReader, 执行到'评估'阶段时，run.py中导包失败

Traceback (most recent call last):
  File "run.py", line 41, in <module>
    import bidaf_model as rc_model
ModuleNotFoundError: No module named 'bidaf_model'

检查后，没有在项目中找到bidaf_model相关代码。
希望可以帮忙解决一下，谢谢。

dusql-baseline

运行baseline时，运行至2020-06-01 15:04:33,010-INFO: value feature is being used这个日志时，再无日志输入，gpu利用率为11m，运行的时lstm编码

About ACL2020-GraphSum

您好！我想用自己的数据测试模型，请问像WIKI.test.0.json这样的数据是怎么生成的呢？有相应程序吗？

TypeError: _set_attr(): incompatible function arguments. The following argument types are supported

Any solution for this please.

关于GraphSum数据迁移的问题

@ZeyuChen @kahitomi
你好，很感谢你们如此优秀的开源。
我想问个问题，这份代码迁移到中文数据集的话，sentencepiece vocab file：spm9998_3.model 需要我们自己重新训练吗？直接用你们开源出来的可以吗？

How to parse sql_query to sql (NLP DuSQL-Baseline task)

Hi, @AoZhang , I download the original data from this link: https://dataset-bj.cdn.bcebos.com/dusql/DuSQL.tar. But I can't find any code about how to parse sql_query to sql in the train/dev/test data.

A question about ACL2020 GraphSum paper

Hi,

Thanks for releasing the code. I read the paper of GraphSum and here is my question.
In section 3.3 You said, "However, thanks to the graph modeling, our model can process much longer inputs." So, how do you process longer inputs? I am very interested in it.

thank you!

DuIE_Baseline编译报错

执行的时候报错，麻烦看下是啥问题？
Python Call Stacks (More useful to users):

File "/home/yong-group/.local/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
attrs=kwargs.get("attrs", None))
File "/home/yong-group/.local/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/yong-group/.local/lib/python3.7/site-packages/paddle/fluid/layers/sequence_lod.py", line 1057, in sequence_unpad
outputs={'Out': out})
File "/home/yong-group/XYN/PAP挑战赛/DuIE_Baseline/ernie/finetune/relation_extraction_multi_cls.py", line 84, in create_model
lod_labels = fluid.layers.sequence_unpad(labels, seq_lens)
File "/home/yong-group/XYN/PAP挑战赛/DuIE_Baseline/ernie/run_duie.py", line 161, in main
ernie_config=ernie_config)
File "/home/yong-group/XYN/PAP挑战赛/DuIE_Baseline/ernie/run_duie.py", line 411, in
main(args)
InvalidArgumentError: The shape of Input(Length) should be [batch_size]. But received (2)
[Hint: Expected len_dims.size() == 1, but received len_dims.size():2 != 1:1.] at (/paddle/paddle/fluid/operators/sequence_ops/sequence_unpad_op.cc:52)
[operator < sequence_unpad > error]

请问有预训练用的数据吗？

就是原文中提到的 Large-scale conversation datasets – Twitter (Cho et al., 2014) and Reddit (Zhou et al., 2018; Galley et al., 2019) are employed for pretraining, which results in 8.3 million training samples in total.

Might be a bug

Research/NLP/Dialogue-PLATO/plato/metrics/metrics.py

Line 32 in b466c22

bigrams = Counter(zip(seq, seq[1:]))

zip(seq, seq[1:])

suppose to be

zip(seq[:-1], seq[1:])

errors running DuReader-Robust-BASELINE

W0414 16:45:31.025034 10900 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.0, Runtime API Version: 9.0
W0414 16:45:31.028314 10900 device_context.cc:245] device: 0, cuDNN Version: 7.6.

I0414 16:45:33.431712 10900 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 2 cards are used, so 2 programs are executed in parallel.
W0414 16:45:36.668439 10900 init.cc:209] Warning: PaddlePaddle catches a failure signal, it may not work properly
W0414 16:45:36.668462 10900 init.cc:211] You could check whether you killed PaddlePaddle thread/process accidentally or report the case to PaddlePaddle
W0414 16:45:36.668467 10900 init.cc:214] The detail failure signal is:

W0414 16:45:36.668470 10900 init.cc:217] *** Aborted at 1586853936 (unix time) try "date -d @1586853936" if you are using GNU date ***
W0414 16:45:36.670137 10900 init.cc:217] PC: @ 0x0 (unknown)
W0414 16:45:36.670218 10900 init.cc:217] *** SIGSEGV (@0x0) received by PID 10900 (TID 0x7f7fa99b1700) from PID 0; stack trace: ***
W0414 16:45:36.671615 10900 init.cc:217] @ 0x7f7fa959d390 (unknown)
W0414 16:45:36.673030 10900 init.cc:217] @ 0x0 (unknown)

I think parallel.py in PLATO implementation raises attribute error

Hello,
It might be a silly question because it's my first time using paddle based codes... I hope your understanding!

I'm trying to run PLATO fine-tuning code, and I met 'Parameter' object has no attribute '_grad_ivar' error' at apply_collectice_grads function in parallel.py in plato implementation.

I also noticed that this function is also implemented in paddle/fluid/dygraph/parallel.py, and it was slightly different from the implementation in plato.

Therefore, I changed the function in plato just like the function in paddle implementaion.
As a result, I can run this code, but I still don't know if this method will make sense... I need your help!!

Thank you.


        for param in self._layers.parameters():
            # NOTE(zcd): The grad_ivar maybe no generated.
            #if param.trainable and param._grad_ivar():
            if param.trainable and param._ivar._grad_ivar():
                g_var = param._grad_ivar()
                grad_vars.append(g_var)
                assert g_var not in grad_var_set
                grad_var_set.add(g_var)

at apply_collectice_grads function in Research/NLP/Dialogue-PLATO/plato/modules/parallel.py


        for param in self._layers.parameters():
            # NOTE(zcd): The grad_ivar maybe no generated.
            if param.trainable and param._ivar._grad_ivar():
                g_var = framework.Variable(
                    block=self._helper.main_program.current_block(),
                    name=param._ivar._grad_name(),
                    stop_gradient=True,
                    ivar=param._ivar._grad_ivar())
                grad_vars.append(g_var)
                assert g_var not in grad_var_set
                grad_var_set.add(g_var)

at apply_collectice_grads function in paddle/fluid/dygraph/parallel.py

Can dataset processed in MMPMS be Shared?

MMPMS("Generating Multiple Diverse Responses with Multi-Mapping and Posterior Mapping Selection") evaluate the proposed model on two public conversation dataset: Weibo [Shang et al., 2015] and Reddit [Zhou et al., 2018] that maintain a large repository of post-response pairs from popular social websites.
The paper mentioned that "After basic data cleaning, we have above 2 million pairs in both datasets."
I would like to train the MMPMS from scratch. Could you please share the cleaned data?

Where is ACL20-DuRecDial?

官网给的test1文件怎么处理

2_prepare_syn_trainlist

Traceback (most recent call last):
File "2_prepare_syn_trainlist.py", line 29, in
color = s.attributes['colorID'].value
File "/home/eini/anaconda3/lib/python3.7/xml/dom/minidom.py", line 552, in getitem
return self._attrs[attname_or_tuple]
KeyError: 'colorID'

RuntimeError: parallel_for failed: no kernel image is available for execution on the device

GrahpSum is very intriguing, but I'm unable to test with ./scripts/predict_graphsum_local_multinews.sh.

The log/lanch.log file shows this error: RuntimeError: parallel_for failed: no kernel image is available for execution on the device

+ source ./env_local/env_local.sh
++ set -xe
+++ hostname -i
++ export iplist=127.0.1.1
++ iplist=127.0.1.1
++ unset http_proxy
++ unset https_proxy
+ source ./env_local/utils.sh
++ set -u
+ source ./model_config/graphsum_model_conf_local_multinews
++ task=GraphSum_MDS
++ VOCAB_PATH=./vocab/spm9998_3.model
++ CONFIG_PATH=./model_config/graphsum_config.json
++ TASK_DATA_PATH=/home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle
++ lr_scheduler=noam_decay
++ use_fp16=False
++ use_fuse=True
++ use_hierarchical_allreduce=True
++ nccl_comm_num=3
++ loss_scaling=12800
++ WARMUP_PROP=0.01
++ WARMUP_STEPS=8000
++ beta1=0.9
++ beta2=0.998
++ eps=1e-9
++ LR_RATE=2.0
++ WEIGHT_DECAY=0.01
+ export FLAGS_eager_delete_tensor_gb=1.0
+ FLAGS_eager_delete_tensor_gb=1.0
+ export FLAGS_sync_nccl_allreduce=1
+ FLAGS_sync_nccl_allreduce=1
+ export FLAGS_fraction_of_gpu_memory_to_use=0.98
+ FLAGS_fraction_of_gpu_memory_to_use=0.98
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ python -u ./src/run.py --model_name graphsum --use_cuda true --is_distributed false --use_multi_gpu_test False --use_fast_executor true --use_fp16 False --use_dynamic_loss_scaling False --init_loss_scaling 12800 --weight_sharing true --do_train false --do_val false --do_test true --do_dec true --verbose true --batch_size 30000 --in_tokens true --stream_job '' --init_pretraining_params '' --train_set /home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle/train --dev_set /home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle/valid --test_set /home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle/test --vocab_path ./vocab/spm9998_3.model --config_path model_config/graphsum_config.json --checkpoints ./models/graphsum_multinews --init_checkpoint ./models/graphsum_multinews/step_42976 --decode_path ./results/graphsum_multinews --lr_scheduler noam_decay --save_steps 10000 --weight_decay 0.01 --warmup_steps 8000 --validation_steps 20000 --epoch 100 --max_para_num 30 --max_para_len 60 --max_tgt_len 300 --max_out_len 300 --min_out_len 200 --beam_size 5 --graph_type similarity --len_penalty 0.6 --block_trigram True --report_rouge True --learning_rate 2.0 --skip_steps 100 --grad_norm 2.0 --pos_win 2.0 --label_smooth_eps 0.1 --num_iteration_per_drop_scope 10 --log_file log/graphsum_multinews_test.log --random_seed 1

(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ cat log/lanch.log
-----------  Configuration Arguments -----------
batch_size: 30000
beam_size: 5
beta1: 0.9
beta2: 0.998
block_trigram: True
checkpoints: ./models/graphsum_multinews
config_path: model_config/graphsum_config.json
decode_path: ./results/graphsum_multinews
decr_every_n_nan_or_inf: 2
decr_ratio: 0.8
dev_set: /home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle/valid
do_dec: True
do_lower_case: True
do_test: True
do_train: False
do_val: False
encoder_json_file: roberta_config/encoder.json
epoch: 100
eps: 1e-09
ernie_config_path: ernie_config/ernie_config.json
ernie_vocab_file: ernie_config/vocab.txt
evaluate_blue: False
grad_norm: 2.0
graph_type: similarity
in_tokens: True
incr_every_n_steps: 100
incr_ratio: 2.0
init_checkpoint: ./models/graphsum_multinews/step_42976
init_loss_scaling: 12800.0
init_pretraining_params:
is_distributed: False
label_smooth_eps: 0.1
learning_rate: 2.0
len_penalty: 0.6
log_file: log/graphsum_multinews_test.log
lr_scheduler: noam_decay
max_out_len: 300
max_para_len: 60
max_para_num: 30
max_seq_len: 512
max_tgt_len: 300
metrics: True
min_out_len: 200
model_name: graphsum
num_iteration_per_drop_scope: 10
pos_win: 2.0
random_seed: 1
report_rouge: True
roberta_config_path: roberta_config/roberta_config.json
roberta_vocab_file: roberta_config/vocab.txt
save_steps: 10000
skip_steps: 100
stream_job:
test_set: /home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle/test
train_set: /home/matt/mr/algos/GraphSum/data/MultiNews_data_tfidf_30_paddle/train
use_cuda: True
use_dynamic_loss_scaling: False
use_fast_executor: True
use_fp16: False
use_interval: False
use_multi_gpu_test: False
validation_steps: 20000
verbose: True
vocab_bpe_file: roberta_config/vocab.bpe
vocab_path: ./vocab/spm9998_3.model
warmup_proportion: 0.1
warmup_steps: 8000
weight_decay: 0.01
weight_sharing: True
------------------------------------------------
attention_probs_dropout_prob: 0.1
dec_graph_layers: 8
dec_word_pos_embedding_name: dec_word_pos_embedding
enc_graph_layers: 2
enc_sen_pos_embedding_name: enc_sen_pos_embedding
enc_word_layers: 6
enc_word_pos_embedding_name: enc_word_pos_embedding
hidden_act: relu
hidden_dropout_prob: 0.1
hidden_size: 256
initializer_range: 0.02
max_position_embeddings: 512
num_attention_heads: 8
postprocess_command: da
preprocess_command: n
word_embedding_name: word_embedding
------------------------------------------------
[2020-10-23 10:20:10,392 INFO] {'BOS': 4, 'EOS': 5, 'PAD': 6, 'EOT': 3, 'EOP': 7, 'EOQ': 8, 'UNK': 0}
[2020-10-23 10:20:10,393 WARNING] paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[2020-10-23 10:20:11,079 INFO] args.is_distributed: False
W1023 10:20:11.511365 3292334 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 86, Driver API Version: 11.1, Runtime API Version: 10.0
W1023 10:20:11.512548 3292334 device_context.cc:244] device: 0, cuDNN Version: 8.0.
W1023 10:20:11.862699 3292334 operator.cc:179] truncated_gaussian_random raises an exception thrust::system::system_error, parallel_for failed: no kernel image is available for execution on the device
/home/matt/anaconda3/envs/graphsum/lib/python3.6/site-packages/paddle/fluid/executor.py:779: UserWarning: The following exception is not an EOF exception.
  "The following exception is not an EOF exception.")
Traceback (most recent call last):
  File "./src/run.py", line 35, in <module>
    run_graphsum(args)
  File "/home/matt/mr/algos/Research/NLP/ACL2020-GraphSum/src/networks/graphsum/run_graphsum.py", line 219, in main
    exe.run(startup_prog)
  File "/home/matt/anaconda3/envs/graphsum/lib/python3.6/site-packages/paddle/fluid/executor.py", line 780, in run
    six.reraise(*sys.exc_info())
  File "/home/matt/anaconda3/envs/graphsum/lib/python3.6/site-packages/six.py", line 703, in reraise
    raise value
  File "/home/matt/anaconda3/envs/graphsum/lib/python3.6/site-packages/paddle/fluid/executor.py", line 775, in run
    use_program_cache=use_program_cache)
  File "/home/matt/anaconda3/envs/graphsum/lib/python3.6/site-packages/paddle/fluid/executor.py", line 822, in _run_impl
    use_program_cache=use_program_cache)
  File "/home/matt/anaconda3/envs/graphsum/lib/python3.6/site-packages/paddle/fluid/executor.py", line 899, in _run_program
    fetch_var_name)
RuntimeError: parallel_for failed: no kernel image is available for execution on the device

(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ nvidia-smi
Fri Oct 23 10:21:38 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    Off  | 00000000:21:00.0 Off |                  N/A |
| 30%   32C    P0    62W / 350W |      0MiB / 24265MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ python -V
Python 3.6.9 :: Anaconda, Inc.
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep nltk
nltk             3.4.5
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep numpy
numpy            1.18.1
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep paddlepaddle
paddlepaddle-gpu 1.6.3.post107
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep pyrouge
pyrouge          0.1.3
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep regex
regex            2020.2.20
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep requests
requests         2.22.0
(graphsum) matt@DeepWhite:~/mr/algos/GraphSum $ pip list | grep sentencepiece
sentencepiece    0.1.85

(graphsum) matt@DeepWhite:~/mr/algos/GraphSum/log $ cat graphsum_multinews_test.log
[2020-10-23 10:16:45,647 INFO] {'BOS': 4, 'EOS': 5, 'PAD': 6, 'EOT': 3, 'EOP': 7, 'EOQ': 8, 'UNK': 0}
[2020-10-23 10:16:45,647 WARNING] paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[2020-10-23 10:16:46,320 INFO] args.is_distributed: False
[2020-10-23 10:19:38,037 INFO] {'BOS': 4, 'EOS': 5, 'PAD': 6, 'EOT': 3, 'EOP': 7, 'EOQ': 8, 'UNK': 0}
[2020-10-23 10:19:38,037 WARNING] paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[2020-10-23 10:19:38,740 INFO] args.is_distributed: False
[2020-10-23 10:20:10,392 INFO] {'BOS': 4, 'EOS': 5, 'PAD': 6, 'EOT': 3, 'EOP': 7, 'EOQ': 8, 'UNK': 0}
[2020-10-23 10:20:10,393 WARNING] paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[2020-10-23 10:20:11,079 INFO] args.is_distributed: False
[2020-10-23 10:43:43,574 INFO] {'BOS': 4, 'EOS': 5, 'PAD': 6, 'EOT': 3, 'EOP': 7, 'EOQ': 8, 'UNK': 0}
[2020-10-23 10:43:43,574 WARNING] paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[2020-10-23 10:43:44,258 INFO] args.is_distributed: False

No such file or directory: 'log/graphsum.log'

When I run file 'predict_graphsum_local_multinews.sh' on google colab. I have this problem the following:
Traceback (most recent call last):
File "/content/drive/MyDrive/DATN/GraphSum/src/run.py", line 31, in
init_logger(args.log_file)
File "/content/drive/MyDrive/DATN/GraphSum/src/utils/logging.py", line 33, in init_logger
file_handler = logging.FileHandler(log_file)
File "/usr/lib/python3.6/logging/init.py", line 1032, in init
StreamHandler.init(self, self._open())
File "/usr/lib/python3.6/logging/init.py", line 1061, in _open
return open(self.baseFilename, self.mode, encoding=self.encoding)
FileNotFoundError: [Errno 2] No such file or directory: '/content/log/graphsum.log'

关于序列标注策略的疑惑

为什么“王”字，所对应的标签被划分到两个类中，这两个类分别是什么，希望得到解答，谢谢

Will release Pre-Trained Chinese PLATO models?

How to train the mode with 2 GPUs

I have only two GPUs. How do I modify the code?

在AI Studio上新建项目运行报错

你好，
我在AI Studio上新建项目后，运行时出现错误：

python=3.7
PaddlePaddle=1.6.0
请问如何解决，谢谢！

DuEL_Baseline 在运行predict.sh的时候，报 var read_file_0.tmp_3 not in this block 错误

详情log如下：
请问是哪里出错了吗，训练是没有问题的

环境：Cuda10+cudnn 7.4 + paddle1.7.1

Traceback (most recent call last):
File "./ernie/infer_type_ranker.py", line 358, in
main(args)
File "./ernie/infer_type_ranker.py", line 163, in main
main_program=predict_prog,
File "/home/hadoop-aipnlp/cephfs/data/gaojianwei/research/ccks2020/DuEL_Baseline/env2/lib/python2.7/site-packages/paddle/fluid/io.py", line 1221, in save_inference_model
prepend_feed_ops(main_program, feeded_var_names)
File "/home/hadoop-aipnlp/cephfs/data/gaojianwei/research/ccks2020/DuEL_Baseline/env2/lib/python2.7/site-packages/paddle/fluid/io.py", line 1031, in prepend_feed_ops
out = global_block.var(name)
File "/home/hadoop-aipnlp/cephfs/data/gaojianwei/research/ccks2020/DuEL_Baseline/env2/lib/python2.7/site-packages/paddle/fluid/framework.py", line 2280, in var
raise ValueError("var %s not in this block" % name)
ValueError: var read_file_0.tmp_3 not in this block

关于EMNLP2019-AKGCM中，Minerva应用在对话中的query

您好，

在Minerva文章中解决的是三元组的QA问题，所以他们的query也来自于relation。
那对于对话中的query，是第一个人说的话吗？
那训练的数据集是将triple中的relation全部换成sentence吗？

谢谢！

Why the test result of Multi-News only has 5590 lines?

Hi,

Joyce

How to provide custom input to ACL2020-GraphSum model

Your work is amazing. Can you please direct me to a method which tells how can we give our own paragraphs as input to GraphSum Model?

aistudio上使用predict.sh 预测，报错 ValueError: var read_file_0.tmp_3 not in this block

环境：aistudio Cuda9.2+cudnn 7.6 + paddle1.8.4
使用paddle1.5就没事，但是这个就不行，请问怎么解决

报错log如下

2020-11-23 14:35:49,888-WARNING: paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
[WARNING] 2020-11-23 14:35:49,888 [       io.py:  712]:	paddle.fluid.layers.py_reader() may be deprecated in the near future. Please use paddle.fluid.io.DataLoader.from_generator() instead.
----------place-----------
CUDAPlace(0)
W1123 14:35:50.949314   187 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 10.1, Runtime API Version: 9.0
W1123 14:35:51.145216   187 device_context.cc:260] device: 0, cuDNN Version: 7.6.
2020-11-23 14:35:57,108-INFO: Load pretraining parameters from ./checkpoints/step_20000.
[INFO] 2020-11-23 14:35:57,108 [     init.py:  101]:	Load pretraining parameters from ./checkpoints/step_20000.
2020-11-23 14:35:57,108-INFO: save inference model to ./checkpoints/inference_model/step_20000_inference_model
[INFO] 2020-11-23 14:35:57,108 [infer_type_ranker.py:  158]:	save inference model to ./checkpoints/inference_model/step_20000_inference_model
Traceback (most recent call last):
  File "./ernie/infer_type_ranker.py", line 358, in <module>
    main(args)
  File "./ernie/infer_type_ranker.py", line 163, in main
    main_program=predict_prog
  File "/opt/conda/envs/python27-paddle120-env/lib/python2.7/site-packages/paddle/fluid/io.py", line 1247, in save_inference_model
    prepend_feed_ops(main_program, feeded_var_names)
  File "/opt/conda/envs/python27-paddle120-env/lib/python2.7/site-packages/paddle/fluid/io.py", line 1043, in prepend_feed_ops
    out = global_block.var(name)
  File "/opt/conda/envs/python27-paddle120-env/lib/python2.7/site-packages/paddle/fluid/framework.py", line 2377, in var
    raise ValueError("var %s not in this block" % name)
ValueError: var read_file_0.tmp_3 not in this block

InvalidArgumentError

在aistudio上运行，报错误InvalidArgumentError: Python object is not type of St10shared_ptrIN6paddle10imperative7VarBaseEE (at /paddle/paddle/fluid/pybind/imperative.cc:216)，该如何解决

ACL2019_DuConv generative_paddle loss error

Hi,
I want to run the model in generative_paddle, but there is an error when I run run_train.sh:
AssertionError: The loss.shape should be (1L,), but the current loss.shape is (-1L,). Maybe that you should call fluid.layers.mean to process the current loss.
How can I solve it? THANKS!

DatasetA label for fine-tuning

Can you provide label for dataset A on another hosting? I can't download from baidu. Thanks so much.

'Parameter' object has no attribute '_grad_ivar'

I encountered this error when I run sh scripts/DailyDialog/multi_gpu_train.sh

my environment:
paddlepaddle==1.6.0 via pip install paddlepaddle-gpu==1.6.0
cuda 10
cudnn 7.6

thanks!

PLATO fine-tuning raises error.. Please help!

Hello,
I installed all requirements by using pip install -r requirement.txt
my cuda version, cudnn version are fit for paddlepaddle-gpu == 1.6.1.post107 (which is on your requirements.txt)
However, I met this error when I try to fine-tune the pre-trained model.

Traceback (most recent call last):
File "./run.py", line 23, in
import paddle.fluid as fluid
File "/home/joy/Research/NLP/Dialogue-PLATO/vplato/lib/python3.6/site-packages/paddle/fluid/init.py", line 35, in
from . import framework
File "/home/joy/Research/NLP/Dialogue-PLATO/vplato/lib/python3.6/site-packages/paddle/fluid/framework.py", line 35, in
from . import core
File "/home/joy/Research/NLP/Dialogue-PLATO/vplato/lib/python3.6/site-packages/paddle/fluid/core.py", line 187, in
raise e
File "/home/joy/Research/NLP/Dialogue-PLATO/vplato/lib/python3.6/site-packages/paddle/fluid/core.py", line 167, in
from .core_avx import *
ImportError: /home/joy/Research/NLP/Dialogue-PLATO/vplato/lib/python3.6/site-packages/paddle/fluid/../libs/libmklml_intel.so: symbol __kmpc_omp_task_with_deps, version VERSION not defined in file libiomp5.so with link time reference

[[ false = true ]]

I tried my best to fix it, but it didn't work. What is the solution for this problem?

paddlepaddle / research Goto Github PK

research's Issues

NLP/DuReader-Robust-BASELINE的训练程序，单卡时正常运行，多卡时则会报错，具体信息如下：

C++ Call Stacks (More useful to developers):

Python Call Stacks (More useful to users):

Error Message Summary:

执行的时候报错，麻烦看下是啥问题？ Python Call Stacks (More useful to users):

Recommend Projects

Recommend Topics

Recommend Org

执行的时候报错，麻烦看下是啥问题？
Python Call Stacks (More useful to users):