When running sh scripts/train_difficulty1.sh ./low_speed_model
in /PARL/examples/NeurIPS2019-Learn-to-Move-Challenge, absurd errors occurred (as shown below). Can anyone help me? Thanks in advance!
(opensim-rl) luo@idserver:~/PARL/examples/NeurIPS2019-Learn-to-Move-Challenge$ sh scripts/train_difficulty1.sh ./low_speed_model
/home/luo/anaconda3/envs/opensim-rl/bin/python
[12-16 23:08:12 MainThread @logger.py:224] Argv: train.py --actor_num 300 --difficulty 1 --penalty_coeff 3.0 --logdir ./output/difficulty1 --restore_model_path ./low_speed_model
/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/opensim/simbody.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
[12-16 23:08:12 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4
[12-16 23:08:13 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4
W1216 23:08:14.078102 6084 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.0, Runtime API Version: 8.0
W1216 23:08:14.081565 6084 device_context.cc:267] device: 0, cuDNN Version: 7.5.
[12-16 23:08:16 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4
/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/compiler.py:239: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
""")
WARNING:root:
You can try our memory optimize feature to save your memory usage:
# create a build_strategy variable to set memory optimize option
build_strategy = compiler.BuildStrategy()
build_strategy.enable_inplace = True
build_strategy.memory_optimize = True
# pass the build_strategy to with_data_parallel API
compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
loss_name=loss.name, build_strategy=build_strategy)
!!! Memory optimize is our experimental feature !!!
some variables may be removed/reused internal to save memory usage,
in order to fetch the right value of the fetch_list, please set the
persistable property to true for each variable in fetch_list
# Sample
conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None)
# if you need to fetch conv1, then:
conv1.persistable = True
I1216 23:08:16.079864 6084 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 4. And the Program will be copied 4 copies
I1216 23:08:17.081748 6084 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
[12-16 23:08:17 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4
/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/compiler.py:239: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
""")
WARNING:root:
You can try our memory optimize feature to save your memory usage:
# create a build_strategy variable to set memory optimize option
build_strategy = compiler.BuildStrategy()
build_strategy.enable_inplace = True
build_strategy.memory_optimize = True
# pass the build_strategy to with_data_parallel API
compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
loss_name=loss.name, build_strategy=build_strategy)
!!! Memory optimize is our experimental feature !!!
some variables may be removed/reused internal to save memory usage,
in order to fetch the right value of the fetch_list, please set the
persistable property to true for each variable in fetch_list
# Sample
conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None)
# if you need to fetch conv1, then:
conv1.persistable = True
I1216 23:08:17.209542 6084 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 4. And the Program will be copied 4 copies
I1216 23:08:17.324332 6084 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
[12-16 23:08:17 MainThread @machine_info.py:86] nvidia-smi -L found gpu count: 4
/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/compiler.py:239: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
""")
WARNING:root:
You can try our memory optimize feature to save your memory usage:
# create a build_strategy variable to set memory optimize option
build_strategy = compiler.BuildStrategy()
build_strategy.enable_inplace = True
build_strategy.memory_optimize = True
# pass the build_strategy to with_data_parallel API
compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
loss_name=loss.name, build_strategy=build_strategy)
!!! Memory optimize is our experimental feature !!!
some variables may be removed/reused internal to save memory usage,
in order to fetch the right value of the fetch_list, please set the
persistable property to true for each variable in fetch_list
# Sample
conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None)
# if you need to fetch conv1, then:
conv1.persistable = True
share_vars_from is set, scope is ignored.
I1216 23:08:17.525264 6084 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 4. And the Program will be copied 4 copies
I1216 23:08:17.640771 6084 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
[12-16 23:08:17 MainThread @train.py:303] restore model from ./low_speed_model
Traceback (most recent call last):
File "train.py", line 327, in
learner = Learner(args)
File "train.py", line 85, in init
self.restore(args.restore_model_path)
File "train.py", line 304, in restore
self.agent.restore(model_path)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/parl/core/fluid/agent.py", line 221, in restore
filename=filename)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 699, in load_params
filename=filename)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 611, in load_vars
filename=filename)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 648, in load_vars
executor.run(load_prog)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/executor.py", line 651, in run
use_program_cache=use_program_cache)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/executor.py", line 749, in run
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet: Invoke operator load_combine error.
Python Callstacks:
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/framework.py", line 1771, in append_op
attrs=kwargs.get("attrs", None))
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 647, in load_vars
attrs={'file_path': os.path.join(load_dirname, filename)})
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 611, in load_vars
filename=filename)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/paddle/fluid/io.py", line 699, in load_params
filename=filename)
File "/home/luo/anaconda3/envs/opensim-rl/lib/python3.6/site-packages/parl/core/fluid/agent.py", line 221, in restore
filename=filename)
File "train.py", line 304, in restore
self.agent.restore(model_path)
File "train.py", line 85, in init
self.restore(args.restore_model_path)
File "train.py", line 327, in
learner = Learner(args)
C++ Callstacks:
tensor version 3393762800 is not supported. at [/paddle/paddle/fluid/framework/lod_tensor.cc:256]
PaddlePaddle Call Stacks:
0 0x7efdba6c1f10p void paddle::platform::EnforceNotMet::Init<char const*>(char const*, char const*, int) + 352
1 0x7efdba6c2289p paddle::platform::EnforceNotMet::EnforceNotMet(std::exception_ptr::exception_ptr, char const*, int) + 137
2 0x7efdbc38c7d4p paddle::framework::DeserializeFromStream(std::istream&, paddle::framework::LoDTensor*, paddle::platform::DeviceContext const&) + 724
3 0x7efdbb35e480p paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, float>::LoadParamsFromBuffer(paddle::framework::ExecutionContext const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, std::istream*, bool, std::vector<std::string, std::allocatorstd::string > const&) const + 352
4 0x7efdbb35edfep paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 798
5 0x7efdbb35f273p std::Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, int>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, signed char>, paddle::operators::LoadCombineOpKernel<paddle::platform::CUDADeviceContext, long> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::M_invoke(std::Any_data const&, paddle::framework::ExecutionContext const&) + 35
6 0x7efdbc7411e7p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 375
7 0x7efdbc7415c1p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 529
8 0x7efdbc73ebbcp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332
9 0x7efdba84cd0ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382
10 0x7efdba84fdafp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool) + 143
11 0x7efdba6b359dp
12 0x7efdba6f4826p
13 0x7efe81ea2df2p _PyCFunction_FastCallDict + 258
14 0x7efe81f282bbp
15 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981
16 0x7efe81f26a60p
17 0x7efe81f2848ap
18 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805
19 0x7efe81f26a60p
20 0x7efe81f2848ap
21 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981
22 0x7efe81f26a60p
23 0x7efe81f2848ap
24 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805
25 0x7efe81f26a60p
26 0x7efe81f2848ap
27 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805
28 0x7efe81f26a60p
29 0x7efe81f2848ap
30 0x7efe81f2a8ddp _PyEval_EvalFrameDefault + 7805
31 0x7efe81f26a60p
32 0x7efe81f2848ap
33 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981
34 0x7efe81f25e74p
35 0x7efe81f285e8p
36 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981
37 0x7efe81f25e74p
38 0x7efe81f26e75p _PyFunction_FastCallDict + 645
39 0x7efe81e4bba6p _PyObject_FastCallDict + 358
40 0x7efe81e4bdfcp _PyObject_Call_Prepend + 204
41 0x7efe81e4be96p PyObject_Call + 86
42 0x7efe81ec4233p
43 0x7efe81eb9d4cp
44 0x7efe81e4badep _PyObject_FastCallDict + 158
45 0x7efe81f282bbp
46 0x7efe81f2b15dp _PyEval_EvalFrameDefault + 9981
47 0x7efe81f26a60p
48 0x7efe81f26ee3p PyEval_EvalCodeEx + 99
49 0x7efe81f26f2bp PyEval_EvalCode + 59
50 0x7efe81f596c0p PyRun_FileExFlags + 304
51 0x7efe81f5ac83p PyRun_SimpleFileExFlags + 371
52 0x7efe81f760b5p Py_Main + 3621
53 0x400c1dp main + 365
54 0x7efe80f01830p __libc_start_main + 240
55 0x4009e9p