<div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto

right, I'm trying to use multi-agent env. <div class="snippet-clipboard-content no

The issue has been addressed by <a class="issue-link js-issue-link" data-error-text="F

restore model error about parl HOT 6 CLOSED

paddlepaddle commented on September 21, 2024

restore model error

from parl.

Comments (6)

TomorrowIsAnOtherDay commented on September 21, 2024

It seems that you are using the IMPALA algorithm with the customized agent. Could you paste the code of MPEAgent here?

from parl.

zienn commented on September 21, 2024

right, I'm trying to use multi-agent env.

    def __init__(self, algorithm, obs_shape, act_dim,
                 learn_data_provider=None):
        assert isinstance(obs_shape, (list, tuple))
        self.obs_shape = obs_shape
        self.act_dim = act_dim
        # self.place = fluid.CUDAPlace(
        #     0) if machine_info.is_gpu_available() else fluid.CPUPlace()
        # self.fluid_executor = fluid.Executor(self.place)
        super(MPEAgent, self).__init__(algorithm)  
        if learn_data_provider:
            self.learn_reader.decorate_tensor_provider(learn_data_provider)
            self.learn_reader.start()

    def build_program(self):
        self.sample_program = fluid.Program()
        self.predict_program = fluid.Program()
        self.learn_program = fluid.Program()

        # fluid.layers.data()用来接收数据，类似placeholder
        with fluid.program_guard(self.sample_program):
            obs = layers.data(
                name='obs', shape=self.obs_shape, dtype='float32')
            self.sample_actions, self.behaviour_logits = self.alg.sample(obs)  # sample()

        # predict()
        with fluid.program_guard(self.predict_program):
            obs = layers.data(
                name='obs', shape=self.obs_shape, dtype='float32')
            self.predict_actions = self.alg.predict(obs)

        with fluid.program_guard(self.learn_program):
            obs = layers.data(
                name='obs', shape=self.obs_shape, dtype='float32')
            obs_act = layers.data(
                name='obs_act', shape=(-1, 21), dtype='float32')
            actions = layers.data(
                name='actions', shape=[], dtype='int64')
            behaviour_logits = layers.data(
                name='behaviour_logits', shape=[self.act_dim], dtype='float32')
            rewards = layers.data(
                name='rewards', shape=[], dtype='float32')
            dones = layers.data(
                name='dones', shape=[], dtype='float32')
            lr = layers.data(
                name='lr', shape=[1], dtype='float32', append_batch_size=False)
            entropy_coeff = layers.data(
                name='entropy_coeff', shape=[], dtype='float32')

            self.learn_reader = fluid.layers.create_py_reader_by_data(
                capacity=32,
                feed_list=[
                    obs, obs_act, actions, behaviour_logits, rewards, dones, lr, entropy_coeff
                ])

            obs, obs_act, actions, behaviour_logits, rewards, dones, lr, entropy_coeff = fluid.layers.read_file(
                self.learn_reader)
            vtrace_loss, kl = self.alg.learn(obs, obs_act, actions, behaviour_logits,
                                             rewards, dones, lr, entropy_coeff)
            self.learn_outputs = [
                vtrace_loss.total_loss, vtrace_loss.pi_loss,
                vtrace_loss.vf_loss, vtrace_loss.entropy, kl
            ]
        self.learn_program = parl.compile(self.learn_program,
                                          vtrace_loss.total_loss)

    def sample(self, obs_np):
        obs_np = obs_np.astype('float32')
        self.fluid_executor.run(fluid.default_startup_program())

        # FIXME: error
        sample_actions, behaviour_logits = self.fluid_executor.run(
            self.sample_program,
            feed={'obs': obs_np},
            fetch_list=[self.sample_actions, self.behaviour_logits])
        return sample_actions, behaviour_logits

    def predict(self, obs_np):
        obs_np = obs_np.astype('float32')
        # self.fluid_executor.run(fluid.default_startup_program())
        predict_actions = self.fluid_executor.run(
            self.predict_program,
            feed={'obs': obs_np},
            fetch_list=[self.predict_actions])[0]
        return predict_actions

    def learn(self):
        # self.fluid_executor.run(fluid.default_startup_program())
        total_loss, pi_loss, vf_loss, entropy, kl = self.fluid_executor.run(
            self.learn_program, fetch_list=self.learn_outputs)
        return total_loss, pi_loss, vf_loss, entropy, kl```

from parl.

TomorrowIsAnOtherDay commented on September 21, 2024

Thanks for your quick reply! We will try to reproduce the problem at our environment and then fix it.

from parl.

zienn commented on September 21, 2024

I tried to restore model.ckpt in the default IMPALA alg.The same Error happend.

    agent.restore('./model.ckpt')
  File "/home/tianqi/anaconda3/lib/python3.6/site-packages/parl/core/fluid/agent.py", line 221, in restore
    filename=filename)
  File "/home/tianqi/anaconda3/lib/python3.6/site-packages/paddle/fluid/io.py", line 798, in load_params
    filename=filename)
  File "/home/tianqi/anaconda3/lib/python3.6/site-packages/paddle/fluid/io.py", line 675, in load_vars
    raise TypeError("program's type should be Program")
TypeError: program's type should be Program

from parl.

TomorrowIsAnOtherDay commented on September 21, 2024

We find that this line causes the issue:

PARL/examples/IMPALA/atari_agent.py

Line 79 in c5a8c2b

self.learn_program = parl.compile(self.learn_program,

We will fix this problem next week:)

Currently, we suggest removing the line in your code. This line transfers a vanilla program into a new program that runs parallel on CPUs. It has little negative affect on the performance if you have a GPU.

from parl.

TomorrowIsAnOtherDay commented on September 21, 2024

The issue has been addressed by #192
Please update parl with the following command:
pip install --upgrade git+https://github.com/PaddlePaddle/PARL.git ,
or just download the repository and install it locally with: cd PARL; pip install .

Thanks for your feedback on PARL. It does make PARL a better framework!

from parl.

restore model error about parl HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent