Giter VIP home page Giter VIP logo

Error: /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.cu.h:88 Assertion `y_[id] != 0` failed. InvalidArgumentError: Integer division by zero encountered in divide.Please check. about parl HOT 5 CLOSED

paddlepaddle avatar paddlepaddle commented on June 19, 2024
Error: /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.cu.h:88 Assertion `y_[id] != 0` failed. InvalidArgumentError: Integer division by zero encountered in divide.Please check.

from parl.

Comments (5)

zienn avatar zienn commented on June 19, 2024 1

找到问题了,Normal(means, std),求log_prob的时候std不能为0。神经网络输出要做处理。

def log_prob(self, value):
       var = self.scale * self.scale
        log_scale = nn.log(self.scale)
        return -1. * ((value - self.loc) * (value - self.loc)) / (
            2. * var) - log_scale - math.log(math.sqrt(2. * math.pi))

这部分代码是从sac复制来的,不明白为什么在sac里面执行不会出错?

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

咨询下你目前是做了哪部分的改动呢?
建议你不要一下子改动太多,一步步地增加你的代码,如果发现某次改动出问题了,就把相应的改动扔上来。
这样我们更好定位到问题。

from parl.

zienn avatar zienn commented on June 19, 2024

@TomorrowIsAnOtherDay 暂时改动的部分不多,我想在连续动作空间使用IMPALA,改动算法learn()部分如下,其他没有大的改动,问题应该在这里。actor网络拟合means和std,参照sac算法。请帮忙检查是否合理。

        values = self.model.value(obs)
        actions_mu, log_std_mu = self.model.policy(obs)
        std_pi = layers.exp(log_std)
        normal_pi = Normal(actions, std_pi)
        x_t1 = normal_pi.sample([1])[0]
        y_t1 = layers.tanh(x_t1)
        # action1 = y_t1 * self.max_action
        log_prob1 = normal_pi.log_prob(x_t1)
        log_prob1 -= layers.log(self.max_action * (1 - layers.pow(y_t1, 2)) + epsilon)
        log_prob1 = layers.reduce_sum(log_prob1, dim=1, keep_dim=True)
        log_prob_pi = layers.squeeze(log_prob1, axes=[1])

        std_mu = layers.exp(log_std_mu)
        normal_mu = Normal(actions_mu, std_mu)
        x_t2 = normal_mu.sample([1])[0]
        y_t2 = layers.tanh(x_t2)
        # action2 = y_t2 * self.max_action
        log_prob2 = normal_mu.log_prob(x_t2)
        log_prob2 -= layers.log(self.max_action * (1 - layers.pow(y_t2, 2)) + epsilon)
        log_prob2 = layers.reduce_sum(log_prob2, dim=1, keep_dim=True)
        log_prob_mu = layers.squeeze(log_prob2, axes=[1])

        # target_policy_distribution = CategoricalDistribution(target_logits)
        # behaviour_policy_distribution = CategoricalDistribution(
        #     behaviour_logits)

        policy_entropy = normal_pi.entropy()
        target_actions_log_probs = log_prob_mu
        behaviour_actions_log_probs = log_prob_pi

        # Calculating kl for debug
        # kl = target_policy_distribution.kl(behaviour_policy_distribution)
        kl = normal_pi.kl_divergence(normal_mu)
        kl = layers.reduce_mean(kl)

from parl.

zenghsh3 avatar zenghsh3 commented on June 19, 2024

你好,看错误提示“Integer division by zero encountered in divide”应该是elementwise_divide(除法op)计算时出错,除数存在0的情况,可以检查定位下是哪里数值计算不合理。(例如在除法op前面加入fluid.layers.Print打印tensor运行时数值)

from parl.

TomorrowIsAnOtherDay avatar TomorrowIsAnOtherDay commented on June 19, 2024

手动点赞,那我们就先关掉这个issue了:)

from parl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.