Giter VIP home page Giter VIP logo

Comments (11)

yebai avatar yebai commented on June 11, 2024 1

Then we need to do a catch-exception which is slow. Alternatively, we can make logpdf_with_trans return -Inf by checking terms inside log before evaluation?

Sounds good to me!

from bijectors.jl.

yebai avatar yebai commented on June 11, 2024

stepsize of 10 is quite big actually, did you look at adapted epsilon in HMC?

from bijectors.jl.

xukai92 avatar xukai92 commented on June 11, 2024

Just checked - this happens when the step size is tuned to something like 5.75.

from bijectors.jl.

yebai avatar yebai commented on June 11, 2024

I think NUTS adaption needs some further improvement. Before that happens, perhaps we can heuristically upper bound step size, e.g. < 2?

Dirichlet distribution is a very nasty one for transforms, I suspect it'll be hard to make it stable in all cases. If it occasionally throws off the edges, we can try to reject a Hamiltonian simulation.

from bijectors.jl.

xukai92 avatar xukai92 commented on June 11, 2024

I think NUTS adaption needs some further improvement. Before that happens, perhaps we can heuristically upper bound step size, e.g. < 2?

2 still seems to be too high. 1 works.

Dirichlet distribution is a very nasty one for transforms, I suspect it'll be hard to make it stable in all cases. If it occasionally throws off the edges, we can try to reject a Hamiltonian simulation.

Then we need to do a catch-exception which is slow. Alternatively we can make logpdf_with_trans return -Inf by checking terms inside log before evaluation?

from bijectors.jl.

mohamed82008 avatar mohamed82008 commented on June 11, 2024

We can just do log(max(0, x)) everywhere there is log(x). The output may then be -Inf or NaN if we divide 2 Infs by each other for some reason. So outside we can check the output with isnan and isinf.

from bijectors.jl.

yebai avatar yebai commented on June 11, 2024

This sounds good to me. Perhaps give it a try and see whether it can fix TuringLang/Turing.jl#621?

from bijectors.jl.

xukai92 avatar xukai92 commented on June 11, 2024

@yebai I will take a try today

from bijectors.jl.

xukai92 avatar xukai92 commented on June 11, 2024

@mohamed82008 @yebai great it works

from bijectors.jl.

mohamed82008 avatar mohamed82008 commented on June 11, 2024

Ok the root cause of this error AFAICT is https://github.com/TuringLang/Bijectors.jl/blob/master/src/Bijectors.jl#L221. The StatsFuns.logistic(x) function is way too small even for x = -1e2, and is pretty much 0 for -1e4 and higher. The easiest fix is to offset it by eps. I can try that and see if it replaces the need for the workaround in #13.

from bijectors.jl.

mohamed82008 avatar mohamed82008 commented on June 11, 2024

Ok, so after trying so many things, I came to a realization that we cannot support all of the Eucledian space with the invlink transformation while keeping the inverse relationship between link and invlink. As of #9 , we have been using ϵ to help stay away from the ugly -Inf and Inf of logit. This inevitably means that we are mapping all valid points in the support of the Dirichlet distribution to a finite subspace of the Eucledian space. So then what happens when you try to walk back from a point very far in the Eucledian space to the support of the distribution? The unthinkable. If that point is far beyond the friendly part of the Eucledian space, the logistic function starts yelling at us by returning 0 and 1. This is alarming, because inverting that with logit will give us the ugly -Inf and Inf back. In order words, one fix to the error in this issue is to truncate the output of logistic to be between ϵ and 1 - ϵ. And we have to truncate, not recenter, if we want to keep the inverse relationship between link and invlink valid for the friendly part of the Eucledian space. If we are willing to make the inverse property of link and invlink approximate, then we can recenter the output of logistic using z = z * (1 - 2ϵ) + ϵ. More directly, and I found that it works better in practice, is to truncate all x[k]s to be between 0 and 1. This condition is a direct consequence of sticking to the friendly part of the Eucledian space, but can be violated when using values outside the friendly region. Excuse my layman terms.

I will make a PR to get your feedback.

from bijectors.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.