I've noticed a numeric issue when using this package with random walk algorithm. The c

We can just do log(max(0, x)) everywhere there is <co

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Ok the root cause of this error AFAICT is <a href="https://github.com/TuringLang/Bijec

Numerical issue with random walk for Dirichlet about bijectors.jl HOT 11 CLOSED

turinglang commented on June 11, 2024

Numerical issue with random walk for Dirichlet

from bijectors.jl.

Comments (11)

yebai commented on June 11, 2024 1

Then we need to do a catch-exception which is slow. Alternatively, we can make logpdf_with_trans return -Inf by checking terms inside log before evaluation?

Sounds good to me!

from bijectors.jl.

yebai commented on June 11, 2024

stepsize of 10 is quite big actually, did you look at adapted epsilon in HMC?

from bijectors.jl.

xukai92 commented on June 11, 2024

Just checked - this happens when the step size is tuned to something like 5.75.

from bijectors.jl.

yebai commented on June 11, 2024

I think NUTS adaption needs some further improvement. Before that happens, perhaps we can heuristically upper bound step size, e.g. < 2?

Dirichlet distribution is a very nasty one for transforms, I suspect it'll be hard to make it stable in all cases. If it occasionally throws off the edges, we can try to reject a Hamiltonian simulation.

from bijectors.jl.

xukai92 commented on June 11, 2024

I think NUTS adaption needs some further improvement. Before that happens, perhaps we can heuristically upper bound step size, e.g. < 2?

2 still seems to be too high. 1 works.

Dirichlet distribution is a very nasty one for transforms, I suspect it'll be hard to make it stable in all cases. If it occasionally throws off the edges, we can try to reject a Hamiltonian simulation.

Then we need to do a catch-exception which is slow. Alternatively we can make logpdf_with_trans return -Inf by checking terms inside log before evaluation?

from bijectors.jl.

mohamed82008 commented on June 11, 2024

We can just do log(max(0, x)) everywhere there is log(x). The output may then be -Inf or NaN if we divide 2 Infs by each other for some reason. So outside we can check the output with isnan and isinf.

from bijectors.jl.

yebai commented on June 11, 2024

This sounds good to me. Perhaps give it a try and see whether it can fix TuringLang/Turing.jl#621?

from bijectors.jl.

xukai92 commented on June 11, 2024

@yebai I will take a try today

from bijectors.jl.

xukai92 commented on June 11, 2024

@mohamed82008 @yebai great it works

from bijectors.jl.

mohamed82008 commented on June 11, 2024

Ok the root cause of this error AFAICT is https://github.com/TuringLang/Bijectors.jl/blob/master/src/Bijectors.jl#L221. The StatsFuns.logistic(x) function is way too small even for x = -1e2, and is pretty much 0 for -1e4 and higher. The easiest fix is to offset it by eps. I can try that and see if it replaces the need for the workaround in #13.

from bijectors.jl.

mohamed82008 commented on June 11, 2024

Ok, so after trying so many things, I came to a realization that we cannot support all of the Eucledian space with the invlink transformation while keeping the inverse relationship between link and invlink. As of #9 , we have been using ϵ to help stay away from the ugly -Inf and Inf of logit. This inevitably means that we are mapping all valid points in the support of the Dirichlet distribution to a finite subspace of the Eucledian space. So then what happens when you try to walk back from a point very far in the Eucledian space to the support of the distribution? The unthinkable. If that point is far beyond the friendly part of the Eucledian space, the logistic function starts yelling at us by returning 0 and 1. This is alarming, because inverting that with logit will give us the ugly -Inf and Inf back. In order words, one fix to the error in this issue is to truncate the output of logistic to be between ϵ and 1 - ϵ. And we have to truncate, not recenter, if we want to keep the inverse relationship between link and invlink valid for the friendly part of the Eucledian space. If we are willing to make the inverse property of link and invlink approximate, then we can recenter the output of logistic using z = z * (1 - 2ϵ) + ϵ. More directly, and I found that it works better in practice, is to truncate all x[k]s to be between 0 and 1. This condition is a direct consequence of sticking to the friendly part of the Eucledian space, but can be violated when using values outside the friendly region. Excuse my layman terms.

I will make a PR to get your feedback.

from bijectors.jl.

Numerical issue with random walk for Dirichlet about bijectors.jl HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent