Comments (3)
@jsocolar, thanks for bringing this up. I'm putting up a minimal example in C++ to walk through it.
If you pop this into a file called test/unit/math/normal_lcdf_test.cpp
, you'll be able to run it from the command line like: python runTests.py test/unit/math/normal_lcdf_test.cpp
#include <stan/math.hpp>
#include <gtest/gtest.h>
TEST(normal_lcdf, inf_y) {
double y = stan::math::INFTY;
stan::math::var mu = 0.0;
stan::math::var sigma = 1.0;
stan::math::var target = stan::math::normal_lcdf(y, mu, sigma);
EXPECT_FLOAT_EQ(0.0, target.val());
target.grad();
EXPECT_FLOAT_EQ(0.0, mu.adj());
EXPECT_TRUE(isnan(sigma.adj()));
stan::math::recover_memory();
}
When y is infinite (and given that mu and sigma are finite, here ensured by declaring them as parameters), then the gradient is zero. However, it seems that Stan yields infinite gradients here.
Just curious... how'd you come to this conclusion? And which gradient did you think was "infinite"? (The gradient has two elements.) It'd be good to know how you're thinking about what the math library does and how it's generating gradients, especially at the boundary conditions.
To get into the example, if you look at the test:
d normal_lcdf(y, mu, sigma) / dmu = 0
d normal_lcdf(y, mu, sigma) / dsigma = NaN
(NaN
does not equal infinity, but it doesn't mean that's good.)
I think you're expecting d / dsigma = 0
? Is that right?
I didn't trace through to figure out why it's having trouble computing that term, but I'm sure it could be addressed. I'd lean towards throwing an exception at the boundaries, but I think that would change the behavior of Stan in a way that we'd have to have a larger discussion to implement.
Thoughts?
from math.
Thank you for opening this issue, @jsocolar.
from math.
@syclik Thanks for taking a look!
Just curious... how'd you come to this conclusion? And which gradient did you think was "infinite"? (The gradient has two elements.) It'd be good to know how you're thinking about what the math library does and how it's generating gradients, especially at the boundary conditions.
I concluded that the gradient should be zero because that's the well defined limit of the gradient with respect to both mu
and sigma
as y
approaches infinity. I concluded (incorrectly I think) that Stan's gradient was infinite (correct conclusion: not finite) by running the Stan program from the OP via cmdstanr
with
lcdf_mod <- cmdstan_model("/Users/JacobSocolar/Desktop/lcdf.stan")
lcdf_mod$sample(data = list(y = Inf))
and getting back a bunch of
Chain 1 Rejecting initial value:
Chain 1 Gradient evaluated at the initial value is not finite.
Chain 1 Stan can't start sampling from this initial value.
Based on your follow-up, I assume that the problem is the partial wrt sigma, that the NaN
is the problem, and that when I said "infinite" I really meant "not finite".
The use case for returning zero here is given in the discourse thread linked from the OP. It shouldn't be particularly burdensome to code around the current behavior, which I agree isn't necessarily wrong, but if there's appetite on the Stan side for returning zero for the partial wrt sigma then that'd also solve the original discourse issue.
from math.
Related Issues (20)
- Allow OpenCL simplex to use softmax? HOT 2
- Improve the numerical stability of binomial_logit_lpmf / bernoulli_logit_lpmf HOT 1
- matrix power function HOT 5
- Odd interaction between `ode_adams` and threading HOT 4
- Slicing and blocking functions fail on SoA matrices
- `newton_solver` halts sampling HOT 14
- Support faster Cholesky Decomposition of Toeplitz matrices HOT 2
- Compilation failure with Clang 17 and `check_symmetric`
- pow applied to mixed variables HOT 5
- Add integrate_1d_gauss_kronord
- stan::math and std::complex HOT 2
- Inefficient modified bessel function and von_mises_lpdf derivative calculation HOT 5
- Add special case for derivative of modified_bessel_function(0,x) to greatly improve model estimation speed HOT 1
- Hessian NaN with ordered_logistic_lpmf HOT 5
- Boost unused variable warning HOT 1
- linspaced_array returns int, but should return real HOT 1
- Supporting Eigen & FFTW when STAN_THREADS is enabled HOT 4
- Feature request: log matrix product HOT 6
- can we reorganize to put the constraint transforms into their own folder? HOT 4
- 5.0 Release Branch HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from math.