Comments (25)
I think we need another level of abstraction on transformation as well, for the purpose of making this package compatible with possible incoming normalzing flows.
struct InvertibleTransformation end
logabsdetjacob(it::InvertibleTransformation, x) # this returns the logabsdetjacob
forward(it::InvertibleTransformation, x) # this returns forward and the corresponding logabsdetjacob
inverse(it::InvertibleTransformation, x) # this returns inverse and the corresponding logabsdetjacob
This is inline with #6 (comment), in which a transformed distribution consists a base distributin and an invertiable transformation (for which the logdetjacob is also computable).
Edit: logdetjacob
-> logabsdetjacob
(suggested by @willtebbutt
#6 (comment))
from bijectors.jl.
Good point - I really like your Inversed
type for those situations in which the inverse isn't really something that really has a name, as you suggest, and we just add the inverse type where the inverse has a well-established name in its own right.
from bijectors.jl.
I was thinking something along the lines of
struct Transformed{D<:Distribution, T} <: Distribution
d::D
t::T
end
logpdf(::Transformed{D, T}, ::Real) = ...
rand(::Transformed{D, T}) = ...
(this would need to be specialised for Univariate, Multivariate etc)
from bijectors.jl.
Hey! I made an attempt at implementing automatic differentiable variational inference (ADVI) for Turing, which requires transforming the distribution to a distribution with real support, i.e. can use Bijectors.jl
for this!
While doing this I was thinking a bit about the interface and implemented the following as a "wrapper" for Bijectors.jl
.
using Distributions, Bijectors
import Distributions: logpdf, rand
# Issue: this is not enforcing `VariateForm` of `D` and `Transformed{D}` to be the same
struct Transformed{D} <: Distribution{VariateForm, Continuous} where D <: Distribution
d::D
end
# size
Base.size(td::Transformed) = size(td.d)
Base.length(td::Transformed) = length(td.d)
# logp
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::AbstractMatrix{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
# rand
rand(td::Transformed, n::Int64) = link(td.d, rand(td.d, n))
rand(d::Transformed{D} where D <: Distribution{Univariate}) = first(rand(d, 1))
rand(d::Transformed{D} where D <: Distribution{Multivariate}) = rand(d, 1)
Important: vectorized logpdf
does not work, and results in the following error
ERROR: MethodError: no method matching iterate(::Transformed{MvNormal{Float64,PDMats.PDMat{Float64,Array{Float64,2}},Array{Float64,1}}})
If we could fix the above issue, maybe this would be a nice "ad-hoc" implementation. Comparing to the above suggestion which to me seems like it would require a more thorough rewrite of Bijectors.jl
(though I might be thoroughly wrong!).
I haven't investigated this to a satisfactory degree, but I was encouraged on #turing
to share anyways to initiate a discussion.
from bijectors.jl.
Can you implement an iterate
function for the multivariate subtype case? Not sure what that would look like, exactly.
from bijectors.jl.
Ah, sorry I ran it on the wrong distribution; meant to use a Univariate
distribution. Result is the same.
p = Transformed(Normal(0.0, 1.0))
logpdf(p, rand(p)) # <= this works
logpdf.(p, rand(p, 10)) # <= results in the following error
ERROR: MethodError: no method matching iterate(::Transformed{Normal{Float64}})
from bijectors.jl.
I also think that it would be fairly easy to transition from the Transformed{D}
I suggested to the more general Transformed{D, T}
suggested by @willtebbutt when you get to that. For simple use-cases the interface wouldn't change, e.g. the below code would work in both cases
p = Transformed(Exponential(1.0))
if we simply implement a constructor similar to
Transformed(d::D <: PositiveDistribution) = Transformed(d, PositiveDistributionTransformation())
behind the scenes. This way you could use the simpler interface while transitioning to the more general approach.
Also, maybe I misunderstood your question @cpfiffer. Did you question how to implement iterate(...)
for a Multivariate
in particular or any Distribution
? I'm honestly a bit confused by documentation for Distributions.jl
. If I follow https://juliastats.github.io/Distributions.jl/latest/extends.html#Create-a-Multivariate-Distribution-1 and try to implement Distributions._rand!
I get errors saying no sampler exists, but at the same time isa(p_transformed, Sampleable)
returns true
. Not sure if I'm missing something here or this is an actual bug in Distributions.jl
.
from bijectors.jl.
Took a second look at this. I thought you needed to overload iterate
, but the fix is simpler than that. Looks like really all you need is two additional overloads of logpdf
and to remove the broadcast operator on logpdf(p, rand(p,10))
:
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::Vector{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::Vector{AbstractMatrix{T}} where T <: Real) = logpdf_with_trans(d.d, x, true)
This works now:
p = Transformed(Normal(0.0, 1.0))
logpdf(p, rand(p,10))
10-element Array{Float64,1}:
-1.0758450227893368
-1.0166693096930153
-0.9469445907722853
-2.3735502039370373
-0.924499243891403
-1.2068563756279285
-1.363355793193587
-1.6091214288686937
-1.0012221537005328
-0.9931074723392034
Here's the whole MWE again:
using Distributions, Bijectors
import Distributions: logpdf, rand
# Issue: this is not enforcing `VariateForm` of `D` and `Transformed{D}` to be the same
struct Transformed{D} <: Distribution{VariateForm, Continuous} where D <: Distribution
d::D
end
# size
Base.size(td::Transformed) = size(td.d)
Base.length(td::Transformed) = length(td.d)
# logp
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::AbstractMatrix{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::Vector{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::Vector{AbstractMatrix{T}} where T <: Real) = logpdf_with_trans(d.d, x, true)
# rand
rand(td::Transformed, n::Int64) = link(td.d, rand(td.d, n))
rand(d::Transformed{D} where D <: Distribution{Univariate}) = first(rand(d, 1))
rand(d::Transformed{D} where D <: Distribution{Multivariate}) = rand(d, 1)
p = Transformed(Normal(0.0, 1.0))
logpdf(p, rand(p)) # <= this works
logpdf(p, rand(p,10)) # <= this works now
from bijectors.jl.
Nice!
The only potential (future) issue is that I believe logpdf(p, rand(p, 10))
is supposed to be deprecated for Univariate
. At least if I run
p = Normal(0.0, 1.0)
logpdf(p, rand(p, 10))
I get
Warning: `logpdf(d::UnivariateDistribution, X::AbstractArray)` is deprecated, use `logpdf.(d, X)` instead.
which can be verified at https://github.com/JuliaStats/Distributions.jl/blob/master/src/deprecates.jl#L39. That's why I was trying to make it work with vectorization, though I'm still not able to..
Also, do we need logpdf
for Vector{AbstractMatrix{T}}
?
And regarding overloading iterate
; the weird thing is no Distribution
in Distributions.jl
implements iterate
. Might be related to Distributions.jl/issues/579 and Distributions.jl/issues/846. I'll look a bit more into it tomorrow.
from bijectors.jl.
Got it working.
using Distributions, Bijectors
import Distributions: logpdf, rand
# Issue: this is not enforcing `VariateForm` of `D` and `Transformed{D}` to be the same
struct Transformed{D} <: Distribution{VariateForm, Continuous} where D <: Distribution
d::D
end
# size
Base.size(td::Transformed) = size(td.d)
Base.length(td::Transformed) = length(td.d)
# logp
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::AbstractMatrix{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
# rand
rand(td::Transformed, n::Int64) = link(td.d, rand(td.d, n))
rand(d::Transformed{D} where D <: Distribution{Univariate}) = first(rand(d, 1))
rand(d::Transformed{D} where D <: Distribution{Multivariate}) = rand(d, 1)
# makes it possible to vectorize operations on `Transformed` for `UnivariateDistributions`
Broadcast.broadcastable(d::Transformed{D} where D <: UnivariateDistribution) = Ref(d)
The issue was that the Transformed{D}
didn't inherit the implementation of Broadcast.broadcastable
for UnivariateDistribution
from Distributions.jl/common.jl#L100 due to the unspecified VariateForm
in Transformed{D}
. But now the implementation works fine for the case where it failed before;
p = Transformed(Normal(0.0, 1.0))
logpdf(p, rand(p)) # <= this works
logpdf.(p, rand(p, 10)) # <= this also works now
Though it might be a good idea to consider maybe making explicit types for Univariate
and Multivariate
, as this ensures issues like the above does not happen and instead sane behavior will be "inherited" from Distributions.jl
. E.g. vectorization, user-specified rng
, pdf
implementation using logpdf
. It also makes it easier to stay up-to-date with Distributions.jl
.
An example of such an implementation could be:
using Distributions, Bijectors
import Random: AbstractRNG
import Distributions: logpdf, rand, rand!, _rand!, _logpdf
struct UnivariateTransformed <: Distribution{Univariate, Continuous}
d::UnivariateDistribution
end
struct MultivariateTransformed <: Distribution{Multivariate, Continuous}
d::MultivariateDistribution
end
# constructors
transformed(d::UnivariateDistribution) = UnivariateTransformed(d)
transformed(d::MultivariateDistribution) = MultivariateTransformed(d)
# size
Base.length(td::MultivariateTransformed) = length(td.d)
# logp
logpdf(d::UnivariateTransformed, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
_logpdf(d::MultivariateTransformed, x::AbstractVector{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
# rand
rand(rng::AbstractRNG, td::UnivariateTransformed) = first(rand(td.d))
_rand!(rng::AbstractRNG, td::MultivariateTransformed, x::AbstractVector{T} where T <: Real) = begin
rand!(rng, td.d, x)
y = link(td.d, x)
copyto!(x, y)
end
From a user-perspective the interface is still "nice":
p = transformed(Normal(0.0, 1.0))
logpdf.(p, rand(p, 10)) # <= this just works now
EDIT: fixed the _rand
function
from bijectors.jl.
Would suggest logdetjacob
be called logabsdetjacob
, but other than that @xukai92 's proposal sounds reasonable to me. Also maybe extend inv
rather than create a new function inverse
?
edit: actually, perhaps inverse
should be reverse
or backward
to mirror the forward
function?
from bijectors.jl.
Yes this should be logabsdetjacob
!
I like the term inverse
as we are evaluating using the inverse function.
from bijectors.jl.
I did some work on such an interface earlier this week. But then I wrote a simple test which I'm having issues with because of (what seems like) numerical inaccuracies leading to wrong results. Should I submit a WIP PR so you can all see what it looks like?
EDIT: I'm using AD to compute the jacbobian from link
, which is probably why I'm not qetting quite the correct results. This is just a "placeholder" implementation instead of implementing each of the different logabsdetjac
functions manually.
from bijectors.jl.
I like the term inverse as we are evaluating using the inverse function.
Yes, this is reasonable, and I could certainly live with it. My issue is with the use of the name inv(t, x)
as a shorthand for what we really mean: evaluate_inverse_function(t, x)
i.e. I would prefer to have inv
be reserved unary methods that return the inverse of the thing that they're applied to.
An alternative interface might be:
abstract type InvertibleTransformation end
logabsdetjacob(it::InvertibleTransformation, x) # this returns the logabsdetjacob
forward(it::InvertibleTransformation, x) # applies the invertible transformation to a data point
inv(it::InvertibleTransformation) # returns another invertible transformation - the inverse of `it`
This way, if we want to compute the inverse, we would write
forward(inv(it), x)
to get the inverse transformation evaluated at x
, and the negation of the logabsdet
of the jacobian of it
at x
.
Maybe I'm being overly pedantic and, as I said, I could definitely live with your proposal above @xukai92
from bijectors.jl.
This looks great! I guess there are two options to implement the inverse transform as well.
One is to implement a one to one corresponding transforms, say for A
we also have InverseA
and define inv(a::A) = InverseA(...)
and inv(ia::InverseA) = A(...)
.
Or we can define a wrapper called Inversed
and define inv(it:: InvertibleTransformation) = Inversed(it)
and inv(iit::Inversed{T})::T = iit. original
. And for each it::IT
we define forward(it::IT, x)
and forward(iit::Inversed{IT}, x)
.
I feel the second option is more neat. How do you think? @willtebbutt
from bijectors.jl.
We could make the AD way as the callback functions for logabsdetjacob
. The main reason we need a designated abstraction is that when the diemension is high, using AD to compute this term is computationally intractable. This is the case if you are working with natrue images where the transfomraiton is parametrized by neural networks.
from bijectors.jl.
@willtebbutt I have a working example for the proposed abstraction, see below.
# Abstraction
abstract type AbstractInvertibleTransformation end
logabsdetjacob(t::T1, x::T2) where {T1<:AbstractInvertibleTransformation,T2} =
error("`logabsdetjacob(t::$T1, x::$T2)` is not implemented.")
forward(t::T1, x::T2) where {T1<:AbstractInvertibleTransformation,T2} =
error("`forward(t::$T1, x::$T2)` is not implemented.")
struct Inversed{T<:AbstractInvertibleTransformation} <: AbstractInvertibleTransformation
original::T
end
inv(t::T) where {T<:AbstractInvertibleTransformation} = Inversed(t)
inv(it::Inversed{T}) where {T<:AbstractInvertibleTransformation} = it.original
logabsdetjacob(it::T1, y::T2) where {T<:AbstractInvertibleTransformation,T1<:Inversed{T},T2} =
error("`logabsdetjacob(it::$T1, y::$T2)` is not implemented.")
forward(it::T1, y::T2) where {T<:AbstractInvertibleTransformation,T1<:Inversed{T},T2} =
error("`forward(it::$T1, y::$T2)` is not implemented.")
# Demo: logit transformation
using StatsFuns: logit, logistic
struct Logit{T<:Real} <: AbstractInvertibleTransformation
a::T
b::T
end
logabsdetjacob(t::Logit{<:Real}, x::Real) = log((x - t.a) * (t.b - x) / (t.b - t.a))
forward(t::Logit, x::Real) = (rv=logit((x - t.a) / (t.b - t.a)), logabsdetjacob=-logabsdetjacob(t, x))
function forward(it::Inversed{Logit{T}}, y::Real) where {T<:Real}
t = it.original
x = (t.b - t.a) * logistic(y) + t.a
return (rv=x, logabsdetjacob=logabsdetjacob(t, x))
end
# Simple demo
using Distributions, Bijectors
a, b = 1, 3
d = Truncated(Normal(0, 1), a, b)
t = Logit(a, b)
y = randn()
it = inv(t)
itres = forward(it, y)
@info itres
@info "This implementation" logpdf(d, itres.rv) + itres.logabsdetjacob
@info "Bijectors.jl" logpdf_with_trans(d, itres.rv, true)
The demo outputs:
┌ Info: (rv = 1.930013312433704, logabsdetjacob = -0.6980573521827962)
└ @ Main In[5]:7
┌ Info: This implementation
│ logpdf(d, itres.rv) + itres.logabsdetjacob = -1.6299051579255193
└ @ Main In[5]:8
┌ Info: Bijectors.jl
│ logpdf_with_trans(d, itres.rv, true) = -1.6299051579255193
└ @ Main In[5]:9
Comments are welcome!
from bijectors.jl.
@xukai92 glad you like the suggestion, the proposed implementation LGTM. One thing I would suggest is that instead of implementing stuff for Inversed{Logit}
, we just implement Logistic <: AbstractInvertibleTransformation
, and make
inv(t::Logit) = Logistic(some_function_of_parameters_of_t)
and make the inverse of Logistic
a Logit
.
from bijectors.jl.
Yes I was thinking of this approach as well. My issues with it is are
- For a lot of the transformations used in the normalzing flow. Only the forward has a name and we would need to make up a name for each inverse. E.g. for
AffineCoupling
we would need to createInversedAffineCoupling
. If this is case I'd rather haveInversed{AffineCoupling}
to be more consistent. - We need to implement two extra
inv
:p
In fact, we can still overload the inv
function for specific transformations to do what you proposed anyway :)
from bijectors.jl.
I've now tried to incorporate the above comments and Flows.jl into my previous proposal #27; any feedback would be nice:)
from bijectors.jl.
Closed by PR #27.
from bijectors.jl.
@JuliaRegistrator register()
from bijectors.jl.
Registration pull request created: JuliaRegistries/General/3061
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if Julia TagBot is installed, or can be done manually through the github interface, or via:
git tag -a v0.4.0 -m "<description of version>" 6ea2c373ac573295332c48557bbae7c3182edbc4
git push origin v0.4.0
from bijectors.jl.
@JuliaRegistrator register()
from bijectors.jl.
Registration pull request updated: JuliaRegistries/General/3061
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if Julia TagBot is installed, or can be done manually through the github interface, or via:
git tag -a v0.4.0 -m "<description of version>" 635a71a868a496f90fe9edea90fdf37f947c792d
git push origin v0.4.0
from bijectors.jl.
Related Issues (20)
- Road to DensityInterface? HOT 11
- Rewrite: what do we want
- Fitting normalizing flow HOT 2
- Matrix Variate Bijectors - Lower/Uppertriangular transforms
- Zygote AD & `logpdf` for transformed multivariate
- `Bijectors.ordered(d)` is incorrect if `bijector(d)` is not `Identity` HOT 1
- Adding support for FunctionChains.jl? HOT 9
- Problem compiling an app with Turing
- MethodError: no method matching bijector(::MixtureModel{Multivariate, Continuous, MvNormal, Float64}) HOT 1
- CorrBijector makes posterior improper HOT 7
- SimplexBijector tests fails on boundary
- Remove heavy usage of `@generated`
- Questions on custom bijectors HOT 1
- Bijector for MatrixNormal HOT 2
- Remove heavy usage of `@generated` HOT 2
- Add `rng` to have the reproducibility in `PlanarLayer` and `RadialLayer`
- `logpdf` of `UnivariateTransformed` produces `StackOverflowError` HOT 7
- Zygote is broken for `Stacked` bijectors HOT 5
- filldist, up1 not defined HOT 6
- Adding bijectors for OrderStatistic and JointOrderStatistics HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bijectors.jl.