logpdf(t::Transform, l::Distribution, x)

I was thinking something along the lines of <div class="highlight highlight-source

Ah, sorry I ran it on the wrong distribution; meant to use a <code class="notranslate"

I also think that it would be fairly easy to transition from the <code class="notransl

Took a second look at this. I thought you needed to overload <code class="notranslate"

Nice! The only potential (future) issue is that I believe <code class="notranslate

Got it working. <div class="highlight highlight-source-julia notranslate position-

New interface,about turinglang/bijectors.jl

Comments (25)

xukai92 commented on June 12, 2024 1

I think we need another level of abstraction on transformation as well, for the purpose of making this package compatible with possible incoming normalzing flows.

struct InvertibleTransformation end

logabsdetjacob(it::InvertibleTransformation, x) # this returns the logabsdetjacob
forward(it::InvertibleTransformation, x) # this returns forward and the corresponding logabsdetjacob
inverse(it::InvertibleTransformation, x) # this returns inverse and the corresponding logabsdetjacob

This is inline with #6 (comment), in which a transformed distribution consists a base distributin and an invertiable transformation (for which the logdetjacob is also computable).

Edit: logdetjacob -> logabsdetjacob (suggested by @willtebbutt
#6 (comment))

from bijectors.jl.

willtebbutt commented on June 12, 2024 1

Good point - I really like your Inversed type for those situations in which the inverse isn't really something that really has a name, as you suggest, and we just add the inverse type where the inverse has a well-established name in its own right.

from bijectors.jl.

willtebbutt commented on June 12, 2024

I was thinking something along the lines of

struct Transformed{D<:Distribution, T} <: Distribution
    d::D
    t::T
end
logpdf(::Transformed{D, T}, ::Real) = ...
rand(::Transformed{D, T}) = ...

(this would need to be specialised for Univariate, Multivariate etc)

from bijectors.jl.

torfjelde commented on June 12, 2024

Hey! I made an attempt at implementing automatic differentiable variational inference (ADVI) for Turing, which requires transforming the distribution to a distribution with real support, i.e. can use Bijectors.jl for this!

While doing this I was thinking a bit about the interface and implemented the following as a "wrapper" for Bijectors.jl.

using Distributions, Bijectors
import Distributions: logpdf, rand

# Issue: this is not enforcing `VariateForm` of `D` and `Transformed{D}` to be the same
struct Transformed{D} <: Distribution{VariateForm, Continuous} where D <: Distribution
    d::D
end

# size
Base.size(td::Transformed) = size(td.d)
Base.length(td::Transformed) = length(td.d)

# logp
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::AbstractMatrix{T} where T <: Real) = logpdf_with_trans(d.d, x, true)

# rand
rand(td::Transformed, n::Int64) = link(td.d, rand(td.d, n))
rand(d::Transformed{D} where D <: Distribution{Univariate}) = first(rand(d, 1))
rand(d::Transformed{D} where D <: Distribution{Multivariate}) = rand(d, 1)

Important: vectorized logpdf does not work, and results in the following error

ERROR: MethodError: no method matching iterate(::Transformed{MvNormal{Float64,PDMats.PDMat{Float64,Array{Float64,2}},Array{Float64,1}}})

If we could fix the above issue, maybe this would be a nice "ad-hoc" implementation. Comparing to the above suggestion which to me seems like it would require a more thorough rewrite of Bijectors.jl (though I might be thoroughly wrong!).

I haven't investigated this to a satisfactory degree, but I was encouraged on #turing to share anyways to initiate a discussion.

from bijectors.jl.

cpfiffer commented on June 12, 2024

Can you implement an iterate function for the multivariate subtype case? Not sure what that would look like, exactly.

from bijectors.jl.

torfjelde commented on June 12, 2024

Ah, sorry I ran it on the wrong distribution; meant to use a Univariate distribution. Result is the same.

p = Transformed(Normal(0.0, 1.0))

logpdf(p, rand(p))         # <= this works
logpdf.(p, rand(p, 10))    # <= results in the following error

ERROR: MethodError: no method matching iterate(::Transformed{Normal{Float64}})

from bijectors.jl.

torfjelde commented on June 12, 2024

I also think that it would be fairly easy to transition from the Transformed{D} I suggested to the more general Transformed{D, T} suggested by @willtebbutt when you get to that. For simple use-cases the interface wouldn't change, e.g. the below code would work in both cases

p = Transformed(Exponential(1.0))

if we simply implement a constructor similar to

Transformed(d::D <: PositiveDistribution) = Transformed(d,  PositiveDistributionTransformation())

behind the scenes. This way you could use the simpler interface while transitioning to the more general approach.

Also, maybe I misunderstood your question @cpfiffer. Did you question how to implement iterate(...) for a Multivariate in particular or any Distribution? I'm honestly a bit confused by documentation for Distributions.jl. If I follow https://juliastats.github.io/Distributions.jl/latest/extends.html#Create-a-Multivariate-Distribution-1 and try to implement Distributions._rand! I get errors saying no sampler exists, but at the same time isa(p_transformed, Sampleable) returns true. Not sure if I'm missing something here or this is an actual bug in Distributions.jl.

from bijectors.jl.

cpfiffer commented on June 12, 2024

Took a second look at this. I thought you needed to overload iterate, but the fix is simpler than that. Looks like really all you need is two additional overloads of logpdf and to remove the broadcast operator on logpdf(p, rand(p,10)):

logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::Vector{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::Vector{AbstractMatrix{T}} where T <: Real) = logpdf_with_trans(d.d, x, true)

This works now:

p = Transformed(Normal(0.0, 1.0))
logpdf(p, rand(p,10))

10-element Array{Float64,1}:
 -1.0758450227893368
 -1.0166693096930153
 -0.9469445907722853
 -2.3735502039370373
 -0.924499243891403 
 -1.2068563756279285
 -1.363355793193587 
 -1.6091214288686937
 -1.0012221537005328
 -0.9931074723392034

Here's the whole MWE again:

using Distributions, Bijectors
import Distributions: logpdf, rand

# Issue: this is not enforcing `VariateForm` of `D` and `Transformed{D}` to be the same
struct Transformed{D} <: Distribution{VariateForm, Continuous} where D <: Distribution
    d::D
end

# size
Base.size(td::Transformed) = size(td.d)
Base.length(td::Transformed) = length(td.d)

# logp
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::AbstractMatrix{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::Vector{T} where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::Vector{AbstractMatrix{T}} where T <: Real) = logpdf_with_trans(d.d, x, true)

# rand
rand(td::Transformed, n::Int64) = link(td.d, rand(td.d, n))
rand(d::Transformed{D} where D <: Distribution{Univariate}) = first(rand(d, 1))
rand(d::Transformed{D} where D <: Distribution{Multivariate}) = rand(d, 1)

p = Transformed(Normal(0.0, 1.0))

logpdf(p, rand(p))         # <= this works
logpdf(p, rand(p,10))    # <= this works now

from bijectors.jl.

torfjelde commented on June 12, 2024

Nice!
The only potential (future) issue is that I believe logpdf(p, rand(p, 10)) is supposed to be deprecated for Univariate. At least if I run

p = Normal(0.0, 1.0)
logpdf(p, rand(p, 10))

I get
Warning: `logpdf(d::UnivariateDistribution, X::AbstractArray)` is deprecated, use `logpdf.(d, X)` instead.
which can be verified at https://github.com/JuliaStats/Distributions.jl/blob/master/src/deprecates.jl#L39. That's why I was trying to make it work with vectorization, though I'm still not able to..

Also, do we need logpdf for Vector{AbstractMatrix{T}}?

And regarding overloading iterate; the weird thing is no Distribution in Distributions.jl implements iterate. Might be related to Distributions.jl/issues/579 and Distributions.jl/issues/846. I'll look a bit more into it tomorrow.

from bijectors.jl.

torfjelde commented on June 12, 2024

Got it working.

using Distributions, Bijectors
import Distributions: logpdf, rand

# Issue: this is not enforcing `VariateForm` of `D` and `Transformed{D}` to be the same
struct Transformed{D} <: Distribution{VariateForm, Continuous} where D <: Distribution
    d::D
end

# size
Base.size(td::Transformed) = size(td.d)
Base.length(td::Transformed) = length(td.d)

# logp
logpdf(d::Transformed{D} where D <: Distribution{Univariate}, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
logpdf(d::Transformed{D} where D <: Distribution{Multivariate}, x::AbstractMatrix{T} where T <: Real) = logpdf_with_trans(d.d, x, true)

# rand
rand(td::Transformed, n::Int64) = link(td.d, rand(td.d, n))
rand(d::Transformed{D} where D <: Distribution{Univariate}) = first(rand(d, 1))
rand(d::Transformed{D} where D <: Distribution{Multivariate}) = rand(d, 1)

# makes it possible to vectorize operations on `Transformed` for `UnivariateDistributions`
Broadcast.broadcastable(d::Transformed{D} where D <: UnivariateDistribution) = Ref(d)

The issue was that the Transformed{D} didn't inherit the implementation of Broadcast.broadcastable for UnivariateDistribution from Distributions.jl/common.jl#L100 due to the unspecified VariateForm in Transformed{D}. But now the implementation works fine for the case where it failed before;

p = Transformed(Normal(0.0, 1.0))

logpdf(p, rand(p))         # <= this works
logpdf.(p, rand(p, 10))    # <= this also works now

Though it might be a good idea to consider maybe making explicit types for Univariate and Multivariate, as this ensures issues like the above does not happen and instead sane behavior will be "inherited" from Distributions.jl. E.g. vectorization, user-specified rng, pdf implementation using logpdf. It also makes it easier to stay up-to-date with Distributions.jl.
An example of such an implementation could be:

using Distributions, Bijectors
import Random: AbstractRNG
import Distributions: logpdf, rand, rand!, _rand!, _logpdf

struct UnivariateTransformed <: Distribution{Univariate, Continuous}
    d::UnivariateDistribution
end

struct MultivariateTransformed <: Distribution{Multivariate, Continuous}
    d::MultivariateDistribution
end

# constructors
transformed(d::UnivariateDistribution) = UnivariateTransformed(d)
transformed(d::MultivariateDistribution) = MultivariateTransformed(d)

# size
Base.length(td::MultivariateTransformed) = length(td.d)

# logp
logpdf(d::UnivariateTransformed, x::T where T <: Real) = logpdf_with_trans(d.d, x, true)
_logpdf(d::MultivariateTransformed, x::AbstractVector{T} where T <: Real) = logpdf_with_trans(d.d, x, true)

# rand
rand(rng::AbstractRNG, td::UnivariateTransformed) = first(rand(td.d))
_rand!(rng::AbstractRNG, td::MultivariateTransformed, x::AbstractVector{T} where T <: Real) = begin
    rand!(rng, td.d, x)
     y = link(td.d, x)
    copyto!(x, y)
end

From a user-perspective the interface is still "nice":

p = transformed(Normal(0.0, 1.0))
logpdf.(p, rand(p, 10)) # <= this just works now

EDIT: fixed the _rand function

from bijectors.jl.

willtebbutt commented on June 12, 2024

Would suggest logdetjacob be called logabsdetjacob, but other than that @xukai92 's proposal sounds reasonable to me. Also maybe extend inv rather than create a new function inverse?

edit: actually, perhaps inverse should be reverse or backward to mirror the forward function?

from bijectors.jl.

xukai92 commented on June 12, 2024

Yes this should be logabsdetjacob!

I like the term inverse as we are evaluating using the inverse function.

from bijectors.jl.

torfjelde commented on June 12, 2024

I did some work on such an interface earlier this week. But then I wrote a simple test which I'm having issues with because of (what seems like) numerical inaccuracies leading to wrong results. Should I submit a WIP PR so you can all see what it looks like?

EDIT: I'm using AD to compute the jacbobian from link, which is probably why I'm not qetting quite the correct results. This is just a "placeholder" implementation instead of implementing each of the different logabsdetjac functions manually.

from bijectors.jl.

willtebbutt commented on June 12, 2024

I like the term inverse as we are evaluating using the inverse function.

Yes, this is reasonable, and I could certainly live with it. My issue is with the use of the name inv(t, x) as a shorthand for what we really mean: evaluate_inverse_function(t, x) i.e. I would prefer to have inv be reserved unary methods that return the inverse of the thing that they're applied to.

An alternative interface might be:

abstract type InvertibleTransformation end

logabsdetjacob(it::InvertibleTransformation, x) # this returns the logabsdetjacob
forward(it::InvertibleTransformation, x) # applies the invertible transformation to a data point
inv(it::InvertibleTransformation) # returns another invertible transformation - the inverse of `it`

This way, if we want to compute the inverse, we would write

forward(inv(it), x)

to get the inverse transformation evaluated at x, and the negation of the logabsdet of the jacobian of it at x.

Maybe I'm being overly pedantic and, as I said, I could definitely live with your proposal above @xukai92

from bijectors.jl.

xukai92 commented on June 12, 2024

This looks great! I guess there are two options to implement the inverse transform as well.

One is to implement a one to one corresponding transforms, say for A we also have InverseA and define inv(a::A) = InverseA(...) and inv(ia::InverseA) = A(...).

Or we can define a wrapper called Inversed and define inv(it:: InvertibleTransformation) = Inversed(it) and inv(iit::Inversed{T})::T = iit. original. And for each it::IT we define forward(it::IT, x) and forward(iit::Inversed{IT}, x).

I feel the second option is more neat. How do you think? @willtebbutt

from bijectors.jl.

xukai92 commented on June 12, 2024

@torfjelde Re: #6 (comment)

We could make the AD way as the callback functions for logabsdetjacob. The main reason we need a designated abstraction is that when the diemension is high, using AD to compute this term is computationally intractable. This is the case if you are working with natrue images where the transfomraiton is parametrized by neural networks.

from bijectors.jl.

xukai92 commented on June 12, 2024

@willtebbutt I have a working example for the proposed abstraction, see below.

# Abstraction
abstract type AbstractInvertibleTransformation end

logabsdetjacob(t::T1, x::T2) where {T1<:AbstractInvertibleTransformation,T2} = 
    error("`logabsdetjacob(t::$T1, x::$T2)` is not implemented.")
forward(t::T1, x::T2) where {T1<:AbstractInvertibleTransformation,T2} = 
    error("`forward(t::$T1, x::$T2)` is not implemented.")

struct Inversed{T<:AbstractInvertibleTransformation} <: AbstractInvertibleTransformation
    original::T
end

inv(t::T) where {T<:AbstractInvertibleTransformation} = Inversed(t)
inv(it::Inversed{T}) where {T<:AbstractInvertibleTransformation} = it.original

logabsdetjacob(it::T1, y::T2) where {T<:AbstractInvertibleTransformation,T1<:Inversed{T},T2} = 
    error("`logabsdetjacob(it::$T1, y::$T2)` is not implemented.")
forward(it::T1, y::T2) where {T<:AbstractInvertibleTransformation,T1<:Inversed{T},T2} = 
    error("`forward(it::$T1, y::$T2)` is not implemented.")

# Demo: logit transformation
using StatsFuns: logit, logistic

struct Logit{T<:Real} <: AbstractInvertibleTransformation
    a::T
    b::T
end

logabsdetjacob(t::Logit{<:Real}, x::Real) = log((x - t.a) * (t.b - x) / (t.b - t.a))

forward(t::Logit, x::Real) = (rv=logit((x - t.a) / (t.b - t.a)), logabsdetjacob=-logabsdetjacob(t, x))

function forward(it::Inversed{Logit{T}}, y::Real) where {T<:Real}
    t = it.original
    x = (t.b - t.a) * logistic(y) + t.a
    return (rv=x, logabsdetjacob=logabsdetjacob(t, x))
end

# Simple demo
using Distributions, Bijectors

a, b = 1, 3
d = Truncated(Normal(0, 1), a, b)
t = Logit(a, b)
y = randn()
it = inv(t)
itres = forward(it, y)
@info itres
@info "This implementation" logpdf(d, itres.rv) + itres.logabsdetjacob
@info "Bijectors.jl" logpdf_with_trans(d, itres.rv, true)

The demo outputs:

┌ Info: (rv = 1.930013312433704, logabsdetjacob = -0.6980573521827962)
└ @ Main In[5]:7
┌ Info: This implementation
│   logpdf(d, itres.rv) + itres.logabsdetjacob = -1.6299051579255193
└ @ Main In[5]:8
┌ Info: Bijectors.jl
│   logpdf_with_trans(d, itres.rv, true) = -1.6299051579255193
└ @ Main In[5]:9

Comments are welcome!

from bijectors.jl.

willtebbutt commented on June 12, 2024

@xukai92 glad you like the suggestion, the proposed implementation LGTM. One thing I would suggest is that instead of implementing stuff for Inversed{Logit}, we just implement Logistic <: AbstractInvertibleTransformation, and make

inv(t::Logit) = Logistic(some_function_of_parameters_of_t)

and make the inverse of Logistic a Logit.

from bijectors.jl.

xukai92 commented on June 12, 2024

Yes I was thinking of this approach as well. My issues with it is are

For a lot of the transformations used in the normalzing flow. Only the forward has a name and we would need to make up a name for each inverse. E.g. for AffineCoupling we would need to create InversedAffineCoupling. If this is case I'd rather have Inversed{AffineCoupling} to be more consistent.
We need to implement two extra inv :p

In fact, we can still overload the inv function for specific transformations to do what you proposed anyway :)

from bijectors.jl.

torfjelde commented on June 12, 2024

I've now tried to incorporate the above comments and Flows.jl into my previous proposal #27; any feedback would be nice:)

from bijectors.jl.

xukai92 commented on June 12, 2024

Closed by PR #27.

from bijectors.jl.

xukai92 commented on June 12, 2024

@JuliaRegistrator register()

from bijectors.jl.

JuliaRegistrator commented on June 12, 2024

Registration pull request created: JuliaRegistries/General/3061

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if Julia TagBot is installed, or can be done manually through the github interface, or via:

git tag -a v0.4.0 -m "<description of version>" 6ea2c373ac573295332c48557bbae7c3182edbc4
git push origin v0.4.0

from bijectors.jl.

yebai commented on June 12, 2024

@JuliaRegistrator register()

from bijectors.jl.

JuliaRegistrator commented on June 12, 2024

Registration pull request updated: JuliaRegistries/General/3061

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if Julia TagBot is installed, or can be done manually through the github interface, or via:

git tag -a v0.4.0 -m "<description of version>" 635a71a868a496f90fe9edea90fdf37f947c792d
git push origin v0.4.0

from bijectors.jl.

New interface about bijectors.jl HOT 25 CLOSED

Comments (25)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent