Giter VIP home page Giter VIP logo

Comments (39)

bob-carpenter avatar bob-carpenter commented on May 18, 2024 1

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024 1

I'm going ahead and adding this now that we have complex vectors and can code it the natural way.

from math.

syclik avatar syclik commented on May 18, 2024

From @betanalpha on August 16, 2013 14:40

Whoops, forgot about the imaginary components, so I guess something like

void fft(vector x, vector r, vector i);

from math.

syclik avatar syclik commented on May 18, 2024

@betanalpha, think we should still have this as part of the language?

from math.

syclik avatar syclik commented on May 18, 2024

From @betanalpha on May 22, 2014 20:47

There are lots of physics use cases, if only for preprocessing in
transformed data. The holdup was whether we wanted to (a)
find a way to limit functions to transformed data or (b) figure
out the analytic gradient for the FFT.

On May 22, 2014, at 9:44 PM, Daniel Lee [email protected] wrote:

@betanalpha, think we should still have this as part of the language?


Reply to this email directly or view it on GitHub.

from math.

syclik avatar syclik commented on May 18, 2024

From @bob-carpenter on May 23, 2014 13:35

And a bigger issue is data-only arguments. Because in some
cases, we might only want to implement some gradients for a function.

Definitely not impossible, but a fair bit of work in the parser.

On May 22, 2014, at 10:47 PM, Michael Betancourt [email protected] wrote:

There are lots of physics use cases, if only for preprocessing in
transformed data. The holdup was whether we wanted to (a)
find a way to limit functions to transformed data or (b) figure
out the analytic gradient for the FFT.

On May 22, 2014, at 9:44 PM, Daniel Lee [email protected] wrote:

@betanalpha, think we should still have this as part of the language?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.

from math.

syclik avatar syclik commented on May 18, 2024

From @mbrubake on May 23, 2014 15:35

So I totally think we should add FFT at some point as it's very useful for some models. The gradients are easy (the FFT and Inverse FFT are both linear operators) so we don't need to worry about data-only functions.

The problem with adding FFT is that we need some way to handle complex numbers. There are basically two options that aren't complete hacks:

  1. Actually implement a complex type which works just like real does.
  2. Add support for functions which return more than one parameter. Then, the FFT would simply take and return the real and imaginary coefficients separately.

Both of these options could potentially turn into a fair amount of work. That said, I vote for 2. For one, I don't think complex numbers are going to be widely enough used to be worth the pain of implementing and supporting them. On the other hand, allowing functions to return multiple values seems much more useful and may not be that hard if we make suitable use of boost tuples [http://www.boost.org/doc/libs/1_55_0/libs/tuple/doc/tuple_users_guide.html]

from math.

syclik avatar syclik commented on May 18, 2024

From @bob-carpenter on May 23, 2014 16:51

Yes, I really want to add multiple outputs. What they
call lists in the R world, I think, though of course, I'd want
to type the ones in Stan.

I don't know what the size limits are on Boost's tuples, but we probably don't
need huge lists.

I also really like the Python syntax for literals and assignments,
which lets you do things like (a,b,c) to create a list out of a, b,
and c. It's all kosher for us, because we can infer the type.

vector a;
vector b;
(a,b) <- fft(...);

We probably want to have the right co(ntra)variance with the type, so
that they follow assignment rules pointwise. That is, I can use
an (int,real) argument anywhere I can use a (real,real) agument and
I can use an (int,real) result anwyhere I could use a (real,real) result.

  • Bob

On May 23, 2014, at 5:35 PM, Marcus Brubaker [email protected] wrote:

So I totally think we should add FFT at some point as it's very useful for some models. The gradients are easy (the FFT and Inverse FFT are both linear operators) so we don't need to worry about data-only functions.

The problem with adding FFT is that we need some way to handle complex numbers. There are basically two options that aren't complete hacks:

• Actually implement a complex type which works just like real does.
• Add support for functions which return more than one parameter. Then, the FFT would simply take and return the real and imaginary coefficients separately.
Both of these options could potentially turn into a fair amount of work. That said, I vote for 2. For one, I don't think complex numbers are going to be widely enough used to be worth the pain of implementing and supporting them. On the other hand, allowing functions to return multiple values seems much more useful and may not be that hard if we make suitable use of boost tuples [http://www.boost.org/doc/libs/1_55_0/libs/tuple/doc/tuple_users_guide.html]


Reply to this email directly or view it on GitHub.

from math.

syclik avatar syclik commented on May 18, 2024

From @mbrubake on May 23, 2014 20:8

I'm also a pretty big fan of the python syntax in this case, so that sounds
perfect to me. I'm pretty sure boost::tuple can handle as many options as
we will ever reasonably need. I'm hard pressed to imagine a function
returning more than 5 things.

One challenge we might run into is if we need to have different functions
for different numbers of returns. E.G., a function which does one thing if
it returns one versus two objects. Because we can't deduce template
parameters based on return type in C++, how these functions return values
would need to be different from a simple return value. Probably not hugely
important to begin with but maybe worth thinking about.

from math.

syclik avatar syclik commented on May 18, 2024

From @betanalpha on May 24, 2014 16:53

And a bigger issue is data-only arguments. Because in some
cases, we might only want to implement some gradients for a function.

Definitely not impossible, but a fair bit of work in the parser.

I was thinking of something simpler — functions that can only be
called in the transformed data block.

from math.

syclik avatar syclik commented on May 18, 2024

From @bob-carpenter on May 24, 2014 23:16

On May 23, 2014, at 10:08 PM, Marcus Brubaker [email protected] wrote:

I'm also a pretty big fan of the python syntax in this case, so that sounds
perfect to me. I'm pretty sure boost::tuple can handle as many options as
we will ever reasonably need. I'm hard pressed to imagine a function
returning more than 5 things.

No, but I can easily imagine an R programmer stuffing 45 things
into a list.

One challenge we might run into is if we need to have different functions
for different numbers of returns. E.G., a function which does one thing if
it returns one versus two objects. Because we can't deduce template
parameters based on return type in C++, how these functions return values
would need to be different from a simple return value. Probably not hugely
important to begin with but maybe worth thinking about.

We do the same thing as C++ in not allowing two different
functions with the same inputs and different return types.
I don't think that's usually a problem though --- when it's
happened to me it's always been a mistake.

  • Bob

from math.

syclik avatar syclik commented on May 18, 2024

From @bob-carpenter on May 24, 2014 23:17

On May 24, 2014, at 6:53 PM, Michael Betancourt [email protected] wrote:

And a bigger issue is data-only arguments. Because in some
cases, we might only want to implement some gradients for a function.

Definitely not impossible, but a fair bit of work in the parser.

I was thinking of something simpler — functions that can only be
called in the transformed data block.

That would definitely be easier to implement.

  • Bob

from math.

syclik avatar syclik commented on May 18, 2024

From @mbrubake on May 24, 2014 23:26

On Sat, May 24, 2014 at 7:17 PM, Bob Carpenter [email protected]:

On May 24, 2014, at 6:53 PM, Michael Betancourt [email protected]
wrote:

And a bigger issue is data-only arguments. Because in some
cases, we might only want to implement some gradients for a function.

Definitely not impossible, but a fair bit of work in the parser.

I was thinking of something simpler — functions that can only be
called in the transformed data block.

That would definitely be easier to implement.

I don't understand why we need to though, at least for FFT. The analytic
gradients are pretty easy. Heck, I'll volunteer to do them if someone else
goes through the trouble of implementing everything else but is having
trouble with that.

Cheers,
Marcus

from math.

syclik avatar syclik commented on May 18, 2024

From @bob-carpenter on May 24, 2014 23:53

It sounds like we don't need to do this for FFT.

But there are things like cumulative distribution functions
and some crazy densities that are easy to differentiate for
a few but not all parameters.

But let's cross that bridge when we come to it.

  • Bob

On May 25, 2014, at 1:26 AM, Marcus Brubaker [email protected] wrote:

On Sat, May 24, 2014 at 7:17 PM, Bob Carpenter [email protected]:

On May 24, 2014, at 6:53 PM, Michael Betancourt [email protected]
wrote:

And a bigger issue is data-only arguments. Because in some
cases, we might only want to implement some gradients for a function.

Definitely not impossible, but a fair bit of work in the parser.

I was thinking of something simpler — functions that can only be
called in the transformed data block.

That would definitely be easier to implement.

I don't understand why we need to though, at least for FFT. The analytic
gradients are pretty easy. Heck, I'll volunteer to do them if someone else
goes through the trouble of implementing everything else but is having
trouble with that.

Cheers,
Marcus

Reply to this email directly or view it on GitHub.

from math.

ryanpdwyer avatar ryanpdwyer commented on May 18, 2024

As another physical science user, I'd also be very interested in fft / ifft.

from math.

syclik avatar syclik commented on May 18, 2024

We can worry about the language details after it gets into the math library first. This should be an easy enough project for someone wanting to contribute to the math library.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

Is there a pointer to the analytic gradient definitions somewhere?

from math.

mbrubake avatar mbrubake commented on May 18, 2024

The analytic gradients are the easy part of this.

d fft(y) = fft( dy )

The hard part is dealing with complex values and/or multiple return values.

from math.

 avatar commented on May 18, 2024

Regarding the analytic gradients, we should be a little bit careful here. the FFT is certainly linear when its target is complex numbers, but the amplitudes and phases are not at all linear in the input. The analytic gradient for amplitude squared certainly isn't bad, but the analytic gradient for phase is slightly annoying to write down without being able to manipulate complex numbers. I'd like to vote for doing this with a complex type.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

Will the complex type only need double real and imaginary
components?

Does Eigen have the complex support for the operations you
need? It's not complete, but it's getting better.

If both are yes, then I don't see a problem in terms of how all
the autodiff works. I have no idea about efficiency, accuracy
and robustness.

I've been told before that when I grow up I'll learn to think of
matrix operations in terms of complex numbers.

On Oct 12, 2016, at 4:16 PM, thelseraphim [email protected] wrote:

Regarding the analytic gradients, we should be a little bit careful here. the FFT is certainly linear when its target is complex numbers, but the amplitudes and phases are not at all linear in the input. The analytic gradient for amplitude squared certainly isn't bad, but the analytic gradient for phase is slightly annoying to write down without being able to manipulate complex numbers. I'd like to vote for doing this with a complex type.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

from math.

 avatar commented on May 18, 2024

Planning to bring this up in the meeting tomorrow but had a chat with @bob-carpenter and @betanalpha and we settled on implementing a family of fft functions without creating a complex type. fft_amp2 will give the squared amplitudes, fft_phase will give the phases, and an as-yet unnamed function will return the amplitudes and phases packaged as a 2xn Eigen matrix. We'll probably also give the real and imaginary components and corresponding 2xn matrix for convenience of computing the inverse fourier transform in the future.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

@thelseraphim Could you write out the Stan language signatures for the functions you're proposing just to make sure we're on the same page before you build them? Thanks.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

From @wds15 on the stan-dev list:

One of the very cool applications of a FFT transform (and its inverse) is the possibility to calculate the convolution as a simple product of two functions in the transformed space. This is for example handy if one wants to get the difference distribution of two quantities; one simply FFT transforms the function, multiplies them and then takes the inverse of that product.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

@wds15: This is one of the reasons I reopened the issue---other people were asking for the same kind of convolutions.

from math.

 avatar commented on May 18, 2024

Just wanted to record the result of conversations offline and on discourse in the thread here. The plan initially is to implement the following function signatures, as @betanalpha spelled out on discourse

real[] fft_amps(real[] x);
real[] fft_phases(real[] x);
real[,] fft_amps_phases(real[] x);

we'll also want the matrix versions of these functions

vector fft_amps(vector x);
vector fft_phases(vector x);
matrix fft_amps_phases(vector x);

adding real[,] fft_real_imag(real[] x) and its matrix version should be only infinitesimally more work and is much friendlier to future messing around with convolutions, ifft, etc. So that may or may not go in on the first pass.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

from math.

mbrubake avatar mbrubake commented on May 18, 2024

Feel free to ignore me since I'm not active at the moment, but I figured I'd thrown in my $0.02.

  • The choice of amplitudes and phases seems like a bad idea. In particular, it introduces a significant non-linearity (with a singularity no less) into a transformation which is otherwise linear. Doing it in terms of real and imaginary components retains this linearity, would generally be faster and is much more standard. It is then easy to convert to amplitudes and phases if you need them with calls to basic functions.
  • Having just a single function which returns an array of vectors/matrices is definitely better than separate functions and I would question whether it's worth actually having the separate functions.
  • As for vector[] vs matrix, I think vector[] makes a little more sense. Plus, it would make it possible to consistently extend to a 2D fft. I.E., to have a matrix[] fft2(matrix x); function.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

from math.

ryanpdwyer avatar ryanpdwyer commented on May 18, 2024

As another interested user, I agree with @mbrubake that real and imaginary parts are what would be useful.
Here is a brief example of how I would use it:

My measured experimental data points are vector[N] y. The measured y is a discrete convolution of a known response function vector[H] h and a sample variable vector[N+H-1] x.
I have a model for x which I fit using the data y. To do this, I need to compute the discrete convolution for the modeled x:

for (i in 1:N) {
    # Discrete convolution, using only the region where kernel
    # and model overlap fully (convolve mode='valid' in scipy or Matlab)
    y_modeled[i] <- dot_product(h, segment(x_modeled, i, H)); 
 }
y ~ normal(y_modeled, sigma); 

This is very slow if H is large. The discrete convolution can be calculated much faster by taking the fft of x_model and h, multiplying the Fourier transforms, and inverse transforming.
With N = 10,000 and H=1,000, I get a factor of 3-5 speedup using the FFT to compute the convolution (timings from scipy).
My model code if stan implemented fft and ifft would look like:

x_fft <- fft(x_model); # Defined to return an array of 2 vectors
x_re <- x_fft[1]; 
x_im <- x_fft[2];
y_re <- h_re .* x_re .- h_im .* x_im;  # Assuming h_im, h_re passed in as data
y_im <- h_re .* x_im .+ h_im .* x_re;
y_model <- ifft(y_re, y_im); 

y ~ normal(y_model, sigma); 

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

from math.

rainwoodman avatar rainwoodman commented on May 18, 2024

FFT is how we compute gravity forces, with which we can build a gravity solver model with stan.
We have been exploring this using some customized tools for a while, but would like to explore porting the heavy lifting to stan.

Gradients of complex numbers has been worked out. Some explanations were done by the autograd authors at HIPS/autograd#179 . real-complex transformations are more useful in our cases -- it is slightly more complicated due to hermitian constraints, but not that much difficult.

from math.

bob-carpenter avatar bob-carpenter commented on May 18, 2024

@rainwoodman We're working on adding some basic FFT functionality. Can you be more specific about what function you'd like to see in Stan?

from math.

rainwoodman avatar rainwoodman commented on May 18, 2024

I have to be honest I have not built anything with stan yet -- just got very inspired by chapter 19 of the manual, because our model is a PDE solver.

FFT (real-complex and complex real) is half of what we need. We also need a few 'convolution' operators, which I named them paint/readout.

Here is a write up of the operators:

https://bids.berkeley.edu/news/automatic-differentiation-and-cosmology-simulation

Jump to "Two Useful Operators in Particle-Mesh Solvers"

'Psi' operators (vector-jacobian-product in autograd) --are useful for back-propagation of the gradients. I was using the naive gradient convention for complex number derivatives -- it makes the real-complex gradients easier to write but it is mathematically uglier than the correct Wirtinger derivatives. If stan has never dealt with complex gradients then it is important to make a decision.

Once the FFT lands we can worry about the paint/readout operators. They are currently implemented in C and packed in a Python package.[1] It wouldn't be hard to lift them out to stan, I imagine. I can contribute a plugin if you point to me where to look.

Eventually we will worry about parallel -- we deal with millions of parameters and want to push to billions. Our current tools uses MPI since it is not fault tolerant and thus easier to program(as least for us with HPC background). Is there a desire along this line with the stan development?

[1] https://github.com/rainwoodman/pmesh/blob/master/pmesh/_window.pyx

from math.

betanalpha avatar betanalpha commented on May 18, 2024

from math.

rainwoodman avatar rainwoodman commented on May 18, 2024

Thanks a lot!

I haven't thought about evolve the gradient of likelihood as an additional set of variables.

Is it similar to page 12 of http://wwwu.uni-klu.ac.at/bkaltenb/AD12_kaltenbacher.pdf ?

Could you share a reference or a keyword for me look up on the work with the likelihood with N-body solvers? That is exactly the problem we are facing -- and we currently use back-propagation to compute the gradients. Our parameter is the initial condition (O(10^7)), thus a joined-forward modelling of the sensitivity may not work.

from math.

betanalpha avatar betanalpha commented on May 18, 2024

from math.

rainwoodman avatar rainwoodman commented on May 18, 2024

Ah. I see. Page 40, Chapter 13.4, coupled size N * (N + 1) is what drove us away from the coupled variable approach. N is O(10^7) in our case. We may want to reformulate our problem.

from math.

betanalpha avatar betanalpha commented on May 18, 2024

from math.

rainwoodman avatar rainwoodman commented on May 18, 2024

I somehow had the wrong impression that adjoint method were identical to back-propagation, and missed the entire literature. We shall look more carefully into it. Thanks!

from math.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.