differentiableuniverseinitiative / dhod Goto Github PK

View Code? Open in Web Editor NEW

8.0 5.0 2.0 51.96 MB

Differentiable Halo Occupation Distributions

License: MIT License

Python 0.60% Jupyter Notebook 99.27% TeX 0.13%

dhod's Introduction

Differentiable HOD

Differentiable Halo Occupation Distributions

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_{Francois Lanusse}
💻

_{ChangHoon Hahn}
💻

_{Andrew Hearin}
🤔

_{Ben Horowitz}
💻

_{Chirag Modi}
🖋

_{Simone Ferraro}
🤔 🖋

This project follows the all-contributors specification. Contributions of any kind welcome!

dhod's People

Contributors

Stargazers

Watchers

Forkers

gitter-badger mardom

dhod's Issues

Remove heavy TF model from git history

@modichirag The folder models is too heavy to be stored directly on git, it makes downloading the project way too slow. I'm going to go ahead and erase it from history, we can always add this back as a git-lfs folder afterwards

Make all sampling functions batch capable

For now, the sampling functions from HOD components are not batch capable. we should change that so that we can sample several batches at once, for VI purposes ;-)

Adding contributions

@all-contributors please add @EiffL for infrastructure and code

Optimize HOD parameters to fit power spectrum

This issue is to track testing of the gradients by just doing gradient descent to fit the power spectrum (or even better the correlation function)

Set default values for Zheng07 params to halotools values

Currently our defaults for the Zheng07 model differ from the halotools values. Not sure why, but we should find out why, decide on a set of defaults, and document where they come from in the documentation of these functions

Improve the README

We need to add an example of how to use these things in the readme

Implementation of NFW radial distribution

Many thanks to @aphearin who pointed us to this reference:

https://arxiv.org/abs/1805.09550

This issue is to track comments during the implementation of this distribution.

I've started implementing it as a proper TensorFlow Probability distribution in u/EiffL/NFW. The good news is that approximately a month ago, the Lambert W function got added to TensorFlow Probability so we got most of the work cut out for us :-D : https://github.com/tensorflow/probability/blob/d3dc1d657bc2386a86c69c445a8ae087e212cd05/tensorflow_probability/python/math/special.py#L141

Proper inference of HOD parameters

This issue is to track the development and testing of a gradient-based Bayesian inference method to estimate HOD parameters

Why not sampling all the things? HMC sampling of each individual galaxy activation

I'm not really convinced we are doing the right thing by using an HMC over the stochastically sampled power spectrum. The only correct way to do this would be to draw several power spectra, and then use the mean, i.e. using an estimator of the theoretical mean over N samples, and most likely N should be larger than 1. This is essentially the same age old problem of you can't sample cosmological parameters without sampling all of the latent variables as well.

So... why don't we bite the bullet and just sample all of the latent variables of the model, i.e. the "activation" (whether the galaxy is on or off) of every single galaxy in the mock, at the same time as we sample the HOD parameters. Turns out, that this has little extra cost compared to what we were doing before, because the computations of forward and backward pass are strictly the same, the only potential cost is storage of latent variables in th MCMC trace, but we can circumvent that.

Here is a proof of concept sampling only the centrals:
https://colab.research.google.com/drive/1jsYwqxvw05LmG6jmYzHa13t1kjhH8q0F?usp=sharing
And it works nicely:

The super nice thing about this approach is that I think you might not need a covariance matrix! only diagonal measurement errors on your power spectrum. Because we track all the latent variables.

Anyways, this looks tractable to me, curious to hear what other people think.

New versions of TFP cause NaN gradients for RelaxedBernoulli near p=0 and p=1

I couldn't reproduce one of the figures of the paper (from this notebook: https://github.com/DifferentiableUniverseInitiative/DHOD/blob/master/nb/Zheng2007_demo.ipynb), specifically the figure comparing gradients.

Turns out, it's due to the RelaxedBernoulli being quite sensitive near 0 and 1 in newer versions of TFP, I'm trying to pin down exactly what changed, but checked that with previous versions (0.9) we can get nice gradients even close to the edges: https://colab.research.google.com/drive/1xkXYtC3ER1z0r25fkUa2pwV_b7OlQBnP?usp=sharing

Adding missing contributors

We have been slacking off on properly recognizing contributions

@all-contributors please add @modichirag for content

Implement a differentiable correlation function

This would be the holy grail, if we had a TF estimator that was relatively fast and differentiable we could direcltly work at the level of correlation functions

Implement and sample from Zheng2007

This issue is tracking the definition of a tensor-based structure for a halo catalog.

Halotools relies on astropy tables, here we can't use that, most likely a dictionary of tensors will be just as good.

RelaxedBernoulli samples not differentiable

samples from tensorflow_probability.distributions.RelaxedBernoulli is not differentiable even when temperature is increased to 10^6. The gradients from the snippets below spits out nans.

Mhalo = tf.convert_to_tensor(np.random.uniform(10., 15., 1000), dtype=tf.float32)
siglogm = tf.convert_to_tensor(0.2, dtype=tf.float32)
temperature = 100

def Ncen(Mmin): 
    # mean occupation of centrals
    return 0.5 * (1+tf.math.erf((Mhalo - Mmin)/siglogm))

for temp in np.linspace(0.1, 1e6, 10).astype(np.float32): 
    def _hod(Mmin): 
        bern = tfp.distributions.RelaxedBernoulli(temp, probs=Ncen(Mmin))
        return bern.sample(seed=0)

    loss = lambda mm: _hod(mm)
    val, grad = tfp.math.value_and_gradient(loss, [_Mmin])
    print(grad)

Compute a CIC density field from galaxy catalog

This issue to add a functionality to compute a 3d CIC mesh from a galaxy catalog, from which we can then try to compute the power spectrum.
@modichirag Do you want to take care of that? Essentially, this would involve using FlowPM to CIC paint a galaxy density field from a galaxy catalog

Add caching to GitHub CI

Right now, github has to install tensorflow and halotools at every run of the CI script, someone should read the manual here: https://help.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows and figure out how to cache these pip packages