Giter VIP home page Giter VIP logo

naomi-aghq's Introduction

naomi-aghq

Code for the manuscript Howes, Stringer, Flaxman and Eaton "Fast approximate Bayesian inference of HIV indicators using PCA adaptive Gauss-Hermite quadrature" (in preparation).

Naomi (Eaton et al, 2021) is a spatial evidence synthesis model used to produce district-level HIV epidemic indicators in sub-Saharan Africa. Multiple outcomes of interest, including HIV prevalence, HIV incidence and treatment coverage are jointly modelled using both household survey data and routinely reported health system data. The model is provided as a tool for countries to input their data to and generate estimates during a yearly process supported by UNAIDS. Currently, inference is conducted using empirical Bayes and a Gaussian approximation via the TMB R package. We propose a new inference method extending adaptive Gauss-Hermite quadrature to deal with >20 hyperparameters, enabling fast and accurate inference for Naomi and other extended latent Gaussian models. Using data from Malawi, our method improves the accuracy of inferences across a range of metrics, while being substantially faster to run than Hamiltonian Monte Carlo with the No-U-Turn sampler. Our implementation is based on the existing TMB C++ template for the model's log-posterior, and is compatible with any model with such a template.

Top: Illustration of Gauss-Hermite quadrature, adaption, and the principal components analysis approach we use. Bottom: Example district-level Naomi model outputs for adults aged 15-49.

C++ template for the log-posterior

The TMB template for the simplified Naomi model is available here.

File structure

The directories of this repository are:

Directory Contains
docs Hosted files at https://athowes.github.io/naomi-aghq/
make Scripts used to run the reports. _make.R runs everything in order
misc Ideas for further work and other documents that don't need to be in src
src All orderly reports
utils Helper scripts for common development tasks

orderly

We use the orderly package (RESIDE, 2020) to simplify the process of doing reproducible research. After installing orderly (from either CRAN or Github) a report, example, may be run by:

orderly::orderly_run(name = "src/example")

The results of this run will appear in the draft/ folder (ignored on Github). To commit the draft (with associated id) to the archive/ folder (also ignored on Github, and treated as "read only") use:

orderly::orderly_commit(id)

Any outputs of this report will then be available to use as dependencies within other reports. Reports can be pushed to the HIV inference group sharepoint (the remote) using:

orderly::orderly_push_archive("example")

Or can be pulled (alongside any dependencies) from the remote using:

orderly_pull_archive("example")

Alternatively, just the dependencies can be pulled using orderly::orderly_pull_dependencies("example").

R package dependencies

This repository is supported by the inf.utils package, which can be installed from Github via:

devtools::install_github("athowes/inf.utils")

We also use the aghq package for inference, which is available from CRAN, though the latest development version can be installed from Github via:

devtools::install_github("athowes/aghq", ref = "adam-dev")

Session information

The sessionInfo() used to run this analysis is:

R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 13.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] patchwork_1.1.2 stringr_1.5.0   purrr_1.0.1     readr_2.1.3     tidyr_1.2.1     tibble_3.2.1    tidyverse_1.3.1 forcats_0.5.2   ggplot2_3.4.0  
[10] dplyr_1.0.10    rmarkdown_2.18 

Reports

The reports within src are as follows:

Report Description
astro Analysis of an astronomy example
check_hyper-marginals Visualise hyperparameter posteriors fitted with NUTS for simplified Naomi, as well as positions of PCA-AGHQ nodes
check_pca-aghq Unit tests for the PCA-AGHQ procedure working as expected
check_sd-estimation Analysing odd behaviour for simplified Naomi whereby joint Gaussian approximation better estimates the standard deviation than conditionals
check_tmb-aghq-k1 Assessment that TMB and aghq run with k = 1 produce essentially the same inferences for simplified Naomi, in contrast to inferences from aghq with any grid
check_tmb-output Checking the the TMB output for simplified Naomi looks Gaussian, as it should be
dev_hyper-sampling Method development for sampling from the hyperparameters within TMB or AGHQ
dev_scale-grid Method development for hyperparameter grids over many dimensional spaces
dev_sinla Method development for Laplace and reduced cost Laplace marginals
docs_01-04-20-mini Presentation for StatML CDT
docs_01-07-21-stats-epi-group Presentation for Flaxman, Bhatt groups
docs_15-11-22-seminar Presentation for Waterloo SAS
docs_18-04-23-lab-group Presentation for HIV Inference Group
docs_19-05-23-turing Presentation for Alan Turing Institute
docs_21-06-23-mlgh Presentation for Machine Learning and Global Health Network
docs_bayescomp-poster Poster for BayesComp 2023
docs_bioinference-poster Poster for BioInference 2023
docs_model-structure Analysis of inputs to simplified Naomi outputs
docs_paper Paper and appendix
epil Analysis of epilepsy example
example_inla-grid Demonstration of grid construction method used in R-INLA
example_inla-replication Step-by-step version of INLA method in R and TMB
example_naomi Follow the naomi package vignette
explore_aghq Walkthrough various aghq package functions
explore_posterior-comparison Explore methods for comparing the accuracy of computed posterior distributions
naomi-simple_contraction Comparison of prior and posterior standard deviations across inference methods
naomi-simple_exceedance Case-study on computation of exceedance probabilities for second 90 target and high incidence
naomi-simple_fit Fit the simplified Naomi model using a range of inference methods
naomi-simple_increase-s-k Is the log normalising constant better estimated as $s$ and $k$ are increased in PCA-AGHQ?
naomi-simple_ks Comparison of inference methods using Kolmogorov-Smirnov tests on marginals
naomi-simple_mcmc Diagnostics for MCMC convergence
naomi-simple_mmd Comparison of inference methods using maximum mean discrepancy on joint distributions
naomi-simple_model-checks Checking the model fit to data
naomi-simple_point-estimates Comparison of inference methods for estimating the mean and standard deviation
naomi-simple_psis Comparison of inference methods using Pareto-smoothed importance sampling on joint distributions
plot-tikz_algorithm-flowchart TikZ diagram of proposed algorithm
plot-tikz_simplified-naomi TikZ diagram of the simplified Naomi directed acyclic graph
prev-anc-art_model0 Fit prevalence, ANC, ART model version 0
prev-anc-art_model1 Fit prevalence, ANC, ART model version 1
prev-anc-art_model2 Fit prevalence, ANC, ART model version 2
prev-anc-art_model3 Fit prevalence, ANC, ART model version 3
prev-anc-art_model4 Fit prevalence, ANC, ART model version 4
prev-anc-art_results Analyse results of prevalence, ANC, ART model
prev-anc-art_sim Simulate prevalence, ANC, ART data

naomi-aghq's People

Contributors

athowes avatar jeffeaton avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

jeffeaton

naomi-aghq's Issues

Faster matrix algebra for Laplace marginals

  • Scope out the possibility for more control over the evaluation of MakeADFun, especially the matrix algebra. This can be started by first understanding the internals of MakeADFun. Could it be replaced with relying on and and the Laplace approximation computed outside of TMB, with parts replaced to speed it up as required
    • Put MakeADFun into development script
    • Click through MakeADFun and write comments as to what is doing which part. Try to find the part which is doing the matrix algebra.

Scale up the number of grid points

  • Test how much longer AGHQ hyperparameter quadrature takes as a function of number of grid points
  • Test how much longer a given Laplace marginal takes as a function of number of grid points
  • Test whether accuracy of latent field Laplace approximations increases
  • Test whether accuracy of mixture of Gaussians increases

Test random mixture model

Jeff points to a repo of Kasper's looking for "Simple examples for which the Laplace approximation is inaccurate":

It links to this git repo with an example of a false convergence error and unpacking it a bit: https://github.com/kaskr/laplace_accuracy

The model that Kasper suggests is causing trouble is:

u1 ~ N(mu1, sd1)
u2 ~ N(mu2, sd2)
N ~ Pois(exp(u1) + exp(u2))

Should try this out with aghq and try to develop intuition as to why it might do better than TMB in this case.

Complete full Laplace implementation and understanding

  • Add KS statistic for all indices i in 1:30
  • Plot KS of the Gaussian against KS of the Laplace with an x = y line for reference
  • Check that tmbstan isn't sampling from h even though it's passed h not f
  • Resolve bug causing notebook i = 1 to be different
  • Move clean version to script
  • Extend away from empirical Bayes to integration with aghq over the hyperparameters
  • Make distributions plot larger in notebook
  • Write-up maths into paper.Rmd

Run `k = 5` options for `aghq`

e.g.

  • k = 5 and s = 5 would be 3125 points and take ~0.5 hours
  • k = 5 and s = 6 would be 3125 points and take ~1.5 hours

How do these do in the KS?

Posterior contraction assessment

  • Which parameters are poorly identified?
  • Are the posteriors from TMB systematically wider than those from aghq or tmbstan: perhaps they may be because less uncertainty in hyperparamters is taken into account

Better ways to compare the posterior output

From Alex:

is there anything that stands out to you

I think looking at the mean and SD is a bit too simplistic. With large data and under "standard" conditions, the mean and the mode will be similar (think Bernstein von-Mises, convergence of the posterior to a Gaussian), so I would argue that any approximation method really should be able to pick up the mean well. And, being interested at all the in the second moment sort of implies you believe the posterior is Gaussian anyways, in which case aghq would be exact. So it's definitely encouraging that the two approximations (aghq and MCMC) are close in the first two moments, but it doesn't really tell the whole story.

I think it would be more instructive to measure something to do with the tails. I usually compare the KS distance (maximal difference in empirical CDF) between the samples returned by tmbstan, and those from aghq::sample_marginal, marginally for each parameter. You can't fool the KSโ€“ if the tails aren't captured, it will be large. You can use ks.test() in R which also gives a p-value, which irrespective of philosophy and distributional assumptions, does give you a rough measure of what KS values are "large". It's also directly interpretable, like KS = 0.05 means that no matter where you compute a tail probability, you're never more than 5 percentage points away from what you'd get using tmbstan. I do this in the software vignette, https://arxiv.org/pdf/2101.04468.pdf, Table 3 for example.

The KS stats only capture the marginal distributions. For the joint, I haven't yet explored this too much, but one potentially compelling option is Pareto-Smoothed Importance Sampling (PSIS, see for example https://arxiv.org/pdf/1802.02538.pdf). You can use the loo::psis function for this, I have gotten it to work before but I can't seem to find the code (sorry about that). It gives you an interpretable, scalar summary of the agreement between the approximate and true joint distributions, which is a very useful thing especially in high dimensions.

Add ESS for Naomi output

The ESS is just be a function of the chains, so should be able to calculate this externally to tmbstan.

Add MCMC diagnostics

If going to use tmbstan as gold standard, need to check that things look alright (and eventually use longer chains)

Laplace marginals development

  • Expose internals of fit_aghq and run through
  • Create TMB template and objective function for one named latent field parameter and index
  • Attain Laplace marginal for one named latent field parameter with empirical Bayes hypers
  • Attain Laplace marginal for one named latent field parameter with grid hypers
  • Generalise code to sweep over all indices of one named latent field parameter
    • I decided this doesn't need to be done, and we can just sweep over all indices once all the named latent field parameters are together
  • Generate marginals for beta_rho[1] to compare to -- just use those from naomi-simple_compare
  • Understand what's going wrong when the laplace_marginal errors due to the negative weighted posterior outweighing the positive weighted posterior in some location
    • Is there any way to sample from the mixture of Gaussians with a sparse grid and possibly negative weights?
  • Start the optimisation within for(z in ...) loop for each evaluation of obj$fn as as close to the eventual optima as possible -- likely at the mode of the Gaussian approximation -- by using the random.start option
    • Benchmark exactly how much cheaper that has made things
  • Use the suggested cheaper code to calculate the diagonal of the inverse Hessian
  • Generalise code to sweep over all latent field parameters
    • Strategy to do this is to concatenate all the names into x_i and x_minus_i, then reconstruct x, then from x generate all the named parameters based on their position
  • Create function fit_name_here which brings this all together
  • Fit k = 1 with Laplace marginals
  • Fit k = 2 or k = 3 sparse with Laplace marginals
    • I think the sparsity causes a problem for the weights
  • A way to sample from the spline interpolated Laplace marginals
  • Create sample_adam function to sample from i = 1, ... spline interpolated Laplace marginals
  • Compare output from sample_adam to the other inference methods e.g. using KS tests
  • #23
  • Expose internals of sample_aghq to get a feel again for how the latent field and hyperparameter marginals are being used and run through
    • Blocker: can't fit product grid aghq with k > 1.
      • Possible solution: lower area_level to 3 rather than 4. But this won't fix exponential in number of hypers problem
      • Possible solution: create version of naomi_simple.cpp with fewer hypers. But this would be annoying to do
      • Possible solution: find a way to fix some of the hypers, probably with the map option
  • See if it's possible to use DATA_UPDATE macros to avoid recreating TMB template for each value i
    • This looks to be impossible. TMB doesn't like that I if-else branch after updating it:
      image
  • #22

Simulate from prior using `tmbstan`

Howes, Adam T
Does someone know if it's possible to generate data from the prior with tmbstan? Something like doing this to get an objective function without any contribution from the likelihood (but data can't beNA):

Wolock, Tim
I don't know of a built-in way in TMB or Stan. You really just want "y", not "X", to be NA, right? Could you add a use_likelihood object to your data list? You would have something like if (use_likelihood==1) nll -= dnorm(y, mu, sigma, true);. If you "fit" the model, you should just get prior samples of the distributional params, right? If you're generating posterior predictive samples in a SIMULATE block, you could also get prior predictive samples from TMB with obj$simulate(x). (Taking this opportunity to plug numpyro, which makes conditioning optional)

Howes, Adam T
Thanks Tim! Yeah I think that could work about adding a flag. Then whatever the values of y you input it doesn't matter, they count for nothing. I also didn't know about the SIMULATE block, so helpful to know. For some reason with Stan I thought it was easy to generate data from the prior, but I think I was mistaken about this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.