Giter VIP home page Giter VIP logo

bayesplot's People

Contributors

avehtari avatar billdenney avatar bnicenboim avatar cbemben avatar charlesm93 avatar ecoronado92 avatar fweber144 avatar hadley avatar heavywatal avatar helske avatar hhau avatar jgabry avatar lindeloev avatar malcolmbarrett avatar martinmodrak avatar mcol avatar mitzimorris avatar paul-buerkner avatar rok-cesnovar avatar silberzwiebel avatar teemusailynoja avatar teunbrand avatar tjmahr avatar tony-stone avatar yimingli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bayesplot's Issues

pandoc/pandoc-citeproc dependencies

I would suggest changing README.md to,

If you are not using RStudio and you get an error related to "pandoc" you will either need to install pandoc (e.g. brew install pandoc) and pandoc-citeproc (e.g. brew install pandoc-citeproc), or remove the argument build_vignettes=TRUE to avoid building the package vignettes.

or something that suggests you also need pandoc-citeproc otherwise R CMD build fails at the command line. The pandoc documentation was unclear that pandoc-citeproc requires a separate installation (i.e. it's not bundled in with brew install pandoc. (It's probably not that big of a deal if you don't mention it since I'm guessing most people are building packages via RStudio, but at least someone dealing with this issue might find their way here.)

Finalize documentation

Documentation to-do list:

  • make sure all functions and arguments are sufficiently documented
  • proofread all help pages
  • finish/improve/replace vignettes

Histograms (y-label and grouping)

1 - For histograms (e.g., in ppc_stat, and ppc_hist) it would be good to have a y-label which shows the counts (not sure what is plotted currently, counts or density). This is useful specially in ppc_stat_group, because it gives you an extra information about the size of each group.

2 - It would be good to have a ppc_hist_group function for comparing the distributions at a finer level.

Is GGally worth it?

Is GGally's pair plot useful enough to warrant using it for mcmc_pairs and keeping GGally in Suggests? Currently this is what mcmc_pairs uses, but it might be more trouble than it's worth. Working with ggpairs is pretty cumbersome, and although the plot looks ok, it isn't flexible enough to be easily extended to include functionality similar to rstan::pairs.stanfit, which is what really makes pairs plots very useful.

Residuals by time plots

It would be very useful to have residuals by time plots, similar to ppc_ts_grouped plots.

allow for additional plotting options

It would be great to let users customize more the plots from bayesplot to make them ready for presentations or other non-screen usage.

A functionality like rstan_gg_options would be great to have which allows for specification of options which are beyond ggplot theme options and go directly to the used geoms.

Allow to plot residuals in ppc_scatter*

Right now, the x-axis of the pp_scatter* plots are displaying y, but it users may also want to see the residuals (y - yrep) since this resembles the standard residual plots generated for instance by plot(<lm object>) (except that y is plotted on the y-axis there). Do you think it makes sense adding an option to show residuals instead of y itself?

Default ggplot theme

Decide what the default ggplot theme should be. See if Andrew has any comments / strong preferences.

Recycle transformations in mcmc_intervals

I was trying out the new stan_betareg and plotted my coefficients with mcmc_intervals.

I'd like to be able to apply a single transformation (e.g. arm::invlogit) to all parameters. It would be great if one could do:

mcmc_intervals(fit, regex_pars = "search_for_coefs_here",
               transformations = arm::invlogit)

without the need of putting together a named list.

Dependence on dplyr?

Currently the use of dplyr functions in the bayesplot source code is inconsistent. I think it's best to either use it as much as possible or not at all. If the former, then this can wait until after the initial release, but if dplyr is to be removed I'd rather not include it as a dependency in the initial release.

mcmc_recover_intervals extensions - bias + coverage

Hi!

I just discovered the intervals function which looks great! I know that I should put all of my parameters one the unit-scale, but in practice I sometimes don't do that (even thought I should, I know). For these circumstances it would be nice to plot things as bias. So instead of showing the true values along with the intervals I would like to see an option which would allow me plot the bias.

Of course, the concept of bias is shaky in a Bayesian world, but as long as I can be sure that my prior is weakly-informative, I would like to be able to do that.

Another very useful extensions (I am happy to open another issue) would be a plot of the coverage when I replicate things a lot of times.

BTW, these tools look awesome to me!

function for easily juxtaposing plots and enforcing common axis limits

When comparing plots (e.g. PPCs for two different models for y) it can often be important to make sure the plots use the same axis limits. It would be nice to have a function that takes plot objects and a single x/y axis limit specification and then displays all the plots using that same x/y axis specification.

Visual Predictive Check

It would be great to have a so-called visual predictive check. To exemplify it I include an example in the form of a simple R script.

This plot is very useful for models which have continuous regressors which are given by the design of the experiemnt at the same value for all the subjects in a data-set. For example, imagine I have many subjects in a clinical study and one measures at pre-defined time-points for all patients whatever is of interest. The plot then allows to compare the raw quantiles of the data at each time-point vs what the model predicts for these quantiles (with its uncertainty).

vpc_example_R.txt

vpc_mtcars.pdf

object of type 'closure' is not subsettable`

I get the error below when I try to plot posterior draws from an HMM.

posterior_array <- as.array(my_fit)
dimnames(posterior_array)
mcmc_areas(posterior_array, pars = c("mu[1]", "mu[2]"))
$iterations
NULL

$chains
 [1] "chain:1"  "chain:2"  "chain:3"  "chain:4"  "chain:5"  "chain:6"  "chain:7" 
 [8] "chain:8"  "chain:9"  "chain:10"

$parameters
 [1] "mu[1]"      "mu[2]"      "mu[3]"      "mu[4]"      "mu[5]"      "mu[6]"     
 [7] "mu[7]"      "mu[8]"      "mu[9]"      "sigma[1]"   "sigma[2]"   "sigma[3]"

Error in theme_get()[newitem_names] : object of type 'closure' is not subsettable

I tried extracting the stan fit object as a dataframe but still get error. Any idea what's going on?

Allow a column to separate chains in mcmc plots

It could make sense too add another argument naming a column in the data.frame / matrix (defaulting to NULL or so) that -- if present -- would allow separating chains. This way one would not be forced to use 3D arrays.

Improve input checking for arguments 'y' and 'yrep'

Right now, only vectors are excepted for y and matrices for yrep. In my point of view this may be too restrictive for a user facing function. For instance, one-dimensional arrays are not allowed as input for y although they look the same as vectors on the surface and users will wonder why there input is invalid.
Maybe one could try to internally coerce y and yrep to a vector / matrix respectively thus allowing more flexible input. However, I also understand if you want to keep the input more restrictive.

Also, it may be good to check that both y and yrep are really numeric before passing them to ggplot2. For instance, when I call ppc_resid with a character vector for y, I get the error message
`Error in FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) :
non-numeric argument to binary operator``
which points in the right direction but isn't optimal from my point of view.

PPCs by group

Allow stratification by group, e.g. to do PPCs for each group in multilevel model

ppc_vs_x_grouped cannot handle argument 'time'

When you accidently pass the time argument to ppc_vs_x_grouped it is further passed to ppc_ts_grouped, although x is already used as the time argument here. This leads to duplicated assignments of the latter argument causing an error. Is it possible to just ignore the time argument in ppc_vs_x_grouped instead of passing it further?

More functionality for residual plots

1 - It would be useful to have grouped residuals. For instance, ppc_resid_hist_group. This can tell us about the distribution of residuals at a finer level which can provide some insight into the fit of the model.

2 - Currently, there is no way to change the x-axis in residual plots. For discrete variables on the x-axis, we might want to have a box-plot for the residuals for each point.

boxplots for posterior predictive checks

I think we need an option for pp_check that does boxplots in addition to the options for histograms and overlaid densities. So, a boxplot of the data on the left and then to its right ten or so boxplots of the posterior predictive realizations.

intervals plot

Add mcmc_intervals() function (see rstan::stan_plot) with options like

  • probability mass included in interval
  • point estimate (none, mean, median)
  • show densities
  • color by R-hat or effective sample size

Add a generic 'ppc' (or similar named) function

Likely developers of other packages will want to introduce a convenience function that generates the yreps and calls the pcc_* functions afterwards. As this will likely happen via S3 methods, it might be a good idea to put the corresponding generic in the ppcheck package to make sure that all packages use the same method name (to minimize confusion of users) and to avoid unnecessary function masking.

LOO predictive checks

Summary:

Add LOO predictive chesks

  • LOO probability integral transformation (PIT) predictive check
  • plot of LOO predictive intervals vs. observations

Description:

Calibration of the marginal predictions can be checked with probability integral transformation (PIT) checks. LOO improves the check by avoiding the double use of data. See Marginal predictive checks in BDA3 p. 152-153. In addition visual predictive checking can be made by plotting LOO predictive intervals and the observations.

Example code for LOO-PIT (I assume visual predictive checking is close to current ppc plot)

data(radon)
y<-radon$log_radon
# Fit the first model
modelA <- stan_lmer(
    log_radon ~ floor + log_uranium + floor:log_uranium + (1 + floor | county),
    data = radon,
    cores = 4,
    iter = 2000,
    chains = 4)

# probability integral transformation (PIT)
# this would be more accurate using conditional cdf's N(y_i|mu^{(s)},sigma^{(s)}
log_likA<-log_lik(modelA, parameter_name = "log_lik")
psisA<-psislw(-log_likA)
predsA<-posterior_predict(modelA)
pitA<-array(0,ncol(predsA))
library(matrixStats)
for (i in 1:ncol(predsA)) {
    pitA[i]<- exp(logSumExp(psisA$lw_smooth[(predsA[,i]<=y[i]),i]))
}

# LOO-PITs should have uniform distribution
par(mfrow=c(1,2))
qqplot(pitA,runif(10000))

Exchange axes in ppc_scatter* plots

This is of course just a matter of style, but personally I prefer / intuitively expect y to be on the y-axis. Feel free to close this issue, if you prefer the current use of the axes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.