Package 'adjustr' reference manual

Title:	Stan Model Adjustments and Sensitivity Analyses using Importance Sampling
Description:	Functions to help assess the sensitivity of a Bayesian model (fitted using the rstan pakcage) to the specification of its likelihood and priors. Users provide a series of alternate sampling specifications, and the package uses Pareto-smoothed importance sampling to estimate posterior quantities of interest under each specification.
Authors:	Cory McCartan [aut, cre]
Maintainer:	Cory McCartan <[email protected]>
License:	BSD_3_clause + file LICENSE
Version:	0.1.2
Built:	2025-03-24 06:09:49 UTC
Source:	https://github.com/CoryMcCartan/adjustr

Compute Pareto-smoothed Importance Weights for Alternative Model Specifications

Description

Given a set of new sampling statements, which can be parametrized by a data frame or list, compute Pareto-smoothed importance weights and attach them to the specification object, for further calculation and plotting.

Usage

adjust_weights(spec, object, data = NULL, keep_bad = FALSE, incl_orig = TRUE)
adjust_weights(spec, object, data = NULL, keep_bad = FALSE, incl_orig = TRUE)

Arguments

`spec`	An object of class `adjustr_spec`, probably produced by `make_spec`, containing the new sampling sampling statements to replace their counterparts in the original Stan model, and the data, if any, by which these sampling statements are parametrized.
`object`	A model object, either of type `stanfit`, `stanreg`, `brmsfit`, or a list with two elements: `model` containing a `CmdStanModel`, and `fit` containing a `CmdStanMCMC` object.
`data`	The data that was used to fit the model in `object`. Required only if one of the new sampling specifications involves Stan data variables.
`keep_bad`	When `FALSE` (the strongly recommended default), alternate specifications which deviate too much from the original posterior, and which as a result cannot be reliably estimated using importance sampling (i.e., if the Pareto shape parameter is larger than 0.7), have their weights discarded—weights are set to `NA_real_`.
`incl_orig`	When `TRUE`, include a row for the original model specification, with all weights equal. Can facilitate comaprison and plotting later.

Details

This function does the bulk of the sensitivity analysis work. It operates by parsing the model code from the provided Stan object, extracting the parameters and their sampling statements. It then uses R metaprogramming/tidy evaluation tools to flexibly evaluate the log density for each draw and each sampling statement, under the original and alternative specifications. From these, the function computes the overall importance weight for each draw and performs Pareto-smoothed importance sampling. All of the work is performed in R, without recompiling or refitting the Stan model.

Value

A tibble, produced by converting the provided specs to a tibble (see as.data.frame.adjustr_spec), and adding columns .weights, containing vectors of weights for each draw, and .pareto_k, containing the diagnostic Pareto shape parameters. Values greater than 0.7 indicate that importance sampling is not reliable. If incl_orig is TRUE, a row is added for the original model specification. Weights can be extracted with the pull.adjustr_weighted method. The returned object also includes the model sample draws, in the draws attribute.

References

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., & Gabry, J. (2015). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646.

Examples

## Not run: 
model_data = list(
    J = 8,
    y = c(28, 8, -3, 7, -1, 1, 18, 12),
    sigma = c(15, 10, 16, 11, 9, 11, 10, 18)
)

spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)
adjust_weights(spec, eightschools_m)
adjust_weights(spec, eightschools_m, keep_bad=TRUE)

spec = make_spec(y ~ student_t(df, theta, sigma), df=1:10)
adjust_weights(spec, eightschools_m, data=model_data)
# will throw an error because `y` and `sigma` aren't provided
adjust_weights(spec, eightschools_m)

## End(Not run)

## Not run: 
model_data = list(
    J = 8,
    y = c(28, 8, -3, 7, -1, 1, 18, 12),
    sigma = c(15, 10, 16, 11, 9, 11, 10, 18)
)

spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)
adjust_weights(spec, eightschools_m)
adjust_weights(spec, eightschools_m, keep_bad=TRUE)

spec = make_spec(y ~ student_t(df, theta, sigma), df=1:10)
adjust_weights(spec, eightschools_m, data=model_data)
# will throw an error because `y` and `sigma` aren't provided
adjust_weights(spec, eightschools_m)

## End(Not run)

adjustr: Stan Model Adjustments and Sensitivity Analyses using Importance Sampling

Description

Functions to help assess the sensitivity of a Bayesian model to the specification of its likelihood and priors, estimated using the rstan package. Users provide a series of alternate sampling specifications, and the package uses Pareto-smoothed importance sampling to estimate posterior quantities of interest under each specification.

Details

See the list of key functions and the example below. Full package documentation available at https://corymccartan.github.io/adjustr/.

Key Functions

make_spec
adjust_weights
summarize
plot

Convert an `adjustr_spec` Object Into a Data Frame

Description

Returns the data frame of specification parameters, with added columns of the form .samp_1, .samp_2, ... for each sampling statement (or just .samp if there is only one sampling statement).

Usage

## S3 method for class 'adjustr_spec'
as.data.frame(x, ...)
## S3 method for class 'adjustr_spec'
as.data.frame(x, ...)

Arguments

`x`	the `adjustr_spec` object
`...`	additional arguments to underlying method

`dplyr` Methods for `adjustr_spec` Objects

Description

Core dplyr verbs which don't involve grouping (filter, arrange, mutate, select, rename, and slice) are implemented and operate on the underlying table of specification parameters.

Usage

## S3 method for class 'adjustr_spec'
filter(.data, ..., .preserve = FALSE)

## S3 method for class 'adjustr_spec'
arrange(.data, ...)

## S3 method for class 'adjustr_spec'
rename(.data, ...)

## S3 method for class 'adjustr_spec'
select(.data, ...)

## S3 method for class 'adjustr_spec'
slice(.data, ..., .preserve = FALSE)
## S3 method for class 'adjustr_spec'
filter(.data, ..., .preserve = FALSE)

## S3 method for class 'adjustr_spec'
arrange(.data, ...)

## S3 method for class 'adjustr_spec'
rename(.data, ...)

## S3 method for class 'adjustr_spec'
select(.data, ...)

## S3 method for class 'adjustr_spec'
slice(.data, ..., .preserve = FALSE)

Arguments

`.data`	the `adjustr_spec` object
`...`	additional arguments to underlying method
`.preserve`	as in `filter` and `slice`

Examples

## Not run: 
spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)

arrange(spec, desc(df))
slice(spec, 4:7)
filter(spec, df == 2)

## End(Not run)
## Not run: 
spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)

arrange(spec, desc(df))
slice(spec, 4:7)
filter(spec, df == 2)

## End(Not run)

Extract Model Sampling Statements From a Stan Model.

Description

Prints a list of sampling statements extracted from the model block of a Stan program, with each labelled "parameter" or "data" depending on the type of variable being sampled.

Usage

extract_samp_stmts(object)
extract_samp_stmts(object)

Arguments

object

A stanfit model object.

Value

Invisbly returns a list of sampling formulas.

Examples

## Not run: 
extract_samp_stmts(eightschools_m)
#> Sampling statements for model 2c8d1d8a30137533422c438f23b83428:
#>   parameter   eta ~ std_normal()
#>   data        y ~ normal(theta, sigma)

## End(Not run)
## Not run: 
extract_samp_stmts(eightschools_m)
#> Sampling statements for model 2c8d1d8a30137533422c438f23b83428:
#>   parameter   eta ~ std_normal()
#>   data        y ~ normal(theta, sigma)

## End(Not run)

Get Importance Resampling Indices From Weights

Description

Takes a vector of weights, or data frame or list containing sets of weights, and resamples indices for use in later computation.

Usage

get_resampling_idxs(x, frac = 1, replace = T)
get_resampling_idxs(x, frac = 1, replace = T)

Arguments

`x`	A vector of weights, a list of weight vectors, or a data frame of type `adjustr_weighted` containing a `.weights` list-column of weights.
`frac`	A real number giving the fraction of draws to resample; the default, 1, resamples all draws. Smaller values should be used when `replace=FALSE`.
`replace`	Whether sampling should be with replacement. When weights are extreme it may make sense to use `replace=FALSE`, but accuracy is not guaranteed in these cases.

Value

A vector, list, or data frame, depending of the type of x, containing the sampled indices. If any weights are NA, the indices will also be NA.

Examples

## Not run: 
spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)
adjusted = adjust_weights(spec, eightschools_m)

get_resampling_idxs(adjusted)
get_resampling_idxs(adjusted, frac=0.1, replace=FALSE)

## End(Not run)

## Not run: 
spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)
adjusted = adjust_weights(spec, eightschools_m)

get_resampling_idxs(adjusted)
get_resampling_idxs(adjusted, frac=0.1, replace=FALSE)

## End(Not run)

Set Up Model Adjustment Specifications

Description

Takes a set of new sampling statements, which can be parametrized by other arguments, data frames, or lists, and creates an adjustr_spec object suitable for use in adjust_weights.

Usage

make_spec(...)
make_spec(...)

Arguments

...

Model specification. Each argument can either be a formula, a named vector, data frames, or lists.

Formula arguments provide new sampling statements to replace their counterparts in the original Stan model. All such formulas must be of the form variable ~ distribution(parameters), where variable and parameters are Stan data variables or parameters, or are provided by other arguments to this function (see below), and where distribution matches one of the univariate Stan distributions. Arithmetic expressions of parameters are also allowed, but care must be taken with multivariate parameter arguments. Since specifications are passed as formulas, R's arithmetic operators are used, not Stan's. As a result, matrix and elementwise multiplcation in Stan sampling statments may not be interpreted correctly. Moving these computations out of sampling statements and into a local variables will ensure correct results.

For named vector arguments, each entry of the vector will be substituted into the corresponding parameter in the sampling statements. For data frames, each entry in each column will be substituted into the corresponding parameter in the sampling statements.

List arguments are coerced to data frames. They can either be lists of named vectors, or lists of lists of single-element named vectors.

The lengths of all parameter arguments must be consistent. Named vectors can have length 1 or must have length equal to the number of rows in all data frame arguments and the length of list arguments.

Value

An object of class adjustr_spec, which is essentially a list with two elements: samp, which is a list of sampling formulas, and params, which is a list of lists of parameters. Core dplyr verbs which don't involve grouping (filter, arrange, mutate, select, rename, and slice) are supported and operate on the underlying table of specification parameters.

Examples

make_spec(eta ~ cauchy(0, 1))

make_spec(eta ~ student_t(df, 0, 1), df=1:10)

params = tidyr::crossing(df=1:10, infl=c(1, 1.5, 2))
make_spec(eta ~ student_t(df, 0, 1),
          y ~ normal(theta, infl*sigma),
          params)

make_spec(eta ~ cauchy(0, 1))

make_spec(eta ~ student_t(df, 0, 1), df=1:10)

params = tidyr::crossing(df=1:10, infl=c(1, 1.5, 2))
make_spec(eta ~ student_t(df, 0, 1),
          y ~ normal(theta, infl*sigma),
          params)

Extract Weights From an `adjustr_weighted` Object

Description

This function modifies the default behavior of dplyr::pull to extract the .weights column.

Usage

## S3 method for class 'adjustr_weighted'
pull(.data, var = ".weights", name = NULL, ...)
## S3 method for class 'adjustr_weighted'
pull(.data, var = ".weights", name = NULL, ...)

Arguments

`.data`	A table of data
`var`	A variable, as in `pull`. The default returns the `.weights` column, and if there is only one row, it returns the first element of that column
`name`	Ignored
`...`	Ignored

Plot Posterior Quantities of Interest Under Alternative Model Specifications

Description

Uses weights computed in adjust_weights to plot posterior quantities of interest versus specification parameters

Usage

spec_plot(
  x,
  by,
  post,
  only_mean = FALSE,
  ci_level = 0.8,
  outer_level = 0.95,
  ...
)
spec_plot(
  x,
  by,
  post,
  only_mean = FALSE,
  ci_level = 0.8,
  outer_level = 0.95,
  ...
)

Arguments

`x`	An `adjustr_weighted` object.
`by`	The x-axis variable, which is usually one of the specification parameters. Can be set to `1` if there is only one specification. Automatically quoted and evaluated in the context of `x`.
`post`	The posterior quantity of interest, to be computed for each resampled draw of each specificaiton. Should evaluate to a single number for each draw. Automatically quoted and evaluated in the context of `x`.
`only_mean`	Whether to only plot the posterior mean. May be more stable.
`ci_level`	The inner credible interval to plot. Central 100*ci_level posterior draws.
`outer_level`	The outer credible interval to plot.
`...`	Ignored.

Value

A ggplot object which can be further customized with the ggplot2 package.

Examples

## Not run: 
spec = make_spec(eta ~ student_t(df, 0, scale),
                 df=1:10, scale=seq(2, 1, -1/9))
adjusted = adjust_weights(spec, eightschools_m)

spec_plot(adjusted, df, theta[1])
spec_plot(adjusted, df, mu, only_mean=TRUE)
spec_plot(adjusted, scale, tau)

## End(Not run)

## Not run: 
spec = make_spec(eta ~ student_t(df, 0, scale),
                 df=1:10, scale=seq(2, 1, -1/9))
adjusted = adjust_weights(spec, eightschools_m)

spec_plot(adjusted, df, theta[1])
spec_plot(adjusted, df, mu, only_mean=TRUE)
spec_plot(adjusted, scale, tau)

## End(Not run)

Summarize Posterior Distributions Under Alternative Model Specifications

Description

Uses weights computed in adjust_weights to compute posterior summary statistics. These statistics can be compared against their reference values to quantify the sensitivity of the model to aspects of its specification.

Usage

## S3 method for class 'adjustr_weighted'
summarise(.data, ..., .resampling = FALSE, .model_data = NULL)

## S3 method for class 'adjustr_weighted'
summarize(.data, ..., .resampling = FALSE, .model_data = NULL)
## S3 method for class 'adjustr_weighted'
summarise(.data, ..., .resampling = FALSE, .model_data = NULL)

## S3 method for class 'adjustr_weighted'
summarize(.data, ..., .resampling = FALSE, .model_data = NULL)

Arguments

`.data`	An `adjustr_weighted` object.
`...`	Name-value pairs of expressions. The name of each argument will be the name of a new variable, and the value will be computed for the posterior distribution of eight alternative specification. For example, a value of `mean(theta)` will compute the posterior mean of `theta` for each alternative specification. Also supported is the custom function `wasserstein`, which computes the Wasserstein-p distance between the posterior distribution of the provided expression under the new model and under the original model, with `p=1` the default. Lower the `spacing` parameter from the default of 0.005 to compute a finer (but slower) approximation. The arguments in `...` are automatically quoted and evaluated in the context of `.data`. They support unquoting and splicing.
`.resampling`	Whether to compute summary statistics by first resampling the data according to the weights. Defaults to `FALSE`, but will be used for any summary statistic that is not `mean`, `var` or `sd`.
`.model_data`	Stan model data, if not provided in the earlier call to `adjust_weights`.

Value

An adjustr_weighted object, wth the new columns specified in ... added.

Examples

## Not run: 
model_data = list(
    J = 8,
    y = c(28, 8, -3, 7, -1, 1, 18, 12),
    sigma = c(15, 10, 16, 11, 9, 11, 10, 18)
)

spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)
adjusted = adjust_weights(spec, eightschools_m)

summarize(adjusted, mean(mu), var(mu))
summarize(adjusted, wasserstein(mu, p=2))
summarize(adjusted, diff_1 = mean(y[1] - theta[1]), .model_data=model_data)
summarize(adjusted, quantile(tau, probs=c(0.05, 0.5, 0.95)))

## End(Not run)

## Not run: 
model_data = list(
    J = 8,
    y = c(28, 8, -3, 7, -1, 1, 18, 12),
    sigma = c(15, 10, 16, 11, 9, 11, 10, 18)
)

spec = make_spec(eta ~ student_t(df, 0, 1), df=1:10)
adjusted = adjust_weights(spec, eightschools_m)

summarize(adjusted, mean(mu), var(mu))
summarize(adjusted, wasserstein(mu, p=2))
summarize(adjusted, diff_1 = mean(y[1] - theta[1]), .model_data=model_data)
summarize(adjusted, quantile(tau, probs=c(0.05, 0.5, 0.95)))

## End(Not run)

Package 'adjustr'

Help Index

Compute Pareto-smoothed Importance Weights for Alternative Model Specifications

Description

Usage

Arguments

Details

Value

References

See Also

Examples

adjustr: Stan Model Adjustments and Sensitivity Analyses using Importance Sampling

Description

Details

Key Functions

Convert an adjustr_spec Object Into a Data Frame

Description

Usage

Arguments

dplyr Methods for adjustr_spec Objects

Description

Usage

Arguments

Examples

Extract Model Sampling Statements From a Stan Model.

Description

Usage

Arguments

Value

Examples

Get Importance Resampling Indices From Weights

Description

Usage

Arguments

Value

Examples

Set Up Model Adjustment Specifications

Description

Usage

Arguments

Value

See Also

Examples

Extract Weights From an adjustr_weighted Object

Description

Usage

Arguments

Plot Posterior Quantities of Interest Under Alternative Model Specifications

Description

Usage

Arguments

Value

See Also

Examples

Summarize Posterior Distributions Under Alternative Model Specifications

Description

Usage

Arguments

Value

See Also

Examples

Convert an `adjustr_spec` Object Into a Data Frame

`dplyr` Methods for `adjustr_spec` Objects

Extract Weights From an `adjustr_weighted` Object