Title: | Neighborhood Modeling and Analysis |
---|---|
Description: | Functionality for fitting neighborhood models of McCartan, Brown, and Imai <arXiv:2110.14014>. The core methodology is described in the paper and can be implemented with any tool that can fit generalized linear mixed models (GLMMs). However, some of the preprocessing necessary to set up the GLMM can be onerous. In addition to providing a specialized GLMM routine, this package provides several preprocessing functions that, while not completely general, should be useful for others performing these kinds of analyses. |
Authors: | Cory McCartan [aut, cre], Jacob Brown [aut], Kosuke Imai [aut] |
Maintainer: | Cory McCartan <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0.9000 |
Built: | 2024-10-26 04:41:41 UTC |
Source: | https://github.com/CoryMcCartan/nbhdmodel |
Binned Residual Plot
binned_resid(model, bins = 16)
binned_resid(model, bins = 16)
model |
the model object, which should have |
bins |
the number of bins |
A ggplot
This function combines individual, neighborhood, and geographic information
to produce a data frame suitable for use in fitting a neighborhood_model()
.
Can be applied in a loop over respondents to generate a model frame for
an entire sample.
calc_indiv_frame(row, nbhd, block_d, block_gr)
calc_indiv_frame(row, nbhd, block_d, block_gr)
row |
A single data frame row containing relevant individual covariates for a respondent. |
nbhd |
The respondent's neighborhood as a vector of indices indexing the
blocks in |
block_d |
A data frame of census blocks, including a column |
block_gr |
An adjacency graph object: a list containing an element
|
A tibble that can be used inside a modeling function. Will contain
the entries in row
, plus the relevant entries in block_d
for each
block, plus columns:
ring
containing the "ring" indicator around the residence: 0 indicates
the respondent's block, 1 for blocks touching the residential block, 2 for blocks
touching those, etc.
incl
a binary indicator for whether the block is in the neighborhood
dist
the distance to the respondent's block
frac_con
the fraction of nearer blocks in the neighborhood this block
is connected to.
Not exported. Assumes block_d
has column fips
which is used inside
the neighborhood
column of new_resp
..
eff_dist(fit, new_resp, block_d, proc_fn = function(x) x)
eff_dist(fit, new_resp, block_d, proc_fn = function(x) x)
fit |
the model fit, from |
new_resp |
a single-row respondent data frame to make predictions from |
block_d |
the block data frame |
proc_fn |
a processing function that is used to prepare raw model data for fitting |
a numeric vector of effective distances
Calculate AUC
fastAUC(x, y)
fastAUC(x, y)
x |
the predictor |
y |
a binary indicator |
the scalar AUC
Filter block data to a radius of a FIPS code
local_area(fips, block_d, geom_d, dist = 0.5)
local_area(fips, block_d, geom_d, dist = 0.5)
fips |
the FIPS code to center the area at |
block_d |
the block data, with a |
geom_d |
the block geometry data |
dist |
the radius of the area, in miles |
a filtered block_d
Functions for working with neighborhood fits
## S3 method for class 'nbhd_fit' summary(object, ...) ## S3 method for class 'nbhd_fit' coef(object, ...) ## S3 method for class 'nbhd_fit' fixef(object, ...) ## S3 method for class 'nbhd_fit' ranef(object, ...) ## S3 method for class 'nbhd_fit' fitted(object, ...) ## S3 method for class 'nbhd_fit' residuals(object, ...) ## S3 method for class 'nbhd_fit' as.matrix(x, ...) ## S3 method for class 'nbhd_fit' as.data.frame(x, ...)
## S3 method for class 'nbhd_fit' summary(object, ...) ## S3 method for class 'nbhd_fit' coef(object, ...) ## S3 method for class 'nbhd_fit' fixef(object, ...) ## S3 method for class 'nbhd_fit' ranef(object, ...) ## S3 method for class 'nbhd_fit' fitted(object, ...) ## S3 method for class 'nbhd_fit' residuals(object, ...) ## S3 method for class 'nbhd_fit' as.matrix(x, ...) ## S3 method for class 'nbhd_fit' as.data.frame(x, ...)
object , x
|
a |
... |
Ignored. |
Fits the neighborhood GLMM using a provided formula and using a Bernoulli outcome with cloglog link.
neighborhood_model( formula, data, prior_coef_scale = 2.5, draws = 1000, imp_samp = TRUE, init = 0, ..., hessian = TRUE, verbose = FALSE )
neighborhood_model( formula, data, prior_coef_scale = 2.5, draws = 1000, imp_samp = TRUE, init = 0, ..., hessian = TRUE, verbose = FALSE )
formula |
a one-sided model formula. The actual formula used to fit the
GLMM will be generated from this one: it will contain a term for log(dist)
(where |
data |
the model data frame.
Should have an |
prior_coef_scale |
the scale of the prior on the standardized predictors. |
draws |
the number of approximate posterior draws to generate |
imp_samp |
whether to perform importance resampling on the approximate draws. |
init |
initial values for the model fitting function |
... |
other arguments to to the model fitting function |
hessian |
whether to compute the Hessian. Required for full inference. |
verbose |
if |
a fitted model object of class nbhd_fit, which is a list which includes the following elements:
map
, the MAP estimates for the parameters.
vcov
, the covariance matrix for the MAP estimates, calculated from the
Hessian of the log posterior.
raw_ids
the vector of ids
X
the design matrix
y
the outcome value (1 - incl
)
post
, the approximate posterior samples, from posterior::draws_rvars
lp
, the log posterior probability of each sample
lp_norm
, the log probability of Normal approx. to posterior for each sample
50% and 90% credible intervals plotted by default.
## S3 method for class 'nbhd_fit' plot(x, y = NULL, intercept = FALSE, inner_prob = 0.5, outer_prob = 0.9, ...)
## S3 method for class 'nbhd_fit' plot(x, y = NULL, intercept = FALSE, inner_prob = 0.5, outer_prob = 0.9, ...)
x |
a |
y |
ignored |
intercept |
if |
inner_prob |
the inner credible interval probability |
outer_prob |
the inner credible interval probability |
... |
Ignored. |
A ggplot.
Get Posterior Block Inclusion Probabilities
post_incl( fit, new_resp, resp_id = NULL, block_d, proc_fn = function(x) x, use_distance = TRUE )
post_incl( fit, new_resp, resp_id = NULL, block_d, proc_fn = function(x) x, use_distance = TRUE )
fit |
the model fit, from |
new_resp |
a single-row respondent data frame to make predictions from |
resp_id |
the resondent ID, used to get the random effects. If |
block_d |
the block data frame |
proc_fn |
a processing function that is used to prepare raw model data for fitting |
use_distance |
if |
a numeric vector of inclusion probabilities
Simulate a neighborhood for a respondent
simulate_neighborhood( fit, new_resp, draws = 1, resp_id = NULL, block_d, block_gr, max_ring = 10L, proc_fn = function(x) x )
simulate_neighborhood( fit, new_resp, draws = 1, resp_id = NULL, block_d, block_gr, max_ring = 10L, proc_fn = function(x) x )
fit |
the model fit, from |
new_resp |
a single-row respondent data frame to make predictions from.
Should have a |
draws |
the number of simulated neighborhoods per respondent |
resp_id |
the resondent ID, used to get the random effects. If |
block_d |
the block data frame. Should have a |
block_gr |
An adjacency graph object: a list containing an element
|
max_ring |
the maximum graph distance from the starting block to allow. |
proc_fn |
a processing function that is used to prepare raw model data for fitting |
A list of draws
integer vectors containing the indices of the
blocks (in block_d
) making up each simulated neighborhood.