Package 'nbhdmodel'

Title: Neighborhood Modeling and Analysis
Description: Functionality for fitting neighborhood models of McCartan, Brown, and Imai <arXiv:2110.14014>. The core methodology is described in the paper and can be implemented with any tool that can fit generalized linear mixed models (GLMMs). However, some of the preprocessing necessary to set up the GLMM can be onerous. In addition to providing a specialized GLMM routine, this package provides several preprocessing functions that, while not completely general, should be useful for others performing these kinds of analyses.
Authors: Cory McCartan [aut, cre], Jacob Brown [aut], Kosuke Imai [aut]
Maintainer: Cory McCartan <[email protected]>
License: MIT + file LICENSE
Version: 0.1.0.9000
Built: 2024-10-26 04:41:41 UTC
Source: https://github.com/CoryMcCartan/nbhdmodel

Help Index


Binned Residual Plot

Description

Binned Residual Plot

Usage

binned_resid(model, bins = 16)

Arguments

model

the model object, which should have fitted, resid, etc. methods.

bins

the number of bins

Value

A ggplot


Create a model data frame for an individual respondent

Description

This function combines individual, neighborhood, and geographic information to produce a data frame suitable for use in fitting a neighborhood_model(). Can be applied in a loop over respondents to generate a model frame for an entire sample.

Usage

calc_indiv_frame(row, nbhd, block_d, block_gr)

Arguments

row

A single data frame row containing relevant individual covariates for a respondent.

nbhd

The respondent's neighborhood as a vector of indices indexing the blocks in block_gr.

block_d

A data frame of census blocks, including a column centroid with an s2 point geography of each block's centroid.

block_gr

An adjacency graph object: a list containing an element graph with the adjacency list, and blocks a character vector of GEOIDs or codes corresponding to the indices in graph.

Value

A tibble that can be used inside a modeling function. Will contain the entries in row, plus the relevant entries in block_d for each block, plus columns:

  • ring containing the "ring" indicator around the residence: 0 indicates the respondent's block, 1 for blocks touching the residential block, 2 for blocks touching those, etc.

  • incl a binary indicator for whether the block is in the neighborhood

  • dist the distance to the respondent's block

  • frac_con the fraction of nearer blocks in the neighborhood this block is connected to.


Get Posterior Mean of Effective Block Distance

Description

Not exported. Assumes block_d has column fips which is used inside the neighborhood column of new_resp..

Usage

eff_dist(fit, new_resp, block_d, proc_fn = function(x) x)

Arguments

fit

the model fit, from neighborhood_model()

new_resp

a single-row respondent data frame to make predictions from

block_d

the block data frame

proc_fn

a processing function that is used to prepare raw model data for fitting

Value

a numeric vector of effective distances


Calculate AUC

Description

Calculate AUC

Usage

fastAUC(x, y)

Arguments

x

the predictor

y

a binary indicator

Value

the scalar AUC


Filter block data to a radius of a FIPS code

Description

Filter block data to a radius of a FIPS code

Usage

local_area(fips, block_d, geom_d, dist = 0.5)

Arguments

fips

the FIPS code to center the area at

block_d

the block data, with a ⁠$fips⁠ column matching fips argument.

geom_d

the block geometry data

dist

the radius of the area, in miles

Value

a filtered block_d


Functions for working with neighborhood fits

Description

Functions for working with neighborhood fits

Usage

## S3 method for class 'nbhd_fit'
summary(object, ...)

## S3 method for class 'nbhd_fit'
coef(object, ...)

## S3 method for class 'nbhd_fit'
fixef(object, ...)

## S3 method for class 'nbhd_fit'
ranef(object, ...)

## S3 method for class 'nbhd_fit'
fitted(object, ...)

## S3 method for class 'nbhd_fit'
residuals(object, ...)

## S3 method for class 'nbhd_fit'
as.matrix(x, ...)

## S3 method for class 'nbhd_fit'
as.data.frame(x, ...)

Arguments

object, x

a nbhd_fit object

...

Ignored.


Fit the Neighborhood Model

Description

Fits the neighborhood GLMM using a provided formula and using a Bernoulli outcome with cloglog link.

Usage

neighborhood_model(
  formula,
  data,
  prior_coef_scale = 2.5,
  draws = 1000,
  imp_samp = TRUE,
  init = 0,
  ...,
  hessian = TRUE,
  verbose = FALSE
)

Arguments

formula

a one-sided model formula. The actual formula used to fit the GLMM will be generated from this one: it will contain a term for log(dist) (where dist must be the column that encodes distance), and will have random effects based on the id column. The column incl will be used as the left-hand-side variable indicating that a block is in the neighborhood.

data

the model data frame. Should have an incl column for block inclusion, an id column with respondent IDs, and a dist column with distances.

prior_coef_scale

the scale of the prior on the standardized predictors.

draws

the number of approximate posterior draws to generate

imp_samp

whether to perform importance resampling on the approximate draws.

init

initial values for the model fitting function rstan::optimizing().

...

other arguments to to the model fitting function rstan::optimizing().

hessian

whether to compute the Hessian. Required for full inference.

verbose

if TRUE, show verbose optimization output.

Value

a fitted model object of class nbhd_fit, which is a list which includes the following elements:

  • map, the MAP estimates for the parameters.

  • vcov, the covariance matrix for the MAP estimates, calculated from the Hessian of the log posterior.

  • raw_ids the vector of ids

  • X the design matrix

  • y the outcome value (1 - incl)

  • post, the approximate posterior samples, from posterior::draws_rvars

  • lp, the log posterior probability of each sample

  • lp_norm, the log probability of Normal approx. to posterior for each sample


Plot coefficient estimates

Description

50% and 90% credible intervals plotted by default.

Usage

## S3 method for class 'nbhd_fit'
plot(x, y = NULL, intercept = FALSE, inner_prob = 0.5, outer_prob = 0.9, ...)

Arguments

x

a nbhd_fit object from neighborhood_model().

y

ignored

intercept

if FALSE, don't plot the intercept estimate.

inner_prob

the inner credible interval probability

outer_prob

the inner credible interval probability

...

Ignored.

Value

A ggplot.


Get Posterior Block Inclusion Probabilities

Description

Get Posterior Block Inclusion Probabilities

Usage

post_incl(
  fit,
  new_resp,
  resp_id = NULL,
  block_d,
  proc_fn = function(x) x,
  use_distance = TRUE
)

Arguments

fit

the model fit, from neighborhood_model()

new_resp

a single-row respondent data frame to make predictions from

resp_id

the resondent ID, used to get the random effects. If NULL a new random effect will be simulated for each draw. If NA, the random effect will be set to zero.

block_d

the block data frame

proc_fn

a processing function that is used to prepare raw model data for fitting

use_distance

if FALSE, remove block-to-home distance from the linear predictor.

Value

a numeric vector of inclusion probabilities


Simulate a neighborhood for a respondent

Description

Simulate a neighborhood for a respondent

Usage

simulate_neighborhood(
  fit,
  new_resp,
  draws = 1,
  resp_id = NULL,
  block_d,
  block_gr,
  max_ring = 10L,
  proc_fn = function(x) x
)

Arguments

fit

the model fit, from neighborhood_model()

new_resp

a single-row respondent data frame to make predictions from. Should have a neighborhood column with codes matching block_gr$blocks.

draws

the number of simulated neighborhoods per respondent

resp_id

the resondent ID, used to get the random effects. If NULL a new random effect will be simulated for each draw. If NA, the random effect will be set to zero. If a numeric, the random effect will be set to this value.

block_d

the block data frame. Should have a centroid geography column.

block_gr

An adjacency graph object: a list containing an element graph with the adjacency list, and blocks a character vector of GEOIDs or codes corresponding to the indices in graph.

max_ring

the maximum graph distance from the starting block to allow.

proc_fn

a processing function that is used to prepare raw model data for fitting

Value

A list of draws integer vectors containing the indices of the blocks (in block_d) making up each simulated neighborhood.