Package 'easycensus' reference manual

Title:	Quickly Find, Extract, and Marginalize U.S. Census Tables
Description:	Extracting desired data using the proper Census variable names can be time-consuming. This package takes the pain out of that process by providing functions to quickly locate variables and download labeled tables from the Census APIs (<https://www.census.gov/data/developers/data-sets.html>).
Authors:	Cory McCartan [aut, cre]
Maintainer:	Cory McCartan <[email protected]>
License:	MIT + file LICENSE
Version:	1.1.1
Built:	2025-03-30 04:25:30 UTC
Source:	https://github.com/CoryMcCartan/easycensus

Authorize use of the Census API

Description

Tries environment variables CENSUS_API_KEY and CENSUS_KEY, in that order. If none is found and R is used in interactive mode, will prompt the user for a key.

Usage

cens_auth()
cens_auth()

Value

a Census API key

Find a decennial or ACS census table with variables of interest

Description

This function uses fuzzy matching to help identify tables from the census which contain variables of interest. Matched table codes are printed out, along with the Census-provided table description, the parsed variable names, and example table cells. The website https://censusreporter.org/ may also be useful in finding variables.

Usage

cens_find(tables, ..., show = 4)

cens_find_dec(..., show = 2)

cens_find_acs(..., show = 4)
cens_find(tables, ..., show = 4)

cens_find_dec(..., show = 2)

cens_find_acs(..., show = 4)

Arguments

`tables`	A list of `cens_table` objects, such as is produced by `cens_parse_tables()`.
`...`	Variables to look for. These can be length-1 character vectors, or, for convenience, can be left unquoted (see examples).
`show`	How many matching tables to show. Increase this to show more possible matches, at the cost of more output. Negative values will be converted to positive but will suppress any printing.

Value

The codes for the top show tables, invisibly if show is positive.

Examples

cens_find_dec("sex", "age")
cens_find(tables_sf1, "sex", "age") # same as above
cens_find_dec(tenure, race)
cens_find_acs("income", "sex", show=3)
cens_find_acs("heath care", show=-1)

cens_find_dec("sex", "age")
cens_find(tables_sf1, "sex", "age") # same as above
cens_find_dec(tenure, race)
cens_find_acs("income", "sex", show=3)
cens_find_acs("heath care", show=-1)

Construct a Geography Specification for Census Data

Description

Currently used mostly internally. Builds a Census API-formatted specification of which geographies to download data for. State and county names (or postal abbreviations) are partially matched to existing tables, for ease of use. Other geographies should be specified with Census GEOIDs. The usgazeteer package, available with remotes::install_github("bhaskarvk/usgazetteer"), may be useful in finding GEOIDs for other geographies. Consult the "geography" sections of each API at https://www.census.gov/data/developers/data-sets.html for information on which geographic specifiers may be provided in combination with others.

Usage

cens_geo(geo = NULL, ..., check = TRUE, api = "acs/acs5", year = 2019)
cens_geo(geo = NULL, ..., check = TRUE, api = "acs/acs5", year = 2019)

Arguments

`geo`	The geographic level to return. One of the machine-readable or human-readable names listed in the "Details" section. Will return all matching geographies of this level, as filtered by the further arguments to `...`. For example, setting `geo="tract"` is equivalent to setting `tract="all"`.
`...`	Geographies to return, as supported by the Census API. Order matters here—the first argument will be the geographic level to return (i.e., it corresponds to the `geo` argument) and additional arguments will filter the results. Use `"all"`, `"*"`, `NA`, or `TRUE` to return all units of a particular geography. See the examples for details.
`check`	If `TRUE`, validate the provided geographies against the available geographies from the relevant Census API. Requires the `api` and `year` arguments to be specified.
`api`	A Census API programmatic name such as `"acs/acs5"`.
`year`	The year for the data

Details

Supported geography arguments:

us
region
division
state
county
county_subdiv (County Subdivision)
subminor_civil_division (Subminor Civil Division)
place_remainder (Place/Remainder (Or Part))
tract_part (Tract (Or Part))
urban_rural (Urban Rural)
block_group_part (Block Group (Or Part))
block
tract
aian_area_part (American Indian Area/Alaska Native Area/Hawaiian Home Land (Or Part))
block_group (Block Group)
county_part (County (Or Part))
place_part (Place (Or Part))
place
consolidated_city (Consolidated City)
alaska_native_regional_corporation (Alaska Native Regional Corporation)
aian_area (American Indian Area/Alaska Native Area/Hawaiian Home Land)
tribal_subdiv (Tribal Subdivision/Remainder)
aian_reserve_stat (American Indian Area/Alaska Native Area (Reservation Or Statistical Entity Only))
ai_tribal_subdiv_part (American Indian Tribal Subdivision (Or Part))
ai_off_reserve_trust (American Indian Area (Off-Reservation Trust Land Only)/Hawaiian Home Land)
tribal_census_tract (Tribal Census Tract)
tribal_census_tract_part (Tribal Census Tract (Or Part))
tribal_block_group (Tribal Block Group)
state_part (State (Or Part))
county_subdiv_part (County Subdivision (Or Part))
tribal_subdiv_part (Tribal Subdivision/Remainder (Or Part))
aian_reserve_stat_part (American Indian Area/Alaska Native Area (Reservation Or Statistical Entity Only) (Or Part))
ai_off_reserve_trust_part (American Indian Area (Off-Reservation Trust Land Only)/Hawaiian Home Land (Or Part))
tribal_block_group_part (Tribal Block Group (Or Part))
msa (Metropolitan Statistical Area/Micropolitan Statistical Area)
principal_city_part (Principal City (Or Part))
metro_division (Metropolitan Division)
msa_part (Metropolitan Statistical Area/Micropolitan Statistical Area (Or Part))
metro_division_part (Metropolitan Division (Or Part))
combined_statistical_area (Combined Statistical Area)
combined_necta (Combined New England City And Town Area)
necta (New England City And Town Area)
combined_statistical_area_part (Combined Statistical Area (Or Part))
combined_necta_part (Combined New England City And Town Area (Or Part))
necta_part (New England City And Town Area (Or Part))
principal_city (Principal City)
necta_division (Necta Division)
necta_division_part (Necta Division (Or Part))
urban_area (Urban Area)
urban_area_part (Urban Area (Or Part))
consolidated_city_part (Consolidated City (Or Part))
cd (Congressional District)
sld_upper (State Legislative District (Upper Chamber))
sld_lower (State Legislative District (Lower Chamber))
alaska_native_regional_corporation_part (Alaska Native Regional Corporation (Or Part))
zcta (Zip Code Tabulation Area)
zcta_part (Zip Code Tabulation Area (Or Part))
school_district_elementary (School District (Elementary))
school_district_secondary (School District (Secondary))
school_district_unified (School District (Unified))
congressional_district_part (Congressional District (Or Part))
school_district_elementary_part (School District (Elementary) (Or Part))
school_district_secondary_part (School District (Secondary) (Or Part))
school_district_unified_part (School District (Unified) (Or Part))
voting_district_part (Voting District (Or Part))
subminor_civil_division_part (Subminor Civil Division (Or Part))
state_legislative_district_upper_chamber_part (State Legislative District (Upper Chamber) (Or Part))
state_legislative_district_lower_chamber_part (State Legislative District (Lower Chamber) (Or Part))
vtd (Voting District)
ai_tribal_subdiv (American Indian Tribal Subdivision)
puma (Public Use Microdata Area)

Value

A list with two elements, region and regionin, which together specify a valid Census API geography argument.

Examples

cens_geo(state="WA")
cens_geo("county", state="WA") # equivalent to `cens_geo(county="all", state="WA")`
cens_geo(county="King", state="Wash")
cens_geo(zcta="02138", check=FALSE)
cens_geo(zcta=NA, state="WA", check=FALSE)
cens_geo("zcta", state="WA", check=FALSE)
cens_geo(cd="09", state="WA", check=FALSE)
cens_geo("county_part", state="WA", cd="09", check=FALSE)

cens_geo(state="WA")
cens_geo("county", state="WA") # equivalent to `cens_geo(county="all", state="WA")`
cens_geo(county="King", state="Wash")
cens_geo(zcta="02138", check=FALSE)
cens_geo(zcta=NA, state="WA", check=FALSE)
cens_geo("zcta", state="WA", check=FALSE)
cens_geo(cd="09", state="WA", check=FALSE)
cens_geo("county_part", state="WA", cd="09", check=FALSE)

Download data from a decennial census or ACS table

Description

Leverages censusapi::getCensus() to download tables of census data. Tables are returned in tidy format, with variables given tidy, human-readable names.

Usage

cens_get_dec(
  table,
  geo = NULL,
  ...,
  sumfile = "sf1",
  pop_group = NULL,
  check_geo = FALSE,
  drop_total = FALSE,
  show_call = FALSE
)

cens_get_acs(
  table,
  geo = NULL,
  ...,
  year = 2019,
  survey = c("acs5", "acs1"),
  check_geo = FALSE,
  drop_total = FALSE,
  show_call = FALSE
)

cens_get_raw(
  table,
  geo = NULL,
  ...,
  year = 2010,
  api = NULL,
  check_geo = FALSE,
  show_call = TRUE
)
cens_get_dec(
  table,
  geo = NULL,
  ...,
  sumfile = "sf1",
  pop_group = NULL,
  check_geo = FALSE,
  drop_total = FALSE,
  show_call = FALSE
)

cens_get_acs(
  table,
  geo = NULL,
  ...,
  year = 2019,
  survey = c("acs5", "acs1"),
  check_geo = FALSE,
  drop_total = FALSE,
  show_call = FALSE
)

cens_get_raw(
  table,
  geo = NULL,
  ...,
  year = 2010,
  api = NULL,
  check_geo = FALSE,
  show_call = TRUE
)

Arguments

`table`	The table to download, either as a character vector or a table object as produced by `cens_find_dec()`, `cens_find_acs()` or `cens_parse_tables()`, or as included in `tables_dec` and `tables_acs`. Note: some tables are split into A/B/C/etc. versions by race; this function unifies all of these tables under one code. So, for example, use `P012`, not `P012A`.
`geo`	The geographic level to return. One of the machine-readable or human-readable names listed in the "Details" section of `cens_geo()`. Will return all matching geographies of this level, as filtered by the further arguments to `...`. For example, setting `geo="tract"` is equivalent to setting `tract="all"`.
`...`	Geographies to return, as supported by the Census API. Order matters here—the first argument will be the geographic level to return (i.e., it corresponds to the `geo` argument) and additional arguments will filter the results. Use `"all"`, `"*"`, `NA`, or `TRUE` to return all units of a particular geography. See the examples of `cens_geo()` for details.
`sumfile`	For decennial data, the summary file to use. SF2 contains more detailed race and household info.
`pop_group`	For decennial data using summary file SF2, the population group to filter to. See https://www2.census.gov/programs-surveys/decennial/2010/technical-documentation/complete-tech-docs/summary-file/sf2.pdf#page=347.
`check_geo`	If `TRUE`, validate the provided geographies against the available geographies from the relevant Census API.
`drop_total`	Whether to filter out variables which are totals across another variable. Recommended only after inspection of the underlying table.
`show_call`	Whether to show the actual call to the Census API. May be useful for debugging.
`year`	For ACS data, the survey year to get data for.
`survey`	For ACS data, whether to use the one-year or five-year survey (the default). Make sure to check availability using `cens_find_acs()`.
`api`	A Census API programmatic name such as `"acs/acs5"`.

Value

A tibble of census data in tidy format, with columns GEOID, NAME, variable (containing the Census variable code), value or estimate in the case of ACS tables, and additional factor columns specific to the table.

Functions

cens_get_dec(): Get decennial census data.
cens_get_acs(): Get American Community Survey (ACS) data.
cens_get_raw(): Get raw data from another Census Bureau API. Output will be minimally tidied but will likely require further manipulation.

Examples

## Not run: 
cens_get_dec("P3", "state")
cens_get_dec(tables_sf1$H2, "state")
cens_get_dec("H2", "county", state="WA", drop_total=TRUE)

cens_get_acs("B09001", county="King", state="WA")

## End(Not run)

## Not run: 
cens_get_dec("P3", "state")
cens_get_dec(tables_sf1$H2, "state")
cens_get_dec("H2", "county", state="WA", drop_total=TRUE)

cens_get_acs("B09001", county="King", state="WA")

## End(Not run)

Helper function to sum over nuisance variables

Description

For ACS data, margins of error will be updated appropriately, using the functionality in estimate().

Usage

cens_margin_to(data, ...)
cens_margin_to(data, ...)

Arguments

`data`	The output of `cens_get_dec()` or `cens_get_acs()`
`...`	The variables of interest, which will be kept. Remaining variables will be marginalized out.

Value

A new data frame that has had group_by() and summarize() applied.

Examples

## Not run: 
d_cens = cens_get_acs("state", "B25042")
cens_margin_to(d_cens, bedrooms)

## End(Not run)
## Not run: 
d_cens = cens_get_acs("state", "B25042")
cens_margin_to(d_cens, bedrooms)

## End(Not run)

Attempt to Parse Tables from a Census API

Description

Uses the same parsing code as that which generates tables_sf1 and tables_acs See https://www.census.gov/data/developers/data-sets.html for a list of APIs and corresponding years, or use censusapi::listCensusApis().

Usage

cens_parse_tables(api, year)
cens_parse_tables(api, year)

Arguments

`api`	A Census API programmatic name such as `"acs/acs5"`.
`year`	The year for the data

Value

A list of cens_table objects, which are just lists with four elements:

concept, a human-readable name
tables, the constituent table codes
surveys, the supported surveys
dims, the parsed names of the dimensions of the tables
vars, a tibble with all of the parsed variable values

Examples

## Not run: 
cens_parse_tables("dec/pl", 2020)

## End(Not run)

## Not run: 
cens_parse_tables("dec/pl", 2020)

## End(Not run)

Specialized margin-of-error calculations

Description

Proportions and percent-change-over-time calculations require different standard error calculations.

Usage

est_prop(x, y)

est_pct_chg(x, y)
est_prop(x, y)

est_pct_chg(x, y)

Arguments

x, y

An estimate vector. For est_pct_chg(), calculates the % change from x to y (i.e., $(y-x)/x$ )

Value

An estimate vector.

Examples

x = estimate(1, 0.1)
y = estimate(1.5, 0.1)
est_prop(x, y)
est_pct_chg(x, y)

x = estimate(1, 0.1)
y = estimate(1.5, 0.1)
est_prop(x, y)
est_pct_chg(x, y)

Estimate class

Description

A numeric vector that stores margin-of-error information along with it. The margin of error will update through basic arithmetic operations, using a first-order Taylor series approximation. The implicit assumption is that the errors in each value are uncorrelated. If in fact there is correlation, the margins of error could be wildly under- or over-estimated.

Usage

estimate(x, se = NULL, moe = NULL, conf = 0.9)

is_estimate(x)

as_estimate(x)
estimate(x, se = NULL, moe = NULL, conf = 0.9)

is_estimate(x)

as_estimate(x)

Arguments

`x`	A numeric vector containing the estimate(s).
`se`	A numeric vector containing the standard error(s) for the estimate(s). Users should supply either `se` or `moe` and `conf`.
`moe`	A numeric vector containing the margin(s) of error. Users should supply either `se` or `moe` and `conf`.
`conf`	The confidence level to use in converting the margin of error to a standard error. Defaults to 90%, which is what the Census Bureau uses for ACS estimates.

Value

An estimate vector.

Examples

estimate(5, 2) # 5 with std. error  2
estimate(15, moe=3) - estimate(5, moe=4)
estimate(1:4, 0.1) * estimate(1, 0.1)

estimate(5, 2) # 5 with std. error  2
estimate(15, moe=3) - estimate(5, moe=4)
estimate(1:4, 0.1) * estimate(1, 0.1)

Format an estimate

Description

Format an estimate for pretty printing

Usage

## S3 method for class 'estimate'
format(x, conf = 0.9, digits = 2, trim = FALSE, ..., formatter = fmt_plain)
## S3 method for class 'estimate'
format(x, conf = 0.9, digits = 2, trim = FALSE, ..., formatter = fmt_plain)

Arguments

`x`	An estimate vector
`conf`	The confidence level to use in converting the margin of error to a standard error. Defaults to 90%, which is what the Census Bureau uses for ACS estimates.
`digits`	The number of dig
`trim`	logical; if `FALSE`, logical, numeric and complex values are right-justified to a common width: if `TRUE` the leading blanks for justification are suppressed.
`...`	Ignored.
`formatter`	the formatting function to use internally

Extract estimates, standard errors, and margins of error

Description

Getter functions for estimate() vectors.

The posterior::rvar class may be useful in handling standard errors for more complicated mathematical expressions. This function assumes a Normal distribution centered on the estimate, with standard deviation equal to the standard error of the estimate. The posterior package is required for this function.

Usage

get_est(x)

get_se(x)

get_moe(x, conf = 0.9)

to_rvar(x, n = 500)
get_est(x)

get_se(x)

get_moe(x, conf = 0.9)

to_rvar(x, n = 500)

Arguments

`x`	An estimate vector.
`conf`	The confidence level to use in constructing the margin of error.
`n`	How many samples to draw.

Value

An estimate vector.

A posterior::rvar vector.

Examples

x = estimate(1, 0.1)
get_est(x)
get_moe(x)

x = estimate(1, 0.1)
if (requireNamespace("posterior", quietly=TRUE)) {
    rv_x = to_rvar(x)
    (rv_x^2 / rv_x) - rv_x # std. errors zero (correct)
    x^2 / x - x # std. errors not zero
}

x = estimate(1, 0.1)
get_est(x)
get_moe(x)

x = estimate(1, 0.1)
if (requireNamespace("posterior", quietly=TRUE)) {
    rv_x = to_rvar(x)
    (rv_x^2 / rv_x) - rv_x # std. errors zero (correct)
    x^2 / x - x # std. errors not zero
}

Parsed Census SF1 and ACS Tables

Description

Contains parsed table information for the 2010 Decennial Summary File 1 and 2019 ACS 5-year and 1-year tables. This parsed information is used internally in cens_find_dec(), cens_find_acs(), cens_get_dec(), and cens_get_acs(). For other sets of tables, try using cens_parse_tables().

Usage

tables_sf1

tables_acs
tables_sf1

tables_acs

Format

A list of cens_table objects, which are just lists with four elements:

concept, a human-readable name
tables, the constituent table codes
surveys, the supported surveys
dims, the parsed names of the dimensions of the tables
vars, a tibble with all of the parsed variable values

An object of class list of length 83.

An object of class list of length 848.

Tidy labels in census tables

Description

Some table labels are quite verbose, and users will often want to shorten them. These functions make tidying common types of labels easy. Most produce straightforward output, but there are several more generic tidiers:

tidy_simplify() attempts to simplify labels by removing words common to all labels.
tidy_parens() attempts to simplify labels by removing all terms in parentheses.
tidy_race_detailed() creates logical columns for each of the six racial categories.

Usage

tidy_race(x)

tidy_race_detailed(x, x2, x3)

tidy_ethnicity(x)

tidy_age(x)

tidy_age_bins(x, as_factor = FALSE)

tidy_income_bins(x, as_factor = FALSE)

tidy_simplify(x)

tidy_parens(x)
tidy_race(x)

tidy_race_detailed(x, x2, x3)

tidy_ethnicity(x)

tidy_age(x)

tidy_age_bins(x, as_factor = FALSE)

tidy_income_bins(x, as_factor = FALSE)

tidy_simplify(x)

tidy_parens(x)

Arguments

`x`	A factor, which will be re-leveled. Character vectors will be converted to factors.
`x2`, `x3`	Additional character columns containing detailed information for certain variables (e.g. detailed race)
`as_factor`	if `TRUE`, return a factor with levels of the form `⁠[35,40]⁠`.

Value

A re-leveled factor, except for tidy_age_bins(), which by default returns a data frame with columns age_from and age_to (inclusive).

Examples

ex_race_long = c("american indian and alaska native alone", "asian alone",
    "black or african american alone", "hispanic or latino",
    "native hawaiian and other pacific islander alone",
    "some other race alone", "total", "two or more races",
    "white alone", "white alone, not hispanic or latino")
tidy_race(ex_race_long)

tidy_age_bins(c("10 to 14 years", "21 years", "85 years and over"))

tidy_parens(c("label one (fake)", "label two (fake)"))
tidy_simplify(c("label one (fake)", "label two (fake)"))

## Not run:  # requires API key
d = cens_get_acs("B02003", "us", year=2019, survey="acs1")
dplyr::mutate(d, tidy_race_detailed(dtldr_1, dtldr_2, dtldr_3))

## End(Not run)

ex_race_long = c("american indian and alaska native alone", "asian alone",
    "black or african american alone", "hispanic or latino",
    "native hawaiian and other pacific islander alone",
    "some other race alone", "total", "two or more races",
    "white alone", "white alone, not hispanic or latino")
tidy_race(ex_race_long)

tidy_age_bins(c("10 to 14 years", "21 years", "85 years and over"))

tidy_parens(c("label one (fake)", "label two (fake)"))
tidy_simplify(c("label one (fake)", "label two (fake)"))

## Not run:  # requires API key
d = cens_get_acs("B02003", "us", year=2019, survey="acs1")
dplyr::mutate(d, tidy_race_detailed(dtldr_1, dtldr_2, dtldr_3))

## End(Not run)

Package 'easycensus'

Help Index

Authorize use of the Census API

Description

Usage

Value

Find a decennial or ACS census table with variables of interest

Description

Usage

Arguments

Value

Examples

Construct a Geography Specification for Census Data

Description

Usage

Arguments

Details

Value

Examples

Download data from a decennial census or ACS table

Description

Usage

Arguments

Value

Functions

Examples

Helper function to sum over nuisance variables

Description

Usage

Arguments

Value

Examples

Attempt to Parse Tables from a Census API

Description

Usage

Arguments

Value

Examples

Specialized margin-of-error calculations

Description

Usage

Arguments

Value

Examples

Estimate class

Description

Usage

Arguments

Value

Examples

Format an estimate

Description

Usage

Arguments

Extract estimates, standard errors, and margins of error

Description

Usage

Arguments

Value

Examples

Parsed Census SF1 and ACS Tables

Description

Usage

Format

Tidy labels in census tables

Description

Usage

Arguments

Value

Examples