output |
---|
github_document |
You can install the development version of dbmm from GitHub with:
# install.packages("devtools")
devtools::install_github("devincaughey/dbmm")
You will also need to install cmdstanr (for instructions, see here).
The R package dbmm fits dynamic Bayesian measurement models using the programming language Stan via the R package cmdstanr. Currently, the only supported model is a dynamic factor (DF) model for indicators of mixed type (binary, ordinal, or metric). In the future, however, the package will incorporate other models, including the dynamic group-level item response theory (DGIRT) model currently implemented by the R package dgo.
Using dbmm involves the following steps:
- Shape the data into the list format required by cmdstanr.
- Fit the required model in Stan using.
- Extract parameter draws from the fitted model.
- If needed, identify the model the model by rotating and/or sign-flipping the draws.
- Check the convergence diagnostics of the fitted model.
- Label the draws with informative parameter names.
- Summarize and plot the posterior distributions.
## Load data on societal attributes of U.S. states in 2020 and 2021
data("social_outcomes_2020_2021")
## Drop observations with missing values
social_outcomes_2020_2021 <- na.omit(social_outcomes_2020_2021)
## Shape the data into list form
shaped_data <- shape_data(
long_data = social_outcomes_2020_2021,
unit_var = "st",
time_var = "year",
item_var = "outcome",
value_var = "value",
periods_to_estimate = 2020:2021,
ordinal_items = NA,
binary_items = NA,
max_cats = 10,
standardize = TRUE,
make_indicator_for_zeros = TRUE
)
You can specify additional arguments for cmdstanr::sample()
. For details, see here.
options(mc.cores = parallel::detectCores()) # for parallizing across chains
fitted <- fit(
data = shaped_data,
n_dim = 2,
chains = 2,
parallelize_within_chains = FALSE,
constant_alpha = FALSE,
separate_eta = TRUE,
init_kappa = FALSE,
force_recompile = FALSE,
iter_warmup = 500,
iter_sampling = 500,
adapt_delta = .9,
refresh = 10,
seed = 123
)
fitted_draws <- extract_draws(fitted)
head(fitted_draws)
identified_draws <- identify_draws(fitted_draws, rotate = TRUE)
## (To apply varimax rotation, set `rotate = TRUE`.)
## Basic check
check_convergence(identified_draws$id_draws)
## Traceplot of selected parameters
bayesplot::mcmc_trace(identified_draws$id_draws, pars = "lambda_metric[23,2]")
## More details
summarized_draws <- summary(identified_draws$id_draws)
summary(summarized_draws)
labeled_draws <- label_draws(identified_draws)
head(labeled_draws$eta)
head(labeled_draws$lambda_metric)
library(tidyverse)
## Posterior mean and standard deviation of the factor scores and item loadings
eta_summ <- labeled_draws$eta %>%
summarise(
est = mean(value),
err = sd(value),
.by = c(TIME, UNIT, dim)
)
head(eta_summ)
lambda_metric_summ <- labeled_draws$lambda_metric %>%
summarise(
est = mean(value),
err = sd(value),
.by = c(ITEM, dim)
)
head(lambda_metric_summ)
## Plot item loadings
lambda_metric_summ %>%
pivot_wider(
id_cols = "ITEM",
names_from = "dim",
values_from = c("est", "err")
) %>%
ggplot() +
aes(x = est_1, y = est_2, label = ITEM) +
geom_vline(xintercept = 0, linetype = "dotted") +
geom_hline(yintercept = 0, linetype = "dotted") +
geom_point() +
geom_linerange(
aes(xmin = est_1 - 1.96*err_1, xmax = est_1 + 1.96*err_1),
alpha = 1/4, linewidth = 2
) +
geom_linerange(
aes(ymin = est_2 - 1.96*err_2, ymax = est_2 + 1.96*err_2),
alpha = 1/4, linewidth = 2
) +
ggrepel::geom_text_repel() +
labs(title = "Item Loadings") +
coord_fixed()