Audit a grid of multisite design cells against quality-gate thresholds
Source:R/scenario_audit.R
scenario_audit.RdRun M simulation replicates of each row of a design_grid
and summarize whether the cell clears the public quality-gate
thresholds. The top-of-funnel diagnostic of Dr. Chen's four-question
rubric: run this before committing a scenario grid to a long
simulation, so designs that cannot meet your minimum requirements
(feasibility, dependence target, distributional fit) are caught early.
Usage
scenario_audit(
grid,
M = 200L,
thresholds = default_thresholds(),
parallel = FALSE
)Arguments
- grid
A
multisitedgp_design_gridobject fromdesign_grid.- M
Integer (\(\ge 1\)). Number of simulation replicates per cell. Default
200L. UseM = 50Lfor a fast pre-check; raise to200L+ for stable across-replicate medians.- thresholds
Named list of quality-gate thresholds. Missing entries are filled from
default_thresholds.- parallel
Logical. If
TRUE, usefurrr::future_map_dfr()with the caller's active future plan. DefaultFALSE.
Value
A tibble with one row per design-grid cell, columns include the cell's design parameters plus aggregated diagnostics and pass/fail flags.
Details
For each cell of grid, scenario_audit() runs M replicates of
sim_multisite (or sim_meta, depending on
the cell's paradigm), computes the standard Group A/B/C/D
diagnostics on each replicate, and aggregates pass/fail status against
thresholds (defaulting to default_thresholds).
Parallel evaluation. Set parallel = TRUE to dispatch cells
through furrr::future_map_dfr() with the caller's active future
plan. The furrr package is a Suggests dependency.
Recommended workflow. (1) Build the grid with
design_grid; (2) run scenario_audit(grid, M = 50L) for
a fast pre-check; (3) drop or revise cells that fail the gates; (4)
run the long-form simulation only on the surviving cells.
For a worked example see the Case study — multisite trial vignette.
References
Lee, J., Che, J., Rabe-Hesketh, S., Feller, A., & Miratrix, L. (2025). Improving the estimation of site-specific effects and their distribution in multisite trials. Journal of Educational and Behavioral Statistics, 50(5), 731–764. doi:10.3102/10769986241254286 .
See also
design_grid for building the input scenario grid;
feasibility_index for the per-replicate Group A scalar;
default_thresholds for the default quality gates;
sim_multisite and sim_meta for the
per-replicate simulators called inside;
the A3 Diagnostics
in practice vignette.
Other family-diagnostics:
bhattacharyya_coef(),
compute_I(),
compute_kappa(),
compute_shrinkage(),
default_thresholds(),
feasibility_index(),
heterogeneity_ratio(),
informativeness(),
ks_distance(),
mean_shrinkage(),
realized_rank_corr(),
realized_rank_corr_marginal()
Examples
# Fast pre-check on a small grid.
grid <- design_grid(J = 10L, sigma_tau = c(0.10, 0.20), seed_root = 1L)
scenario_audit(grid, M = 1L)
#> # A tibble: 2 × 25
#> cell_id J sigma_tau M seed_root design_hash status pass n_violations
#> <int> <int> <dbl> <int> <int> <chr> <chr> <lgl> <int>
#> 1 1 10 0.1 1 1140350788 8e962b43f3… FAIL FALSE 4
#> 2 2 10 0.2 1 312928385 18d7bb27f3… FAIL FALSE 3
#> # ℹ 16 more variables: fail_reasons <chr>, warn_reasons <chr>, med_I_hat <dbl>,
#> # q05_I_hat <dbl>, q95_I_hat <dbl>, med_R_hat <dbl>, q95_R_hat <dbl>,
#> # med_mean_shrinkage <dbl>, q05_mean_shrinkage <dbl>,
#> # med_feasibility_efron <dbl>, q05_feasibility_efron <dbl>,
#> # med_feasibility_morris <dbl>, med_bhattacharyya <dbl>,
#> # q05_bhattacharyya <dbl>, med_ks <dbl>, q95_ks <dbl>
if (FALSE) { # \dontrun{
# Production audit — many replicates, parallel.
future::plan(future::multisession, workers = 4)
scenario_audit(grid, M = 200L, parallel = TRUE)
} # }