Skip to contents

Compute the across-site mean of the empirical-Bayes shrinkage, \(\bar{S} = J^{-1} \sum_{j=1}^{J} \sigma_\tau^2 / (\sigma_\tau^2 + \widehat{se}_j^2)\), from a simulated dataset, a bare \(\widehat{se}_j^2\) vector, or a closed-form approximation that uses only design parameters. Group D diagnostic (downstream shrinkage — Dr. Chen's question 4 on average pooling toward the overall mean).

Usage

mean_shrinkage(x = NULL, ...)

Arguments

x

Optional. A multisitedgp_data object (data method), a positive numeric se2_j vector (numeric method), or NULL to take the closed-form path. Default NULL.

...

Method-specific named arguments. The numeric method and the closed-form path require sigma_tau (numeric \(\ge 0\), between-site SD; auto-read for the data method). The closed-form path also accepts nj_mean (numeric \(>\) 0, mean per-site sample size used in \(\widehat{se}^2 = \kappa / \bar{n}\)); varY (numeric \(>\) 0, outcome variance fed into compute_kappa; default 1); p (numeric in (0, 1), treatment-assignment proportion; default 0.5); and R2 (numeric in [0, 1), site-level covariate-explained variance share; default 0). All other ... are forwarded to the chosen method.

Value

A scalar double in [0, 1) for the data and numeric methods; a numeric vector (recycled over the closed-form arguments) for the closed-form path.

Details

\(\bar{S}\) is the natural single-number summary of how much partial-pooling work a hierarchical model will do across the design. Values near 0 indicate the prior dominates almost everywhere; values near 1 indicate near-no pooling. The canonical quality gate is mean_shrinkage_min = 0.30 (see default_thresholds).

Method dispatch. mean_shrinkage() resolves to one of three paths:

  • Data method (multisitedgp_data input): reads sigma_tau from attr(x, "design")$sigma_tau and returns mean(compute_shrinkage(x$se2_j, sigma_tau)).

  • Numeric method (numeric se2_j vector): requires sigma_tau explicitly and returns mean(compute_shrinkage(x, sigma_tau)).

  • Closed-form path (x = NULL, named arguments only): approximates \(\widehat{se}^2 \approx \kappa(p, R^2, \mathrm{Var}(Y)) / \bar{n}\) and returns \(\sigma_\tau^2 / (\sigma_\tau^2 + \widehat{se}^2)\). Use this at the design stage to plan \(\bar{S}\) before any simulation has been drawn — it answers "at this heterogeneity scale and average site size, what shrinkage should I expect?"

The closed-form path vectorizes over its arguments via R recycling, so passing a vector nj_mean produces a vector \(\bar{S}\). The two empirical paths return scalars.

For the design-stage planning workflow see the A3 · Diagnostics in practice vignette.

References

Lee, J., Che, J., Rabe-Hesketh, S., Feller, A., & Miratrix, L. (2025). Improving the estimation of site-specific effects and their distribution in multisite trials. Journal of Educational and Behavioral Statistics, 50(5), 731–764. doi:10.3102/10769986241254286 .

See also

compute_shrinkage for the per-site \(S_j\) kernel; compute_kappa for the closed-form precision constant \(\kappa\); informativeness and compute_I for the Group A counterpart; feasibility_index for the additive Efron / Morris forms; default_thresholds for the mean_shrinkage_min gate; scenario_audit for the audit pipeline; the A3 · Diagnostics in practice vignette.

Other family-diagnostics: bhattacharyya_coef(), compute_I(), compute_kappa(), compute_shrinkage(), default_thresholds(), feasibility_index(), heterogeneity_ratio(), informativeness(), ks_distance(), realized_rank_corr(), realized_rank_corr_marginal(), scenario_audit()

Examples

# Data method: sigma_tau is read from the attached design.
dat <- sim_multisite(J = 10L, seed = 1L)
mean_shrinkage(dat)
#> [1] 0.3453587

# Numeric method: explicit sigma_tau on a bare se2_j vector.
mean_shrinkage(dat$se2_j, sigma_tau = 0.20)
#> [1] 0.3453587

# Closed-form (design-stage) path: no simulated data needed.
mean_shrinkage(nj_mean = 50, sigma_tau = 0.20)
#> [1] 0.3333333

# Closed-form sweep: how does mean shrinkage vary with average site size?
mean_shrinkage(nj_mean = c(25, 50, 100, 200), sigma_tau = 0.20)
#> [1] 0.2000000 0.3333333 0.5000000 0.6666667