Generate two-component Gaussian mixture latent site effects
Source:R/layer1-gen_effects_mixture.R
gen_effects_mixture.RdDraw J standardized site effects from a two-component Gaussian mixture (the legacy JEBS parameterization) and apply the shared Layer 1 location-scale wrapper to produce \(\tau_j = \tau + X_j\boldsymbol{\beta} + \sigma_\tau\,z_j\). Reach for the mixture when you expect bimodal effects, contamination, or a subgroup of "outlier" sites whose effect distribution differs from the bulk.
Usage
gen_effects_mixture(
J,
tau = 0,
sigma_tau = 0.2,
delta,
eps,
ups,
formula = NULL,
beta = NULL,
data = NULL
)Arguments
- J
Integer. Number of sites.
- tau
Numeric. Grand mean on the response scale. Default
0.- sigma_tau
Numeric (\(\ge 0\)). Between-site standard deviation on the response scale. Default
0.20.- delta
Numeric (> 0). Component separation. Required — no default. Larger values produce more bimodal mixtures. Typical applied values:
delta = 2(mild bimodality),delta = 5(clearly bimodal — the JEBS fixture).- eps
Numeric in
(0, 1). Component-2 mixing weight. Required.eps = 0.3puts 30% of sites in component 2;eps = 0.5is balanced.- ups
Numeric (> 0). SD ratio \(\sigma_2 / \sigma_1\). Required.
ups = 1gives equal-spread components;ups = 2gives a wider second component (the JEBS fixture).- formula
One-sided formula for site-level covariates, or
NULL.- beta
Numeric coefficient vector matching
formula, orNULL.- data
A
data.framewith the predictors named informula, orNULL.
Value
A tibble with one row per site and columns site_index (integer
1:J), z_j (unit-variance mixture residual), tau_j (response-scale
effect), latent_component (integer 1 or 2 — which component each draw
came from), plus any covariate columns from data.
Details
The mixture model is parameterized so that, before standardization,
component 1 has mean \(-\epsilon\delta\) and SD 1 and component 2 has
mean \((1 - \epsilon)\delta\) and SD ups, with mixing weight
\((1 - \epsilon)\) on component 1 and \(\epsilon\) on component 2.
This guarantees the unmixed expectation is zero. The total variance
before standardization is
\((1 - \epsilon) + \epsilon\,\mathrm{ups}^2 + \epsilon(1 - \epsilon)\delta^2\);
the package divides each draw by the square root of that variance to
produce unit-variance standardized residuals \(z_j\).
This is the parameterization used in the JEBS paper's mixture-shape
fixtures; the parameter names (delta, eps, ups) match the JEBS
notation. Because of that lock, the returned tibble carries an extra
column latent_component (integer 1 or 2) recording which component
each draw came from — useful for diagnostics and for matching realized
draws against intended group memberships.
For the broader catalog and decision rubric, see the G-distribution catalog and standardization vignette.
References
Lee, J., Che, J., Rabe-Hesketh, S., Feller, A., & Miratrix, L. (2025). Improving the estimation of site-specific effects and their distribution in multisite trials. Journal of Educational and Behavioral Statistics, 50(5), 731–764. doi:10.3102/10769986241254286 .
See also
gen_effects for the dispatcher and the full eight-shape
catalog;
gen_effects_pmslab for a related null-spike-plus-slab
shape;
gen_effects_gaussian for the unimodal baseline;
the M2 G-distribution
catalog vignette.
Other family-effects:
gen_effects(),
gen_effects_ald(),
gen_effects_dpm(),
gen_effects_gaussian(),
gen_effects_pmslab(),
gen_effects_skewn(),
gen_effects_studentt(),
gen_effects_user()
Examples
# JEBS fixture: clearly bimodal, 30% in the wider second component.
mix <- gen_effects_mixture(J = 50L, delta = 5, eps = 0.3, ups = 2)
table(mix$latent_component) # ~ 35 / 15 split
#>
#> 1 2
#> 33 17
# Mild bimodality with equal-spread components.
gen_effects_mixture(J = 50L, delta = 2, eps = 0.5, ups = 1, sigma_tau = 0.15)
#> # A tibble: 50 × 4
#> site_index z_j tau_j latent_component
#> <int> <dbl> <dbl> <int>
#> 1 1 -0.0811 -0.0122 2
#> 2 2 1.84 0.276 2
#> 3 3 -0.768 -0.115 1
#> 4 4 1.81 0.272 2
#> 5 5 -0.318 -0.0476 2
#> 6 6 -0.132 -0.0198 1
#> 7 7 0.350 0.0526 2
#> 8 8 0.00846 0.00127 1
#> 9 9 1.39 0.209 2
#> 10 10 -0.414 -0.0620 1
#> # ℹ 40 more rows