Align precision ranks using hybrid copula initialization and rank polishing

The recommended default Layer 3 dependence injector. align_hybrid_corr() combines fast empirical-rank Gaussian-copula initialization with an optional hill-climb polish, getting the best of both worlds: the smooth joint distribution of the copula path and the tight target-realized agreement of the hill-climb. It preserves the empirical precision multiset exactly via explicit permutation.

Usage

align_hybrid_corr(
  upstream,
  rank_corr = 0,
  init = c("copula", "rank"),
  polish = c("hill_climb", "none"),
  max_iter = 20000L,
  tol = 0.02,
  dependence_fn = NULL,
  ...
)

Arguments

upstream: A Layer 2 data frame with canonical columns site_index, z_j, tau_j, se_j, se2_j, n_j.
rank_corr: Numeric in [-1, 1]. Target residual Spearman correlation. Default 0. Typical applied values: 0.3 (moderate positive), -0.5 (selection-on-precision in meta-analysis).
init: Character. Initialization mode — "copula" (default, recommended) or "rank" (partial-sort start).
polish: Character. Polishing mode — "hill_climb" (default, recommended) or "none" (copula-only).
max_iter: Integer (\(\ge 100\)). Maximum swap proposals during the polish stage. Default 20000L.
tol: Numeric (> 0). Absolute tolerance for the realized residual Spearman. Default 0.02.
dependence_fn: Optional callback. See align_rank_corr for the contract.
...: Additional arguments forwarded to dependence_fn.

Value

The upstream tibble with paired precision columns permuted, plus attributes permutation_perm and dependence_diagnostics (with stage-specific entries: init_method, polish_method, init_realized, polish_realized).

Details

Two-stage algorithm. Stage 1 — initialization. By default, run align_copula_corr with pearson_corr chosen so the implied Spearman matches rank_corr. (Alternative: init = "rank" starts from a partial-sort permutation seeded by ordered indexing.) Stage 2 — polish. By default, run a hill-climb on the Stage 1 output until the realized Spearman lands within tol of rank_corr or max_iter swaps have been proposed. Stage 2 can be skipped with polish = "none" for a copula-only result.

Why it's the default. Pure copula (Stage 1 only) is fast but has finite-J slack on the realized Spearman; pure hill-climb is exact but slow to converge from an arbitrary starting permutation, especially near \(|rank_{corr}| \to 1\). The hybrid uses copula to land in the right neighborhood and hill-climb to clean up — typically converges in far fewer swaps than pure rank, with comparable target accuracy.

Custom dependence_fn extensibility. See the same contract documented in align_rank_corr.

For the formal contrast among the three injection methods and a decision rubric, see the Precision dependence — three injection methods vignette.

RNG policy

Stage 1 (copula init) consumes one rnorm() draw per site at finite pearson_corr. Stage 2 (hill-climb polish) proposes swaps via sample.int() and accepts deterministically; consumes the active sample.int() RNG. Both stages are wrapped in the caller's seed when invoked through sim_multisite / sim_meta.

References

Lee, J., Che, J., Rabe-Hesketh, S., Feller, A., & Miratrix, L. (2025). Improving the estimation of site-specific effects and their distribution in multisite trials. Journal of Educational and Behavioral Statistics, 50(5), 731–764. doi:10.3102/10769986241254286 .

Examples

# Recommended default: hybrid copula + hill-climb polish.
effects <- gen_effects_gaussian(J = 12L)
margins <- gen_site_sizes(effects, J = 12L, nj_mean = 40, cv = 0.2)
aligned <- align_hybrid_corr(margins, rank_corr = 0.3, max_iter = 1000L)
cor(aligned$z_j, aligned$se2_j, method = "spearman")
#> [1] 0.2982474

# Copula-only (skip the polish stage).
copula_only <- align_hybrid_corr(margins, rank_corr = 0.3, polish = "none")