Align precision ranks to a target Spearman correlation
Source:R/layer3-align_rank_corr.R
align_rank_corr.RdPermute the paired precision columns (se_j, se2_j, n_j) of a Layer 2
frame so the realized Spearman correlation between the standardized
residual effect \(z_j\) and the sampling variance \(\widehat{se}_j^2\)
approximates a target value rank_corr. The permutation preserves both
marginals exactly — only the assignment of precision values to sites
changes — and adds an attr(., "permutation_perm") integer vector
recording the realized assignment.
Usage
align_rank_corr(
upstream,
rank_corr = 0,
max_iter = 20000L,
tol = 0.02,
dependence_fn = NULL,
...
)Arguments
- upstream
A Layer 2 data frame with canonical columns
site_index, z_j, tau_j, se_j, se2_j, n_j(typically the output ofgen_site_sizesorgen_se_direct).- rank_corr
Numeric in
[-1, 1]. Target Spearman correlation betweenz_jandse2_j. Default0(independence). Typical applied values:0.3(moderate positive — small sites tend to have larger effects),-0.5(moderate negative — selection-on-effect-size in meta-analysis).|rank_corr| > 0.95triggers a near-boundary warning.- max_iter
Integer (\(\ge 100\)). Maximum number of swap proposals. Default
20000L.- tol
Numeric (> 0). Absolute tolerance for the realized residual Spearman correlation. Default
0.02.- dependence_fn
Optional callback. See Details for the contract.
- ...
Additional arguments forwarded to
dependence_fn.
Value
The upstream tibble with the paired precision columns
(se_j, se2_j, n_j) permuted to approximate the target. Two
attributes are attached: permutation_perm (length-J integer
vector — the realized site-to-grid permutation) and
dependence_diagnostics (a named list with method, target,
realized_residual, realized_marginal, iterations, converged).
Details
Hill-climb algorithm. The package starts from the upstream
(Layer 2) ordering of (se2_j, se_j, n_j) and proposes random pair
swaps. A swap is accepted if it brings the realized residual Spearman
correlation closer to rank_corr; otherwise it is rejected. The search
continues until the realized correlation lands within tol of the
target or until max_iter swaps have been proposed. The result is the
best-effort permutation found within budget — for moderate
\(|rank_{corr}| \le 0.7\) and max_iter = 20,000 (the default) the
algorithm converges reliably; near \(|rank_{corr}| \approx 1\) the
search may stall before convergence.
Why "exact marginal preservation" matters. Because the
algorithm only reorders the existing values, the marginals of se2_j
and \(z_j\) are bit-identical to their Layer 2 inputs. Diagnostics
computed on the marginals (informativeness, heterogeneity ratio,
shrinkage) are unchanged by Layer 3; only the joint distribution and
rank correlation move. If you need to introduce dependence by
shifting marginals, use align_copula_corr (Gaussian
copula) or align_hybrid_corr (the recommended default,
which combines both).
Custom dependence_fn extensibility. Pass a dependence_fn
callback to bypass the built-in hill-climb. The callback receives
z_j, se2_j, target, and ..., and must return
list(se2_j = <length-J numeric>, perm = <length-J integer>). The
returned se2_j must be a permutation of the upstream se2_j
multiset. The callback owns its own RNG.
For the formal contrast among the three injection methods (rank, copula, hybrid) and a decision rubric on when to choose each, see the Precision dependence — three injection methods vignette.
RNG policy
Built-in hill-climb proposes swaps via sample.int() and accepts /
rejects deterministically based on the resulting correlation. The
active sample.int() RNG is consumed; under a wrapper seed
(sim_multisite() / sim_meta() with seed), runs are bit-identical.
Custom dependence_fn callbacks own their own RNG.
References
Lee, J., Che, J., Rabe-Hesketh, S., Feller, A., & Miratrix, L. (2025). Improving the estimation of site-specific effects and their distribution in multisite trials. Journal of Educational and Behavioral Statistics, 50(5), 731–764. doi:10.3102/10769986241254286 .
See also
align_copula_corr for the Gaussian-copula alternative;
align_hybrid_corr for the recommended default that
combines copula initialization with hill-climb polish;
realized_rank_corr for the realized Spearman after
alignment;
sim_multisite for the wrapper that calls this in the
four-layer pipeline;
the M4
Precision dependence theory vignette.
Other family-dependence:
align_copula_corr(),
align_hybrid_corr()
Examples
# Compose Layer 1 + 2 + 3 manually with a moderate positive target.
effects <- gen_effects_gaussian(J = 12L)
margins <- gen_site_sizes(effects, J = 12L, nj_mean = 40, cv = 0.2)
aligned <- align_rank_corr(margins, rank_corr = 0.3, max_iter = 1000L)
cor(aligned$z_j, aligned$se2_j, method = "spearman")
#> [1] 0.3017562
# Negative target — common in meta-analytic selection-on-precision scenarios.
aligned_neg <- align_rank_corr(margins, rank_corr = -0.5)
attr(aligned_neg, "dependence_diagnostics")
#> $method
#> [1] "rank"
#>
#> $target_type
#> [1] "residual_spearman"
#>
#> $target
#> [1] -0.5
#>
#> $achieved
#> [1] -0.5193014
#>
#> $residual
#> [1] -0.01930144
#>
#> $converged
#> [1] TRUE
#>
#> $iterations
#> [1] 1
#>
#> $tol
#> [1] 0.02
#>