eqc_calibrate() implements Algorithm 1 (Empirical / Stochastic
Quadrature Calibration, EQC/SQC) for reliability-targeted IRT simulation.
Given a target marginal reliability \(\rho^*\), a latent distribution
generator sim_latentG() (for \(G\)) and an item parameter generator
sim_item_params() (for \(H\)), the function searches for a global
discrimination scale \(c^* > 0\) such that the population reliability
\(\rho(c)\) of the Rasch/2PL model is approximately equal to \(\rho^*\).
The key idea is to:
Draw a large fixed "quadrature" sample \(\{\theta_m\}_{m=1}^M \sim G\) and item parameters \(\{(\beta_i, \lambda_{i,0})\}_{i=1}^I \sim H\) once.
For any scale \(c\), form \(\lambda_i(c) = c \cdot \lambda_{i,0}\) and compute the empirical approximation to population reliability \(\hat\rho_M(c)\) from the test information function.
Solve the scalar equation \(\hat\rho_M(c^*) = \rho^*\) using deterministic root-finding (Brent's method via
uniroot()).
Arguments
- target_rho
Numeric in (0, 1). Target marginal reliability \(\rho^*\).
- n_items
Integer. Number of items in the test form.
- model
Character. Measurement model:
"rasch"or"2pl". For"rasch", all baseline discriminations are set to 1 before scaling.- latent_shape
Character. Shape argument passed to
sim_latentG()(e.g."normal","bimodal","heavy_tail", ...).- item_source
Character. Source argument passed to
sim_item_params()(e.g."irw","parametric","hierarchical","custom").- latent_params
List. Additional arguments passed to
sim_latentG().- item_params
List. Additional arguments passed to
sim_item_params().- reliability_metric
Character. Reliability definition used inside EQC:
"msem"MSEM-based marginal reliability (default, theoretically exact).
"info"Average-information reliability (faster, more stable).
Synonyms:
"bar"for"msem","tilde"for"info".- M
Integer. Size of the empirical quadrature sample (default: 10000).
- c_bounds
Numeric length-2 vector. Search bounds for \(c\). Default: c(0.3, 3).
- tol
Numeric. Tolerance for
uniroot(). Default: 1e-4.- seed
Optional integer for reproducibility.
- verbose
Logical. If TRUE, print progress messages.
Value
An object of class "eqc_result" (a list) with elements:
c_starCalibrated discrimination scale \(c^*\).
target_rhoTarget reliability \(\rho^*\).
achieved_rhoEmpirical quadrature estimate \(\hat\rho_M(c^*)\).
metricReliability metric used.
theta_quadLength-M vector of quadrature abilities.
theta_varSample variance of theta_quad.
items_baseitem_params object with scale = 1 (baseline).
items_calibitem_params object with discriminations scaled by c_star.
Details
Reliability Metrics
The function supports two reliability definitions:
MSEM-based (
"msem"/"bar"): Uses the harmonic mean of test information, \(\bar{w}(c) = \sigma^2_\theta / (\sigma^2_\theta + E[1/\mathcal{J}(\theta;c)])\). This is theoretically exact but may have a lower ceiling for high reliability.Average-information (
"info"/"tilde"): Uses the arithmetic mean, \(\tilde{\rho}(c) = \sigma^2_\theta \bar{\mathcal{J}}(c) / (\sigma^2_\theta \bar{\mathcal{J}}(c) + 1)\). By Jensen's inequality, \(\tilde{\rho} \geq \bar{w}\), so this metric typically yields higher reliability values.
WLE vs EAP Reliability Interpretation
When validating with TAM, note that EAP reliability is systematically higher than WLE reliability. This is not a bug but a mathematical property of TAM's definitions. EAP reliability more directly corresponds to the MSEM-based population reliability targeted by EQC. For conservative inference, treat WLE as a lower bound and EAP as an upper bound for true measurement precision.
See also
spc_calibrate for the stochastic approximation alternative,
compute_rho_bar and compute_rho_tilde for
reliability computation utilities,
compute_reliability_tam for TAM validation.
Examples
if (FALSE) { # \dontrun{
# Basic EQC calibration
eqc_result <- eqc_calibrate(
target_rho = 0.80,
n_items = 25,
model = "rasch",
latent_shape = "normal",
item_source = "irw",
seed = 42,
verbose = TRUE
)
print(eqc_result)
} # }