sim_item_params() generates item parameters (difficulty \(\beta\) and
discrimination \(\lambda\)) for Item Response Theory (IRT) simulation studies.
It wraps the IRW irw_simu_diff() function for realistic difficulty distributions
and provides multiple methods for generating correlated discriminations.
The function is designed with four key principles:
Realistic difficulties: Integration with Item Response Warehouse (IRW) for empirically-grounded difficulty distributions.
Correlated parameters: Support for the empirically observed negative correlation between difficulty and discrimination (Sweeney et al., 2022).
Marginal preservation: Copula method preserves exact marginal distributions while achieving target correlation.
Reliability targeting: Scale factor for subsequent calibration.
Usage
sim_item_params(
n_items,
model = c("rasch", "2pl"),
source = c("irw", "parametric", "hierarchical", "custom"),
method = c("copula", "conditional", "independent"),
n_forms = 1L,
difficulty_params = list(),
discrimination_params = list(),
hierarchical_params = list(),
custom_params = list(),
scale = 1,
center_difficulties = TRUE,
seed = NULL
)Arguments
- n_items
Integer. Number of items to generate per form.
- model
Character. The data-generating model: "rasch" or "2pl".
- source
Character. Source for generating difficulties:
"irw"Use IRW difficulty pool (realistic, empirical)
"parametric"Generate from parametric distribution
"hierarchical"Joint MVN for both parameters (Glas & van der Linden)
"custom"User-supplied parameters or function
- method
Character. Method for generating discriminations (when model = "2pl"):
"copula"Gaussian copula - preserves marginals exactly (RECOMMENDED)
"conditional"Conditional normal regression on difficulty
"independent"Independent generation (no correlation)
- n_forms
Integer. Number of test forms to generate. Default is 1. When > 1, returns a data frame with form_id column.
- difficulty_params
List. Parameters for difficulty generation:
- For
source = "irw": pool- difficulty pool data frame- For
source = "parametric": mu,sigma,distribution
- For
- discrimination_params
List. Parameters for discrimination generation:
mu_logMean of log-discrimination (default: 0)
sigma_logSD of log-discrimination (default: 0.3)
rhoTarget correlation between \(\beta\) and \(\log(\lambda)\) (default: -0.3)
- hierarchical_params
List. For source = "hierarchical":
mu2-vector: means of \((\log\lambda, \beta)\)
tau2-vector: SDs
rhoCorrelation
- custom_params
List. For source = "custom":
betaVector or function returning difficulties
lambdaVector or function returning discriminations
- scale
Numeric. Global discrimination scaling factor for reliability targeting. Final discriminations are \(\lambda_i^* = c \cdot \lambda_i\). Default is 1.
- center_difficulties
Logical. If TRUE, center difficulties to sum to zero for identification. Default is TRUE.
- seed
Integer. Random seed for reproducibility.
Value
An object of class "item_params" containing:
dataData frame with columns: form_id, item_id, beta, lambda, lambda_unscaled
modelModel type used
sourceSource used for generation
methodMethod used for discrimination generation
n_itemsNumber of items per form
n_formsNumber of forms generated
scaleScale factor applied
centeredWhether difficulties were centered
paramsParameters used for generation
achievedAchieved statistics (correlations, moments)
Details
Why Copula Method is Recommended
When difficulties come from the IRW pool (which has realistic, often non-normal marginal distributions), the conditional normal method can distort the achieved correlation because it assumes linearity. The Gaussian copula method:
Transforms difficulties to uniform scale via empirical CDF
Generates correlated uniforms through Gaussian copula
Transforms back to desired marginals (log-normal for discrimination)
This guarantees:
Exact preservation of difficulty marginal (whatever IRW provides)
Exact log-normal marginal for discriminations
Spearman correlation \(\approx \rho\) (rank-based, robust to non-normality
References
Glas, C. A. W., & van der Linden, W. J. (2003). Computerized adaptive testing with item cloning. Applied Psychological Measurement, 27(4), 247-261.
Sweeney, S. M., et al. (2022). An investigation of the nature and consequence of the relationship between IRT difficulty and discrimination. EM:IP, 41(4), 50-67.
Zhang, L., et al. (2025). Realistic simulation of item difficulties. PsyArXiv.
Examples
# Example 1: Rasch with IRW difficulties
items1 <- sim_item_params(n_items = 25, model = "rasch", source = "irw")
# Example 2: 2PL with copula method (recommended)
items2 <- sim_item_params(
n_items = 30, model = "2pl", source = "irw",
method = "copula",
discrimination_params = list(rho = -0.3)
)
# Example 3: Multiple forms
items3 <- sim_item_params(
n_items = 20, model = "2pl", n_forms = 5,
source = "irw", method = "copula"
)
#> Warning: collapsing to unique 'x' values
# Example 4: Hierarchical 2PL
items4 <- sim_item_params(
n_items = 25, model = "2pl", source = "hierarchical",
hierarchical_params = list(mu = c(0, 0), tau = c(0.25, 1), rho = -0.3)
)