Fit the Bayesian Hybrid Framework Model

Fits the BHF model using Hamiltonian Monte Carlo via Stan.

Usage

bhf_fit(
  data,
  model = NULL,
  chains = 4,
  iter = 2000,
  warmup = floor(iter/2),
  seed = 1234,
  cores = NULL,
  adapt_delta = 0.9,
  max_treedepth = 12,
  refresh = 200,
  ...
)

Arguments

data: An object of class bhf_data from prepare_bhf_data().
model: A compiled Stan model from compile_bhf_model(). If NULL, the function will attempt to compile the model (with a warning).
chains: Integer. Number of MCMC chains. Default is 4.
iter: Integer. Total number of iterations per chain (including warmup). Default is 2000.
warmup: Integer. Number of warmup iterations per chain. Default is half of iter.
seed: Integer. Random seed for reproducibility.
cores: Integer. Number of cores for parallel chains. Default is min(chains, parallel::detectCores()).
adapt_delta: Numeric between 0 and 1. Target acceptance probability. Higher values (e.g., 0.95) reduce divergent transitions but slow sampling. Default is 0.90.
max_treedepth: Integer. Maximum tree depth for NUTS. Default is 12.
refresh: Integer. How often to print progress (iterations). Set to 0 to suppress output.
...: Additional arguments passed to rstan::sampling().

Value

An object of class bhf_fit containing:

stanfit: The stanfit object from rstan
data: The bhf_data object used for fitting
model: The compiled Stan model
call: The function call
diagnostics: MCMC diagnostics summary

Details

The model uses a non-centered parameterization for random effects, which typically improves sampling efficiency. Key parameters are:

alpha: Global intercept on logit scale
sigma_state: Between-domain standard deviation
sigma_psu: Between-PSU standard deviation (within stratum)

Convergence Diagnostics

The function automatically checks for:

Divergent transitions (should be 0)
Low Rhat values (should be < 1.05)
Adequate effective sample size (should be > 100)
Tree depth saturation

Warnings are issued if any diagnostics suggest convergence problems.

Typical Runtime

With default settings (4 chains, 2000 iterations):

Small data (N < 500): 1-2 minutes
Medium data (N ~ 3000): 5-10 minutes
Large data (N > 10000): 20+ minutes

Examples

if (FALSE) { # \dontrun{
# Step 1: Compile model (once per session)
model <- compile_bhf_model()

# Step 2: Prepare data
data(bhf_synthetic_data)
prepared <- prepare_bhf_data(
  bhf_synthetic_data,
  outcome = "has_subsidy",
  domain = "state",
  strata = "stratum",
  psu = "psu",
  weights = "weight"
)

# Step 3: Fit model
fit <- bhf_fit(prepared, model = model, chains = 4, iter = 2000)

# Step 4: Examine results
print(fit)
summary(fit)
} # }