Log-PMF of K Given Alpha (Antoniak Distribution)

Computes $\log P(K_J = k \mid \alpha)$ for $k = 0, 1, \ldots, J$.

Usage

log_pmf_K_given_alpha(J, alpha, logS)

Arguments

J: Integer; sample size (number of observations, must be >= 1).
alpha: Numeric; DP concentration parameter (must be positive scalar).
logS: Matrix; pre-computed log-Stirling matrix from compute_log_stirling.

Value

Numeric vector of length $J+1$ containing $\log P(K_J = k \mid \alpha)$ for $k = 0, 1, \ldots, J$. Note that entry [1] corresponds to $k=0$ and always equals -Inf (since $P(K_J = 0) = 0$).

Details

Uses the Antoniak distribution formula in log-space: $$\log P(K_J = k \mid \alpha) = \log|s(J,k)| + k\log\alpha - \log(\alpha)_J$$

where $|s(J,k)|$ is the unsigned Stirling number of the first kind and $(\alpha)_J$ is the rising factorial.

This log-space computation is numerically stable for large $J$ where direct computation would overflow.

Examples

logS <- compute_log_stirling(50)
log_pmf <- log_pmf_K_given_alpha(50, 2.0, logS)

# Convert to probabilities
pmf <- softmax(log_pmf)
sum(pmf)  # Should be 1
#> [1] 1

Usage

Arguments

Value

Details

See also

Examples