Skip to contents

Computes \(\log P(K_J = k \mid \alpha)\) for \(k = 0, 1, \ldots, J\).

Usage

log_pmf_K_given_alpha(J, alpha, logS)

Arguments

J

Integer; sample size (number of observations, must be >= 1).

alpha

Numeric; DP concentration parameter (must be positive scalar).

logS

Matrix; pre-computed log-Stirling matrix from compute_log_stirling.

Value

Numeric vector of length \(J+1\) containing \(\log P(K_J = k \mid \alpha)\) for \(k = 0, 1, \ldots, J\). Note that entry [1] corresponds to \(k=0\) and always equals -Inf (since \(P(K_J = 0) = 0\)).

Details

Uses the Antoniak distribution formula in log-space: $$\log P(K_J = k \mid \alpha) = \log|s(J,k)| + k\log\alpha - \log(\alpha)_J$$

where \(|s(J,k)|\) is the unsigned Stirling number of the first kind and \((\alpha)_J\) is the rising factorial.

This log-space computation is numerically stable for large \(J\) where direct computation would overflow.

See also

pmf_K_given_alpha for normalized PMF, compute_log_stirling for Stirling computation

Examples

logS <- compute_log_stirling(50)
log_pmf <- log_pmf_K_given_alpha(50, 2.0, logS)

# Convert to probabilities
pmf <- softmax(log_pmf)
sum(pmf)  # Should be 1
#> [1] 1