Poissonization Error Bound (Chen-Stein / Le Cam Bound)
Source:R/13_error_bounds.R
compute_poissonization_bound.RdComputes an upper bound on the conditional total variation distance between the shifted cluster count \(S_J = K_J - 1\) and a Poisson law with the same mean (Poissonization error).
Details
Under the CRP representation, \(S_J = \sum_{i=2}^J I_i\) where \(I_i \sim \text{Bernoulli}(p_i)\) and \(p_i = \alpha / (\alpha + i - 1)\).
A standard Chen-Stein/Le Cam bound gives: $$d_{TV}(S_J, \text{Poisson}(\lambda)) \le \frac{1 - e^{-\lambda}}{\lambda} \sum_{i=2}^J p_i^2$$ where \(\lambda = \sum_{i=2}^J p_i = E[S_J | \alpha]\).
The prefactor \((1 - e^{-\lambda})/\lambda\) is always in (0, 1] and approaches 1 as \(\lambda \to 0\). This provides a tighter bound than simply using \(\sum p_i^2\) alone.
The returned value is capped at 1 (since total variation is always between 0 and 1).
References
Le Cam, L. (1960). An approximation theorem for the Poisson binomial distribution. Pacific Journal of Mathematics, 10(4), 1181-1197.
Chen, L. H. Y. (1975). Poisson approximation for dependent trials. The Annals of Probability, 3(3), 534-545.
RN-05 (Theorem 1) and references therein.
Examples
# Full Chen-Stein bound
compute_poissonization_bound(J = 50, alpha = 1)
#> [1] 0.1732509
# Raw bound (sum of p_i^2)
compute_poissonization_bound(J = 50, alpha = 1, raw = TRUE)
#> [1] 0.6251327
# Vectorized
compute_poissonization_bound(J = 50, alpha = c(0.5, 1, 2, 5))
#> [1] 0.1010243 0.1732509 0.2481908 0.3555110