The AstroStat Slog

tests of fit for the Poisson distribution

Apr 29th, 2008| 02:24 am | Posted by hlee

Scheming arXiv:astro-ph abstracts almost an year never offered me an occasion that the fit of the Poisson distribution is tested in different ways, instead it is taken for granted by plugging data and (source) model into a (modified) χ² function. If any doubts on the Poisson distribution occur, the following paper might be useful:

J.J.Spinelli and M.A.Stephens (1997)
Cramer-von Mises tests of fit for the Poisson distribution
Canadian J. Stat. Vol. 25(2), pp. 257-267
Abstract: goodness-of-fit tests based on the Cramer-von Mises statistics are given for the Poisson distribution. Power comparisons show that these statistics, particularly A², give good overall tests of fit. The statistics A² will be particularly useful for detecting distributions where the variance is close to the mean, but which are not Poisson.

In addition to Cramer-von Mises statistics (A² and W²), the dispersion test D (so called a χ² statistic for testing the goodness of fit in astronomy and this D statistics is considered as a two sided test approximately distributed as a χ²_n-1 variable), the Neyman-Barton k-component smooth test S_k, P and T (statistics based on the probability generating function), and the Pearson X² statistics (the number of cells K is chosen to avoid small expected values and the statistics is compared to a χ²_K-1 variable, I think astronomers call it modified χ² test) are introduced and compared to compute the powers of these tests. The strategy to provide the powers of the Cramer-von Mises statistics is that there is a parameter γ in the negative binomial distribution, which is zero under the null hypothesis (Poission distribution), and letting this γ=δ/sqrt(n) in which the parameter value δ is chosen so that for a two-sided 0.05 level test, the best test has a power of 0.5^[1]. Based on this simulation study, the statistic A² was empirically as powerful as the best test compared to other Cramer-von Mises tests.

Under the Poission distribution null hypothesis, the alternatives are overdispersed, underdispersed, and equally dispersed distributions. For the equally dispersed alternative, the Cramer-von Mises statistics have the best power compared other statistics. Overall, the Cramer-von Mises statistics have good power against all classes of alternative distributions and the Pearson X² statistic performed very poorly for the overdispersed alternative.

Instead of binning for the modified χ² tests^[2], we could adopt A² of W² for the goodness-of-fit tests. Probably, it’s already implemented in softwares but not been recognized.

The locally most powerful unbiased test is the statistics D (Potthoff and Whittinghill, 1966) [↩]
authors’ examples indicate high significant levels compared to other tests; in other words, χ² statistics – the dispersion test statistic D and the Pearson X² – are insensitive to provide the evidence of the source model is not a good-fit to produce Poisson photon count data[↩]

Tags: Cramer-von Mises test, Goodness of fit, most powerful test, Poisson, Power
Category: Methods, Misc | Comment (RSS) | Trackback

One Comment

hlee:

Best, D. J. and Rayner, J. C. W. (2005)
Improved Testing for the Poisson Distribution Using Chisquared Components with Data Dependent Cells (subscription required)
Communications in Statistics: Simulation and Computation, Volume 34, Number 1, pp. 85-96(12)

Abstract: A power study suggests that a good test of fit analysis for the Poisson distribution is provided by a data dependent Chernoff-Lehmann X 2 test with class expectations greater than unity, and its components. These data dependent statistics involve arithmetically simple parameter estimation, convenient approximate distributions, and provide a comprehensive assessment of how well the data agree with a Poisson distribution. We suggest that a well-performed single test of fit statistic is the Anderson-Darling statistic. Three examples are discussed.
——————————————————————————————————————————————————————

Haschenburger, J.K. and Spinelli, J.J. (2005)
Assessing the Goodness-of-Fit of Statistical Distributions When Data Are Grouped (subscription required)
Mathematical Geology, Volume 37, Number 3, pp. 261-276

Abstract: Modeling statistical distributions of phenomena can be compromised by the choice of goodness-of-fit statistics. The Pearson chi-square test is the most commonly used test in the geosciences, but the lesser known empirical distribution function (EDF) statistics should be preferred in many test situations. Using a data set from geomorphology, the Anderson��Darling test for grouped exponential distributions is employed to illustrate ease of use and statistical advantages of this EDF test. Attention to the issues discussed will result in more informed statistic selection and increased rigor in the identification of distribution functions that describe random variables.
——————————————————————————————————————————————————————
Chernoff and Lehmann χ² in these papers reminds me Lucy (2000) that recommends improved χ² instead of Pearson χ².

Lucy, L.B. (2000)
Hypothesis testing for meagre data sets
MNRAS, Volume 318, Issue 1, pp. 92-100.

Abstract:Improved ��2 statistics are defined for both Poisson and multinomial data sets. Numerical experiments with the Nousek-Shue test problem from X-ray spectroscopy indicate that, in contrast to Pearson’s statistic X2, these modified statistics remain approximately valid at the 2�� level of significance even when the mean counts per bin is < <1 provided that the total counts N>~30. The simplest of the proposed statistics is formally equivalent to carrying out a goodness-of-fit test with Pearson’s statistic but with modified confidence limits, and this procedure is recommended.
07-02-2008, 7:42 pm

tests of fit for the Poisson distribution

One Comment

hlee:

Leave a comment

Admin

Recent Posts

Recent Comments

Category Cloud

Blogroll

Links