Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://hea-www.harvard.edu/AstroStat/HEAD2013/asiemiginowska_13103_20130408.pdf
Äàòà èçìåíåíèÿ: Mon Apr 22 23:05:37 2013
Äàòà èíäåêñèðîâàíèÿ: Fri Feb 28 11:16:41 2014
Êîäèðîâêà:

Ïîèñêîâûå ñëîâà: ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï ð ï
Aneta Siem iginows ka

1

Hypothesis Tests
Aneta Siemiginowska Harvard-Smithsonian Center for Astrophysics

CHASC
April 8, 2013 This is one of many statistical topics that was important to Alanna

AAS HEAD 2013


Aneta Siem iginows ka

2

Outline
· Hypothesis testing: motivation and basic framework · Methods:
· P-values · Bayesian posterior predictive p-value · Bayes Factors · Conclusions Further Reading
Based on David Van Dyk talk at the 2008 HEAD meeting in Los Angeles

AAS HEAD 2013


Aneta Siem iginows ka

3

Motivation and Basic Framework
· Last step in data modeling:
· Choose between a simpler and a more complex model, e.g. addition of an
emission or absorption line to the continuum · discriminate between models, e.g. power law or thermal emission

· Framework:
· The Null Hypothesis:
H0: the line does not exist in the data · The Alternative Hypothesis: H1: the line exists in the data The Null is a special case of the Alternative => Line intensity equal zero.

AAS HEAD 2013


Aneta Siem iginows ka

4

Methods: p-values
· Assuming the Null hypothesis is true, how likely do we see the test
statistics, T, as extreme or more extreme than the observed value, t
obs

?

· Although p-values are very popular for model selections, they raise
important challenges:
· They are often based on appropriate asymptotic results · They can bias inference in the direction of false detection

T - Test statistics
AAS HEAD 2013


Aneta Siem iginows ka

5

Methods: Test Statistics
· Test Statistics:
Likelihood Ratio: L - lik e lih o o d 0 - fitted MLE of null model parameters 1 - fitted MLE of alternative model parameters Y - d a ta Important! The distribution -2log(R) under H0 approaches 2(d-do) as the data sample size increases under certain assumptions.
AAS HEAD 2013


Aneta Siem iginows ka

6

Methods: LRT
· Assumptions of the Likelihood Ratio Test statistics:
· The null hypothesis must be a special case of the alternative · The parameter space of the null must be interior of the alternative parameter space.

· The second assumption fails when testing for a spectral emission line:
· When there is no line, the line intensity is zero, it may not be negative. · The line locations and width of the line do not exist when there is no line. They have no values. See Protassov et al (2002) for details

AAS HEAD 2013


Aneta Siem iginows ka

Methods: LRT - distribution
IMPORTANT! We do not know the true distribution of the test statistics.
N a rro w l i n e at fixed location
Unknown location of the line Fixed width Abs orption line

7

Protas s ov et al. (2002)

5% false positive rates For the nominal distribution Simulated False positive rates LRT distribution histograms



2distribution solid line

· Results of three tests compared to the nominal 2 distribution

AAS HEAD 2013


Aneta Siem iginows ka

8

Methods: Monte Carlo Calibration
·
Instead of using the fixed best fit parameter values run Monte Carlo simulations to access the sampling distribution of the LRT (or other Test Statistics). This will calibrate the value of the statistic computed on the data and determine a p-value. Recipe:
· · · · · Simulate N data sets assuming H Compute the test statistics (LRT)
0

·

for each data set. Make a histogram of the simulated test statistics The histogram approximates the sampling distribution of the test statistic calculate the p-value => proportion of simulated test statistics larger than t
obs

AAS HEAD 2013


Aneta Siem iginows ka

9

Methods: Posterior Predictive Sampling
· If there are unknown parameters in the null we cannot simulate data. · Solutions:
· Fit the real data under the null model and compute fitted parameters and error bars. · Parametric Bootstrap - resample data with unknown parameters to account for the error bars. · Bayesian Posterior Predictive modeling simulates unknown parameters from their posterior distribution, which are used to simulate the data sets.

AAS HEAD 2013


Aneta Siem iginows ka

Methods: Bayes Factors
· Motivation:
· P-values are based on Pr(Data|H0) · We are interested in Pr(H0|Data)

10

If we compare these two calculations P-values can vastly overstate the evidence for the H1 Solution: Quantify p(H0 |Data) directly. Bayes factors give a method to do this.

H - model - model parameters Y - data

priors posteriors
AAS HEAD 2013


Aneta Siem iginows ka

Methods: Bayes Factors
· Challenges:
· Computing BF can be very challenging · BF assumes that one of the models H0 or H1 is true (can be a completely different model?) · Priors are much more important/influential when computing BF than with parameter inference. We have to be very careful about prior specification

11

· We are working on methods that address these challenges in
practice.

AAS HEAD 2013


Aneta Siem iginows ka

12

Summary
· Bayesian Posterior Predictive p-values (PPP)
· independent on the prior, need a careful evaluation of the evidence.

· Bayes Factors
· Challenging issues in computing, sensitive to priors.

· Conclusion:
· use PPP, Monte Carlo Sampling or Bootstrap to calibrate test statistics · but there are enough issues with the PPP that we need to look at BF more seriously

AAS HEAD 2013


Aneta Siem iginows ka

13

Further Reading
· ·
CHASC web page -- https://hea-www.harvard.edu/astrostat/ Protassov, R., van Dyk, D.A., Connors, A., Kashyap, V.L., and Siemiginowska, A., (2002) Statistics: Handle with care - detecting multiple model components with the likelihood ratio test ApJ, 571, 545 van Dyk, D.A., Connors, A., Kashyap, V.L., and Siemiginowska, A., (2001) Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation, ApJ, 548, 224

·

·

Park, T., van Dyk, D.A., and Siemiginowska, A. (2008) Searching for Narrow Emission Lines in X-ray Spectra: Computation and Methods ApJ, 688, 807

AAS HEAD 2013


Aneta Siem iginows ka

Methods: Bayes Factors
· Bayesian Evidence:
The average likelihood over the prior distribution
M- model - model parameters Y - data

14

prior

· Bayes Factor:
The ratio of the Bayesian Evidence for each model Large BF values (>100) are decisive.

· BF and posterior probability ratios
Can be used to "compare" not-nested models, such as a power law vs thermall

prior!

AAS HEAD 2013


Aneta Siem iginows ka

15

Methods: Bayes Factors
· Interpretation of the BF against the Jeffreys' scale: BF Strength of evidence (toward M0) 1 ~ 3 Barely worth mentioning 3 1 0 S u b s ta n tia l 10 3 0 S tro n g 30 ~ 100 Very strong > 100 Decisive

AAS HEAD 2013