Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://hea-www.harvard.edu/AstroStat/Stat310_fMMIV/vandyk_17may2005.pdf
Äàòà èçìåíåíèÿ: Wed Jun 8 00:16:43 2005
Äàòà èíäåêñèðîâàíèÿ: Tue Oct 2 05:58:48 2012
Êîäèðîâêà: koi8-r

Ïîèñêîâûå ñëîâà: ï ï ï ï ï ï ï ï ï ï ï
High Energy Astrophysics:
What do Statistical Methods Have to Offer?

David van Dyk Department of Statistics University of California, Irvine Joint work with The California-Harvard Astrostatistics Collaboration

1


Why should Astronomers and Statisticians Collab orate?
An Astronomer's Statistical Worldview circa 1975

"If I need statistics, it's not real"
· Astronomers were used to assuming--without thinking about it--that everything was Gauss-Normal ( · Their standard solution was ­ to "Solve" for "solutions" and to "propagate errors" or ­ to "filter" and "cut" · The same 3 or 4 tools were used for 2 . 2 fo r g o o d n e s s -o f-fi t, 3. Kolmogorov-Smirnov tests, and 4. fast Fourier transformation

and with same "sigma")

everything:

1. Simple parameter estimation via likelihood "forward fitting",

Anything b eyond these few simple tools "requires some thought". And astronomers aren't usually sufficiently trained. At least not yet!
2


What Statisticians bring to the Table
A Statistician's Worldview Includes · An insistence that statistics analysis needs to consider each stage of the measurement process. · A more robust understanding of likelihood, parameter estimation, timing analyses, "deconvolution", etc. (e.g., multiscale image analysis,...). · A toolbox of practical computational methods. · A more robust understanding of when standard tools do or e.g., Protassov et al.

do not work

In the grand sweep of history Astronomers and Statisticians often made breakthroughs together. But in the 1970's the communication apparently broke down: We are now playing catch-up.

The result: Now there are many interesting statistical problems in astronomy, b oth small and large and of interest to all!
3


California Harvard Astro-Statistics Collab oration
The History of CHASC Siemiginowska et ( t h e n A X A F ) in a direct result of were designed to CHASC Today CHASC provides a forum for discussion of outstanding statistical problems in Astrophysics. Goals include developing papers on the interface of statistics and astronomy and incorporating state-of-the-art statistical methods into CIAO. Participants include faculty, researchers, graduate and undergraduate students. CHASC Funding CHASC has been funded by two NSF Grants (DMS 01-04129 and DMS 04-38240) and by NASA Contract NAS8-39073 (CXC). Until this year, the primary source of graduate students funding has been CXC. al. outlined a number of a July 1996 presentation this presentation and ma solve problems described su at ny in ch problems pertaining to Chandra SCMA II. CHASC was initiated as of the methods that I will describe Siemiginowska et al.

CHASC is now one of several such Collab orations in the country.
4


Outline of Presentation
This talk has two components: A. Highly Structured Mo dels in High-Energy Astrophysics · Astrostatistics: Complex Sources, Complex Instruments, and Complex Questions Key: All three are the domain of Astrostatistics · M o d e l-B a s e d S t a t is t ic a l S o lu t io n s · Monte Carlo-Based Bayesian Analysis B . E x a m p le s 1. The EMC2 package for Image Analysis (A detailed example.) 2. The BLoCXS package for Spectral Analysis 3. The BEHR package for computing Hardness Ratios 4. The BRoaDEM package for DEM Reconstruction

5


Primary Collab orators
EM C2 Alanna Connors, David Esch, Margarita Karovska, and David van Dyk B LoC X S Alanna Connors, Peter Freeman, Christopher Hans, Vinay L. Kashyap, Taeyoung Park, Rostislav Protassov, Aneta Siemiginowska, David van Dyk, Yaming Yu, and Andreas Zezas, BEHR Vinay L. Kashyap, Taeyoung Park, David van Dyk, and Andreas Zezas, B R oaD E M Alanna Connors, Hosung Kang, Vinay L. Kashyap, and David van Dyk

6


Complex Astronomical Sources

0 10

photon count 400 800

12

14

16 18 wavelength (A)

20

22

Images may exhibit Spectral, Temporal, and Spatial Characteristics.
7


Astrostatistics: Complex Data Collection

· A very small sample of instruments · Earth-based, survey, interferometry, etc. · X-ray alone: at least fo u r p la n n e d m is s io n s · Instruments have different data-collection mechanism

8


Astrostatistics: Complex Questions

0 6

Photon Counts 200 400 600 800 1000

8

10

12 14 Wavelength (A)

16

18

20

· What is the composition and temperature structure?

9


Astrostatistics: Complex Questions

· Are the loops of hot gas real?

10


Scientific Context
The Chandra X-Ray Observatory · Chandra produces images at least thirty times sharper then any previous X-ray telescope. · X-rays are produced by multi-millions degree matter, e.g., by high magnetic fields, extreme gravity, or explosive forces. · Images provide understand into the hot and turbulent regions of the universe.

Unlocking this information requires subtle analysis:
The California Harvard AstroStatistics Collab oration (CHASC) · van Dyk, et al. (The Astrophysical Journal, 2001) · Protassov, et al. (The Astrophysical Journal, 2002) · van Dyk and Kang (Statistical Science, 2004) · Esch, Connors, van Dyk, and Karovska (The Astrophysical Journal, 2004) · van Dyk et al. (Bayesian Analysis, 2005)
11


Data Collection
Data is collected for each arriving photon: · the (two-dimensional) sky coordinates, · the energy, and · the time of arrival. All variables are discrete: · High resolution - finer discretization. e.g., 4096 â 4096 spatial and 1024 spectral bins The four-way table of photon counts: · Spectral analysis models the one-way energy table; · Spatial analysis models the two-way table of sky coordinates; and · Timing analysis models the one-way arrival time table

The Image: A moving `colored' picture
12


NGC 6240

Image Credits. X-ray: NASA/CXC/MPE/, Komossa et al. (2003, ApJL, 582, L15); Optical: NASA/STScI/R.P.van der Marel & J.Gerssen.

13


Highly Structured Models
Modelling the Chandra data collection mechanism.
counts per unit
+ + + + +

·
absorbtion and submaximal effective area

250

counts per unit

+ + + + ++ + ++ +++ +++ + ++ ++++++++++++++++++++++++ +++++++++++++++++++++++ ++++++++++++++++++++++++ 1 2 3 4 5 6

+ + + +

energy (keV)

counts per unit

100 150

+ + ++ + +

·

0

instrument response

50

150

·

+ +++ + ++ +++++ +++ ++ +++++++++ + ++ + +++++ +++ ++++++++++++ ++ +++ ++ +++++++++++++++++++++++++ +++ + ++ ++ 1 2 3 4 5 6

· T h e m eth o d of D ata Augmentation: EM algorithms and Gibbs samplers. · We can separate a plex problem into quence of problems of which is easy to com a s e, each solve.

0 50

150

energy (keV)

50

120

counts per unit

+++ + ++ + + + ++ + +++ +++ +++ + + + + +++++++ +++++ +++++ ++++ ++ + ++ + + + ++ ++ ++++ +++ ++++++++++++++++++++ ++ + 1 2 3 4 5 6

pile-up
+ + + +

energy (keV)

80

·

0

counts per unit

êæ æ òï ÿ ðüâ

120

+ + + +

80

ñâ

background

+ + + + + + ++++ +++++++ +++++++++++ + ++ ++++ +++++++ +++++++++++++++++ ++ ++++++ +++++++++++++ ++ + ++ ++ ++ + 1 2 3 4 5 6

0

40

+ + + + ++ ++ + + ++ ++ + + + ++ ++ +++++ ++++++++++++++++ + +++++ + ++++++++++++++++ ++++ +++++++++++++++ + ++ ++ + 1 2 3 4 5 6

energy (keV)

0

40

energy (keV)

We wish to directly model the sources and data col lection mechanism and use statistical procedures to fit the resulting highly-structured models and address the substantive scientific questions.

14


A Model-Based Statistical Paradigm
1 . M o d e l B u ild in g · Model source spectra, image, and/or time series ·M ­ ­ ­ ­ odel the data collection process background contamination instrument response effective area and absorption p ile u p What are Prior distributions? 1. Priors can be used · to incorporate information from outside the data, or · to impose structure. 2. Priors offer a principled compromise between "fixing" a parameter & letting it "float free". 3. Setting min and max limits in XSPEC amounts to using a flat prior over a specified range.

· Results in a highly structured hierarchical model 2. Model-Based Statistical Inference · Bayesian posterior distribution

· Maximum likelihood estimation 3. Sophisticated Statistical Computation Methods Are Required · Goals: computational stability and easy implementation · Emphasize natural link with models: The Method of Data Augmentation
15


Bayesian Inference Using Monte Carlo
The Building Block of Bayesian Analysis 1. The sampling distribution: p(Y | ). 2. The prior distribution: p( ). 3. Bayes theorem and the posterior distribution: p( |Y ) p(Y | )p( ) Inference Using a Monte Carlo Sample:
prior posterior joint posterior with flat prior
. . .. . . .. .. . . . . .. . . . .. . .. . . . . .. .. . . ... ... . .. . .. . . . . . .. . .. .... . .................. . .......... . . . . . . . . .... . .................. . . . . .... ... ..... . .... .. ...... ...... . .......................... ... ... .... .. . .. . . .. ..... ... ..... ...... ... .. . .. . ............................... ........ ... . . . . . . ...... . . . ............ ...... . ... . .. .. ... .... .. .. .. . ............. .... ......... .... . . . . .......................... ... ............. ....... .. . ..... . . . . . . . . .................................. . .... .. .. . .. .... ......... ..... ..... .. .. . . ........................................ .... ....... .... .. . . . ......... .. .. . . ......................... ..... . . .. . ... .. . . . ......... .... .. ...... .. . ..................................... .......... .. .... .. . . . . . .. . . . . .. ... ..... ........... .. .. . . . . . ... .. ..................... . ..... .. . . . ....... .. . ... . .. .............................................. .. .. . . .. ........... ........ .. . . .. . .. .. ...... . . . ....................... . .. . .. ... ... ..... . ..... .. . .. . . . . .. . . . . . .......... . .. . . . . . . ... .. . .... . .. . . ... ....... .. . . . . . ... .. .. . .. .. . . . . .. .

0.6

0.4

0.4

lamB 0 2 4 6 8 10

0.2

0.2

0.0

0

2

4 lamS

6

8

0.0

1.5

2.0

2.5

0

2

4 lamS

6

8

lamS

We use

MCMC (e.g., the Gibbs Sampler) to obtain the Monte Carlo sample.
16


Bayesian Deconvolution
· The Data Collection Mechanism
Blurring Matrix * known from calibration Source Model

Non-Homogeneous

Expected Photon Count

Stochastic Censoring * known from calibration

Background Contamination * often fit using background observation

The observed counts are modeled as indep endent Poisson variables with means given by .
17


Parameterized finite mixture models (source models w/ several components)

The Source Models

Smoothing prior distributions ( M u lt is c a le m o d e ls fo r d iff u s e e m is s io n )

Compound deconvolution models (simultaneous instrumental & physical "deconvolution" of complex sources)
Photon Counts 200 400 600 800 1000 0 6

8

10

12 14 Wavelength (A)

16

18

20

= P1 A1 (P2 A2 µT ) +

18


Outline of Presentation
This talk has two components: A. Highly Structured Models in High-Energy Astrophysics · Astrostatistics: Complex Sources, Complex Instruments, and Complex Q u e s t io n s · M o d e l-B a s e d S t a t is t ic a l S o lu t io n s · Monte Carlo-Based Bayesian Analysis B. Examples 1. The EMC2 package for Image Analysis (A detailed example.) 2. The BLoCXS package for Spectral Analysis 3. The BEHR package for computing Hardness Ratios 4. The BRoaDEM for DEM Reconstruction

19


Example 1: The EMC2 package for Image Analysis
The Source Model · A Poisson Process for the missing ideal counts. Zi Poisson(µi ) · A useful source model must allow for 1. extended diffuse nebula with irregular and unpredictable structure 2. one or more concentrated X-ray emitters.
K

µi =

µES i

+
k=1

µPS pi k

k

The point sources can be modeled as delta functions, Gaussians or Lorentzians.

20


Additional Model Comp onents
We can add additional model components

A jet can be modeled as a string of elongated Gaussian distributions.

21


A Smoothing Prior for the Extended Source
The Nowak-Kolaczyk Multiscale Model:

Low Resolution
z z
·· 1·

High Resolution
z
z 2·
11

z z z z

12

z z z z

21

z z z z

22

- z


- z


z z z

13

14

23

24

31

32

41

42

33

34

43

44

z·· Poisson(µ) µ Gamma{(0 , 1 )}

z ,· |z·· Multinomial(p1 ) p1 Dirich.{(1 , 1 , 1 , 1 )}

z i, |zi· Multinomial(p2i ) p2i Dirich.{(2 , 2 , 2 , 2 )}

Wavelet like model in a fully Bayesian analysis.

22


Setting the Smoothing Parameters
The Multiscale prior distribution is specified in terms of a number of Dirichlet smoothing parameters (1 , 2 , . . .). · There is one parameter at each level of resolution. · Larger values of each encourage more smoothing
5
alpha = 0.1

6

With binary s p lit s

alpha = 20

2

3

4

0

1

alpha = 2

0.0

0.2

0.4 p

0.6

0.8

1.0

· Some researchers suggest parameterizing the j , e.g., setting j = ak j . · Based on statistical properties of the model, e.g., correlation functions and posterior concavity (Nowak and Kolaczyk; Bouman, Dukic, and Meng).

Instead, we propose a strategy that fits the smoothing parameters to the data.
23


Fitting the Smoothing Parameters
We may fit the smoothing parameters (k ) if we regularize their values.
The exact shap e of the prior matters less than

· We use a common prior ­ Too much mass near zero leads to numerical instability. (Priors that put all mass in 1 quadrant.) ­ Too much mass far from zero results in too much s m o o t h in g . · A compromise: k exp(- 3 /3)
0.0 0.5 0.0 0.5

exp(- )

its general features.

exp(- 2

)

1.0

1.5

2.0

2.5

3.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

exp(- )

exp(- 3

)

1.0

1.5

2.0

2.5

3.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

These priors can be viewed as a smoother way of setting the "range" of the smoothing parameters, with specifying the range.
24


Prior Correlation Structure

A mixture prior distribution · Our prior specification depends on the choice of coordinates. · For each choice there is a corresponding multiscale prior distributions. · We propose using an equally weighted mixture of each of these priors. · Removes the checker-board pattern in the results. This "cycle-spinning" strategy is analogous to what is done with wavelets.
25


Sensitivity Analysis

simulated data and PSF

· A data set was simulated using a binary source. · Fit using two priors. · Contours are at levels 3 and 10.

results under

p() exp(-

1000 3

3 )

p() exp(-

10 3

3 )

26


Statistical Computation
· We use a three-step Gibbs sampler to construct a Markov chain with stationary distribution equal to the target posterior distribution: S S S
TEP TEP TEP

1: Sample Z given µ, , and Y 2: Sample µ given Z , , and Y . 3: Sample given Z , µ, and Y .

Here, Z is the ideal counts, µ is the image Y is the data, and is the smoothing parameters.

1000 draws of a smoothing parameter using two starting values.

Poor Mixing!
27


The Advantage of Blocking
· Original Sampler: S S S S S S
TEP TEP TEP

1: Sample Z given µ, , and Y 2: Sample µ given Z , , and Y . 3: Sample given Z , µ, and Y . 1: Sample Z given µ, , and Y 3: Sample given Z and Y . 2: Sample µ given Z , , and Y .

· A simple change:
TEP TEP TEP

1000 draws of a smoothing parameter using two starting values.

Much Better Mixing!
28


NGC 6240

29


The Effect of Nowak-Kolaczyk Multiscale Smoothing Prior

30


Evaluating the Fit

31


The Significance Maps

32


Mira: The Wonderful Star
An EMC2 image on "Astronomy Picture of the Day" (May 5, 2005)

Credit: X-ray Image: M. Karovska (Harvard-Smithsonian CfA) et al., CXC / NASA
33


Examples 2 and 3: Sp ectral Analysis and Hardness Ratios
High-Resolution Spectra · High resolution detectors such as those aboard Chandra herald a quantum leap forward for empirical high-energy astronomy · Unfortunately, standard methods (e.g., 2 fitting) rely on Gaussian assumptions and thus require a minimum count per bin. · Ad-hoc procedures that group bins are wasteful and sacrifice the desirable high-resolution inherent in the data. Hardness-Ratios · A rough summary of a spectrum is a comparison of the expected hard and expected soft counts. · This is the lowest resolution spectral analysis, but can be useful for classifying faint sources. · Again, the validity of standard methods depends on Gaussian assumptions. · For faint sources either the hard or soft counts can be very small.
34


Solution: Poisson Statistics
· Rather than basing statistical techniques on Gaussian assumptions, we can use the Poisson Distribution as a statistical model for low-count data. · Specifically, we replace the Gaussian likelihood with a Poisson likelihood: Gaussian Likelihood: Poisson Likelihood: -
bins

i -
bins

(xi - µi ) 2 i xi log µi

2

-
bins

µi +
bins

· Bayesian Methods combine the likelihood with a prior distribution that can ­ Model the dist'n of spectral characteristics in a population of sources. ­ Include information from outside the data as to the spectral shape. ­ Smooth the reconstructed spectrum.

Requires Sophisticated Statistical Computing.
35


BLoCXS Bayesian fitting of High Resolution X-ray Spectra.
BLoCXS Functionality · Uses Poisson models and no Gaussian assumptions. Thus, BLoCXS has no trouble with low count data. · Corrects for instrument response as quantified by .rmf or .rsp files. · Corrects for effective area using .arf files. · Uses a Poisson model-based strategy to correct for background contamination. There is no background subtraction and no negative counts. · Can fit absorption due to the ISM or IGM. · Allows for (broken) powerlaw, bremsstrahlung, and blackbody continuums. · Can include Gaussian, Lorentzian, and delta function emission lines. · Can compute principled p-values to test for emission lines. · An extension that will allow for pile-up correction is under development. BLoCXS Availability: Scheduled for release in the next version of CIAO.
36


Principled P-values to Test for a Model Comp onent
Fallible F-tests The F-test commonly used by Astronomers is a special case (under a assumption!) of the Likelihood Ratio Test.

Gaussian

· The LRT is valid for comparing nested models. But the smaller model's parameter must be in the interior of the larger model's parameter space. · This is not the case when testing for a model component in a spectral model. The F-test is not properly calibrated for this problem. · We conducted a survey of papers in ApJ, ApJL, and ApJS (1995-2001)
Type of Test Null Space on Boundary Comparing Non-Nested Models Other Questionable Cases Seemingly Appropriate Use of Test Number of Papers 106 17 4 56

Protassov et al. develops a method based on posterior predictive p-values to properly calibrate a test. This pap er has already b een cited 44 times.
37


BEHR Bayesian Estimation of Hardness Ratios.
BEHR Functionality · BEHR uses Poisson models background contaminated soft and hard counts. Thus, BEHR has no trouble with low count data. · BHER computes hardness ratio estimates and intervals with reliable frequency properties. (See simulation study.) BEHR Availability · BEHR will soon be available on the CXC contributed software page (cxc.harvard.edu/cont-soft/soft-exchange.html). BEHR Examples and References
van Dyk, D. A. et al. (2005). Deconvolution in High-Energy Astrophysics: Science, Instrumentation, and Methods. Bayesian Analysis, to app ear. Park, T., van Dyk, D. A., Kashyap, V. L., & Zezas, A. (2004). Computing Hardness Ratios with Poissonian Errors. CHASC Technical Report.
38


Verifying BEHR
Simulation Study · S = H = 3; each with expected background contamination = 0.1. · Background exposure is 100 times longer. · R = S/ H , H R = (H - S )/ (H + S ) , C = log10 (R)

Table 1: Coverage of Bayesian and Standard Methods.
Method Hardness Ratio R BEHR HR C Standard Method R HR C True Value 1 0 0 1 0 0 Coverage Rate 95.0% 91.5% 98.0% 96.5% 99.5% 100.0% Mean Length 7.30 1.23 1.53 138.29 3.44 7.26 Mean Square Error by mode 0.59 0.53 0.42 73.58 0.63 5.58 by mean 12.34 0.42 0.46

39


Example 4: BRoaDEM package for DEM Reconstruction
· The relative sizes of spectral emission lines are informative as to the physical environment (e.g., the temperature and composition) of a solar coronae. · Quantum physical calculations can predict the relative sizes given the temperature and and composition of a hot plasma. · We aim to invert these calculations to reconstruct the temperature distribution and composition of a stellar corona given the relative size of the spectral lines.
Photon Counts 200 400 600 800 1000 0 6

8

10

12 14 Wavelength (A)

16

18

20

40


BRoaDEM Bayesian Reconstruction of a DEM.
Functionality BRoaDEM is a Bayesian Model-Based Method that can · Simultaneously account for tens of thousands of spectral emission lines. · Fit the elemental abundances along with their statistical error bars. · Reconstruct the stellar DEM, along with measures of uncertainty. · Allow us to account for uncertainly in the relevant quantum physical calculations (i.e., the emissivity matrix tabulated in the Atomic Database (ATOMDB)). · Include diagnostic methods that can identify inconsistencies in ATOMDB.

41


Selected References
The Astrophysics Spectral and Image Models van Dyk, D. A., Connors, A., Esch, D. N., Freeman, P., Kang, H., Karovska, M., and Kashyap, V. (2005). Deconvolution in High-Energy Astrophysics: Science, Instrumentation, and Methods (With Discussion). Bayesian Analysis, to appear. Esch, D. N., Connors, A., Karovska, M., and van Dyk, D. A. (2004). A Image Restoration Technique with Error Estimates. The Astrophysical Journal, vol. 610, 1213­1227. van Dyk, D. A. and Kang, H. (2004). Highly Structured Hierarchical Models for Spectral Analysis in High Energy Astrophysics. Statist. Science, 19, 275­293. Protassov, R., van Dyk, D. A., Connors, A., Kashyap, V. L., & Siemiginowska, A. (2002). Statistics: Handle with Care, Detecting Multiple Model Components with the Likelihood Ratio Test, The Astrophysical Journal, vol. 571, 545­559. van Dyk, D. A., Connors, A., Kashyap, V. L., & Siemiginowska, A. (2001). Analysis of Energy Spectrum with Low Photon Counts, The Astrophysical Journal, vol. 548, 224­243.
42


More on Example 2: The Basic Sp ectral Models
· Photon counts modeled with Poisson process. · The Poisson parameter is a function of energy, with two basic components: 1. The continuum, a GLM for the baseline spectrum, 2. Several emission lines, a mixture of Gaussians added to the continuum. 3. Several absorption lines multiply by the continuum. 4. The continuum indicates the temperature of the source while the emission and absorption lines gives clues as to the relative abundances of elements

LAMBDA

0.0 0

0.0003

1

2 ENERGY

3

4

5

43


A Bayesian Sp ectral Analysis
Quasars · Among the most distant distinct detectable ob jects. · Believed to be super massive black holes with mass a million times that of th e s u n . · Give glimpse into the very distant past, perhaps 90% of the way to Big Bang. High Red-Shift Quasar PG1637+706 · Red-shift: wavelengths elongated as ob ject moves away: energy appears lower · By measuring the change in energy, we can recover the recession velocity, and in a uniformly expanding universe, the velocity is a direct measure of d is t a n c e .

44


The Sp ectral Model
Mo del · Power Law Continuum: f ( , Ej ) = E
C C - j
C

· Absorption model of Morrison and McCammon (1983) to account for the IS M a n d IG M . · Power Law for Background counts: f ( , Ej ) = E · Narrow Gaussian Emission Line ( = 0.125 keV) Three Mo dels for the Emission Line: M M M
ODEL ODEL B B - j
B

0: There is no emission line.

1: There in an emission line with fixed location in the spectrum but unknown intensity. 2: There is an emission line with unknown location and intensity.

ODEL

45


Finding the Sp ectral Line
EM Algorithm · Refit with 51 starting values for the line location between 1.0 and 6.0 keV · ML estimate agrees with scientific expectation (between 2.74 and 2.87 keV) Results mode (keV) 1 .0 5 9 1 .7 7 6 2 .3 6 9 2 .8 0 7 * 4 .2 1 6 5 .0 3 1 5 .7 1 5 domain of convergence (keV) 1 .0 -1 .3 1 .4 -2 .0 2 .1 -2 .3 2 .4 -3 .7 3 .8 -4 .7 4 .8 -5 .2 5 .3 -5 .9 loglikelihood 2 5 8 9 .3 1 2 5 9 0 .3 7 2 5 9 0 .1 9 2 5 9 4 .9 4 2 5 8 9 .5 7 2 5 8 9 .3 1 2 5 8 9 .7 4

Maximum loglikelihood for the model with no line: 2589.31.
46


Sampling the Ma jor Mode
· A Gibbs sampler can sample the ma jor posterior mode. · Compute estimates, error bars, and correlations. 2 .5 %
C

m e d ia n 3 .8 9 0 e -0 4 1 .3 4 8 5 4 -0 .7 2 1 1 7 -0 .2 5 7 9 3 -0 .9 2 7 2 1 1 0 4 .1 2 7 2 .7 9 4 8

9 7 .5 % 4 .3 1 7 e -0 4 1 .5 3 9 5 1 -0 .3 0 5 9 4 0 .1 4 2 9 2 -0 .5 2 5 1 5 2 0 5 .5 2 5 2 .9 4 2 2

m ean 3 .8 9 5 e -0 4 1 .3 4 8 1 9 -0 .7 2 1 3 -0 .2 6 6 1 6 -0 .9 2 5 6 1 1 0 7 .8 3 1 1 5 8 2 .7 9 5 5 1

3 .4 9 9 e -0 4 1 .1 5 6 8 3 -1 .1 3 6 1 8 -0 .7 2 3 9 5 -1 .3 2 0 9 6 3 3 .9 0 3 6 2 .6 5 6 5 7

C

A B

B

L 1, L 1,µ

47


G O T O N E X T S L ID E !

48


No Emission Line

Emission Line Fixed at 2.835 keV
·· · ·· ·· · ··· ··· ····· · ·· · · ·· ·· · · · ··· · ······· · ·· ·· ··· · · ·· · ··· · · · ··· ·· ·· ··· ·· · ····· · ····· · ·· · ·· ··· ···· · · · ···· ··· ········ · · · · · · · · ············································································· · ·· ··· ········ ···· · ··· ······ ····· · · · · ·· · · · ·· · · · · ·· ······························································· · 0 2 4 Energy (keV) 6 8 30

Model Diagnostics
Residual Plots · Gaussian Errors · Posterior Predictive Errors

· ·· ·· ·· · ··· ··· ····· · ··· · ·· · · · · ······· ·· · ········ · ·· · ·· · · · ···· ·· ··· ·· ··· ·· · ····· · ····· · ·· · ·· · ··· · · · ···· ··· ······· · · · · · · ·· ·· ·· · ·· ·· ········································································ · · ··· ········ ···· · ······· ······· · · · · ·· · ··· ································································ · · · ·· · ·· 0 2 4 Energy (keV) 6 8

Counts Gaussian Errors 10 20

30

Counts Gaussian Errors

0

0

10

20

-10

0

2

4 Energy (keV)

6

8

-10 0

··· ·· ·· · · · ··· ·· · · · · · · · · · ·· · · · · · · · · ·· · ·· · ··· ·· ·· ···· · · · · ·· · · · ·· ·· ·· ···· ·· · ·· ·· ······· ··· · · ·· · · ·· ······ ··· · ············ ········ ·· ······· ·· · ······················· ·· ·· ··· · ··· ····· · ··············································· ·· · · · ·· · · · ··· · ··· ··· ········ ···· ······ · ·· ·· · ·· ······ · · · ·· ·· · · ·· · · ·· · · · · · ··· ·· · · · · · · ·· ·· ···· · · ·· · · ·

··· ·· ·· · · ······ · · · ·· · · ·· · · · · · ·· ···· · · · · · · ·· · · · · · ··· ·· ·· · ·· ·· · ·· · ··· · ·· · · ···· ···· ·· · ·········· ··· ······· ···················· ··· ··············· · · ·· · · · · · · · · ·· · ···· ···· ········· · ·· ···· · ·· ················ ··················································· · · · ····· · ··· ········· ··· ··· ··· · ··· · · ·· · · · · · · · · ·· · · · ·· · · · ·· · · ·· · ·· · ·· · · · · · 2 4 Energy (keV) 6 8

Residuals Gaussian Errors -5 0 5

.

Residuals Posterior Predictive Errors -10 -5 0 5

0

2

4 Energy(keV)

6

8

Residuals Posterior Predictive Errors -10 -5 0 5 0

··· ·· ·· · · · ··· ·· · · · · ··· · ·· ·· · · · ·· · · · · ·· · ··· ·· ·· ····· · · · · ··· ···· · · · ···· ·· ···· ···· ·· · ·· ·············· ··· · · ·· · · ·· · · ··· · · ········ ········ ·· ··· ·· · ······················· ·· · · · · · ·· · ········ ··········· · · · ·· · ·· ······· ··· ····· ················································· ·· · ·· · ·· · ·· · ·· ······ · · ·· · · ·· · · · · · ·· ·· · · · · · · ·· ·· ···· · · ·· · · ·

Residuals Gaussian Errors -5 0 5

··· ·· ·· · · ······ · · · ·· · · ·· · · · · · ·· ···· · · · · · ·· ·· · ·· · · · · ·· · · ····· ······ · · ··· ···· ··· ···· ·· · ················· ··· · · ·· · ·· · · · · · ···· · · · · · · ·· · · · · · · · · ············ ································ · ·· ···· · · ··· ··········· ··················································· · · ·· · ·· · · · ···· ··· · · ·· ·· · ·· · ···· · · · · ··· · · · ·· · · ··· · · · ·· · · ·· · ·· · ·· · · · · · 2 4 Energy(keV) 6 8

49


Model Checking
Posterior Predictive Checks · The Likelihood Ratio Test: T (yrep ) = log · Sample yr
ep supi L( |yrep ) sup0 L( |yrep )

, i = 1, 2,
ODEL

from posterior predictive distribution under model M
Model 0 vs. Model 1 Model 0 vs. Model 2
400

0.

800

600

p=0 T(y) = 5.62

300

p = 0.013

200

400

T(y)= 5.63

0

200

0

1

2

3

4

5

6

7

0 0

100

2

4

6

8

10

T (y rep ) T (y rep ) Given the prior belief that the line is near 2.81 keV, it is legitimate to use the first ppp-value. Without such prior information, one should use the second ppp-value.
50


Conclusions
Motivation for Mo del Based Metho ds: · Asymptotic approximations may not be justified ­ 2 fitting is not appropriate for low counts · Accounting for background contamination · Accounting for pile up Motivation for Bayesian Metho ds: · Likelihood methods also require asymptotic approximations (e.g., to compute error bars) which may not be reliable · Testing for spectral or spatial features · Computation for mode finders may be intractable The Future of Data Analysis: · Problem specific modeling and computing · Less reliance on statistical black boxes and multi-purpose solutions
51