Документ взят из кэша поисковой машины. Адрес оригинального документа : http://hea-www.harvard.edu/AstroStat/Stat310_1213/dj_20130416.pdf
Дата изменения: Tue Apr 16 19:57:42 2013
Дата индексирования: Fri Feb 28 08:38:08 2014
Кодировка:

Поисковые слова: quasar
Detection: Overlapping Sources
David Jones Harvard University Statistics Department

April 16, 2013


Introduction
X-ray data: coordinates of photon detections PSFs of close sources overlap Aim: inference for number of sources and their intensities, positions and spectral distributions


Contamination approach (Kashyap et al. 1994)

Circle sources and solve a set of linear equations describing the intensities and contamination of each source circle from background and other sources Issues
Not clear how the circles should be drawn Gaussian PSFs Only works with small overlap Only works with few sources

There are also kernel approaches but these don't have the advantages of dealing with the allocation of photons exactly


Clustering Approach: Basic Model and Notation
Data = yij ni = # photons detected from source i µi = centre of source i k = # sources (components) yij |µi , ni , k PSF centred at µi j = 1, . . . , ni , i = 0, . . . , k (n0 , n1 , . . . , nk )|w , k Mult(n; (w0 , w1 , . . . , wk )) (w0 , w 1, . . . , wk )|k Dirichlet(, , . . . , ) µi |k Uniform over the image i = 1, 2, . . . , k k Pois() Component with label 0 is background and its "PSF" is uniform over the image (so its "centre" is irrelevant) Reasonably insensitive to , the prior mean number of sources


3rd Dimension: Spectral Data

Can we distinguish the background and sources more accurately if we model the energy of the photons as well? eij |, Gamma(, ) for i = 1, . . . , k e
0j

Uniform to some maximum

Gamma(a , b ) Gamma(a , b )

Using a (correctly) "informative" prior on and versus a diffuse prior made very little difference to results.


RJMCMC
Similar to Richardson & Green 1997 Knowledge of the PSF makes things easier Insensitive to e.g. posterior for ten sources with = 3: Number of Components 9 10 11 12 0.141 0.222 0.220 0.157 0.022 0.029 0.027 0.021

Mean SD

7 0.029 0.018

8 0.058 0.019

13 0.082 0.014


Simulated Data

Source region (2 SD) is about 28% of the area and contains about 41% of the observations Positions (-2, 0), (0, 1), (1.5, 0) with intensities 50, 100, 150 respectively


Joint Log Posterior


Posterior of k

Aggregation over 10 chains of the posterior probabilities (for each k the SD over the 10 chains is small) When not using the energy information we usually can't find the faintest source


Chain 1: Posterior of k Trace


Gelman-Rubin: Posterior of k

Gelman-Rubin statistics were 1.00 (C.I. 1.01) and 1.01 (C.I. 1.01) respectively


Allocation of Photons
Table: Allocation breakdown: (a) ignoring energy information
Source (intensity) Background (10/sq) Left (50) Middle (100) Right (150) Average No. Photons 1015 38 97 152 Average Allo cation Breakdown Background Left Middle Right 0.876 0.035 0.040 0.049 0.798 0.121 0.067 0.014 0.502 0.168 0.189 0.141 0.481 0.043 0.159 0.317

Table: Allocation breakdown: (b) using energy information
Source (intensity) Background (10/sq) Left (50) Middle (100) Right (150) Average No. Photons 1015 38 97 152 Average Allo cation Breakdown Background Left Middle Right 0.894 0.024 0.038 0.045 0.531 0.278 0.165 0.026 0.293 0.122 0.346 0.239 0.305 0.028 0.141 0.526

Background is more easily distinguished from the sources when we include the energy information


Parameter Inference

Table: Parameter estimation (a) no energy information (b) with energy information
µ11 -1.266 0.069 0.543 µ12 0.839 0.125 0.718 µ21 0.401 0.067 0.165 µ22 0.549 0.068 0.207 µ31 1.798 0.030 0.090 µ32 -0.054 0.046 0.005 w1 0.049 0.002 0.050 -1.790 0.037 0.045 -0.101 0.064 0.014 -0.234 0.033 0.056 1.042 0.026 0.002 1.584 0.019 0.007 -0.044 0.022 0.002 0.040 0.001 0.036 w2 0.067 0.002 0.027 0.077 0.001 0.018 w3 0.086 0.003 0.032 0.115 0.002 0.014 wb 0.798 0.001 0.001 0.768 0.000 0.000 NA NA NA NA 2.827 0.013 0.030 0.004 NA NA NA NA 0.459 0.003 0.002 0.006

Mean SD MSE SD/Mean Mean SD MSE SD/Mean

The effects are obviously less pronounced when the sources are more easily distinguished from the background


Real Data

Additional question: can we distinguish the spectral distributions of the sources?


What is the PSF?

Ideally a fairly accurate PSF can be obtained by training on non-overlapping sources In the absence of an accurate PSF:
1. Approximate the number of sources (2 in this case) 2. Obtain an EM estimate of the covariance of the PSF

The presence of some clearly separated sources will obviously improve the accuracy of step 2 and generally reduce sensitivity to step 1


EM Estimate of the Covariance
We obtained ^
EM

=

0.562 -0.020 -0.020 0.479

A slice through the middle of the brighter source suggests the diagonal terms are not unreasonable


Problem!
Behaves badly possibly because the background is not uniform


Solutions?

The covariance matrix doesn't seem to be the issue. Scaling the EM estimate by a range of values made very little difference Ignoring the energy information also doesn't help Current solution: (w0 , w 1, . . . , wk )|k Dirichlet(, , . . . , ) previously = 1 but now we set = 50 to eliminate very weak sources Other ideas?


Posterior of k


Three? Potential Binaries?

Probably just an artifact of making the sources more similar in brightness through (but could be useful with prior knowledge) - moderate choice of needed More careful treatment of label switching is needed for inference for the parameters of potential binaries


Parameter Inference

Table: Parameter estimation for FK Aqr and FL Aqr
µ11 120.980 0.017 0.000 µ12 124.846 0.017 0.000 µ21 121.415 0.036 0.001 µ22 127.400 0.036 0.001 w1 0.673 0.007 0.010 w2 0.181 0.005 0.030 wb 0.146 0.005 0.034 3.112 0.062 0.004 0.020 0.005 0.000 0.000 0.023

Mean SD MSE SD/Mean


Extensions to Spectral Modeling

The background spectral distribution doesn't appear to be uniform at all Model the spectral distributions of background and sources to all be different Gammas Will allow us to look at the question of whether the two sources have different spectral distributions


Background is Not Uniform


Comparing Spectral Distribution Parameters

95% posterior intervals for 1 and 2 are nearly disjoint


Should the dim source be similar to background?


Summary

Works very well for simulated data Spectral model and possibly the background spatial model need some revisions to be realistic Need to investigate exactly why saturation occurs for the real data but not the simulated data Potential to separate spectral distributions of different sources