Документ взят из кэша поисковой машины. Адрес оригинального документа : http://hea-www.harvard.edu/AstroStat/astro193/VLK_slides_mar23.pdf
Дата изменения: Tue Mar 24 02:36:37 2015
Дата индексирования: Sun Apr 10 11:58:51 2016
Кодировка:

Поисковые слова: m 63
Astro 193 : 2015 Mar 23
·

Follow-up
·

p-values: what is the opposite of "reject the null"?
· · ·

not "accept the null" also not "reject the alternate" Halsey et al. 2015, The fickle P value generates irreproducible results, Nature Methods 12, 179
http://www.nature.com/nmeth/journal/v12/n3/full/nmeth.3288.html

· · ·

Source Detection Survival Analysis Upper Limits


Source Detection

·

The process of source detection is an application of p-value based null hypothesis rejection Background usually estimated locally, and multiple tests are made on a scan statistic The null is that the observed statistic is obtained from the background Subject to both false positive (Type I) and false negative (Type II) errors

·

·

·


Survival Analysis

· · · ·

False negatives non-detections censored data Left-censoring vs Right-censoring cumulative distribution function, F(x) = [0,x] ds f(s) Survival Function S(x) = Prob(X>x) 1-F(x)

·

Hazard Function H(x) = f(x)/S(x) = Prob(X [x,x+x]|X>x)/x


Survival Analysis

·

Feigelson & Nelson 1985, Statistical methods for astronomical data with upper limits. I - Univariate distributions, ApJ, 293, 192 Schmitt, J.H.M.M. 1985, Statistical analysis of astronomical data containing upper bounds - General methods and examples drawn from X-ray astronomy, ApJ, 293, 178 asurv astronomical survival analysis package http://www2.astro.psu.edu/statcodes/asurv superseded by R package survival

·

·


Survival Analysis: Kaplan-Meier estimator

· ·

What is the survival function for a Normal? Likelihood for case of many observations of same object when it may be undetected in some cases L = i N(x,) (1-SN(x,)
(1-)

), =1(0) for (un)detected

·

Non-parametric maximum-likelihood product-limit estimator of the Survival Function in the asymptotic limit when censoring is random S
KM

(x) =

xx

(1 - d/N)

product of probabilities of "surviving" each interval [xk,xk+1] var(S
KM

(x)) = S

2 KM



xx

d/[N(N-d)]

Ni are the total number of objects xi, di are number of detections at xi


Survival Analysis: Kaplan-Meier estimator

·

Example: x = {11, <12, 13, <14, 15} d = {1, 0, 1, 0, 1}
· · ·

x=11: N=5, d=1, S

KM

(x=11 )=(1-1/5)=0.8 (x=13 )=0.8*(1-1/3)=0.533 (x=15 )=0.8*0.67*(1-1/1)=0
+ +

+

x=13: N=3, d=1, S x=15: N=1, d=1, S

KM KM

·

what if x = {<11, 12, 13, <14, 15}?
·

x=12: N=4, d=1, S

KM

(x=12 )=(1-1/4)=0.75

+


Upper Limits

· ·

Reading: Kashyap et al. 2010 (ApJ 719, 900) A confidence interval or a credible range gives a range of values that a parameter can have for a specified significance. The interval has two ends. A lower bound, and an upper bound. The true value is likely higher than the lower bound. And lower than the upper bound. Why is this not an upper limit?

·

·

Because what we understand to be upper limits are:
The largest intensity a source can have without being detected The smallest intensity a source should have to be detected


Upper Limits

· ·

We define an upper limit in the context of detection Something is detected when some measurable statistic that is a function of the observed data exceeds a pre-set threshold e.g., test statistic T=nS, and threshold T*=5 counts. If more than 5 counts are seen, claim detection. If fewer are seen, the source must be less bright than some value, aka Upper Limit Need both Type I and Type II errors to define Upper Limits

·

·


Upper Limits

·

Suppose the threshold T* is defined by a false positive probability of (e.g., the probability that a background fluctuation results in test statistic value T>T*) A source with intensity S will produce a signal that falls below the threshold T* with false negative probability 1- U(,) is the upper limit on S such that p(T>T*()|S,B)

·

·


Upper Limits


Upper Limits : Properties

· ·

Depends on the detection process Does not depend on the number of counts in source region Does depend on the background and exposure

·


Upper Limits : Recipe
1. Define a test statistic T for measuring the strength of a source signal 2. Set the max probability of a false detection, (e.g., =0.003 for a "3" detection) 3. Compute the corresponding detection threshold T*() 4. Compute the probability of detection () for T* 5. Define the min probability of detection min (e.g., min=0.5) 6. Compute upper limit as value of S such that (S) min