Документ взят из кэша поисковой машины. Адрес оригинального документа : http://hea-www.harvard.edu/AstroStat/HEAD2008/talk_efeigelson.pdf
Дата изменения: Tue Apr 1 07:55:02 2008
Дата индексирования: Tue Oct 2 04:16:10 2012
Кодировка:

Поисковые слова: вторая космическая скорость
A s tros tatis tic s a nd H igh Energy A s trophys ic s

E ric F e ig e ls o n C e n t e r fo r A s t ro s t a t is t ic s P e n n S t a t e U n iv e rs it y

HEAD 2008


What is astrostatistics?
W h a t is a s t ro n o m y ?
The properties of planets, stars, galaxies and the Universe, and the processes that govern them

W h a t is s t a t is t ic s ?
­ "The first task of a statistician is cross-examination of data" (R. A. Fisher) ­ "[Statistics is] the study of algorithms for data analysis" (R. Beran) ­ "A statistical inference carries us from observations to conclusions about the populations sampled" (D. R. Cox) ­ "Some statistical models are helpful in a given context, and some are not" (T. Speed, addressing astronomers) ­ "There is no need for these hypotheses to be true, or even to be at all like the truth; rather ... they should yield calculations which agree with observations" (Osiander's Preface to Copernicus' De Revolutionibus, quoted by C. R. Rao)


"The goal of science is to unlock nature's secrets. ... Our understanding comes through the development of theoretical models which are capable of explaining the existing observations as well as making testable predictions. ... Fortunately, a variety of sophisticated mathematical and computational approaches have been developed to help us through this interface, these go under the general heading of statistical inference." (P. C. Gregory, Bayesian Logical Data Analysis for the Physical Sciences, 2005)

M y c o n c lu s io n : T h e a p p lic a t io n o f s t a t is t ic s t o h ig h - e n e rg y a s t ro n o m ic a l d a t a is n o t a s t ra ig h t fo rw a rd , m e c h a n ic a l e n t e rp ris e . It r e q u ire s c a re fu l s t a t e m e n t o f t h e p ro b le m , m o d e l f o rm u la t io n , c h o ic e o f s t a t is t ic a l m e t h o d (s ), a n d ju d ic io u s e v a lu a t io n o f t h e re s u lt .


We are making mistakes!
· The likelihood ratio test for comparing two parametric models cannot be applied when a parameter is near zero
(Protassov, van Dyk et al. 2002)

· Probabilities from the 1-sample Kolmogorov-Smirnov test comparing a univariate dataset to its best-fit model are incorrect
(Lilliefors 1969; Babu & Feigelson ADASS XV 2006)

· The Anderson-Darling test is often more sensitive than the K-S test, and there is no valid 2-dimensional K-S test
(Stephens 1974; Simpson 1951)

· Power-law models should not be fit to binned data, use the MLE on the original events (Crawford et al. 1970)


We use a unnecessarily narrow suite of statistical methods
M o d e r n s t a t is t ic s is v a s t . H E A e n c o u n t e r s p r o b le m s in : im a g e a n a ly s is , t im e s e r ie s a n a ly s is , m o d e l s e le c t io n , r e g r e s s io n , n o n p a r a m e t r ic s , s p a t ia l p o in t p r o c e s s e s , m u lt iv a r ia t e a n a ly s is , s u r v iv a l a n a ly s is , ... , ... D o z e n s o f m o n o g r a p h s a r e p u b lis h e d e a c h y e a r in t h e s e fi e ld s . T h e s o ft w a r e s it u a t io n is m u c h im p r o v e d : R h a s e m e r g e d a s t h e p r e m ie r p u b lic - d o m a in s t a t is t ic a l s o ft w a r e p a c k a g e . S im ila r t o ID L in s t y le , b u t w it h a h u g e r a n g e o f b u ilt - in s t a t is t ic a l fu n c t io n a lit ie s . See http://r-project.org and tutorials at http://astrostatistics.psu.edu


We are making progress!
· Growth of research collaborations in astrostatistics: CaliforniaHarvard Astrostatistics Collaboration (van Dyk et al) specifically oriented towards HEA. Also groups at CMU, Berkeley, Michigan Penn State, Cornell, SAMSI. · Growth of conference series (SCMA, ADA, PhysStat, SAMSI) and monographs (Starck/Murtagh, Gregory, Lupton, Wall/Jenkins) for advanced statistical treatment of astronomical data · Week-long Summer School in Statistics for Astronomers held at Penn State since 2005. In steady state, we are training ~10% of world's astronomy graduate students.


Some contemporary issues
· Treatment of measurement errors. See Bayesian approach to regression by Kelly (ApJ 2007) · Hardness ratios at low count rates. Surprisingly tricky! See review by Brown et al. (Stat Sci 2001) and Bayesian solution by Park et al. (ApJ 2006) · Upper limits in Poisson background. Surprisingly tricky! See review by Cowan (SCMA IV 2007) · Increased use of numerical bootstrap and MCMC numerical confidence intervals (Babu 1984; Ptak, this session)