Документ взят из кэша поисковой машины. Адрес оригинального документа : http://hea-www.harvard.edu/AstroStat/Stat310_0809/hl_20080909.pdf
Дата изменения: Tue Sep 9 22:29:18 2008
Дата индексирования: Tue Oct 2 04:16:26 2012
Кодировка:

Поисковые слова: annular solar eclipse
Introduction

Assessing Model Preference
H. Lee
Harvard Smithsonian Center for Astrophysics

September 9, 2008

Hyunso ok Lee


Introduction

Motivation

­Vinay called me to get some the 3rd opinion about a.k.a. Vuoung's test. ­Not knowing how their conversation begin, I began to think on my own way. What's wrong with information criteria such as AIC and BIC, we use so often? Why astronomers do not care about in ranking candidate models based on well known model selection criteria? Inference comes after one chooses a model and as a consequence, the results of inference depend on the model choice. ­The shortcoming of IC for astronomers is that IC do not provide the level of significance of choosing one model over the other. They like to know Prob(choosing model A over model B) not AIC(model A) < AIC(model B) < AIC(model C)...

Hyunso ok Lee


Introduction

Vinay and Andrea's example

Power Law vs Blackbody (see Vinay's slog post): PL: f (x ) = Ap ( BB: f (x ) = Ab
x x e
min

)-
2



x

x /kT

-1

­Normalization constants, Ap and Ab have no commonality except their meanings in physics. ­Both models have two parameters to be estimated. The parameter spaces of these two models are non nested. ­For the convenience' sake, we assume independence within these pairs (Ap , ) and (Ab , T ). (In fact, I've kept hearing Ap and correlated).

Hyunso ok Lee


Introduction

Astronomer's Model Selection I. Reduced
­First, they bin (group) data to increase S/N for 2 fitting. Then, compare reduced 2 for a better fit model (does not produce significance).

2

­ (Dib - Mib )/ Mib N (0, 1), where Dib and Mib denote binned data based on observed Poisson counts (Di ) and model (predicted) counts (Mi ). This approximation is correct if the specified model is a right one. Reduced 2 indicates how good the chosen model fits data. If both models produce reasonable reduced 2 values, we could devise a test similar to F test for testing the ratio of these two reduce 2 values is F distributed under the null hypothesis (Ho : red 2 (A)/red 2 (B ) = 1). ­instrument effects and pile ups are simplified/ignored/taken care of.

Hyunsook Lee


Introduction

­Note that this 2 fitting process transforms a complex parameter space into a simple Gaussian error space. Degrees of freedom only carry the information in those models. The transformed space cannot tell whether models are nested or not. Residuals do not tell the model properties (PL or BB). It could be viewed as dimension reduction in a degrading fashion because of the loss of information. Furthermore, we must be sure about both models are relatively good-fit prior to comparing two models. ­ If Ap and are dependent, reduced 2 with the degree of freedom (n-p) where p=2 seems not proper. Dib -M b iid how to get effective degrees of freedom? Will b i N (0, 1) be
Mi

valid?

Hyunso ok Lee


Introduction

Astronomer's Model Selection II. LRT

When comparing nested models, with or without binning, astronomers use LRT. Unfortunately, LRT has its own limits (see papers and slog posts) and is not suitable for comparing PL and BB to produce model preference significance. The notion of Model Misspecification (if one model is right, the other is misspecified) in general sense is not considered with LRT. The term, model misspecification is not discussed in the astronomical community.

Hyunso ok Lee


Introduction

Model Selection ­ Statisticians' viewpoint
See Books (there are not many)
­ Burnham,K.P. and Anderson, D.R. (2002) Model Selection and Multimodel Inference: A Practical Informatoin theoretic Approach (2nd Ed.) Springer-Verlag, NY ­ Claeskens and Hjort (2008) Model Selection and Model Averaging, Cambridge Univ. Press, UK ­ Lahiri, P. (2001) Model Selection, IMS Lecture Notes 38, IMS ­ Linhart, H. and Zucchini, W. (1986). Model Selection, John Wiley & Sons, NY ­ McQuarrie, A.D.R. and Tsai, C.-L. (1998) Regression and Time Series Model Selection, World Scientific Publishing Co. INC., NJ

Papers from the slog for testing non nested models.
http://groundtruth.info/AstroStat/slog/2008/non-nested-hypothesis-tests/

More Frequentist and Bayesian Model Selection and Model Averaging papers (heavy on variable selection).

Hyunso ok Lee


Introduction

The problem is that the main ideas of these papers cannot be typified into astronomical problems in choosing a right model among candidates. Statistics related to non nested or semi nested model selection heavily rely on regression settings and focused on variable selection.

Hyunso ok Lee


Introduction

P(choosing model A over model B)

We assume the best fit estimators are unbiased. Devise some likelihood type measure L so that the goal became assessing P (L(A) > L(B ))
P (choosing model A over B ) = = = P (L(A) > L(B )) E [P (L(A) > L(B )|D )] 1 P (L(A; Di ) > L(B ; Di )) + o (n N
-

) >0

We know that Di Poi (Mi ) (Mi depends on the model choice) and best fits of Ap , , Ab , T on D .

Hyunso ok Lee