Документ взят из кэша поисковой машины. Адрес оригинального документа : http://hea-www.harvard.edu/AstroStat/etc/Schafer_AAS215_Jan2010_231.06.pdf
Дата изменения: Tue Feb 9 05:51:45 2010
Дата индексирования: Fri Feb 28 10:56:40 2014
Кодировка:
Improved Astronomical Inferences via Nonparametric Density Estimation

Chad M. Schafer, InCA Group
www.incagroup.org Department of Statistics Carnegie Mellon University Work Supported by NSF, NASA-AISR Grant January 2010
1


The Core Collaborators
Susan M. Buchman Peter E. Freeman Ann B. Lee Joseph W. Richards

2


M o t i v at i o n Theory predicts the distribution of observables as a function of cosmological parameters.

3


M o t i v at i o n For example,
m b H0 ns A = total matter density = baryonic matter density = dark energy density = the Hubble parameter = the optical depth = spectral index of initial spectrum = amplitude of initial spectrum 0.40 0.056 0.60 64.6 km/s/Mpc 0.075 0.99 0.79

parameterize the power spectrum of the CMB anisotropy.
4


M o t i v at i o n For example,
m b H0 ns A = total matter density = baryonic matter density = dark energy density = the Hubble parameter = the optical depth = spectral index of initial spectrum = amplitude of initial spectrum 0.40 0.056 0.60 64.6 km/s/Mpc 0.075 0.99 0.79

parameterize the power spectrum of the CMB anisotropy.
5


M o t i v at i o n
7000

6000

( + 1)C /2 (µK )

2
5000 4000 3000 2000 1000

0 0

100

200

300

400

500

600

700

800

900

Angular Frequency ()
6


M o t i v at i o n

Image courtesy of WMAP Science Team.
7


M o t i v at i o n The key role of Density estimation, i.e., estimating the distribution from which a sample of data were drawn Assuming a parametric form is convenient, but often difficult to justify. Nonparametric density estimation drops these restrictions
8


Nonparametric Density Estimation
12 density 0 0.0 2 4 6 8 10

0.1

0.2

0.3 redshift

0.4

0.5

Histogram of 1,425 galaxy redshifts.
9


Nonparametric Density Estimation
12 density 0 0.0 2 4 6 8 10

0.1

0.2

0.3 redshift

0.4

0.5

Compared with best fitting gamma distribution.
10


Nonparametric Density Estimation

Data Estimate Assumptions

Parametric case: Fixed contribution of assumptions.

11


Nonparametric Density Estimation

Data Estimate Assumptions
n

Nonparametric case: Contribution of assumptions is controlled by n . Optimally, n = o(n-1/(4+d) ), where d = dimension of data.
12


Nonparametric Density Estimation

-10

-5

0

5

10

Kernel density estimation puts a smooth mass at each data point. n controls the width of the "bumps."
13


Nonparametric Density Estimation
12 density 0 0.0 2 4 6 8 10

0.1

0.2

0.3 redshift

0.4

0.5

Parametric versus nonparametric estimate (kernel density estimate).
14


Nonparametric Density Estimation
12 density 0 0.0 2 4 6 8 10

0.1

0.2

0.3 redshift

0.4

0.5

n chosen too small, i.e. too much weight on data
15


Nonparametric Density Estimation
12 density 0 0.0 2 4 6 8 10

0.1

0.2

0.3 redshift

0.4

0.5

n chosen too large, i.e. too much weight on assumptions
16


Nonparametric Density Estimation

True Density Closest Gaussian 0.3 f(x) 0.0 -4 0.1 0.2

-2

0 x

2

4

Truth is not quite a Gaussian distribution.
17


Nonparametric Density Estimation
Error (Mean Integrated Squared Error)

5.0e-06

1.5e-05

2.5e-05

Assuming Gaussian Nonparametric Estimator

0

200

400

600

800

1000

sample size (n)

Even at moderate sample sizes, nonparametric estimator superior.
18


Bivariate Density Estimation

x x Sample of 15,057 SDSS quasars. (Richards, et al. 2006)

Absolute Magnitude

-24 0

-26

-28

-30

1

2

3 Redshift

4

5

19


Bivariate Density Estimation

x x Bivariate luminosity function estimate (Schafer (2007))

Absolute Magnitude

-24 0

-25

-26

-27

-28

-29

-30

1

2

3

4

5

Redshift
20


Bivariate Density Estimation
)
-5

z = 0.49

z = 1.25

z = 2.01

pc-3mag (M
i

-1

) (M

-9

-7

1

pc-3mag-

(M

i

) (M

-9

-7

-5

z = 2.8

z = 3.75

z = 4.75

)

-25

-27

-29

-25

-27

-29

-25

-27

-29

M

i

M

i

M

i

Cross-sections, compared with "standard" approach.
21


Working in Higher Dimensions
40 Line Flux (10^-17 erg/cm^2/s/Angstrom) 0 4000 10 20 30

5000

6000

7000

8000

9000

Wavelength (Angstroms)

SDSS galaxy spectrum.
22


Working in Higher Dimensions

x 10 5

-11

0.16 0.14

0

0.12
3
-5

0.1
-10

t 3

0.08 0.06
5 0 -5 4 -10 -15 -20 -4 -25 -6 -2 2 0 x 10
-7

-15

6

x 10

-12

0.04

t
2

2



t 1

1

3,846 galaxy spectra, colored by redshift
23

(Richards, Freeman, Lee, Schafer (2009a))


Working in Higher Dimensions

Examples of galaxy image data.
24


Working in Higher Dimensions

Third Coordinate

0e+00

2e-04

4e-04

-2e-04

1.5 1.0 0.5 0.0 -0.5 -1.0

-4e-04

-1.5 -2.0 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034

First Coordinate

200 galaxies, colored by eccentricity.
25

Second Coordinate


The Big Picture
Distribution Space Encoding Space Data Space

Component 3

Confidence/Credible Region

Component

2

Physically Possible Distributions

Component 1

Once represented in low-dimensional space encoding space, nonparametric density estimation useful for comparing observations and theory
26


References
Buchman, Lee, and Schafer (2009). To appear in Statistical Methodology. arXiv:0907.0199 Richards, et al. (2006) ApJ. 131 2766 Richards, Freeman, Lee, and Schafer (2009a). ApJ. 691 32-42. Schafer (2007). ApJ 661 703-713. Schafer and Stark (2009). J. Amer. Stat. Assoc.

27