Документ взят из кэша поисковой машины. Адрес оригинального документа : http://hea-www.harvard.edu/AstroStat/pyBLoCXS/_sources/index.txt
Дата изменения: Fri Oct 29 20:04:11 2010
Дата индексирования: Tue Oct 2 07:57:37 2012
Кодировка:
Поисковые слова: п п п п п п п п п п п п п п п п

.. pyBLoCXS documentation master file, created by
sphinx-quickstart on Tue Aug 17 10:11:57 2010.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

Welcome to pyBLoCXS documentation!
==================================

pyBLoCXS is a sophisticated Markov chain Monte Carlo (MCMC) based algorithm
designed to carry out Bayesian Low-Count X-ray Spectral (BLoCXS) analysis in the
Sherpa environment. The code is a Python extension to Sherpa that explores
parameter space at a suspected minimum using a predefined Sherpa model to
high-energy X-ray spectral data. pyBLoCXS includes a flexible definition of
priors and allows for variations in the calibration information. It can be used
to compute posterior predictive p-values for the likelihood ratio test (see
Protassov et al., 2002, ApJ, 571, 545). Future versions will allow for the
incorporation of calibration uncertainty (Lee et al., 2010, submitted to ApJ).

MCMC is a complex computational technique that requires some sophistication on
the part of its users to ensure that it both converges and explores the
posterior distribution properly. The pyBLoCXS code has been tested with a
number of simple single-component spectral models. It should be used with
great care in more complex settings. Readers interested in Bayesian low-count
spectral analysis should consult van Dyk et al. (2001, ApJ, 548, 224). pyBLoCXS
is based on the methods in van Dyk et al. (2001) but employs a different MCMC
sampler than is described in that article. In particular, pyBLoCXS has two
sampling modules. The first uses a Metropolis-Hastings jumping rule that is a
multivariate t-distribution with user specified degrees of freedom centered on
the best spectral fit and with multivariate scale determined by the Sherpa
function, covar(), applied to the best fit. The second module mixes this
Metropolis Hastings jumping rule with a Metropolis jumping rule centered at the
current draw, also sampling according to a t-distribution with user specified
degrees of freedom and multivariate scale determined by a user specified scalar
multiple of covar() applied to the best fit.

A general description of the MCMC techniques we employ along with their
convergence diagnostics can be found in Appendices A.2 - A 4 of van Dyk et
al. (2001) and in more detail in Chapter 11 of Gelman, Carlin, Stern, and Rubin
(Bayesian Data Analysis, 2nd Edition, 2004, Chapman & Hall/CRC).

Download
========

September 15, 2010 pyblocxs-0.0.3.tar.gz_

.. _pyblocxs-0.0.3.tar.gz: http://hea-www.harvard.edu/AstroStat/pyBLoCXS/pyblocxs-0.0.3.tar.gz

Dependencies
============

pyBLoCXS requires

:Sherpa 4.2.1 or later [sherpa]_: A 1D/2D modeling and fitting application written
as a set of Python modules.

.. [sherpa] http://cxc.harvard.edu/contrib/sherpa

Documentation
=============

The code is developed as Chandra contributed software.

Setup Instructions
------------------

To run it within *CIAO/Sherpa*, *cd* to *.../pyblocxs-0.0.3*, start *Sherpa*,
and type *import pyblocxs*.

Or

To run it directly in Python, *cd* to *.../pyblocxs-0.0.3*, and
and then type::

python setup.py install --prefix=
setenv PYTHONPATH ${PYTHONPATH}:/lib/python2.6/site-packages
python

and then from the Python prompt::

from pyblocxs import *

Alternatively, the tarball can be downloaded and unpacked to a location
*/path/to/pyblocxs*. Python scripts should include the following lines before
calling pyBLoCXS functions::

import sys
sys.path.append('/path/to/pyblocxs/pyblocxs-0.0.3')
import pyblocxs

Usage
-----

The primary way to run pyBLoCXS within *Sherpa* is to call the function
*pyblocxs.get_draws()*

First read in the spectrum::

load_pha("pha.fits")

and define the model::

set_model(xsphabs.abs1*powlaw1d.p1)

and carry out a regular fit to define the covariance matrix::

set_stat("cash")
fit()
covar()

then invoke pyBLoCXS with MetropolisMH as follows::

import pyblocxs
pyblocxs.set_sampler("MetropolisMH")
stats, accept, params = pyblocxs.get_draws(niter=1e4)

to change to MH::

pyblocxs.set_sampler("MH")
stats, accept, params = pyblocxs.get_draws(niter=1e4)

Listing of Control Parameters
-----------------------------

Metropolis-Hastings Jumping Rule

* *defaultprior* -- Boolean to indicate that all parameters have the default flat
prior.

* *inv* -- Boolean or array of booleans indicating which parameters are on the
inverse scale.

* *log* -- Boolean or array of booleans indicating which parameters are on the
logarithm scale (natural log).

* *originalscale* -- Array of booleans indicating which parameters are on the
original scale.

* *priorshape* -- Array of booleans indicating which parameters have associated
user-defined prior functions.

* *scale* -- A scalar multiple of the output of covar() used in the scale of the
t-distribution.

Mixture of Metropolis and Metropolis-Hastings Jumping Rules

* *defaultprior* -- Boolean to indicate that all parameters have the default flat
prior.

* *inv* -- Boolean or array of booleans indicating which parameters are on the
inverse scale.

* *log* -- Boolean or array of booleans indicating which parameters are on the
logarithm scale (natural log).

* *originalscale* -- Array of booleans indicating which parameters are on the
original scale.

* *p_M* -- The proportion of jumps generated by the Metropolis jumping rule.

* *priorshape* -- Array of booleans indicating which parameters have associated
user-defined prior functions.

* *scale* -- A scalar multiple of the output of covar() used in the scale of the
t-distribution

Passing Parameters
------------------

Users can set control parameters at a high level using *set_sampler_opt()*::

set_sampler("MH")
set_sampler_opt("scale", 0.5)

set_sampler("MetropolisMH")
set_sampler_opt("p_M", 0.33)

Displaying the Results
----------------------

The best-fit parameter values can be extracted as either::

bestfit = params[::,stats.argmin()].T

or::

npars = len(get_model().thawedpars)
bestfit = [params[ii][stats.argmin()] for ii in range(npars)]

.. image:: _static/nh_trace.png

A simple trace of the parameter draws with burn-in::

import matplotlib.pyplot as pyplot
NHarr = params[0][nburn:niter]
pyplot.plot(range(nburn,niter), NHarr)
pyplot.xlabel("iteration")
pyplot.ylabel("NH")
pyplot.title("NH trace")

.. image:: _static/mh_histogram.png

Histograms of the parameter draws can be made using *matplotlib*, as follows::

import matplotlib.pyplot as pyplot
pdf,bins,patches = pyplot.hist(params[0], bins=N, normed=True, histtype='step')

Or, using the plotting backend in Sherpa with::

pyblocxs.plot_pdf(params[0], name="abs1.nH", bins=N)

The draws can be filtered using NumPy array slicing::

nh_draws = params[0]

# only use (100,600], meaning 101st to 600th.
nh_draws_filtered = nh_draws[100:600]

pyblocxs.plot_pdf(nh_draws_filtered, "abs1.nh")

.. image:: _static/mh_quantile.png

Quantiles of the parameter draws can be plotted using *matplotlib*::

import matplotlib
import matplotlib.pyplot as pyplot
import numpy

name = "abs1.nh"
x = numpy.sort(params[0])
median, lval, hval = pyblocxs.get_error_estimates(x, True)
y = (numpy.arange(x.size) + 1) * 1.0 / x.size

pyplot.plot(x, y, color='red')
pyplot.axvline(median, linestyle='--', color='orange')
pyplot.axvline(lval, linestyle='--', color='blue')
pyplot.axvline(hval, linestyle='--', color='blue')
pyplot.xlabel(name)
pyplot.ylabel(r"$p(\leq x)$") # p(<= x)
pyplot.title("CDF of %s" % name)

.. image:: _static/mh_3dscatter.png

3D scatter plots of parameter space can be made using *matplotlib*::

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as pyplot

fig = pyplot.figure()
ax = Axes3D(fig)
ax.scatter(params[0], params[1], stats, c="red")
ax.set_xlabel("abs1.nh")
ax.set_ylabel("p1.gamma")
ax.set_zlabel("statistic")
pyplot.draw()

Priors
------

By default, pyBLoCXS uses a flat prior defined between the
hardcoded parameter minima and maxima. But more informative
priors can also be defined using the *set_prior()* function.
if a model has been defined as, say, *xsapec.therm*, a Gaussian
prior can be imposed on the temperature parameter by defining::

normgauss1d.g1
g1.pos=2.5; g1.fwhm=0.5
set_prior(therm.kT,g1)
set_sampler_opt('defaultprior', False)
set_sampler_opt('priorshape', [True, False, False])
set_sampler_opt('originalscale', [True, True, True])

Arbitrary functions can be defined by the user for use as priors::

def lognorm(x, sigma=0.5, norm=1.0, x0=20.):
xl=numpy.log10(x)+22.
return (norm/numpy.sqrt(2*numpy.pi)/sigma)*numpy.exp(-0.5*(xl-x0)*(xl-x0)/sigma/sigma)

and which can be invoked as, e.g.::

set_prior(abs1.NH,lognorm)
set_sampler_opt('defaultprior', False)
set_sampler_opt('priorshape', [True, False, False])
set_sampler_opt('originalscale', [True, True, True])

Transformations
---------------

Because the jumping rules employed by pyBLoCXS are all based on symmetric
bell-shaped t-distributions, the MCMC code will be far more efficient if the
posterior distribution is roughly bell-shaped. When this is not the case under
the canonical parameterization of the fitted model, it can still often be true
under some transformation of the canonical parameterization. For example we may
replace a strictly positive parameter that has a right skewed posterior
distribution with the log of that same parameter. pyBLoCXS allows for
uni-variate transformations of any or all of the model parameters onto the log
and inverse scales.

Given a set of five thawed parameter values where the first and third parameter
are to be on the log scale::

set_sampler_opt('log', [True, False, True, False, False])

Similarly, if the last parameter is to be inverted::

set_sampler_opt('inv', [False, False, False, False, True])

API User Interface functions
----------------------------

.. autofunction:: pyblocxs.mh.dmvt

.. autoclass:: pyblocxs.mh.Walk
:members:

.. autoclass:: pyblocxs.mh.Sampler
:members:

.. autoclass:: pyblocxs.mh.MH
:members:

.. autoclass:: pyblocxs.mh.MetropolisMH
:members:

High-Level User Interface functions
-----------------------------------

.. function:: list_priors()

List the dictionary of currently set prior functions for the set
of thawed Sherpa model parameters

.. function:: get_prior(par)

Return the prior function set for the input Sherpa parameter *par*::

func = get_prior(abs1.nh)

.. function:: set_prior(par, prior)

Set a prior function *prior* for the the input Sherpa parameter *par*.
The function signature for *prior* is of the form lambda x.

.. function:: set_sampler(sampler)

Set a sampler type *sampler* as the default sampler for use with pyblocxs.
*sampler* should be of type str. Native samplers available include *"MH"*
and *"MetropolisMH"*. The default sampler is *"MetropolisMH"*. For
example::

set_sampler("MetropolisMH")

.. function:: get_sampler_name()

Return the name of the currently set sampler type. For example::

print get_sampler_name()
"MH"

.. function:: get_sampler()

Return the current sampler's Python dictionary of configuration options.
For example::

print get_sampler()
"{'priorshape': False, 'scale': 1, 'log': False, 'defaultprior': True,
'inv': False, 'sigma_m': False, 'priors': (), 'originalscale': True,
'verbose': False}"

.. function:: set_sampler_opt(opt, value)

Set a configuration option for the current sampler type. A collection
of configuration options is found by calling get_sampler() and
examining the Python dictionary. For example::

# Set all parameters to log scale
set_sampler_opt('log', True)

# Set only the first parameter to log scale
set_sampler_opt('log', [True, False, False])

.. function:: get_sampler_opt(opt)

Get a configuration option for the current sampler type. A collection
of configuration options is found by calling get_sampler() and
examining the Python dictionary. For example::

get_sampler_opt('log')
False

.. function:: get_draws(id=None, otherids=(), niter=1e3)

Run pyblocxs using current sampler and current sampler configuration
options for *niter* number of iterations. The results are returned
as a 3-tuple of Numpy ndarrays. The tuple specifys an array of statistic
values, an array acceptance booleans, and a 2-D array of associated
parameter values. The arguments *id* and *otherids* are used to access the
Sherpa fit object to be used in the run by Sherpa data id. Note, before
running *get_draws* a Sherpa fit must be complete and the covariance matrix
should be calculated at the resultant fit minimum. For example::

stats, accept, params = get_draws(1, niter=1e4)

.. function:: get_error_estimates(x, sorted=False)

Compute the quantiles and return the median, -1 sigma value, and +1 sigma
value for the array *x*. The *sorted* argument indicates whether *x* has
been sorted. For example::

median, low_val, hi_val = get_error_estimates(x, sorted=True)

.. function:: plot_pdf(x, name='x', bins=12, overplot=False)

Compute a histogram and plot the probability density function of *x*.
For example::

import numpy as np
mu, sigma = 100, 15
x = mu + sigma*np.random.randn(10000)
plot_pdf(x, bins=50)

.. toctree::
:maxdepth: 2

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`