Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.mrao.cam.ac.uk/yerac/juvela/juvela.ps
Дата изменения: Wed Feb 22 22:11:37 1995
Дата индексирования: Tue Oct 2 01:45:43 2012
Кодировка:
Поисковые слова: m 44

The use of Positive Matrix Factorization in
the analysis of molecular line spectra from
the Thumbprint Nebula
By
M i k a Juv e l ay, K i mmo L eh t i n eny AND Pen t t i P a a t e r oz
email: juvela@cc.helsinki.fi
y Helsinki University Observatory, TЁahtitorninmЁaki, 00014 Helsinki, FINLAND
z University of Helsinki, Department of Physics, Siltavuorenpenger 20 D, SF00170 Helsinki,
FINLAND
We present the first results of the application of Positive Matrix Factorization (PMF) to the
analysis of molecular line spectra. PMF is a new computational method, which tries to find out
the underlying basic spectral components with the sum of which the measured spectra could
be explained. The principles of the method are discussed and some results from the analysis of
12 CO--spectra from the Thumbprint Nebula are shown.
1. Introduction
The analysis of molecular line spectra often concentrates on extracting some paramet
ers from individual spectra separately. Usually this is done by making gaussian fits. In
the case of several velocity components this is feasible only when the signal to noise ratio
is good and the components clearly separated. The general velocity structure can be
studied more easily either by drawing channel maps or by constructing space--velocity
charts. These give, however, only a qualitative picture and velocity charts are moreover
limited to one spatial dimension. In the case of multiple components positive matrix
factorization (Paatero & Tapper (1994)) may extract more valuable information from
the data.
2. The Positive Matrix Factorization
Positive matrix factorization (PMF) analyzes the matrix containing the measured spec
tra by calculating a small number of basic spectral profiles and the weights with which
each of these basic components or factors is present in each individual spectrum.
Factor analysis (FA) and principal component analysis (PCA) are older methods, which
are in principle also capable of doing this, when applied to a matrix of spectra. The results
of these methods are, however, often ambiguous and difficult to interpret, since the basic
profiles may include many negative values. The physically meaningful representation can
be found only after a series of transformations which are called rotations.
By requiring nonnegativity for both the weights and the spectral profiles PMF is able
to produce results which are far easier to interpret. Another new aspect of PMF is the
optimal use of error estimates. PMF computes the solution by minimizing the least
squares error of the fit weighted with the error estimates.
2.1. The Principles of PMF
For the analysis the measured spectra are placed as rows in a matrix X, so that the
columns of the matrix correspond to different channels of the spectrum analyzer. Let
1

2 M. Juvela et al.: Positive Matrix Factorization
0 5 10 15 20
0
10
20
30
40
50
60
1 2 1
2 4 2
3 6 3
2 4 2
1 2 1
1
2
3
2
x
=
a) b)
Figure 1. a) The first 20 singular values of a matrix containing the 92 12 CO--spectra used in
the analysis. b) An example of a factorization of rank one
the dimensions of X be n \Theta m, i.e. there are n rows and m columns. In addition to the
emission peaks there is also a random component present. Error estimates of individual
channels can be calculated from the amount of noise in channels around the peak and
placed in an other matrix S.
Given the matrices X and S and the so called rank of the factorization r, PMF solves
a bilinear factorization problem
X ъ GF (2.1)
by calculating factor matrices G and F of dimensions n \Theta r and r \Theta m. This is a least
squares solution which minimizes the error
X
i;j
` (X \Gamma GF ) i;j
s i;j
' 2
: (2.2)
An additional restriction on the solution is provided by a positivity constraint. That is,
every element of the matrices G and F is required to be nonnegative. This reflects the
desire to find positive basic components, of which the spectra could be reconstructed by
addition.
2.2. The rank of the factorization
An approximate rank of the factorization can be determined with the help of singular
value decomposition (SVD), X = USV T . By plotting the singular values of matrix X one
gets a graph like that in Figure 1. The number of linearly independent basic components
can be seen from the first singular values. The majority of the singular values fall on a
sloping line. If the first r singular values are above this line then we may assume that
there might be r independent components. In Figure 1 we have r = 4. The final test for
r is still the comparison of results received from factorizations of different rank.
Let us consider the simplest case, a factorization of rank 1. In this case G is a column
vector with as many elements as there were measured spectra and F is a row vector
with as many elements as there were different channels in the measurements. PMF
approximates the matrix X with the product of these, so that every row of X should be
approximately equal to F (giving the shape of the peak) multiplied with the element of G
corresponding to the individual measurement. In this case the result of PMF is simple:
the resulting peak shape in F is the weighted average of all spectra and G gives the
relative intensities. Usually the spectra are more complicated, containing several basic
components, each of which requires an increase in the rank of the factorization. Every
factorization can be thought of as a sum of factorizations of rank one, i.e. one simply

M. Juvela et al.: Positive Matrix Factorization 3
adds more rows in F to represent new basic components and more columns in G to give
the corresponding weights for these.
2.3. Some complications
Given any factor matrices G and F , new factor matrices can be formed by mathematical
operations called rotations. The possibility of rotations is one of the main reasons why
PCA and FA are not readily applicable to the analysis of molecular spectra and why
the positivity constraint of PMF turns out to be useful. All rotations can be performed
as a sequence of elementary operations, each of which contains one subtraction between
columns of G or rows of F . Since subtraction tends to produce negative elements, the
positivity constraint decisively reduces the number of possible rotations or prevents them
altogether. The results of PMF are thus generally less ambiguous and, since profiles
consist of positive peaks, easier to interpret.
Rotations are not always totally eliminated, however, and it may be possible to get
different solutions with different rotations. PMF gives several methods for the elimination
of rotations by setting additional restrictions and thus ensuring a physically meaningful
solution. In some cases one must settle for a range of possible solutions. Rotations are
not a deficiency of the program, but indicate only that the data does not contain enough
information to derive a single solution using the given model.
Velocity gradients make peaks appear in different locations in X. Since a single com
ponent cannot represent such a set of spectra, PMF represents them with several basic
components. Spectra with one shifted peak would be approximated with several close
lying peaks in matrix F , the relative intensities of which change according to the gradient.
The gradient can in such cases be deduced from this gradual change of the weights, and
even the central velocities in each spectrum can be calculated from the factorization. On
the other hand, the inadequacy of the model makes it necessary to be very careful in the
interpretation of the results.
3. Thumbprint Nebula
The Thumbprint nebula is an isolated and highly symmetrical Bok globule situated
in the Chamaeleon III region. The globule is in or near the state of virial equilibrium
and shows no signs of star formation. Some 12 COmeasurements from the nebula were
analyzed using PMF. The spectra and the channel maps are shown in an article by
Lehtinen et al. (1995).
Based on SVD the maximum feasible rank of factorization was found to be 4. Here we
present, however, two factorizations to illustrate some features of PMF. In the first frame
are the profiles of the basic components and in the other frames maps of the weights.
The diameter of the dots is proportional to the weight, the maximum of which was scaled
to one for all factors separately.
In Figure 2 are the results of a factorization of rank 3. Two of the calculated profiles
are nearly gaussian while the component with the lowest radial velocity shows a blue
wing. Due to a limited rotational freedom there are some wrinkles in all profiles. These
could at least partly be smoothed away by some rotations thus giving a more truthful
representation.
A factorization of rank 4 is given in Figure 3. PMF uses the new component to repres
ent more accurately the higher velocities while the low velocity component is unaltered.
A clear change in velocity is seen from the maps of the weights of the three first factors
and in the western part of the cloud the strongest component has a welldefined intensity
peak. The two velocity components at the highest velocities originate from the dense

4 M. Juvela et al.: Positive Matrix Factorization
12 h 44 m 12 h 40 m
78 o 40'
78 o 32'
78 o 24'
12 h 44 m 12 h 40 m 12 h 44 m 12 h 40 m
4 2 0 2 4 6 8 10
V (km/s)
0
1
2
3
4
T
(K)
Figure 2. The results of a factorization of rank 3. The first frame shows the three basic
profiles and the other frames the maps of the weights of these
12 h 44 m 12 h 40 m
78 o 40'
78 o 32'
78 o 24'
12 h 44 m 12 h 40 m 12 h 44 m 12 h 40 m
12 h 44 m 12 h 40 m
78 o 40'
78 o 32'
78 o 24'
4 2 0 2 4 6 8 10
V (km/s)
0
1
2
3
4
T
(K)
1 2 3
4
Figure 3. The results of a factorization of rank 4. The first frame shows the four basic
profiles and the other frames maps of the weights in an order of increasing radial velocity

M. Juvela et al.: Positive Matrix Factorization 5
material in the globule itself, while the two components at lower velocity originate from
the diffuse material around the globule. The component from the globule is split into
two components due to a velocity gradient (rotation) over the globule. The rotation is
evident also from gaussian fittings.
4. Conclusions
PMF can be a valuable tool when studying complex regions, where possibly several
velocity components are present. If these are blended together, channel maps do not
necessarily reveal all components. PMF not only finds out of the existence of hidden
components, but also gives the spectral shapes and spatial distributions of these. When
individual spectra are too noisy to allow separate fitting of gaussian components, PMF
can help by making a fit to a number of spectra simultaneously. The possibility of
rotations can sometimes make the interpretations of the results difficult. In extreme cases
PMF can be taken as a data compression tool without assuming any deeper meaning for
the results.
REFERENCES
Lehtinen K., Mattila K., Schnur G.F.O, Prusti T., 1995, A&A, accepted.
Paatero P., Tapper U., 1994, Environmetrics, 5, 111.