. : http://lnfm1.sai.msu.ru/~marat/articles/eng/Err_Estimated_Par_eng.pdf
: Mon Jan 21 18:18:38 2008
: Mon Oct 1 23:29:30 2012
: IBM-866

: massive stars
ISSN 1063-7729, Astronomy Reports, 2008, Vol. 52, No. 2, pp. 99126. c Pleiades Publishing, Ltd., 2008. Original Russian Text c M.K. Abubekerov, N.Yu. Gostev, A.M. Cherepashchuk, 2008, published in Astronomicheski Zhurnal, 2008, Vol. 85, No. 2, pp. 121150. i

Error Estimates for Parameters in Inverse Parametrical Problems. Light-Curve Analysis for Classical Eclipsing Binaries
M. K. Abubekerov, N. Yu. Gostev, and A. M. Cherepashchuk
Sternberg Astronomical Institute, Universitetski i pr. 13, Moscow, 119899 Russia
Received May 22, 2007; in final form, June 22, 2007

Abstract--Parameters and their error intervals numerically estimated for the observed light curve of the binary eclipsing system YZ Cas as well as for one-, two-, three-, and four-parameter functions, and the associated parameters and their error intervals numerically estimated. The error intervals are calculated using differential corrections method, Monte-Carlo simulations, and confidence areas. We study the error intervals and the reliability of the techniques used. PACS numbers: 97.80.Hn, 95.75.Pq DOI: 10.1134/S1063772908020030

1. INTRODUCTION It is often necessary to solve inverse parametrical problems in astronomy, when model parameters describing a process and, as may be even more important, reliable estimates for the errors of these parameters, must be determined from the observed consequences of the process. One particualarly important example is fitting the light curves of eclipsing binaries to derive information about the mass, radii, and luminosities of stars in different stages of their evolution. The determination of the basic parameters of the stellar components of eclipsing binaries along with their errors enables verification of the theory of the structure and evolution of stars, searches for relativistic objects in binary systems, and studies of the evolution of close binaries. Since subtle evolutionary effects in close binaries can often be revealed only close to the limit of the observational accuracy, reliable estimates of the errors of model parameters are crucial in the analysis of eclipsing-binary light curves. The estimation of errors of eclipsing-binary parameters has been subject to various confusing factors. As a rule, attention is given primarily to the central values of parameters, while their errors are usually determined only for formality's sake. Several methods for estimating error intervals for parameters of binary systems, as well as in other parametrical problems, are currently used in astrophysics; the most common are the Monte-Carlo method, the differential-correction method [1], and the confidence-area method [24]. Various groups have adhered to various of these techniques. Here, we numerically compare these methods to estimate the
99

error intervals for the parameters of binaries in order to reveal advantages and drawbacks of the various approaches. The construction of confidence areas in parametrical problems has been well developed theoretically (see, for example, [5]). Here, we use the high efficiency of modern computers to illustrate numerically different techniques that can be used to construct the confidence areas, and provide real examples to justify practical recommendations concerning the application of various methods for parameter estimation. 2. A MODEL OF A BINARY Since our study is methodological, we will adopt a simple model with two spherical stars with thin atmospheres in a circular orbit about each other, without taking into account effects associated with the mutual closeness of the components. This model is easy to develop with modern computers and can be used calculate numerous solution versions for an inverse problem with a comparatively short computation time. We considered the motion of the stellar disks projected onto the plane of the sky, i.e., the plane perpendicular to the line of sight. Figure 1 presents the geometry of the stellar disks during an eclipse. Here, r1 and r2 are the radii of the first and second star, the distance between the centers of the stellar disks, and and the polar coordinates of an arbitrary point on the surface of the first stellar disk (the coordinate origin is at the geometrical center of the disk). The distance between the centers of the stellar disks is specified by the expression (see, for example, [6]) 2 = cos2 i +sin2 i sin2 , (1)


100

ABUBEKEROV et al.

r1

3. The brightness loss in the secondary minimum due to the eclipse of the larger primary by the smaller secondary:
r2

L

full

-L

(2)

( ) =
S ()

I

(1)

()dS.

(6)

d

Equations (1), (4), (5), and (6) completely describe the observed light curve and contain the seven desired (1) (2) parameters: r1 , r2 , i, I0 , I0 , x1 ,and x2 . Substituting the brightness distribution approximated by the linear limb-darkening law (2), (3) in the integral and then integrating, we obtain a system of non-linear algebraic equations for these seven parameters. 3. DERIVATION OF FORMULAS USED IN THE CALCULATIONS Here, in contrast to the generally adopted approach (see, for example, [1]), we use another technique to calculate the area of disk overlap, making it possible to obtain an expression for the light loss independent of the relation between the radii r1 , r2 and the distance between the component centers . Thus, we were able to unify the formulas used in the calculations and avoid considering each case of the eclipse individually. This approach results in a reduction in the number of equations used, and, accordingly, a reduction in the calculation time, by a factor of seven. We describe our technique in more detail below. Let the brightness at some point of the stellar disk be I () = I0 1-x+x 1- 2 r2 ,

Fig. 1. A model of two eclipsing spherical stars, projected onto the plane of the sky.

where i is the orbital inclination of the binary and the current orbital phase angle. A linear limb-darkening law was used for the brightness distribution across the stellar disks: I
(1)

() = I0

(1)

1 - x1 + x1

1-

2 2 r1 2 2 r2

,

(2)

I

(2)

( ) = I0

(2)

1 - x2 + x2

1-

.

(3)

Here, I0 and I0 are the brightnesses at the disk centers of the first and second star, x1 and x2 the limb-darkening coefficients for the first and second stars, and and the polar distances from the disk centers of the first and second stars, respectively. The desired model parameters are r1 , r2 , i, I0 , I0 , x1 , and x2 . The unit of length is the component separation a = 1. There is no "third light" in the model. The light curve of the binary is specified by the following three equations: 1. The combined luminosity of the components, which describes the brightness outside eclipse:
r
1

(1)

(2)

(1)

(2)

where is the distance from the point to the center of the stellar disk, I0 the brightness at the disk center, x the limb-darkening coefficient, and r the radius of the star. Introducing the new parameters X0 = I0 (1 - x) and X1 = I0 x, I () = X0 + X1 1- 2 r2 . (7)

r

2

2
0

I

(1)

()d +2
0

I

(2)

( )d = L

full

.

(4)

2. The light loss in the primary minimum due to the eclipse of the smaller secondary by the larger primary: L
full

Component 1 eclipses component 2 at orbital phase = 0. The total brightness of the star is
r

-L

(1)

( ) =
S ()

I

(2)

( )dS,

(5)

L

(s)

= 2
0

I

(s)

2 (s) 2 (s) 2 ()d = X0 rs + X1 rs , 3 s = 1, 2,

where S () is the disk overlap area.

ASTRONOMY REPORTS Vol. 52 No. 2

2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

101

and the total brightness of the binary outside eclipse is
2 Lfull = L(1) + L(2) = X0 r1 2 (1) 2 2 (2) 2 (2) 2 + X1 r1 + X0 r2 + X1 r2 . 3 3 (1)

(8)

Let us integrate (9) over with for which the 2 2 +2 - rn inequality 1 is satisfied: 2
2 2 +2 - rn cos 2 2 2 +2 - rn || arccos 2

In order to make the formulas for the light curve minima universal and to reduce the number of equations, we assign the subscript n to the eclipsing component (closer to the observer), and f to the eclipsed component (further from the observer). When calculating the light curve minima in the orbital-phase interval -/2 < < /2 (or cos > 0), the variable rn must be replaced by r1 ,and rf by r2 . In the orbitalphase interval cos < 0, the opposite substitution must be made--rf must be replaced by r1 , and rn by r2 . In the new notation, the light loss during an eclipse is L
dec

.

With the same , for which inequality

2 2 +2 - rn < -1, the 2

2 2 +2 - rn < cos 2

(,rf ,rn ,X0 ,X1 ) =
S ()

(f )

(f )

I

(f )

()dS,

(9)

where is the distance between the disk centers and S () the overlapping area of the disks. Before calculating the integral (9), we introduce the functions , x < -1, Ax arccos x, -1 x 1, (10) 0, x>1 and Qx Then dAx =Q dx 1 1 - x2 . (11) x, x 0, 0, x < 0.

is satisfied for any . In this case, the integration over is from - to . For values of for which 2 2 +2 - rn > 1, the last of inequalities (12) is not 2 satisfied for any (i.e., no points with such values are in the integration domain). For such values, we specify both limits of the integration over to be zero. Using the notation (10), the integral in (9) can be rewritten L
r
f

dec

(,rf ,rn ) dI
(f )

(13) ()

(,,rn )

=
0

d
-(,,rn ) r
f

=2
0

(,,rn )I

(f )

()d,

where (,x,y ) A x2 +2 - y 2x
2

.

Substituting (7) into (13), we obtain L =
(f ) X0 dec

(,rf ,rn ,X0 ,X1 )
(f ) X1

(f )

(f )

(14)

Let us introduce a set of polar coordinates with their origin at the center of the disk of the eclipsed star and the polar angle measured from the direction of disk center of the eclipsed component ("f") towards the center of the disk of the eclipsing component ("n") (Fig. 1). In these coordinates, the domain of integration S () is specified as < rf , (12) S () = -2 < , 2 +2 - rn cos . 2
ASTRONOMY REPORTS Vol. 52 No. 2 2008

L

dec 0

(,rf ,rn )+
r

L

dec 1

(,rf ,rn ).

Using (11), we obtain
f

L

dec 0

(,rf ,rn ) = 2
0 2 ,rf ,rn )rf

(,,rn )d
2 +(,rn ,rf )rn

(15)

= (

1 - Q (,rf ,rn ) , 2 Q (,rf ,rn ) Q
2 rf - ( - rn ) 2


102
2 ( + rn )2 - rf

ABUBEKEROV et al.

+ X0 L

(2)

(2) 0

(, r1 ,r2 ,i)+ X1 L

(2)

(2) 1

(, r1 ,r2 ,i).

and
r
f

L

dec 1

(,rf ,rn ) = 2
0 r
2 f

(,,rn )

(16)

The partial derivatives of the total brightness function are expressed in terms of partial derivatives of Ldec and Ldec with respect to , rf , and rn and the 0 1 partial derivatives of (, i) with respect to and i. We can use (11) to find the partial derivatives (15). Simplifying, we finally obtain L L
dec 0 dec 0



1-

2 2 d = rf

0

(, , rn ) 1 - 2 d rf

Q(,rf ,rn ) (,rf ,rn ) =- ,

(21) (22)

(the last equality is obtained via the substitution = 2 ; further, to avoid excessive new notation, is replaced by ). The integral (16) is calculated numerically. For a circular orbit, the distance between the centers of the stellar disks depends on the phase and the orbital inclination as (, i) = cos2 i +sin2 i sin2 . (17)

(,rf ,rn ) = 2(,rf ,rn )rf , rf (,rf ,rn ) = 2(,rf ,rn )rn . rn

L

dec 0

(23)

The brightness of the binary depends on the phase as follows: L(, i, r1 ,r2 ,X0 ,X1 ,X0 ,X1 ) = L - L L
dec dec (1) (1) (2) (2) full

(18)

((, i),r1 ,r2 ,X0 ,X1 ), (2) (2) ((, i),r2 ,r1 ,X0 ,X1 ),
(1) 0

(1)

(1)

Calculating the derivatives of Ldec , we introduce the 1 functions (24) K1 (,rf ,rn ) 2 rf 2 rf - d, Q - ( - rn )2 ( + rn )2 - 0
r
2 f

cos < 0, cos > 0. K2 (,rf ,rn ) (19) Q We obtain
2 rf

Introducing the functions L -
2 (, r1 ,r2 ,i) = r1

1
0

2 rn -

(25)

Ldec ((, i),r1 ,r2 ), cos < 0, 0 0, cos > 0, 22 (1) L1 (, r1 ,r2 ,i) = r1 3 L 0,
dec 1

- ( + rn ) -
2

- ( - rn )

2

d.

-

((, i),r1 ,r2 ), cos < 0, cos > 0,
(2) 0 2 (, r1 ,r2 ,i) = r2

L -

(,rf ,rn ) K1 (,rf ,rn )+ K2 (,rf ,rn ) , =- rf L L
dec 1

dec 1

(26)

Ldec ((, i),r2 ,r1 ), cos > 0, 0 0, cos < 0, 22 (2) L1 (, r1 ,r2 ,i) = r2 3 L 0,
dec 1

(,rf ,rn ) 2rn = K1 (,rf ,rn ), rn rf
r
2 f

(27)

-

((, i),r2 ,r1 ), cos > 0, cos < 0,

L

dec 1

(,rf ,rn ) 1 =2 rf rf


0

(, , rn )
2 rf -

d.

(28)

we can write the total brightness as the linear combination
(1) (1) (2) (2) L(, r1 ,r2 ,i,X0 ,X1 ,X0 ,X1 ) (1) (1) (1) (1) X0 L0 (, r1 ,r2 ,i)+ X1 L1 (, r1 ,r2

Integrating (28) by parts yields
r

(20) ,i)

L

dec 1

2 f

=

(,rf ,rn ) 2 = rf rf

(, , rn )

(29)

0

ASTRONOMY REPORTS Vol. 52 No. 2

2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS
r
2 f

103

(,rn ,rf ) 2 2 - rn -

Q

1-

2 d + rf

= d
0

2 rf - ( - rn )2 - 4rn sin2

x dx, 2

0

2 rf -

(,rn ,rf )

- ( - rn )2 = 2 L rf
dec 1

( + rn )2 - (,rf ,rn )

K2 (,rf ,rn ) =
0

(2rn cos x - )
2

(32)

+

2 2 - 2rn K1 (,rf ,rn )+ K2 (,rf ,rn ) . 2 rf



2 rf - ( - rn )2 - 4rn sin

x dx. 2

The numerical determination of the integrals in (24) and (25) is complicated by the singularities in the functions in the integrand. Therefore, we convert them to a more appropriate form. In these functions, the integrand differs from zero in the interval [( - rn )2 , ( + rn )2 ]. Therefore, K1 (,rf ,rn ) = K2 (,rf ,rn ) = 0 for | - rn | rf , (30)

and if | - rn | < rf , it is sufficient to integrate 2 over the segment [( - rn )2 , min{( + rn )2 ,rf }]. 2 2 + rn - In this segment, the function arccos 2rn is real and monotonic with respect to ; consequently, 2 2 + rn - . T he we can substitute x = arccos 2rn integration over x will then be done over the segment 2 2 2 + rn - min ( + rn )2 ,rf . 0, arccos 2rn Taking into account that d arccos d =
2 2 + rn - 2rn 1 2

Expressions (31) and (32) were obtained assuming that | - rn | < rf . In the case | - rn | rf , the function (,rn ,rf ) = 0, so that the integrals in the indicated expressions vanish, consistent with (30). Thus, (31) and (32) are valid for any positive ,rn and rf . A fact that is important for the numerical integration is that the integrand in (16) is only piecewise smooth, while the efficient application of numerical integration techniques, such as Gauss' quadrature method, requires the integrated function to be smooth over the entire integration interval. Therefore, during 2 the integration, the interval [0,rf ] must be divided into sections according to discontinuous points of the integrand's derivatives, with the integration be carried out separately for each section. Let us present the general expression for the sequence of these points (depending on the values of rn , rf , and , some of these points coincide):
2 {max(0, ( - rn )2 ), min(rf , ( - rn )2 ),

(33)

min(rf , + rn )

2

2 ,rf

}.

This expression is convenient for the construction of a general numerical-integration procedure. Now let us change to the new variable z = cos2 i and new subscripts: L1 (, r1 ,r2 ,z ) = L
(1) 0 (1) 1 (2) 0 (2) 1

(, r1 ,r2 ,i), (, r1 ,r2 ,i), (, r1 ,r2 ,i), (, r1 ,r2 ,i),
(1) (2)

(34)

- ( - rn ) and arccos

( + rn ) -

2

L2 (, r1 ,r2 ,z ) = L L3 (, r1 ,r2 ,z ) = L L4 (, r1 ,r2 ,z ) = L X1 = X3 =
(1) X0 (2) X0

2 2 2 + rn - min ( + rn )2 ,rf

, ,

X2 = X1 , X4 = X1 .
4

2rn
2 2 2 + rn - rf

=A we obtain:

In the new notation, (20) is written = (,rn ,rf ), L(, r1 ,r2 ,z , X1 ... X4 ) =
k =1

2rn

Xk Lk (, r1 ,r2 ,z ). (35)

K1 (,rf ,rn )
ASTRONOMY REPORTS

(31)

Vol. 52 No. 2 2008


104

ABUBEKEROV et al.
P

3.1. Least-Squares Method for the Linear Model The central values for the parameters of a bi(1) (1) (2) (2) nary model X0 , X1 , X0 , and X1 (with respect to which the problem is linear) and r1 , r2 , and i (with respect to which it is non-linear) are obtained by obtaining a least-squares fit to the "observed" light curve. We summarize briefly here elements of the least-squares technique for linear and non-linear models (see the following Section). Consider the linear model specified by the function f (, 1 ... P ), which depends linearly on the parameters 1 ... P :
P

gq (m )wm = 2Bq - 2
p=1

Aqp p = 0,

q = 1 ... P, where
M

Aqp =
m=1 M

gq (m )gp (m )wm , (m - g0 )gq (m )wm .

(39)

Bq =

m=1

The solution for the system (38) is
P

f

lin

(, 1 ... P ) = g0 +
p=1

gp ( )p .

(36)

c = p
q =1 P

A-1 Bq = qp

M

(m - g0 )wm
m=1

(40)

We denote the true values of the physical parameters 1 ... P , and the points at which observations are made 1 ... M (here, M is the total number of observational points). We also specify the set of random, statistically independent observed values 1 ... M , which have a normal distribution with the expectation values M(k ) = f lin (k , 1 ... P ) and disper2 sions 2 (m ) = m ; from now on, 2 (...) will denote the calculated dispersion. We define the residual functional via the expression R
M lin


q =1

A-1 gq (m ), qp

p = 1 ... P.

(1 ... P ,1 ... M )
lin 2

(37)

Thus, the central values for the parameters c ... c are expressed in terms of a linear combi1 P nation of 1 ... M , and so are normally distributed. Their expectation values are 1 ... P , while the dispersions are expressed in terms of the dispersions 1 ... M , in accordance with the addition of the dispersions of independent random values in quadrature: 2 (c ) p
2 2 (m )wm P q =1 P q =1

=
m=1 M

m - f

(, 1 ... P ) wm
P p=1

M

A-1 gq (m ) qp
2

(41)
2

2 gp (m )p wm

=
m=1

=
m=1

m - g0 -

M

2 wm 0

=
m=1

with the weights wm being inversely proportional to 2 the dispersions: wm = 2 /m ,where 0 (1 ... P ,1 ... M )) M the dispersion for weights of unity. In the modeling, is specified as an initial parameter; i.e. the value of is specified a priori in the model calculations. Let us find the values for c ... c that yield 1 P minimum in the residual functional for the fixed ... M . The minimum condition (37) is equivalent the system of P linear equations 2 = 0 R
M lin

A-1 gq (m ) . qp

2 (R

lin

is 0 0 a 1 to

Given the dispersion of the central value of a parameter, we can construct an interval within which the true value of the parameter falls with a specified probability. To this end, it is sufficient to notice that, as follows from the normal distribution of the central parameter, P c - p (c ) , p p (42)



(1 ... P ) q
P p=1

(38)

where P denotes the probability of satisfying the condition and depends on the chosen probability (confidence level) and is determined as the root of the equation


=2
m=1

m - g0 -

gp (m )wm p

2
0

exp -

t2 2

dt = .

ASTRONOMY REPORTS Vol. 52 No. 2

2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

105

For example, the confidence level is 0.6827, 0.9545, and 0.9973 for = 1, 2, and 3, respectively (corresponding to 1, 2, and 3 ). When 0 is unknown (e.g., in the case of a real observed light curve), instead of using (41), the errors are determined using a formula derived from (41) by replacing 0 with 0 , called the rms estimate for the unity-weight dispersion [7]:
2 0 =

expand the function f at the point 1,0 ... P,0 into a Taylor series, retaining up to the first term: f (, 1 ... P ) f (, 1,0 ... P,0 )
P

+
p=1

(p - p,0 )

f (, 1,0 ... P,0 ). p

R

lin

(c 1

... ,1 ... M ) . M -P

c P

(43)

The resulting rms estimates for the dispersions are specified by the formula 2
M

Further, we adopt in (36), and then in (39) and (40) (48) p = p - p,0 , g0 = f (, 1,0 ... P,0 ), f (, 1,0 ... P,0 ), gp ( ) = p where p = 1 ... P .The p,0 + p will then be the next c c approximations for the central values of 1 ... P , while the dispersions calculated using (41) and (48) c for p,0 = p ,
M c 2 (p ) =



2 est

(c ) = p
m=1

2 0 wm

P

A-1 gq (m ) . qp

(44)

q =1

Here, 0 is random, and (c - )/est obeys a Student's distribution with M - P degrees of freedom. However, for sufficiently large M - P 10, this becomes close to a normal distribution, and it can be assumed that P c - p (c ) p p c - (c ) . p P p est p

2 wm 0
m=1

(41 )



P q =1

A-1 qp

2 f (m ,1,0 ... P,0 ) q

3.2. Differential-Correction Method
Let us now consider a model specified by an arbitrary, in general, non-linear function f (, 1 ... P ) that is differentiatable with respect to 1 ... P . 1 ... P , 1 ... M , 1 ... M , and the residual functional are specified in the same way as for the linear model: 1 ... M display normal distributions and M(k ) = f (k , 1 ... P ),
2 2 (m ) = m ,

are adopted as approximate values for the dispersions c c 1 ... P . In the case of a real observed light curve, similar to the linear case, we use the rms estimate for the unityweight dispersion instead of 0 : R(c ... c ,1 ... M ) 1 P (43 ) M -P and the corresponding rms estimates for the dispersions of the parameters:
2 0 = M

(45)

where M(k ) denotes the expectation value for k ,and 2 (m ) the dispersion of k . The residual functional has the form (46) R(1 ... P ,1 ... M )
M


P q =1

2 est

c (p ) = m=1

2 0 wm

(44 ) 2

A-1 qp

f (m ,1,0 ... P,0 ) . q

=
m=1

(m - f (, 1 ... P ))2 wm ,

2 where wm = 2 /m , and we obtain for the unity0 weight dispersion

2 (R( 1 ... P ,1 ... M )) . (47) M In the modeling, 0 is specified as an initial parameter, i.e., it is known a priori in the model calculations. c c We denote 1 ... P to be the central values that bring R( 1 ... P ,1 ... M ) to a minimum for the fixed 1 ... M .Let 1,0 ... P,0 be some values for the parameters that are close to their central values. We 2 = 0
ASTRONOMY REPORTS Vol. 52 No. 2 2008

It is obvious that, if f (, 1 ... P ) is a linear function in 1 ... P , then (41 )(44 ) concide with (41)(44). Note that, in this method, the model is assumed to provide an adequate desription of the observational data.

3.3. Monte-Carlo Method In our study, we have also estimated the disperc c sions (1 ) ... (p ) using the Monte-Carlo method. For the specified 1 ... P , we calculated the val ues of 1 ... M at the phases 1 ... M . We then used the specified 0 to randomly generate for N


106

ABUBEKEROV et al.
(n) (n)

times a sequence of normally distributed 1 ... M (n = 1 ... N ) with mathematical expectations equal to 1 ... M . The central values for 1 for each sequence ... dispersions estimated as
2 mc c (p ) = (n) 1 (n) M c(n)

... P

c(n)

were found

(n = 1 ... N ), and their - p )2 .

1 N

N c (p n=1 (n)

(49)

Naturally, this technique assumes that the true values for 1 ... P are known, which is possible in model problems aimed at determining errors for comparison with those found using other techniques. During the reduction of a real light curve, this method can be applied using the central values for the parameters 1 ... P obtained from the least-squares solution of the light curve, instead of their true values. In doing this, we assume that a small variation of 1 ... P will result in a relatively small variation of the error. Like the differential-correction method, this method assumes that the model provides an adequate description of the observational data.

true values (1 ... P ) D. The set D is the confi dence region for (1 ... P ). Note that, in this case, 0 cannot be replaced by its rms estimate 0 specified by (43 ), since this would appreciably violate the distribution law (50). In particular, a null confidence region would not be obtained for the quantile 0 > M - P (for M = 101, this corresponds to > 0.35); i.e., the model would always be adequate to the observations. Therefore, either the exact known value for 0 must be taken, as in model problems, or a value estimated with high accuracy from independent considerations, as in the case of real observations. Since, in practice, 2 is not always known, a 0 criterion based on the Fisher distribution is often used. Let M be the number of "normal" points unifying the measurements in the group for their subsequent averaging. Let m in the point m be measured Nm times (m = 1 ... M ); i.e., the mth group contains Nm points. We denote the total number of j measurements as N : N = M=1 Nm . Let m be the m j th measurement of the curve (j = 1, 2 ... Nm ) at j the point m ; i.e., m is a random value obeying a j normal distribution, and M (m ) = f (k , 1 ... P ). 1 Nm j m ; then the true dispersions Let m = Nm j =1 j obs 2 (m ) = 2 (m )/Nm , while (m )2 = 1 j Nm (m - m )2 denotes the estiNm (Nm - 1) j =1 mates of the dispersions of the observations at these points. Then, R(1 ... P ,1 ... M ) N - M F M M obs )2 (Nm - 1)wm (m
M,N -M

3.4. Confidence-Area Method
Let us now consider the residual R for the general model specified by (46). Then, in accordance with the 2 distribution, R(1 ... P ,1 ... M ) 2 , (50) M 2 0 where "" means "distributed as." The integral distribution 2 is specified by the expression M (t) =
2 m

, (53)



m 2

, 0,
m 2

t 2

m=1



,

(51)

t where m , 0, 2 is the incomplete generalized 2 gamma function. Consequently, if 2 (0 ) = , i.e., M 0 is the quantile of the 2 distribution for some M confidence level < 1, then the corresponding probability is R(1 ... P ,1 ... M ) 0 = . (52) P 2 0

where FM,N -M is the Fisher distribution [5]. Its integral distribution function is B F
n,m

(t) =

nt m+nt

B

nm 2, 2 nm 2, 2

,

(54)

where Bz n , m is theincompletebetafunction. The 22 procedure used to construct the confidence region for (1 ... P ) is similar to the previous case. Let us now consider the case when the function f depends linearly on K parameters 1 ... P (K P ), ~ for example, on 1 ... K . Let 1 (K +1 ... P ) ... ~ K (K +1 ... P ) be the values that bring yield the minimum residual R(1 ... P ,1 ... M ) for the fixed K +1 ... P , which, in this case, is quadratic in 1 ... K . Then, according to [5],
ASTRONOMY REPORTS Vol. 52 No. 2 2008

Let DP be the P -dimensional set of values for the vector 1 ... P satisfying the condition R(1 ... P ,1 ... M ) 0 . 2 0 (52 )

Then, (52) is equivalent to the statement that the set DP is not empty with probability , and the


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

107

~ R(1 (K and ~ R(1 (K
+1

+1

~ ... P ) ... K (K

+1 ... 2 0

P ), K

+1

... P ,1 ... M )



2 M -K

(55)

~ ... P ) ... K (K
M m=1

+1

... P ), K
2 1)wm om

+1

(Nm -

... P ,1 ... M ) N - M F M -K

M -K,N -M

.

(56)

These statistics can be used to construct confidence areas for K +1 ... P . In particular, we obtain for some k(1 k P ) ~ ~ R(1 (k ) ... k ... P (k ),1 ... M ) (57) 2 0 and ~ ~ R(1 (k ) ... k ... P (k ),1 ... M )
M m=1 2 M -P +1

statistics is never empty. We obtain from (55) and (59) with K = P - 1 ~ ~ R(1 (k ) ... k ... P (k ),1 ... M ) - Rmin 2 . 1 2 0 (61) The statistics (61) also suggest a priori adequacy of the model, and one-dimensional confidence sets (intervals) obtained using these statistics are never empty. If the dependence on 1 ... K is non-linear, the above statements concerning the distributions (55), (56), and the formulas (57)(61) following from them are fulfilled asymptotically, as the number of measurements tends to infinity. One of the problems solved here is the numerical verification of the acceptability of these asymptotic approximations. Note that, if the confidence set is not empty (the model used is adequate to the observed curve), it c c contains the point 1 ... P . Let us also present the minimum residuals for the parameters in geometrical terms. If we consider a P -dimensional set DP = {(1 ... P ) : R(1 ... P ) C }, then, the P - K -dimensional set ~ DP -K = {(K +1 ... P ) : R(1 (K +1 ... P ) ... ~ K (K +1 ... P ),K +1 ... P ,1 ... M ) C } is the projection of DP onto the P - K -dimensional plane K +1 ... P . Since the projections of the confidence set are calculated using the corresponding residuals minimized with respect to all parameters except one, we can find the probability that the true values lie within the corresponding projections of the confidence region constructed for the specified confidence level ,given the distributions of these minimum residuals. If 2 (0 ) = and DP is the P -dimensional conM fidence set with the confidence level , obtained using

(58)

(Nm -

2 1)wm om M -P +1,N -M



N -M F M - P +1

,

~ where p (k ) and 1 p P , p = k, minimize the residual for the fixed k . The relations (57) and (58) can be used to construct one-dimensional confidence regions (intervals) for k with a specified confidence level. We will now find the distribution of the difference between the residual for 2 statistics obtained with M the true parameter values and with their central values. Let us first consider the case of a linear dependence on all parameters, so that, adopting K = P in (55), we will derive the distribution for the residual obtained for the central parameter values1 :
c c R(1 ... P ,1 ... M ) 2 0 2 M -P

.

(59)

c c We denote Rmin R(1 ... P ,1 ... M ). Using the well known statement (a 2 and b 2 ) a + a b b 2+b , we can obtain a

R(1 ... P ,1 ... M ) - Rmin 2 . P 2 0

(60)

The statistics (60) assume the a priori adequacy of the model, and the confidence set obtained with these
1

~ Of course, the vectors c and also depend on the observed ; we omit this dependence for the sake of brevity. ASTRONOMY REPORTS Vol. 52 No. 2 2008


108

ABUBEKEROV et al.

Table 1. for = 0.6827 for two-, three-, and fourparameter problems Statistics
2 P

P =2 P =3 P =4 0.8703 0.9396 0.9702 0.7072 0.7309 0.7536

2 ,M = 101 M FM,
N -M

,M = 101,N = 1212 0.7046 0.7258 0.7464

the quantile 0 for the statistics (p) jections D1 onto the p axes, p dimensional confidence sets for with the confidence level = 2 -P +1 ( M

(50), then its pro= 1 ... P are onethe statistics (57)
0

).

(62)

(Recall that 2 (t) is specified by (51).) Further, if M 2 (0 ) = and DP is the P -dimensional confidence P set with the confidence level obtained using 0 for (p) the statistics (60), its projections D1 onto the p axes, p = 1 ... P are one-dimensional confidence sets for the statistics (61) with the confidence level = 2 (0 ). 1 (63)

= 0.6827. These factors are 1.515173, 1.87796, and 2.17244 for the two-, three-, and four-parameter areas, respectively. Jumping ahead for a moment, we note that the dispersions and projections of the confidence region obtained in the simulation for the 2 P statistics are in good consistency with this rule. In addition, in the case of multi-parameter problems, the increase of the overlap probability for the projections is the same for all parameters (Tables 2 and 3). In Table 4 we present the quantile 0 for the 2 distribution for one-, two-, three-, and fourP parameter problems for the confidence level = 0.6827. Thus, for example, in one-dimensional problems, to ensure that the confidence region overlaps with the exact solution with the probability = 0.6827, the values R(1 ,1 ... M )/2 must be cut off at a 0 number exceeding the minimum value by unity.

Finally, if FM,N -M (0 ) = and DP is the P -dimensional confidence set with the confidence level obtained using 0 for the statistics (53), then its pro(p) jections D1 onto the p axes, p = 1 ... P are onedimensional confidence sets for the statistics (56) with the confidence level M 0 , (64) = FM -P +1,N -M M - P +1 where the function Fn,m (t) is specified by (54). This simulation demonstrates that (62)(64) yield satisfactory results, even in the non-linear case. Table 1 presents the probability of thetruevalue lying within the projection of the region, given the probability of its lying in the region itself, = 0.6827, for one-, two-, three-, and four-parameter problems, calculated using (62)(64). Jumping ahead for a moment, we note that, as the simulation shows, the number of cases when the true parameter values lie within the projection of the region on the parameter axis is perfectly consistent with the theoretical values for presented in Table 1, in both the linear and nonlinear cases. We present here the factors , by which the standard deviation should be multiplied [see (42)] in order to obtain the projection of the confidence region on the parameter axis, which is found for 2 statisP tics under the condition that the confidence region contains the true parameter value with probability

3.5. Application of Error Determination Methods to a Light Curve Let us now consider a model of a binary light curve. We used (35) to calculate the brightness from the known r1 , r2 , z , and X1 ... X4 . The brightness outside of eclipse Lfull was assumed to be known. The total brightness of the binary was taken to be Lfull = 1. Using this condition, we can express X4 in terms of the other parameters via (8) [taking into account the new notation of (34)]: 2 2 3 3X1 r1 X2 r1 3 (65) X4 = 2 - 2r 2 - r 2 - 2 X3 . 2r2 2 2
Having substituted in P = 6 and 1 4 = X1 , f (, 1 ... 6 ) = (46) = r1 , 2 = r2 , 3 = z, (66) 5 = X2 , 6 = X3 , L(m ,r1 ,r2 ,z , X1 ... X4 ), (67)

we write the resulting residual functional R(r1 ,r2 ,z , X1 , X2 , X3 ,1 ... M )
M

=
m=1

[m -L(m ,r1 ,r2 ,z , X1 ... X4 )]2 wm ,

where X4 is specified by (65). The central values of a parameter will be those that yield the absolute minimum. We denote these c c c c c r1 , r2 , z c , and X1 , X2 , X3 . The dispersions of the central values and their rms estimates can be found by substituting (66) into (41 ) and (44 ). ~ The values Xp (r1 ,r2 ,z ,1 ... M ) that yield the minimum residual for fixed r1 ,r2 ,z can be found by adopting P = 3 and 3 g0 = 2 L4 (m ,r1 ,r2 ,z ), 2r2
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

109

Table 2. Half-widths of the projections of the confidence region of parameters of the linear functions (74)(77) for the 2 statistics and the number of agreements of these projections with the theoretical parameter values P Function Formul (74) Formul (75) Formul (76) Formul (77) 1 /2; (N

1

)

2 /2; (N



2

)

3 /2; (N



3

)

4 /2; (N



4

)

N

0

0.00119; (681) 0.00282; (878) 0.00353; (939) 0.0910; (977)

-686 685 681

0.00225; (868) 0.00768; (944) 0.0343; (969)

0.00149; (940) 0.0119; (972)

0.0796; (976)

Table 3. Half-widths of the projections of the confidence region of parameters of the non-linear functions (78)(81) for the 2 statistics and the number of agreements of these projections with the theoretical parameter values P Function Formul (78) Formul (79) Formul (80) Formul (81) 1 /2; (N

1

)

2 /2; (N



2

)

3 /2; (N



3

)

4 /2; (N



4

)

N

0

0.00113; (682) 0.0418; (870) 0.0155; (943) 0.00921; (962)

-668 697 702

0.00169; (866) 0.00603; (937) 0.0126; (976)

0.00132; (942) 0.00322; (969)

0.0132; (973)

g1 (k ) = L1 (r1 ,r2 ,z ) -

2 3r1 2 L4 (m ,r1 ,r2 ,z ), 2r2

Also, similar to (60), we obtain asymptotically ~ R(r1 ,r2 ,z ,1 ... M ) - Rmin 2 . 3 0 (71)

r2 g2 (k ) = L2 (r1 ,r2 ,z ) - 1 L4 (m ,r1 ,r2 ,z ), 2 r2 3 g3 (k ) = L3 (r1 ,r2 ,z ) - L4 (m ,r1 ,r2 ,z ) 2 in (36), (39), and (40). Then ~ X (r1 ,r2 ,z ,1 ... M ) =A
-1

(68)

(r1 ,r2 ,z )B (r1 ,r2 ,z ,1 ... M ).

Let us denote the minimum residuals with respect to X1 , X2 , X3 as ~ ~ R(r1 ,r2 ,z ,1 ... M ) = R(r1 ,r2 ,z , X1 (r1 ,r2 ,z ), ~ ~ X2 (r1 ,r2 ,z ), X3 (r1 ,r2 ,z ),1 ... M ) ~ (the dependence of X (r1 ,r2 ,z ,1 ... M ) on 1 ... M is omitted for the sake of brevity). According to (55) and (56), ~ R(r1 ,r2 ,z ,1 ... M ) 2 - 3 (69) M 0 and ~ R(r1 ,r2 ,z ,1 ... M ) N - M M M -3 obs (Nm - 1)wm (m )2 F
M -3,N -M

The minimization with respect to r1 , r2 ,z is carried out using a minimization technique for non-linear functions, such as the gradient-descent technique. To this end, it is convenient to have expressions for the ~ derivatives of X (r1 ,r2 ,z ) with respect to r1 , r2 , z . We denote these 1 , 2 , 3 , respectively, and they are equal to ~ k X = k A-
1

B + A-1 k B,

k = 1, 2, 3.

Table 4. 0 quantile of the 2 distribution for one-, two-, P three-, and four-parameter problems for the confidence level = 0.6827 Number of parameters P 1 0 1.0000 2.2957 3.5267 4.7195

(70)

2 3 4

m=1

.

ASTRONOMY REPORTS

Vol. 52 No. 2 2008


110

ABUBEKEROV et al.

Using the relation2 k A-1 = -A-1 (k A)A- and (68), we obtain ~ k X = A-
1 1

~ -(k A)X + k B ,

(72)

k = 1, 2, 3. The residual functional can also be easily differentiated using this last relation. We now present a geometrical interpretation of the minimum residuals with respect to some of the parameters. If the set D = {(r1 ,r2 ,z , X1 , X2 , X3 ) : R(r1 ,r2 ,z , X1 , X2 , X3 ) 0 } (where ":" means "such that"), then the set Dr
1

where the true values for 1 = 2, 2 = 3, and 3 = 1, and f (, 1 ,2 ,3 ,4 ) (77) 1 + 2 sin + 3 e + 4 cos , = 1+ 2 where the true values for 1 = 1, 2 = 1, 3 = 1, and 4 = 1. We used the non-linear functions differentiable with respect to the parameters (78) f (, 1 ) = cos( + 1 )+ sin 1 , where the true value for 1 = 2, 1 +sin 2 , 1 + 2 where the true values for 1 = 2 and 2 = 3, f (, 1 ,2 ,3 ) 1 +sin(2 + 3 )+ e3 , = 1 2 + 2 where the true values for 1 = 2, 2 = 3, and 3 and 1 f (, 1 ,2 ,3 ,4 ) = 2+ 1 2 f (, 1 ,2 ) =
2

(79)

(80)

r2 z

~ = {(r1 ,r2 ,z ) : R(r1 ,r2 ,z ) 0 }

is the projection of D on the three-dimensional r1 r2 z hyperplane, and ~ ~ ~ Dr1 = {r1 : R(r1 , r2 (r1 ), z (r1 )) 0 } (73)

= 1, (81)

~ (where r2 (r1 ), z (r1 ) are the values of r2 and z at which ~ ~ the function R(r1 ,r2 ,z ) reaches its minimum for a fixed r1 )--this is the projection of Dr1 r2 z (and D )onto the r1 axis. 4. SIMULATION AIMED AT ESTIMATING THE RELIABILITY OF THE METHODS OF DIFFERENTIAL CORRECTIONS AND CONFIDENCE AREAS Before find the solution of the observed light curve, we tested the error-estimation methods using a set of simple functions. We used one-, two-, three-, and four-parameter functions with both linear (74)(77) and non-linear (78)(81) dependences on the parameters. The simulations, used the linear functions f (, 1 ) = cos + 1 sin , where the true value for 1 = 2, 1 + 2 sin , f (, 1 ,2 ) = 1+ 2 where the true values for 1 = 2 and 2 = 3, f (, 1 ,2 ,3 ) =
2

+sin(2 + 1 - 1) + e

3



+cos(4 + 3 - 1),

(74)

(75)

1 + 2 sin + 3 e , 1+ 2
-1

(76)

Obtained by differentiating the identity AA the unit matrix.

= I,where I is

where the true values for 1 = 1, 2 = 1, 3 = 1, and 4 = 1. In the numerical simulations, the true values of the functions 1 ... 101 were derived at points 1 ... 101 uniformly distributed in the interval [0, 2] of the x axis for specified true values of 1 , 2 , 3 , and 4 . A sample of normally distributed random values m (m = 1 ... 101) was generated, so that (m ) = m = 0.03/ 12 = 0.008660 and M (m ) = m . The generation of the 1 ... 101 was carried out as follows. A sample of normally distributed j j random values 1 ... 101 was first generated for j j j = 1 ... 12, so that M (m ) = m and (m ) = 0.03, 1 j 12 m m = 1 ... 101. Further, the values m = 12 j =1 obs were calculated. In addition, we calculated (m )2 = 1 j 12 (m - m )2 , for use in determining 12(12 - 1) j =1 the Fisher statistics. Accordingly, M = 101, wm = Nm = 12, and 0 = 0.03 were adopted in (45), (46), and (53) (in case, the standard deviation (m ) = this m = 0.03/ 12 = 0.008660). We call this procedure j j "perturbing" a function, and the sample 1 ... 101 obs used to obtain m and m 2 the "perturbed" curve. This technique for generating the perturbed function is equivalent having M = 101 perturbed points
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

111

with standard deviation m = 0.008660, where m = 1 ... 101. The above method for obtaining the perturbed light curve is used when applying Fisher statistics, where the points must be divided into groups. The method for generating the perturbed curve makes it possible to search for the parameter values by applying Fisher and 2 statistics to the same realization of the perturbed curve, which is essential for the purity of the simulation conditions. With this generation method, the dimension of the 2 statistics was M = 101, while that for the Fisher statistics was F101,1212-101 . Further, we calculated the central parameter values and the error intervals. We also calculated the number of cases when the interval encompassed the true parameter values, as an estimate of the reliability of the methods used to calculate the error intervals. Since the central parameter values of (74)(77) essentially coincide with the true values, we do not present these central values here, to avoid diverting attention from the main results of the calculations (the error intervals). The error intervals are given at the = 68.2% confidence level, unless otherwise indicated.

4.1. Differential-Correction Method
c c c We obtained the central values for 1 , 2 , 3 , and c 4 and, based on (41 ), estimates for the standard c c c c deviations (rms errors) (1 ), (2 ), (3 ),and (4 ) for both the linear functions (74)(77) and the nonlinear functions (78)(81). Tables 5 and 6 present these results. In addition, we tested the reliability of the error c c c c c c intervals for 1 (1 ), 2 (2 ), 3 (3 ), and c ( c ), estimated as follows. The above pertur4 4 bation was carried out a thousand times for each function with 0 = 0.03. We obtained the solution for c each perturbed function--the central values for 1 , c , c , and c . Further, we calculated the number 2 3 4 of cases (N1 ), (N2 ), (N3 ), and (N4 ) when the true values of 1 , 2 , 3 , and 4 fell in the error intervals c ( c ), c ( c ), c ( c ), and c ( c ). 1 1 2 2 3 3 4 4 The number of simultaneous coincidences of the error intervals and parameters Nall was also calculated.

As in the previous case, we calculated the numc c ber of cases when the error intervals 1 mc (1 ), c mc ( c ), c mc ( c ), and c mc ( c ) en2 2 3 4 3 4 compassed random realizations of the central values c c c c of 1 , 2 , 3 ,and 4 . In addition, we checked how well the distributions c c c c of the central values of 1 , 2 , 3 , and 4 coincided with a normal distribution for the non-linear relations (74)(77) (in the case of a linear dependence, this coincidence is exact). To this end, in addition to the number of cases when the true parameter values fell within the mc half-width interval, we also calculated the number of cases when the true values fell within the 1.5mc , 2mc , and 3mc half-width intervals, for which the probabilities of coincidence for a normal distribution are 0.8664, 0.9545, and 0.9973, respectively. To avoid overburdening the article with excessive tables, we do not present here these numbers of coincides; however, they were as close to the theoretical probabilities as for mc (see Table 8 below). This provides evidence that the distribution of the central parameter values is also close to normal in the nonlinear case (due to the small "observational errors"). Tables 7 and 8 present the results for the numerical simulations of the estimated reliability of the error intervals for analytical functions with linear and nonlinear dependences on the parameters, respectively.

4.3. Confidence-Area Method
We carried out a similar analysis using the confidence-area method. The confidence areas were constructed based on 2 statistics (where P is the P number of unknown parameters of the problem), 2 statistics (where M is the total number of M points in the curve) and Fisher statistics FM,N -M . In contrast to the differential-correction technique, the confidence-area method not only can be used to calculate confidence intervals for the parameters, but, in some cases, also enables verification of the adequacy of the model to the observational data.
j j The perturbed curves 1 ... 101 for (74)(81) were obtained in the x-axis interval [0, 2]. The error inmi max tervals [p n ,p ](p = 1 ... 4) were then sought for using the confidence-area method. The intervals were calculated as projections of the confidence area obtained using 2 , 2 , or FM,N -M statistics via the P M solution of the corresponding inequality in 1 , 2 , 3 , and 4 , on the coordinate axes 1 , 2 , 3 ,and 4 . To underscore these results, we present here the half-widths of the projections of the confidence area max mi p /2 = (p - p n )/2 rather than the limits of the mi max interval p n and p .

4.2. Monte-Carlo Method
A similar numerical simulation for (74)(81) was carried out using the Monte-Carlo method. N = 1000 perturbations of the function were made. For c(n) each perturbed function, the central values of 1 , 2 , 3 , and 4 were found (n = 1 ... N ) and their dispersions estimated using (49).
ASTRONOMY REPORTS Vol. 52 No. 2 2008
c(n) c(n) c(n)


112

ABUBEKEROV et al.

Table 5. Standard deviations (rms errors) for the parameters of the linear functions (74)(77) and the number of cases when they encompass the true values, obtained with the differential-correction method Function Formul (74) Formul (75) Formul (76) Formul (77)
c (1 ); (N1 ) c (2 ); (N2 ) c (3 ); (N3 ) c (4 ); (N4 )

N

all

0.00111; (696) 0.00186; (686) 0.00188; (666) 0.0418; (669)

0.00148; (675) 0.00409; (691) 0.0158; (659)

0.000793; (684) 0.00548; (665)

0.0366; (670)

527 485 606

Table 6. Standard deviations (rms errors) for the parameters of the non-linear functions (78)(81) and the number of cases when they encompass the true values, obtained with the differential-correction method Function Formul (78) Formul (79) Formul (80) Formul (81)
c (1 ); (N1 ) c (2 ); (N2 ) c (3 ); (N3 ) c (4 ); (N4 )

N

all

0.00115; (697) 0.0276; (683) 0.0837; (695) 0.00412; (676)

0.00112; (676) 0.00313; (673) 0.00555; (679)

0.00068; (677) 0.00143; (679)

0.00596; (687)

-469 587 534

Table 7. Standard deviations (rms errors) for the parameters of the linear functions (74)(77) and the number of cases when they encompass the true values, obtained with the Monte-Carlo method Function Formul (74) Formul (75) Formul (76) Formul (77)
c mc (1 ); (N1 ) c mc (2 ); (N2 ) c mc (3 ); (N3 ) c mc (4 ); (N4 )

0.00114; (691) 0.00154; (689) 0.00182; (674) 0.0430; (684)

0.00128; (683) 0.00394; (675) 0.0161; (692)

0.000769; (672) 0.00563; (692)

0.0376; (682)

Table 8. Standard deviations (rms errors) for the parameters of the non-linear functions (78)(81) and the number of cases when they encompass the true values, obtained with the Monte-Carlo method Function Formul (78) Formul (79) Formul (80) Formul (81)
c mc (1 ); (N1 ) c mc (2 ); (N2 ) c mc (3 ); (N3 ) c mc (4 ); (N4 )

0.00113; (670) 0.0277; (684) 0.0818; (680) 0.00415; (681)

0.00108; (700) 0.00317; (689) 0.00563; (677)

0.000689; (673) 0.00146; (691)

0.00605; (687)

As in the previous cases, we estimated the reliability of the half-width of the projection of the confidence area 1 /2, 2 /2, 3 /2,and 4 /2 by calculating N1 , N2 , N3 ,and N4 --the numbers of cases mi max when the obtained intervals [p n ,p ] (p = 1 ... 4) encompassed the true value p . In total, one thousand trials were made. In addition to calculating the number of coincidences between the true values and

the projection area, we also calculated the numbers of coincides of the confidence area itself, N0 . The results are presented below.

4.4. 2 Statistics P
The error intervals were calculated as projections of the confidence area obtained using (60) via the
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

113

solution of the corresponding inequality for 1 , 2 , 3 , and 4 along the 1 , 2 , 3 ,and 4 axes. For example, for a two-parameter function, the confidence area is elliptical (Fig. 2). The "projection of the confidence area" is the projection of this ellipse onto the corresponding parameter axes--in this case, the 1 and 2 axes. Note that the adequacy of the model to the observational data is not verified when calculating the confidence area based on 2 statistics; the model is P initially assumed to be adequate. Since the quantile 0 is plotted from the minimum residual Rmin [see (60)], the half-width of the confidence-interval projection is stable (in contrast to the case of 2 M statistics, where M is the total number of points on the curve). Tables 2 and 3 present the confidence-area projections for the parameters p for (74)(77) and (78)(81), based on 2 statistics, and the number P of cases when the resulting confidence intervals encompassed the true parameter values. If we require that the probability that the value for the exact solution lies within the projected confidence area be = 68.2%, the probability that the exact solution is encompassed by the entire confidence area will be < 68.2% ( 50%). At the same time, the projections of the confidence intervals indicated in Tables 2 and 3 guarantee that the confidence area will encompass the exact solution with a probability of = 68.2%. This situation is similar to that for the differential-correction method. In practice, the cross section for an unknown parameter is often calculated, rather than the projection of the confidence area onto the parameter axis. To determine the reliability of this parameter-estimation method, we also calculated these cross sections and the number of cases when a cross section encompassed the true parameter value based on 2 statisP tics for both the linear function (75) and a more correlated function of the parameters 1 and 2 : f (x, 1 ,2 ) = 1 x + 2 x2 . (82)

2/2 0.004

0.002

0.10

0.05

0.05

0.10 1/2

0.002

0.004
Fig. 2. Confidence area for the parameters 1 and 2 1 of the function f (, 1 ,2 ) = 1 2 +2 +sin 2 constructed based on 2 statistics (the smaller ellipse) and P 2 statistics (the larger ellipse). The dotted squares M show parallelepipeds containing the confidence areas; their size is specified by projections of the confidence area onto the 1 and 2 axes. The solid square shows the parallelepiped corresponding to the differential correction method and Monte Carlo method.

while the projection of the confidence area onto the parameter axes and the numbers of coincidences with the corresponding true values are independent of the degree of correlation between the parameters. The meaning of the projections of the confidence areas presented in Tables 2 and 3 (and in the other tables considered below) is as follows. The true parameter values are guaranteed to simultaneously all lie within the projected confidence area with a probability of = 68.2%, and to lie within the parallelepiped corresonding to the confidence area itself with some probability > 68.2%.

For (75), the half-widths of the cross sections of 1 and 2 were S1 /2 = 0.003289 (739) and S2 /2 = 0.002627 (750) (the number of coincidences of the true values and the cross section is given in parentheses). For (82), the half-width of the cross sections for 1 and 2 were S1 /2 = 0.001747 (307) and S2 /2 = 0.001122 (311); the projection and the number of coincidences with the true values were 1 /2 = 0.006989 (874) and 2 /2 = 0.0044896 (879). It is obvious that the size of the cross section and the number of coincidences with the true values depend on the degree of correlation between the parameters,
ASTRONOMY REPORTS Vol. 52 No. 2 2008

4.5.

2 M

Statistics

A similar numerical experiment was also carried out based on 2 statistics. The intervals were calcuM lated as projections of the confidence areas obtained using the statistics (50), which is the solution of the inequality (52 )for 1 , 2 , 3 ,and 4 on the 1 , 2 , 3 , and 4 axes. Tables 9 and 10 present the half-widths of the projected confidence areas 1 /2, 2 /2, 3 /2,and 4 /2. Recall that the selected confidence level is = 68.2%.


114

ABUBEKEROV et al.

N 3500 3000 2500 2000 1500 1000 500 0 0.001 0.003 0.005 0.007 0.009 0.011 1/2

Fig. 3. Experimentally obtained histogram for the density of the distribution of the half-width of the confidence interval 1 /2 of the function f (, 1 ) = cos( + 1 )+ sin 1 . The intervals were obtained for 2 statistics as a result of 104 trials; N is M the number of values of the confidence-interval half-widths 1 /2 falling on the corresponding step of the histogram.

In contrast to the 2 distribution and differentialP correction method, the confidence area depends on j j the specific realization of the perturbed 1 ... 101 curve. Figure 3 presents the experimentally obtained density histogram for the distribution of the halfwidth of the confidence interval 1 /2 for the oneparameter function obtained as a result of 104 trials using (74). The histogram displays two pronounced maxima. The first maximum is associated with errors of the first type, when the correct solution is rejected with the probability = 1 - ; i.e., there is no error interval. The second maximum of the histogram displays the density distribution for the error intervals. The confidence intervals vary by a factor of seven to eight. Therefore, we present the half-width of the interval in the vicinity of the second maximum of its density distribution. Tables 9 and 10 present the projected confidence areas for the parameters p of (74)(77) and (78) (81) for the 2 statistics, together with the numbers M of coincidences with the true values. These areas are approximately twice those for the 2 statistics. P

obtained for the FM,N -M statistics, together with the number of coincidences with the true parameter values.

4.7. Analysis of the Results
First and foremost, we note that, even in the case of the linear (74) and non-linear (78) one-parameter problem, the error-interval half-width obtained using the differential-correction and confidence-area methods with 2 and FM,N -M statistics do not M coincide. The error-interval half-width 1 for 2 M and FM,N -M statistics can exceed the error interval c (1 ) for 1 obtained using the differential-correction method by up to a factor of 7-10. In the case of the one-parameter problems (74) and (78), the error-interval half-widths for p obtained using the confidence-area method with 2 statistics are very P close to those obtained for the differential-correction method and the Monte-Carlo technique. However, in the case of analyses with two or more parameters, the projections of the confidence areas onto the parameter axes p differ by factors of 1.5-2. The probability of encompassing the true value exceeds 68%. As the numerical simulations showed, the halfwidths of the error intervals for unknown parameters depend on the selected statisticsand theapriori information about the model used. The methods perform similarly in the sense that the number of cases when the exact solution falls within the confidence area corresponds to the specified probability . The half-width of the error intervals in the c differential-correction method ( (p )) and Montec )) are very close. In the Carlo method (mc (p
ASTRONOMY REPORTS Vol. 52 No. 2 2008

4.6. Fisher Statistics F

M,N -M

We also carried out analogous numerical simulations for the Fisher statistics FM,N -M . The error intervals were calculated as projections of the confidence areas obtained using (53), which represents the solutions for the corresponding inequalities in 1 , 2 , 3 ,and 4 on the 1 , 2 , 3 ,and 4 coordinate axes. Tables 11 and 12 present the projected confidence area for the parameters p of (74)(77) and (78)(81)


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

115

Table 9. Half-widths of the projected confidence areas for the parameters of the linear functions (74)(77) for 2 M statistics and the number of coincidences with the theoretical parameter values Function Formul (74) Formul (75) Formul (76) Formul (77) 1 /2; (N

1

)

2 /2; (N



2

)

3 /2; (N



3

)

4 /2; (N



4

)

N

0

0.00344; (674) 0.00915; (688) 0.00508; (740) 0.2174; (726)

673 684 674

0.00702; (694) 0.0110; (738) 0.0820; (728)

0.00215; (739) 0.0284; (727)

0.1900; (725)

Table 10. Half-widths of the projected confidence areas for the parameters of the non-linear functions (78)(81) for 2 M statistics, and the number of coincidences with the theoretical parameter values Function Formul (78) Formul (79) Formul (80) Formul (81) 1 /2; (N

1

)

2 /2; (N



2

)

3 /2; (N



3

)

4 /2; (N



4

)

N

0

0.00354; (663) 0.116; (707) 0.437; (718) 0.0160; (746)

-680 676 680

0.00436; (700) 0.00159; (729) 0.0220; (745)

0.00343; (723) 0.00560; (746)

0.0229; (747)

Table 11. Half-widths of the projected confidence areas for the parameters of the linear functions (74)(77) for Fisher distribution FM,N -M , and the number of coincidences with the theoretical parameter values Function Formul (74) Formul (75) Formul (76) Formul (77) 1 /2; (N

1

)

2 /2; (N



2

)

3 /2; (N



3

)

4 /2; (N



4

)

N

0

0.00294; (692) 0.00746; (705) 0.00420; (728) 0.1733; (753)

-685 687 688

0.00578; (706) 0.0093; (729) 0.0677; (752)

0.00189; (730) 0.0200; (752)

0.134; (753)

Table 12. Half-widths of the projected confidence areas for the parameters of the non-linear functions (78)(81) for the Fisher distribution FM,N -M , and the number of coincidences with the theoretical parameter values Function Formul (78) Formul (79) Formul (80) Formul (81) 1 /2; (N

1

)

2 /2; (N



2

)

3 /2; (N



3

)

4 /2; (N



4

)

N

0

0.00315; (693) 0.099; (700) 0.391; (723) 0.0114; (749)

-678 679 680

0.00432; (700) 0.00140; (725) 0.0192; (750)

0.00301; (723) 0.00485; (749)

0.0193; (750)

confidence-area method, the half-width of the error interval increases with the uncertainties in the data. For example, when 2 statistics are used (where P P is the number of parameters), the model is initially assumed to be adequate to the observational data, and the dispersion of the data to be known a priori. In general, the adequacy of a model is taken to be demonstrated if the reduced 2 for the minimum M
ASTRONOMY REPORTS Vol. 52 No. 2 2008

residual--2 /(M - P ), where M is the number of M data points and P the number of parameters--is close to unity (see, for example, [24]). If the reduced 2 M appreciably exceeds unity, this means that the minimum residual is too high; i.e., there are substantial systematic deviations between the observed and best-fit theoretical light curves, providing grounds to consider the model inadequate. In this case, it is


116

ABUBEKEROV et al.

f 3.6 3.5 3.4 3.3 3.2 1.0 1.2 1.4 1.6 1.8 2.0

f 3.6 3.5 3.4 3.3 3.2 1.0 1.2 1.4 1.6 1.8 2.0

Fig. 4. Use of the differential-corrections technique. The linear function f (, 1 ,2 ) = 1+1 2 + 2 sin is con structed with the parameters 1 - and 2 - , 1 and 2 , and 1 + and 2 + . The random scatter of the points conforms to Gaussian statistics with teor = 0.03. The values 1 = 2.0000 0.001600 and 2 = 2.9991 0.001278 were adopted. The "error corridor" corresponding to the boundaries of the confidence intervals for the parameters is shown (the curves nearly merge).

Fig. 5. Use of the confidence-area method with 2 M statistics. The linear function f (, 1 ,2 ) = 1+1 2 + 2 sin , constructed with the parameters 1 - and 2 - , 1 and 2 , and 1 + and 2 + , is presented. The random scatter of the points conforms to Gaussian statistics with teor = 0.03. The values 1 = 2.0000 0.01083 and 2 = 2.9991 0.008798 were adopted. The "error corridor" corresponding to the boundaries of the confidence intervals for the parameters is shown.

not correct to estimate the confidence intervals for the parameters using 2 statistics (the same is also P true for the differential-correction and Monte-Carlo methods). The assumption a priori of a model's adequacy unifies the calculation of the errors in the parameters p using 2 statistics and the differential-correction P and Monte-Carlo methods. In one-parameter problems, the error-interval half-width 1 /2 obtained using 2 statistics is very close to the errors for P the differential-correction and Monte-Carlo methods (Tables 28). In multi-parameter problems, the projection of the confidence area onto the corresponding p axes differs from the error-interval half-width calculated using the differential-correction and MonteCarlo methods by a factor of 1.5-2. The probability of encompassing the exact solution increases above that specified by . As was already noted, if we require that the probability of the projections encompassing the exact solution be = 68.2%, the projected confidence area roughly coincide with the standard rms deviations; however, there is no guarantee that the exact solution lies within the confidence area with the specified probability = 68.2% (the real probability for this is < 68.2%). When 2 statistics (where M is the number of M values of the "perturbed" function) and Fisher statistics FM,N -M are used, then the confidence areas for the parameters are calculated and the adequacy of the model is also checked. One distinguishing

feature of this method for calculating the error interval (confidence area) is that the confidence areas depend on the specific realization of the perj j turbed curve 1 ... 101 . The confidence areas ob2 and F tained for M M,N -M statistics varies (in contrast to the differential-correction, Monte-Carlo, and confidence-area methods with 2 statistics). For exP ample, Fig. 3 presents the distribution of confidence areas in the one-parameter case. The area corresponding to the maximum of the histogram exceeds the error intervals for the differential-correction and Monte-Carlo methods by, on average, a factor of 5, and the confidence intervals for 2 statistics by a P factor of 3 (Fig. 2). Figure 4 presents graphs for the two-parameter linear function (75) constructed for the parameter c c c c c c c c 1 - (1 ) and 2 - (2 ), 1 and 2 ,and 1 + (1 ) c + ( c ), which form the so-called "error and 2 2 corridor." When the differential-correction method j j is used, about half of the 1 ... 101 points of the "perturbed" function fall away from the error corridor. At the same time, in the error corridor formed by (75) for the confidence-area method with 2 M c c statistics and the parameter values 1 - (1 )/2 c c c c c c and 2 - (2 )/2, 1 and 2 , and 1 +(1 )/2 c +( c )/2 (where c is the value of the and 2 p 2 mi max middle of the interval [p n ; p ]), the vast majority of the points lie within the error corridor (Fig. 5). This explains why model-fitting results obtained by
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

117

different authors who have estimated the errors using the differential-correction method are often inconsistent. The confidence-area method yields more reliable estimates for the parameter errors. Returning to the reliability of the error-interval half-widths, we note that, in the differentialcorrection method, the number of simultaneous coincides of all the error intervals by their parameters Nall is appreciably lower (by 20%) than the probability (Tables 5 and 6). On the other hand, the number of coincidences with the individual error intervals c c p (p ) corresponds to the probability in this method. In the confidence-area method, the number of coincidences with the indicated projections exceeds the probability ; however, it is guaranteed that all the true parameter values will simultaneously lie in the confidence area with the probability , and lie within the encompassing parallelepiped with a probability exceeding . Thus, in the confidence-area method, the number of coincidences N0 of the true parameter values with the confidence area always corresponds to the suggested probability (Tables 212), whereas the number of coincidences with the projected confidence area appreciably exceeds the probability . It is clear from the simulation results that the error-interval halfwidths obtained for the confidence-area method are more reliable than those obtained for the differentialcorrection and Monte-Carlo methods. Note that non-linearity in the parameters results in asymmetry of the confidence area about the central value, whereas, in the linear case, the area is strictly symmetrical about the p axes. 5. FITTING OF THE OBSERVED LIGHT CURVE OF THE BINARY SYSTEM YZ Cas

L 1.00 0.95 0.90 0.85 0.80 0.75 0 2 4 6 8 10 12 14

Fig. 6. Primary minimum of the normalized light curve of the detached binary YZ Cas. The points indicate the observed brightnesses from [8]; the solid curve represents the best-fit theoretical light curve obtained using our procedure. The corresponding parameters are presented in Table 13.

L 1.00 0.98 0.96 0.94 0.92 0.90 180 182 184 186 188 190 192 194

Fig. 7. Same as Fig. 6 for the secondary minimum.

5.1. Six-Parameter Problem
Here, we analyze the normalized light curve of the eclipsing binary YZ Cas in the red filter ( = 6700 A) taken from the study of Kron [8]. The observed light curve 1 ... 42 consisted of M = 42 points. The dispersion for each point in the observed light curve was obs taken to be (m )2 = 1.77 10-6 [3] on a magnitude scale; in our intensity scale, taking into account obs the fact that Lfull = 1, (m )2 = 1.5015 10-6 .The central value for each point of the light curve was also obtained by averaging Nm = 12 points for m = 1 ... 42 [6, 8]. We obtained a least-squares fit of the total light curve for all six parameters, yielding the central values for the parameters r1 = 0.14408, r2 = 0.07556, i = 88.27 , X1 = 3.927, X2 = 2.696, and X3 = 10.703. We obtained the rms estimates for the dispersions using the differential-correction method with (44 ). The
ASTRONOMY REPORTS Vol. 52 No. 2 2008

rms dispersion of theunity-weight 0 = 0.003422904 was estimated using (43 ). The confidence-area method with Fisher statistics minimized with respect to the linear parameters (70) yielded the onedimension projections of the confidence areas at the confidence level = 0.6827. For example, this projection for r1 is specified by (73); i.e., it is the solution of this inequality. Accordingly, these projections were calculated as follows: the residual (70) was minimized with respect to two non-linear parameters, and the resulting value, which depends on the one remaining non-linear parameter (alternately taken to be r1 , r2 , and i), was set equal to the quantile (i.e., two roots of the corresponding equation were determined). These results are given in Table 13. Since an analysis using 2 and 2 statistics reP M quires the unity-weight dispersion 0 , which is not known exactly beforehand, we studied a model binary system in which the phases 1 ...42 coincided with those of the observed light curve of the YZ Cas.


118

ABUBEKEROV et al.

Table 13. Photometric elements of YZ Cas derived here from the light curve at = 6700 A Method Differential corrections (est ) Confidence intervals, statistics FM -3,N -M ( = 68.2%) r1 0.14408 0.00023 0.1441 0.0021 r2 0.07556 0.00038 0.07547 0.0012


i 88.27 0.090


88.36 0.5225

Table 14. Photometric elements of the model binary YZ Cas Method Differential corrections ( ) Monte-Carlo ( ) Confidence intervals, statistics 2 ( = 68.27%) 3 Confidence intervals, statistics 2 -3 ( = 68.27%) M r1 0.14422 0.00023 0.14422 0.00023 0.14450 0.00043 0.14448 0.0012 r2 0.07557 0.00038 0.07557 0.00038 0.07579 0.00073 0.07564 0.0019 i 88.28 0.091 88.28 0.085 88.17 0.17 88.19 0.46


Table 15. Photometric elements of YZ Cas derived from the light curve at = 6700 A by other authors Reference [8] [10] [11] [12] [13] [14] [15] [16] [17] r1 0.1443 0.00046 0.1428 0.14478 0.00021 0.1454 0.0009 0.151 0.002 0.144 0.1466 0.145 0.005 0.1442 1% r2 0.0756 0.00015 0.0763 0.07580 0.00042 0.0753 0.0003 0.0779 0.006 0.0780 0.080 0.076 0.001 0.0767 1% i 88.18 0.057 88.11 88.617 0.038 88.47 0.01 87.1 0.3 89.22 88.4 88.25 0.1 88.3 0.2


The true values of the binary parameters were taken to be equal to the central values obtained from the fit of the observed light curve: r1 = 0.14408, r2 = 0.07556,and = 88.27 . The unity-weight dispersion i was taken to be the rms estimate 0 obtained from the fit of the observed light curve, 0 = 0.003422904. The weight coefficients wm and the number of the measurements at each phase N1 ... N42 were set equal to 12. We obtained the dispersions for the model system for the differential-correction method using (41 ). Further, we obtained the one-dimension confidencearea projections at the confidence level = 0.6827 using the confidence-area method and 2 statistics M minimized with respect to the linear parameters (69), and using 2 statistics (71). The results for the model 3 light curve of YZ Cas are given in Table 14. The nu-

merical experiment consisted of a thousand realizations of the model light curve 1 ... 42 ,eachofwhich was analyzed and the number of coincidences of the true values of r1 , r2 , and i with the error intervals obtained for each method calculated. Table 15 presents the results of fitting of the YZ Cas light curve by other authors. The central values for the component radii r1 , r2 and the orbital inclination i we have obtained are in good consistency with the previous results (Tables 13, 15). Thus, our procedure for fitting the light curve of two spherical stars can be considered reliable. Figures 6 and 7 present the light curve obtained from the fit of the observational data carried out in [8].
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

119

r2/2 0.0004

0.0002

0.0002

0.0001

0.0001

0.0002 r1/2

0.0002

0.0004
Fig. 8. Confidence area of r1 and r2 obtained from the fit of the observed light curve of YZ Cas taken from [8], obtained using 2 statistics (small ellipse) and 2 statistics (large ellipse). The dotted squares correspond to parallelepipeds that encompass P M the confidence areas, whose sizes are specified by the projections of the confidence area onto the r1 and r2 axes. The solid square corresponds to parallelepiped obtained by differential correction method and Monte Carlo method.

i/2 0.10 0.05

0.0003 0.0002

0.0001 0.05 0.10

0.0001

0.0002 r2/2

Fig. 9. Confidence area of r2 and i obtained from the fit of the observed light curve of YZ Cas taken from [8], obtained for 2 P statistics (small ellipse) and 2 statistics (large ellipse). The dotted rectangles correspond to parallelepipeds that encompass M the confidence areas, whose sizes are specified by the projections of the confidence area onto the r2 and i axes. The solid rectangle corresponds to parallelepiped obtained by differential correction method and Monte Carlo method.

5.2. Two-Parameter Problem
Based on our procedure, we analyzed the observed light curve of YZ Cas using the differentialcorrection, Monte-Carlo, and confidence-area methods with 2 , 2 ,and FM,N -M statistics. In addition, P M we checked the reliability of the error intervals calculated for each method. To this end, we introduced a thousand perturbations of the light curve with the "true" parameter values r1 = 0.14408, r2 = 0.07556,
ASTRONOMY REPORTS Vol. 52 No. 2 2008

and = 88.27 , and the unit-weight dispersion 0 = i 0.003422904. We also calculated the numbers of coincidences of the corresponding confidence areas and their projections with the true parameter values. For convenience and simplicity when comparing the results obtained for each of the methods, we reduced the fit of the observed light curve of YZ Cas to a two-parameter form. In the first case, the component radii r1 and r2 were assumed to be unknown, and, in the second case, the radius of the secondary r2 and


120

ABUBEKEROV et al.

Table 16. Fit of the light curve of YZ Cas from [8] for component radii Method Differential corrections Monte-Carlo Confidence intervals, statistics Confidence intervals,
2 M 2 P

r1 /2; (N 0.6687 10 0.6639 10 1.013 10 2.284 10 2.294 10

r

1

)

r2 /2; (N 0.1089 10 0.1082 10 0.1649 10 0.3720 10 0.3737 10

r

2

)

N 543 535 687 681 685

-4 -4

; (687) ; (683) ; (868) ; (714) ; (717)

-3 -3 -3 -3 -3

; (693) ; (687) ; (870) ; (719) ; (714)

-4 -4 -4

statistics
N -M

Confidence intervals, FM,

statistics

Table 17. Fit of the light curve of YZ Cas from [8] for the radius of the secondary r2 and the orbital inclination i Method Differential corrections Monte-Carlo Confidence intervals, statistics Confidence intervals,
2 M 2 P

r2 /2; (N 0.7920 10 0.8021 10 1.203 10 2.226 10 2.063 10

r

2

)


i;(Ni ) 88.2836 0.0305 ; (696) 88.2700 0.02851; (682) 88.2821 0.04663; (874) 88.2915 0.0862; (717) 88.2911 0.07988; (716)


N 472 491 676 683 687

-4 -4

; (675) ; (686)

-4 -4 -4

; (868) ; (711) ; (720)

statistics
N -M

Confidence intervals, FM,

statistics

the orbital inclination i. The other parameters were assumed to be known, and equal to the central values presented above. Tables 16 and 17 present the results of our lightcurve analysis in this two-parameter form. Figure 8 presents the confidence area for r1 and r2 obtained when the observed light curve of YZ Cas was fitusing 2 and 2 statistics. Figure 9 presents the confiP M dence area for r2 and i obtained when the observed light curve was fit using 2 and 2 statistics. The P M confidence areas for the 2 statistics are, on average, M about twice those for the 2 statistics. According P to the simulations, the numbers of coincidences of the true values with confidence areas for the 2 and P 2 statistics correspond to the specified probability M = 68.2%. 6. RESULTS OF ANALYZING THE LIGHT CURVE OF YZ Cas The results based on the observed light curve of YZ Cas reproduce the previous simulation results obtained using (74)(77) both quantitatively and qualitatively. The parameter errors calculated using the differential-correction and Monte-Carlo methods are similar (Tables 16 and 17). The confidence intervals calculated using 2 statistics in multi-parameter P problems exceed the parameter errors calculated

using the differential-corrections and Monte-Carlo methods by factors of 1.5-2. For one-dimensional problems, the confidence intervals obtained using 2 P statistics are close to the errors calculated using the differential-correction and Monte-Carlo methods. Projections of the confidence areas for r1 , r2 , and i calculated using 2 and FM,N -M statistics exM ceed the error intervals for the same parameters obtained using the differential-correction and MonteCarlo methods by factors of 3-5. Recall that the confidence areas calculated using 2 and FM,N -M M statistics vary. Therefore, we present the confidence area in the vicinity of the maximum of the area distribution. The results of the numerical experiment to estimate the reliability of the error-calculation methods are similar to those for (74)(81). Tables 16 and 17 show that the number of simultaneous coincidences for the error intervals obtained using the differential-correction and Monte-Carlo methods are substantially different from the specified probability (15-20% lower), though the number of coincidences of the individual intervals corresponds to the specified probability = 68.2%. In the confidence-area method, the number of coincidences of the projected confidence areas with true values of r1 , r2 ,and i exceeds the specified probability = 68.2%, while the number of coincidences of the confidence area itself by the true values corresponds to the specified probability (Table 16 and 17). The
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

121

non-linearity of the problem does not affect the deviation of the number of coincidences substantially, and the approximation of using (41 ) in the differentialcorrection method can be considered reasonable. As was noted above, the non-linearity of the problem is manifest in the symmetry of the confidence areas about their central values (Figs. 8 and 9). 7. DISCUSSION Let us summarize the main conclusions of our study comparing various methods for estimating parameter errors in inverse parametrical problems. These methods are classified according to the selected statistics and the a priori information about the dispersion of the observational data and the adequacy of the model used.

7.2. Monte-Carlo Method
This method is similar to the differentialcorrection method and yields similar results. It uses the statistics for the normal distribution of the measurements of a single point of the light curve. The method assumes that the model is adequate to the data and that the dispersion of the observational data is known independently. Like those obtained for the differential-correction method, the parameter errors obtained in the Monte-Carlo method are very small (according to Popper [9], the "outer" errors can exceed the "inner" errors by factors of three to five). Thus, both the differential-correction and MonteCarlo methods use strict a priori assumptions about the model's adequacy and apply the simplest statistics for the normal distribution of the observational measurements with a previously known dispersion. It is precisely due to the use of these idealized model assumptions that these methods yield such small parameter errors. In reality, it is not known in advance whether the model provides an adequate description of the data. In addition, in reality, model parameters are searched for using statistics generated by a normal distribution (2 , Fisher, etc.) rather then statistics of a normal M distribution. The dispersion of the observational data is often not known in advance. Therefore, it is natural to estimate parameter errors using the same statistics used for the search for the model parameters themselves (2 , Fisher, etc.), rather then using the M statistics of a normal distribution or 2 statistics. P Let us consider such methods for estimating parameter errors.

7.1. Differential-Correction Method
This method is based on the statistics for the normal distribution of measurements for a single observed point. The dispersions and, accordingly, rms errors, of the parameters are expressed in terms of the dispersion of the observations using the formula for the sum of dispersions of the observational data. The method assumes a priori that the model is correct and adequate to the observations, and that the dispersion of the observational data is known independently. The parameter errors obtained using the differential-correction method are generally very small. The probability of simultaneous coincidences of all parameter intervals with the exact solutions is appreciably (15-20%) lower than the specified probability 68.2% (see, for example, Tables 16 and 17). The corridor of theoretical light curves corresponding to the boundaries of the confidence intervals for the parameters is very narrow--comparable to the rms error of a single measurement. A substantial fraction of the observed points lie outside the error corridor formed by the limiting theoretical light curves. Therefore, it is likely that the results of fitting of observational data obtained by different researchers will be significantly different. The "outer" errors of the obtained parameters can substantially exceed the "inner" errors (according to Popper [9], by a factor of three to five). According to our numerical experiments, the use of linearization procedures in non-linear problems distorts the confidence intervals obtained using the differential-correction method only slightly when the observational errors are small.
ASTRONOMY REPORTS Vol. 52 No. 2 2008

7.3. Method Based on 2 Statistics P
Since in this method relative rather than absolute variations of the residual functional are used, it assumes that the model is fully adequate. In addition, the dispersion of the observational data is assumed to be known a priori. In this method, the confidence area never degenerates to a null set. In addition, in nonlinear problems, the confidence area is understood in asymptotic terms: the probability of encompassing the exact solution is not equal to the specified probability , but only tends to as M , where M is the number of observational points. However, as simulations show, the probability of of the asymptotic confidence area in non-linear problems encompassing the exact solution is very close to the specified probability for sufficiently large M . In one-parameter problems, the confidence-area method with 2 statistics yields results close to those P obtained with the differential-corrections and MonteCarlo methods. This is natural, since all three use


122

ABUBEKEROV et al.

the same strict, idealized a priori assumptions. In both one-parameter and multi-parameter problems, the 2 method yields confidence areas that always P encompass the exact solution with the specified probability . This represents a substantial advantage of the 2 method over the differential-correction and P Monte-Carlo methods, for which, as is shown by our simulations, the probability of joint coincidence of the confidence intervals for all parameters with the exact solutionsisbelow thespecified probability . When the 2 method is used to estimate the P confidence intervals in multi-parameter problems, it is necessity to project the confidence area onto the parameter axes. The confidence area that encompasses the exact solution with a specified probability is replaced by a parallelepiped containing it, and the probability of encompassing the exact solution increases. If we assume that the probability of coincidence with a projection onto a parameter axis is = 68.2%, then, according to supplementary calculations using the example of (75) and (82), the projections of the confidence area onto the parameter axes, 1 and 2 , obtained for 2 statistics coincide with the conP fidence intervals obtained for differential-correction method, 1 and 2 , for the same probability = 68.2%. However, in this case, the probability of a joint coincidence of the confidence area by all parameters is substantially smaller than the probability = 68.2% (50% or less).

those for the 2 statistics by approximately a factor of P two (see, for example, Figs. 8 and 9). To estimate the parameter confidence intervals, the confidence area must be projected onto the parameter axes, and the fact that the probability of encompassing the exact solution increases in this case must be taken into account. The fact that the confidence area for the 2 statisM tics exceeds that for the 2 statistics does not imply P that the 2 method is worse. In addition to everyM thing else, the 2 method also takes into account M the uncertainty of the fitting results due to possible inadequacy of the model. Therefore, if it is necessary to make an important judgement about the results of model-fitting, precisely the 2 method is recomM mended for the estimation of the associated parameter errors.

7.5. Method Based on Fisher F

M,N -M

Statistics

As in the 2 technique, this method does not M assume adequacy of the model. In addition, in contrast to the 2 method, no a priori knowledge of M the dispersion of the observations is required. The dispersion is derived from the observed scatter of the points in the studied light curve. On this basis, there is some consensus that the FM,N -M method is the most general and uses the smallest number of model assumptions. The only assumption is that observational data have normal distributions in the vicinities of selected sections of the light curve. The confidence-area methods based on the 2 M and Fisher FM,N -M statistics are similar. In both, the confidence areas vary appreciably for different realizations of the observational data; the number of cases when the area degenerates to a null set is = 1 - . Nevertheless, the confidence areas obtained for these statistics are different. We carried out additional calculations aimed at a more detailed study of the difference in the confidence areas obtained for FM,N -M and 2 statistics. In particular, we used the M confidence-area method to search for the unknown parameter 1 using the indicated statistics for the functions (74) and f (, 1 ) = cos + 1 e . (83)

7.4. Method Based on

2 M

Statistics

This method also assumes that the dispersion of the observational data is known a priori. However, the adequacy of the applied model is not assumed in this case. The absolute values of the residual functional are used to find the confidence areas. Therefore, the confidence areas are different for different realizations of the random process (the observed light curve). In a number of cases, when a model is rejected based on a statistical criterion, the confidence area degenerates into a null set. However, it is strictly proven and confirmed by numerical experiments that the confidence area found using the 2 method encompasses the M exact solution with the specified probability , as in the 2 method. P As follows from the smaller number of model assumptions compared to the 2 method (the model's P adequacy is not assumed), the confidence area obtained using the 2 method is larger than that for M the 2 method. The numerical simulations show that P the the confidence areas for the 2 statistics exceed M

We obtained true values for the function 1 ... 101 at points 1 ... 101 uniformly distributed along the horizontal-axis interval [0, 2] for the specified value 1 = 2. Further, a sample of normally distributed ranj j dom 1 ... 101 was generated for j = 1 ... 12,so that j j M(m ) = m and (m ) = 0.03 (m = 1 ... 101).
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

123

N 4000 3000 2000 1000 ()

0 4000 3000 2000 1000 0

0.002 0.004

0.006

0.008

0.010

0.012

0.014 0.016 0.018

(b)

0.0003

0.0009

0.0015

0.0021

0.0027

0.0033 1 /2

Fig. 10. Experimentally derived histogram of the distribution density for the coinfidence-interval half-widths for the functions (a) f (, 1 ) = cos + 1 sin and (b) f (, 1 ) = cos + 1 e . The intervals are obtained using 2 (gray histogram) and M FM,N -M (light histogram) statistics as a result of 104 trials. N is the number of values for which the confidence-interval half-width 1 /2 coincides with the corresponding step of the histogram.

1 Nm j m , Nm j =1 Nm = 12 for the Fisher statistics. Accordingly, we adopted in (45) and (46) M = 101, wm = 12, and case, the standard deviation 0 = 0.03 (in this (m ) = m = 0.03/ 12 = 0.008660). For the 2 statistics, we adopted M = 101 M 12 = 1212, introduced the new values 12(m -1)+j = As before, we calculated
m

=

j m , and calculated the m , so that 12m +j = m (m = 1 ... 101,j = 1 ... 12); in terms of observations, this is equivalent to the statement that each of the groups of 12 values of 1...12 ,13...24 ,... 1200...1212 is obtained at the same phase. Accordingly, wm = 1 and 0 = 0.03 in (45) and (46) (in this case, the standard deviation (m ) = m = 0.03). Further, the error intervals (1 ) were calculated using each of the statistics. The confidence interval (1 ) was calculated using the 1 ... 1212 (2 1212 statistics) and 1 ... 101 (Fisher statistics), which were obtained each time from the same sample of j j 1 ... 101 using the above techniques. Figures 10a and 10b present histograms of the density of the distribution of the confidence-interval half-width 1 /2 for (74) and (83) for F101,1212-101

and 2 1212 statistics. Figure 10 shows that the maximum confidence intervals obtained for the 2 1212 statistics exceed the maximum confidence intervals for the Fisher statistics by about a factor of two. In each of the statistics, the probability that a correct model is rejected is fully consistent with the value = 100% - 68.2% = 31.8% specified a priori. The real "observed" curve is not ideal; i.e., it does not contain values obtained for the same (at the same phase). In this connection, we carried out a simulation to analyze realizations of 1 ... 1212 in which 1 ... 1212 was taken to be uniformly distributed along the horizontal axis (let us call corresponding statistics as conventional Fisher statistics). In this case, conventional Fisher statistics are a priori not applicable, but are nonetheless used in practice. The validity of the Fisher statistics for such a sample of m was studied using the examples of (74) and (83). When calculating the residual using conventional Fisher statistics, the "perturbed" curve for the function 1 ... 1212 was subdivided into 101 intervals of equal length in , with each interval containing 12 points. The average values of k and k (where k = 1 ... 101) inside the interval were calculated from these points, as well as the estimated dispersion obs .

ASTRONOMY REPORTS

Vol. 52 No. 2 2008


124

ABUBEKEROV et al.

N 4000 3000 2000 1000 ()

0 4000 3000 2000 1000 0

0.002

0.004

0.006

0.008

0.010

0.012

0.014

0.016

(b)

0.0005 0.0010 0.0015

0.0020 0.0025 0.0030 0.0035 0.0040 1 /2

Fig. 11. Experimentally derived histogram of the distribution density for the confidence-interval half-widths of the functions (a) f (, 1 ) = cos + 1 sin and (b) f (, 1 ) = cos + 1 e . The intervals are obtained using 2 (gray histogram) and M conventional Fisher FM,N -M (light histogram) statistics as a result of 104 trials. The conventional FM,N -M statistics are approximate, since they are constructed by averaging the individual measurements (uniformly distributed along the horizontal axis) inside selected intervals. N is the number of values for which the coinfidence-interval half-width 1 /2 coincides with the corresponding step of the histogram.

As in the previous numerical experiment, the dimension of the Fisher distribution was F101,1212-101 . Only the uncertainty in was taken into account, not the uncertainty in (this is the most common technique). If the uncertainty in is taken into account, the distribution of the confidence areas may deviate slightly from an exact Fisher distribution. Figures 11a and 11b present histograms for the distribution density of the half-width of the interval 1 /2 obtained as a result of 104 trials using the 2 1212 and Fisher F101,1212-101 statistics for (74) and (83). While the result for the 2 1212 statistics is consistent with the expected probability for rejecting a correct model of = 31.8%, the results for the conventional Fisher statistics deviate appreciably from the theoretical value for . For example, was found to be 20% and 0% for (74) and (83), based on 1000 trials. Thus, averaging points inside intervals affects not only the confidence interval, but also conclusions about the adequacy of the model. Recall that, when fitting the perturbed curve 1 ... 1212 , when the averaged j display a single value j , the results of the fitting for the Fisher statistics are fully consistent with the pre-

specified . Thus, when the Fisher FM,N -M statistics are used, the minimum scatter of the observational data j along the horizontal axis within a single interval is required. In this case, both the probability that the model is adequate to the observational data and the length of the interval (on average) incompetently increase. This incompetent increase in the probability that the model is adequate depends strongly on the type of the function ( ), with the incompetent increase being higher for functions with stronger dependences (Fig. 11). Like the 2 and 2 methods, the FM,N -M techP M nique requires projecting the confidence area (which covers the exact solutions with a specified probability ) onto the parameter axes, resulting in an increase of the probability of encompassing the exact solution. Summarizing, the differential-correction and Monte-Carlo methods yield lower boundaries for the errors of model parameters corresponding to highly idealized a priori model assumptions. The 2 method P yields intermediate values for the parameter errors, and is also based on the idealized assumption that the model used is adequate to the data. The 2 M
ASTRONOMY REPORTS Vol. 52 No. 2 2008


ERRORS OF PARAMETERS IN INVERSE PROBLEMS

125

and FM,N -M methods yield upper boundaries for the parameter errors. These methods are free of artificially idealized model assumptions, and so yield the most reliable fitting results. The 2 and FM,N -M methods M are recommended when conservative judgements about the results of fitting observational data are desired. 8. CONCLUSION Our numerical simulations show the effectiveness of various approaches for calculating parameter errors. The error interval depends dramatically on the selected statistics and assumed a priori information about the model. We stress again that, even in oneparameter problems, the parameter error obtained using the differential-correction and Monte-Carlo methods, on the one hand, and using 2 and M FM,N -M statistics, on the other and, differ appreciably. Our simulations have demonstrated that the parameter errors calculated using the differentialcorrection and Monte-Carlo methods are similar. Depending on the problem, either of these methods may be preferred. We emphasize that, when the differentialcorrection and Monte-Carlo methods are used in multi-parameter problems, we are dealing with the probability of the number of coincidences of the true parameter values with each interval individually. As the simulations show, this probability corresponds to the specified probability. The probability of some number of joint coincidences with the error intervals will be lower than the specified probability by 15-20% [for the model functions (74)(81)]. Consequently, when joint coincidences with the error intervals is considered, these methods cannot be considered reliable. At the same time, our simulations have demonstrated the trustworthiness of the confidence-area method for the above problems. The number of coincidences of the confidence area with the "true" parameter values fully corresponds to the specified probability , so that the number of simultaneous coincidences of the projected areas will not be lower than (since this is the number of coincidences with a parallelepiped in which the confidence area is inscribed). The confidence area depends on the applied statistics (in the case of 2 and FM,N -M statistics, on the M specific observed realization of the function considered). By the "half-width of the projected confidence area," we mean one of its most probable values. In the one-dimensional case, we adopt one of the 1 /2 values in the proximity of the maximum of the histogram
ASTRONOMY REPORTS Vol. 52 No. 2 2008

of the distribution density 1 /2 (Fig. 3) as the halfwidth of the confidence interval. In other words, we adopt some most probable value for the confidence area. When using 2 and FM,N -M statistics, coinM cidence with the confidence area does not depend only on the position of its center relative to the true values, but also to the area's size and its existence (i.e., the area may turn out to form a null set). This can be interpreted as follows: in the differential-correction and confidence-area methods with the 2 statistics, the P model is assumed to be adequate to the observational data, and the confidence intervals for these methods are smaller than the confidence areas obtained using the 2 and Fisher FM,N -M statistics (we mean here M the area near the maximum of its distribution). When error calculation is based on 2 statistics, it must be P borne in mind that the model's adequacy is assumed in advance. The confidence areas obtained for 2 and M FM,N -M statistics exceed the areas obtained for 2 P statistics by a factor of two to three (Figs. 8 and 9). As was already noted, the larger confidence area obtained for 2 and FM,N -M statistics is due to the M fact that, in addition to calculating the confidence area itself, we also check the model's adequacy to the observational data. When Fisher statistics FM,N -M are used, it must be borne in mind that the scatter of the data within a selected averaging interval increases the mean parameter-error intervals, increasing the probability that the model is adequate (Fig. 11a). Taking into account all our presented calculations and considerations, authors usually prefer the confidence-area method with 2 statistics. M ACKNOWLEDGMENTS This work has been supported by the Russian Foundation for Basic Research (project no. 05-0217489), the State Program of Support to Leading Scientific Schools of the Russian Federation (grant no. NSh-5218.2006.2), a Presidential Grant for the Support of Young Russian Post-doctoral Scientists (Candidates of Science) and their Supervisors (MK2059.2007.2), and the Program of the Ministry of Science of the Russian Federation "The Development of the Scientific potential of High Schools" (grant RNP 2.1.1.5940). REFERENCES
1. A. B. Wyse, Lick Observ. Bull. 19, 17 (1939). 2. M. Lampton, B. Margon, and S. Bower, Astrophys. J. 208, 177 (1976).


126

ABUBEKEROV et al. 10. 11. 12. 13. 14. 15. 16. 17. M. Kitamura, Adv. Astron. Astrophys. 3, 27 (1964). E. Budding, Astrophys. Space Sci. 22, 87 (1973). V. Kurutac, Astrophys. Space Sci. 57, 71 (1978). M. Mezzetti, F. Predolin, G. Giuricin, and F. Mardirossian, Astron. Astrophys., Suppl. Ser. 42, 15 (1980). A. M. Shulberg and V. P. Murnikova, Var. Stars 19, 421 (1974). O. Demircan, Astrophys. Space Sci. 56, 389 (1978). A. M. Cherepashchuk, A. V. Goncharskii, and A. G. Yagola, Astron. Zh. 44, 1239 (1967) [Sov. Astron. 11, 990 (1967)]. C. H. Lacy, Astrophys. J. 251, 591 (1981).

3. A. V. Goncharskii, S. Yu. Romanov, and A. M. Cherepashchuk, Finite-Parameter Inverse Problems in Astrophysics (Mosk. Gos. Univ., Moscow, 1991) [in Russian]. 4. A. M. Cherepashchuk, Astron. Zh. 70, 1157 (1993) [Astron. Rep. 37, 585 (1993)]. 5. S.S. Wilks, Mathematical Statistics (Wiley, 1962; Nauka, Moscow, 1967). 6. A. V. Goncharskii, A. M. Cherepashchuk, and A. G. Yagola, Ill-Posed Problems in Astrophysics (Nauka, Moscow, 1985), p. 53 [in Russian]. 7. B. M. Shchigolev, Mathematical Reduction of Observations (Fizmatgiz, Moscow, 1962) [in Russian]. 8. G. E. Kron, Astrophys. J. 96, 173 (1942). 9. D. M. Popper, Astron. J. 89, 132 (1984).

Translated by K. Maslennikov

ASTRONOMY REPORTS Vol. 52 No. 2

2008