Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://www.ipa.nw.ru/PAGE/aspirantura/literatura/kchap4.pdf
Äàòà èçìåíåíèÿ: Fri Feb 19 18:24:39 2016
Äàòà èíäåêñèðîâàíèÿ: Sun Apr 10 01:17:43 2016
Êîäèðîâêà:

Ïîèñêîâûå ñëîâà: magnetic north
Tetsuo Sasao and Andr´ B. Fletcher e Intro duction to VLBI Systems Chapter 4 Lecture Notes for KVN Students Partly based on Ajou University Lecture Notes (to b e further edited) Version 1. (Unfinished.) Issued on February 19, 2006.

Very Long Baseline Interferometry
Contents
1 Technologies Which Made VLBI Possible 1.1 Basics of Digital Data Pro cessing . . . . . . . . . . . . . . . 1.1.1 Analog Pro cessing Versus Digital Pro cessing . . . . . 1.1.2 Sampling and Clipping . . . . . . . . . . . . . . . . . 1.1.3 Discrete­Time Random Pro cess . . . . . . . . . . . . 1.1.4 Stationary Discrete­Time Random Pro cess . . . . . . 1.1.5 Sampling . . . . . . . . . . . . . . . . . . . . . . . . 1.1.6 Comb Function . . . . . . . . . . . . . . . . . . . . . 1.1.7 Fourier Transform of a Comb Function Is a Comb Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.8 Sp ectra of Discrete­Time Pro cesses . . . . . . . . . . 1.1.9 Sp ectra of Sampled Data . . . . . . . . . . . . . . . . 1.1.10 Inverse Relations for Sp ectra of Sampled Data . . . . 1.1.11 Sampling Theorem . . . . . . . . . . . . . . . . . . . 1.1.12 Optimum Sampling Interval . . . . . . . . . . . . . . 1.1.13 Sampling Function . . . . . . . . . . . . . . . . . . . 1.1.14 Correlations of Nyquist Sampled Data with Rectangular Passband Sp ectra . . . . . . . . . . . . . . . . . . 1.1.15 S/N Ratio of Correlator Output of Sampled Data . . 1.1.16 Nyquist Theorem and Nyquist Interval . . . . . . . . 1.1.17 Higher­Order Sampling in VLBI Receiving Systems . 1.1.18 Clipping (or Quantization) of Analog Data . . . . . . 1.1.19 Probability Distribution of Clipp ed Data . . . . . . . 1.1.20 Cross­Correlation of 1­Bit Quantized Data: van Vleck Relationship . . . . . . . . . . . . . . . . . 1.1.21 van Vleck Relationship in Auto correlation . . . . . . 1 . . . . . . . . . . . . . . . . . . . . 4 5 5 6 6 9 10 12 13 14 17 19 20 22 25 28 33 36 37 39 42

. 46 . 51


Sp ectra of Clipp ed Data . . . . . . . . . . . . . . . . . Price's Theorem . . . . . . . . . . . . . . . . . . . . . . Cross­Correlation of the 2-Bit Quantized Data . . . . . Cross­Correlation Co efficient of the 2­Bit Quantized Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.26 Power and Cross­Power Sp ectra of 2-Bit Quantized Data 1.1.27 Disp ersion of Digital Correlator Output . . . . . . . . . 1.1.28 S/N Ratio of Digital Correlator Output . . . . . . . . . 1.1.29 Optimum Parameters v0 and n in the 2­Bit Quantization 1.1.30 Effect of Oversampling in S/N Ratio of Clipp ed Data . 1.1.31 Coherence Factor and Sensitivity with Given Bit Rate 1.2 Frequency Standard . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 VLBI Requires "Absolute" Frequency Stability . . . . . 1.2.2 How to Describ e Frequency Stability? . . . . . . . . . . 1.2.3 Typ es of Phase and Frequency Noises . . . . . . . . . . 1.2.4 Time Domain Measurements . . . . . . . . . . . . . . . 1.2.5 "True Variance" and "Allan Variance" of Fractional Frequency Deviation . . . . . . . . . . . . . . . . . . . 1.2.6 True Variance and Allan Variance through Power Sp ectrum of Fractional Frequency Deviation . . . . . . . . . 1.2.7 Time­Interval Dep endence of Allan Variance . . . . . . 1.2.8 True Variance and Allan Variance through Auto correlation of Phase Noise . . . . . . . . . . . . . . . . . . . 1.2.9 Coherence Function . . . . . . . . . . . . . . . . . . . . 1.2.10 Approximate Estimation of Coherence Time . . . . . . 1.2.11 Estimation of Time­Averaged Phase Noise . . . . . . . 1.3 Time Synchronization . . . . . . . . . . . . . . . . . . . . . . 1.4 Recording System . . . . . . . . . . . . . . . . . . . . . . . . . 2 Overview 2.1 MK­3 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.1.7 2.1.8 2.1.9 2.1.10 of the VLBI System As a Prototyp e of Mo dern VLBI Systems . . Dual­Frequency Reception . . . . . . . . . . First Frequency Conversion . . . . . . . . . Transmission to Control Building . . . . . . Intermediate Frequency Distributer . . . . . Baseband Conversion . . . . . . . . . . . . . Formatter . . . . . . . . . . . . . . . . . . . Data Recorder . . . . . . . . . . . . . . . . . Phase and Delay Calibration . . . . . . . . . Hydrogen Maser Frequency Standard . . . . Automated Op eration by Control Computer 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.22 1.1.23 1.1.24 1.1.25

51 55 59 60 63 66 67 69 70 73 76 76 78 80 82 82 85 87 89 89 91 92 94 95

101 101 102 102 103 103 104 106 106 106 108 108


2.2 Mo dern VLBI Systems . . . . . . . . . 2.2.1 New Recording and Fib er­Link 2.2.2 Digital Baseband Converters . . 2.2.3 e­VLBI . . . . . . . . . . . . . 2.2.4 VLBI Standard Interface (VSI) 3 Difficulties in Ground­Based VLBI 4 Correlation Pro cessing in VLBI 5 Observables of VLBI

.... Systems .... .... ....

.. . .. .. ..

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

109 109 113 114 116 120 120 120

3


1

Technologies Which Made VLBI Possible

Low Noise Amplifier Local Oscillator Local Oscillator

Low Noise Amplifier

IF Frequency Distributer Video Converter Formatter Data Recorder

Phase and Delay Calibrator H-Maser Frequency Standard

Phase and Delay Calibrator H-Maser Frequency Standard

IF Frequency Distributer Video Converter Formatter Data Recorder

Correlation Processor

Digital Data Processing

Figure 1: Schematic view of three ma jor technologies and digital data processing which realized VLBI. There were three ma jor technologies which enabled realization of radio interferometers with very long baselines, exceeding thousands of kilometers. They were: 1. high­stability frequency standard, 2. high­accuracy time synchronization, and 3. high­sp eed high­density recording, or sup er­wideband data transmission in nowadays. In addition, all signal pro cessings essential to VLBI, such as time marking, recording, delay tracking, fringe stopping, and correlation pro cessing are done digitally in mo dern VLBI systems. In this sense, rapid progress in digital technology in the last decades has formed a fundament of VLBI, as illustrated in Figure 1.

4


Therefore, we will first examine theoretical bases of digital signal processing, to an extent which is necessary to understand principles and roles of digital circuitries used in VLBI. Then, we will see basic elements of the ma jor technologies mentioned ab ove.

1.1
1.1.1

Basics of Digital Data Pro cessing
Analog Pro cessing Versus Digital Pro cessing

For VLBI, digital pro cessing is much more suited than analog prcessing, as evident from following comparison. Analog delay circuit A coaxial cable can b e used as a delay cable (Figure 2), if delay is smaller than 1 µsec ( 300 m). For larger delay, other means such as sonic wave must b e used. Almost imp ossible to use for an intercontinental baseline. Unstable delay value against environmental change. Larger error with larger delay. Change of frequency characteristis with mechanical connection and disconnection of cables. Analog correlation pro cessing Multiplication and averaging with analog devices. Op erational range is limited by device characteristics. Affected by the environment. Analog sp ectrometer Narrow­band analog BPF's with square­law detectors ("filterbank sp ectrometer"). Difficult to adjust gains and frequency characteristics of BPF's (frequency channels). Affected by the environment. Frequency resolution is fixed by passbands of BPF's. 5 Digital delay circuit A ring buffer comp osed of a large RAM and shift registers can b e used for digital delay tracking (Figure 2). Quite stable and highly rep eatable in variable environmental conditions. Do es not need any sp ecial tuning. Accuracy of op eration is determined almost solely by accuracy of clo ck time pulses. Easy to cover large intercontinental delays. Delay is tracked only discretely with some loss of signal p ower. Digital correlation pro cessing Multiplication and averaging with logic devices and counters. Stable op eration and high rep eatability. Wide dynamic range. Digital sp ectrometer Comp osed of logic devices, shift registers, and counters. Correlation Fourier transformation (XF­ typ e) and Fourier transformation correlation (FX­typ e). Stable op eration. Identical characteristics of frequency channels. Frequency resolution could b e variable.


input data

ring buffer read address

transmission cable delay cables

write address

delay delayed output data

Figure 2: Examples of analog delay circuit using delay cables (left) and digital delay circuit using a ring buffer (right). 1.1.2 Sampling and Clipping

Two imp ortant achievements in the theory of digital data pro cessing were vital for VLBI. They are the sampling theorem by Shannon (1949), and the clipping theorem by van Vleck and Middelton (1966, original work was done by van Vleck during World War I I). VLBI data are sampled, and clipp ed (or digitized) with 1­bit or 2­bit quantization (Figure 3). In the followings, we will see how the information of the analog data is essentially restored from the sampled and clipp ed data. We will also consider some loss of information accompanied with the digital data pro cessing. 1.1.3 Discrete­Time Random Pro cess

Discrete sequence of variables x[1], x[2], x[3], · · ·, x[i], · · · is called the "random sequence", or the "discrete­time random pro cess", if x[i] at any i is a random variable, i.e., may vary from trial to trial (Figure 4). This is a "discrete version" of the random pro cess continuously varying in time (henceforth, "continuous­time random pro cess"), which we saw in Chapter 3. We intro duce following statistical concepts for the discrete­time random pro cess. · Exp ectation [i] of a discrete­time random pro cess x[i] (i = 1, 2, 3, · · ·, n, · · ·) is defined by an equation: [i] = x[i] , (1) where stands for an ensemble average defined by a joint probability distribution of random variables x[i] (i = 1, 2, 3, · · ·, n, · · ·). 6


Analog signal
0

t

Sampling 0

t

Clipping (or quantization) 1 -1 Bit stream t

0011111110000000000111
To digital circuit
Figure 3: Analog­to­digital (A/D) conversion through sampling, clipping, and bit representation. This figure shows a case of 1­bit quantization.

7


x[i]
trial 1 trial 2 trial 3 trial 4 trial 5 1 2 3 4 5 6 7 8 9 10 11

i

Figure 4: A discrete­time random pro cess is a sequence x[1], x[2], x[3], · · ·, x[i], · · ·, whose value at any i is a random variable. · Auto correlation R[m, n] of the discrete­time random pro cess x[i] is defined by an equation: R[m, n] = x[m] x [n] , where symb ol { } stands for complex conjugate. (2)

A discrete­time random pro cess x[i] is called the "white noise" if its auto correlation satisfies R[m, n] = | x[m] |2 where
mn mn

,

(3)

is Kronecker's delta symb ol: 1 = 0


(m = n) (m = n)

mn

.

(4)

· Cross­correlation of two discrete­time random pro cesses: x[1], x[2], x[3], · · · , x[n], · · · y [1], y [2], y [3], · · · , y [n], · · · is defined by Rxy [m, n] = x[m] y [n] . (5)

8


1.1.4

Stationary Discrete­Time Random Pro cess

The concept of stationary random pro cess, which we intro duced in Chapter 3 for continuous­time random pro cess, can b e transfered to the discrete­time random pro cess in the following way (see, for example, Pap oulis, 1984). · Stationary pro cess. A discrete­time random pro cess x[i] is called "stationary" if the exp ectation [i] = x[i] = , (6) is a constant indep endent of i, and if the auto correlation R[n + m, m] = x[n + m] x [n] = R[m], dep ends on difference m of arguments only. In particular, the stationary random discrete­time pro cess is called the "white noise", if we have R[m] = R[0]
m0

(7)

.

(8)

· Jointly stationary pro cesses. Two discrete­time random pro cesses x[i] and y [j ] are called "jointly stationary" if they are b oth stationary, and if their cross­correlation Rxy [n + m, n] = x[n + m] y [n] = Rxy [m], dep ends on difference m of arguments only. Similarly to the continuous­time pro cess case, we intro duce · correlation co efficient of a zero­mean stationary discrete­time pro cess x[i]: R[m] , (10) r [m] = R[0] and · cross­correlation co efficient of zero­mean jointly stationary discrete­ time pro cesses x[i] and y [j ]: rxy [m] = Rxy [m] Rxx [0] Ryy [0] , (11) (9)

where, auto covariance is just equal to auto correlation and cross­covariance is equal to cross­correlation, since we assumed zero­mean pro cesses (i.e. exp ectations are equal to zero). 9


1.1.5

Sampling

Let us call the "sampling" an action which makes a discrete­time pro cess by p erio dically picking up values of a certain continuous­time pro cess with a certain interval of time ("sampling interval"). The discrete­time pro cess thus created is called the "time­sample" of the original continuous­time pro cess. If a discrete­time random pro cess x[n] is a time­sample of a continuous­ time random pro cess x(t) with a sampling interval T , i.e. if x[n] = x(nT ), (12)

then statistical prop erties of x[n] are determined by statistical prop erties (i.e., by probability distribution) of x(t). · Exp ectation and auto correlation of a random time­sample. If we denote exp ectation and auto correlation of a continuous­time random pro cess x(t) as (t) and R(t1 , t2 ), resp ectively, then exp ectation and auto correlation of a random time­sample x[i] = x(iT ) are given by [n] = (nT ), (13) and R[m, n] = R(mT , nT ), resp ectively. Proof : 1. [n] = x[n] = x(nT ) = (nT ). 2. R[m, n] = x[m] x [n] = x(mT ) x (nT ) = R(mT , nT ). · Cross­correlation of random time­samples. If we denote cross­correlation of continuous­time random pro cesses x(t) and y (t) as Rxy (t1 , t2 ), then cross­correlation of random time­ samples x[i] = x(iT ) and y [i] = y (iT ) are given by Rxy [m, n] = Rxy (mT , nT ). Proof : Rxy [m, n] = x[m] y [n] = x(mT ) y (nT ) = Rxy (mT , nT ). (15) (14)

10


· Stationary random time­sample. If a continuous­time random pro cess x(t) is a stationary random process with constant exp ectation x(t) = and auto correlation x(t + ) x (t) = R( ), then a time­sample x[n] = x(nT ) is a stationary discrete­time random pro cess with exp ectation: [n] = , and auto correlation: R[n + m, n] = R[m] = R(mT ). Proof : 1. Exp ectation [n] of the time­sample x[n] [n] = x[n] = x(nT ) = , is a constant indep endent of argument n. 2. Auto correlation R[n + m, n] of the time­sample x[n] R[n + m, n] = x[n + m] x [n] = x(mT + nT ) x (nT ) = R(mT ) = R[m], dep ends on difference m of arguments only. · Jointly­stationary random time­samples. If x(t) and y (t) are jointly stationary continuous­time random processes with cross­correlation x(t + ) y (t) = Rxy ( ), then their time­ samples x[n] = x(nT ) and y [n] = y (nT ) are jointly stationary discrete­ time random pro cesses with cross­correlation: Rxy [n + m, n] = Rxy [m] = Rxy (mT ). Proof : 1. Time samples x[n] and y [n] are b oth stationary discrete­time random pro cesses, as we saw ab ove. 2. Their cross­correlation Rxy [n + m, n] = x[n + m] y [n] = x(mT + nT ) y (nT ) = Rxy (mT ) = Rxy [m], dep ends on difference m of arguments only. 11 (18) (17) (16)


1.1.6

Comb Function

Infinite numb er of delta functions arranged with equal intervals along a horizontal axis, shown in Figure 5, is called the "comb function". Thus a comb function (t; T ) with p erio d T is given in terms of delta functions (t) by an equation: (t; T ) = where k is an integer.
k =-

(t - k T ),

(19)

... ...
-2T -T 0T 2T 3T 4T 5T 6T 7T 8T 9T

... . . t.

Figure 5: A comb function. An alternative expression of the comb function is known in a Fourier series form as shown b elow. 1. Let us expand the comb function to a Fourier series within a range T T of interval T : - k =-

(t - k T ) =

n=-

an e

i

2 n T

t

,

for -

T T
(20)

where n is an integer. Folllowing the standard pro cedure of the Fourier series expansion, we 2 m calculate n­th Fourier co efficient an by multiplying e-i T t , where m is an arbitrary integer, to b oth sides of equation (20), and integrating T T k =- -
T 2 T 2

T 2

(t - k T ) e

-i

2 m T

t

dt =
-
T 2

(t)e

-i

2 m T

t

dt = 1,

since t can b e equal to k T (t = k T ) only when k = 0 within the T T

· RHS = since

n=-

T 2

a

n -
T 2 T 2

e

i

2 (n-m) T

t

dt = am T ,


e
-
T 2

i

2 (n-m) T

t

0 dt = T

if m = n if m = n

.

3. Equating b oth sides, we have an =
k =-

1 , for any n, and hence T
t

(t - k T ) =

1 T

n=-

e

i

2 n T

,

for -

T T
(21)

T T 4. Although we derived this equality in a limited range - < t , it 2 2 actually holds for wider range of t. In fact, functions in the b oth sides of equation (21) do not change if we substitute t with t + mT with an arbitrary integer m. This means that they are b oth p erio dic functions with p erio d T . Therefore, equation (21) holds for the whole range of t, i.e. - < t . Thus, we have a general relation
k =-

(t - k T ) =

1 T

n=-

e

i

2 n T

t

,

(22)

which holds for any t, and, therefore, (t; T ) = 1 T
n=-

e

i

2 n T

t

,

(23)

is the alternative expression of the comb function. 1.1.7 Fourier Transform of a Comb Function Is a Comb Function

Fourier transform ~ ( ; T ) of a comb function

(t; T ) of argument t with 2 (Figure 6). p erio d T is a comb function of argument with p erio d T Proof : According to the general formula of Fourier transformation, we have ~ ( ; T ) =
-

(t; T )e

-i t

dt =

-

1 T

n=-

e

i

2 n T

t

e

-i t

dt

13


Comb function with period T

...
0 T Fourier transformation t

...

Comb function with period 0

...
0 0 = 2

...

Figure 6: Fourier transform of a comb function of t with p erio d T is a comb 2 . function of with p erio d 0 = T 1 = T =
0 n=-

e-i(

-

2 n T

n=--

) t dt = 2 T

n=-

-

2 n T (24)

( - n 0 ),

where we intro duced a notation: 0 = 2 , T

and used the general formula of the delta function:
-

e

-i t

dt = 2 ( ).

The RHS of equation (24) is nothing but a comb function of with a p erio d 2 : 0 = T
0 n=-

( - n 0 ) =
0

0

( ; 0 ).

(25)

Thus, (t; T ) ( ; 0 ), where a symb ol implies a Fourier transform pair. 1.1.8 Sp ectra of Discrete­Time Pro cesses

We intro duce following definitions. 14


· Power sp ectrum.

A p ower sp ectrum SD ( ) of a stationary random discrete­time pro cess x[n] with auto correlation R[m] is given by a discrete Fourier transform with an arbitrary parameter T (Pap oulis, 1984): SD ( ) =


R[m] e

-im T

.

(26)

m=-

· Cross­p ower sp ectrum.

A cross­p ower sp ectrum SDxy ( ) of jointly stationary discrete­time pro cesses x[n] and y [n] with cross­correlation Rxy [m] is given by a discrete Fourier transform with an arbitrary parameter T : S
D xy

( ) =



Rxy [m] e

-im T

.

(27)

m=-

These sp ectra, as defined by discrete Fourier transforms with an arbitrary parameter T in equations (26) and (27), are p erio dic functions of with 2 a p erio d . They have the same forms as Fourier series, with Fourier T co efficients R[m] and Rxy [m], resp ectively. The sp ectra SD ( ) and SDxy ( ) are, in general, dep endent on the arbitrary parameter T . Later, for particular cases of sampled discrete­time pro cesses (time­samples), we will cho ose T to b e equal to their sampling intervals. Then, we will b e able to establish a relationship b etween a sp ectrum of a time­sample and a sp ectrum of its original continuous­time pro cess. · Inverse relations.

Auto correlation R[m] and cross­correlation RDxy [m] of jointly stationary random discrete­time pro cesses in equations (26) and (27) are given through the p ower sp ectrum SD ( ) and cross­p ower sp ectrum SDxy ( ) by inverse relations: T R[m] = 2 Rxy [m] = T 2
T

SD ( ) e
-
T T

im T

d ,

(28)

S
-
T

D xy

( ) e

im T

d ,

(29)

which are nothing but the formulae for Fourier co efficients in the series expansion. 15


Proof : We prove the inverse relation for the p ower sp ectrum SD ( ) given in equation (28) only, since a pro of for the cross­p ower sp ectrum SDxy ( ) (equation (29)) is given just in a similar way. 1. If SD ( ) = In fact, T 2
T

n=-

R[n] e

-in T

T , then R[m] = 2

T

SD ( ) e
-
T

im T

d .

SD ( ) e
-
T

im T

d =

T 2

n=-

T

R[n]
-
T

e

i(m-n) T

d = R[m],

since

T

e
-
T T

i(m-n) T

d =



0

2 T

if n = m, otherwise.
n=- -in T

T 2. If R[m] = 2

SD ( ) e
-
T

im T

d , then SD ( ) =

R[n] e

.

We first prove this statement for a limited range of confined within an interval - < . Inserting first equation to the T T RHS of second equation, we have
n=- n=-

R[n] e

-in T

T = 2

T

SD ( )
-
T

n=-

e

in( - )T

d .

Note here that

e

in( - )T

is a comb function given in equation 2 , we have T =
0

(23), since, intro ducing a notation 0 =
n=-

e

in( - )T

= = =



e

i

2 n 0

( - )

n=- 0 k =-

( - ; 0 )

( - - k 0 ) --k 2 . T (30)

2 T

k =-

16


Therefore, we obtain
n=-

R[n] e

-in T

=

k =--

T

T

SD ( ) - - k

2 T

d

= SD ( ), since the delta function in the integrand takes non­zero value when =+k 2 , T k = 0, provided that is . T to a p erio dic function with a osed interval - < , T T

and this condition holds only when confined within the interval - < T Now, if we extend the function SD ( ) 2 , b eyond the initially imp p erio d of T we have R[n] e
-in T n=-

= SD ( ),

for any range of . 1.1.9 Sp ectra of Sampled Data

Let us consider discrete­time pro cesses x[n] and y [n], which are time­samples obtained by sampling jointly stationary continuous­time random pro cesses x(t) and y (t) with a sampling interval T : x[n] = x(nT ), and y [n] = y (nT ). Let auto correlation of x[n], and cross­correlation of x[n] and y [n], b e R[m], and Rxy [m], resp ectively. They satisfy R[m] = R(mT ), and Rxy [m] = Rxy (mT ), in view of equations (17) and (18). If in the p ower sp ectrum SD ( ) and the discrete­time pro cesses x[n] and y [n], to b e equal to the sampling interval T we cho ose the arbitrary parameter T cross­p ower sp ectrum SDxy ( ) of the as defined in equations (26) and (27), , i.e., (31)

T = T,

17


then SD ( ) and SDxy ( ) are related to p ower sp ectrum S ( ) and cross­p ower sp ectrum Sxy ( ) of the original continuous­time pro cesses x(t) and y (t) by equations: SD ( ) = S where 0 = Proof : We prove equation (32) for the p ower sp ectrum SD ( ) only, since a pro of of equation (33) for the cross­p ower sp ectrum SDxy ( ) is given just in a similar way. According to equations (17), (26), and (31), the p ower sp ectrum of the time­sample x[n] = x(nT ) is given by SD ( ) =
n=- D xy

1 T

k =-

S ( + k 0 ), Sxy ( + k 0 ),

(32) (33)

( ) =

1 T

k =-

2 . T

R[n] e

-in T

=

n=-

R(nT ) e

-in T

,

where T is the sampling interval. Describing the auto correlation R( ) of the continuous­time stationary random pro cess x(t) in terms of the p ower sp ectrum S ( ) through inverse Fourier transformation: 1 R( ) = S ( ) ei d , 2
-

we have 1 SD ( ) = 2 1 = 2


S ( ) e
n=-

in( - )T

d d .

n=--

S ( )
n=-

e

in( - )T

-

According to equation (30),
n=-

e


in( - )T

is a comb function: 2 . T

e

in( - )T

2 = T

k =-

--k

18


Therefore, 1 SD ( ) = T 1 = T where 0 = 1.1.10 2 . T


k =- - k =-

S ( ) ( - - k 1 = T

2 ) d T


2 S +k T

S ( + k 0 ),

k =-

Inverse Relations for Sp ectra of Sampled Data

The inverse relation for the p ower sp ectrum SD ( ) of a discrete­time stationary random pro cess x[n]: T R[m] = 2
T

SD ( ) e
-
T

im T

d ,

as given in equation (28), must yield an auto correlation which satisfies R[m] = R(mT ), if the pro cess x[n] is a time­sample x[n] = x(nT ) of a continuous­ time stationary random pro cess x(t). Proof : Substituting equation (32) to the inverse relation, we obtain 1 R[m] = 2 1 2
k =--
T

S ( + k
T T

2 )e T

im T

d

+k

2 T

=

S ( ) e
2 T 2 T T

im( -k

2 T

)T

d

k =- - T +k +k

1 = 2 1 = 2 1 = 2

S ( ) e
2 T 2 T T

i(m T -2 k m)

d

k =- - T +k +k

S ( ) e
2 T

im T

d

k =- - T +k

S ( ) e

im T

d = R(mT ).

-

19


Similarly, we confirm that the inverse relation: T Rxy [m] = 2
T

S
-
T

D xy

( ) e

im T

d ,

in equation (29), gives a cross­correlation of time­samples x[n] and y [n]: Rxy [m] = Rxy (mT ). 1.1.11 Sampling Theorem

Shannon (1949) gave a b eautiful pro of of the sampling theorem, which he formulated as follows: "If a function f (t) contains no frequencies higher than B cps, it is completely determined by giving its ordinates at a series of points spaced 1/2B seconds apart." A mathematical pro of showing that "this is not only approximately, but exactly, true" was given as follows. "Let F ( ) b e the sp ectrum of f (t). Then 1 f (t) = 2 1 = 2
-

F ( ) e

i t

d

2 B

F ( ) e
-2 B

i t

d ,

since F ( ) is assumed zero outside the band B . If we let t= n , 2B

where n is any p ositive or negative integer, we obtain n f 2B 1 = 2
2 B

F ( ) e
-2 B

i

n 2B

d .

On the left are the values of f (t) at the sampling p oints. The integral on the right will b e recognized as essentially the n­th coefficient in a Fourier­series expansion of the function F ( ), taking the interval -B to +B as a fundamental p erio d. This means that the values of the samples f (n/2B ) determine the Fourier co efficients in the series expansion of F ( ). Thus they determine 20


F ( ), since F ( ) is zero for frequencies greater than B , and for lower frequencies F ( ) is determined if its Fourier co efficients are determined. But F ( ) determines the original function f (t) completely, since a function is determined if its sp ectrum is known. Therefore the original samples determine the function f (t) completely." Shannon (1949) mentioned that Nyquist had p ointed out the fundamental imp ortance of the time interval 1/2B seconds in connection with telegraphy, and prop osed to call this the "Nyquist interval" corresp onding to the band B. Nowadays, we formulate the sampling theorem in a slightly wider form (Figure 7).
Analog continuous-time signal

0 Spectrum of analog continuous-time data
1/(2B)

t

-(n+1)B -nB

0

nB (n+1)B


0

Sampled signal

1/(2B)

t

Figure 7: Sampling theorem. Sampling Theorem: Al l the information in an analog continuous­time signal with a passband spectrum limited within a frequency range nB < (n + 1)B , where B is a bandwidth and n 0 is an integer, can be preserved, provided the signal is sampled with the Nyquist interval 1/(2B ). Here we assume a real pro cess with even or Hermitian symmetric sp ectrum with resp ect to frequency. Thus, "sp ectrum" here implies p ositive frequency part of the sp ectrum. Sampling frequecncy 2B , with Nyquist interval 1/(2B ), is called the "Nyquist rate".

21


The pro of of the ab ove theorem is given by equations (32) and (33): 1 SD ( ) = T S
D xy k =-

S +k S
xy

2 , T 2 , T

1 ( ) = T

+k

k =-

as illustrated in Figure 8. In fact, equations (32) and (33), and Figure 8, show · if p ositive frequency part of the analog continuous­time sp ectrum S ( ) is confined within a passband nB < (n + 1)B , where n 0 is an integer, B is bandwidth, and is frequency, and · if sampling interval T is equal to the Nyquist interval: T = 1/(2B ), then the analog continuous­time sp ectrum S ( ) is completely preserved in the sp ectrum SD ( ) of the sampled data (First and second panels from the top of Figure 8). Therefore, all the information of the original analog continuous­time signal is preserved in the sampled data. This proves the sampling theorem. Note, however, that the sp ectrum SD ( ) of the sampled data in a range 0 < B is inverted in frequency compared with the original analog continuous­time sp ectrum S ( ), if the integer n is o dd (second panel from the top of Figure 8). On the other hand, · if the sampling interval T is larger than the Nyquist interval, i.e., T > 1/(2B ), as shown in the third panel from the top of Figure 8, or, even though T = 1/(2B ), if the analog continuous­time sp ectrum is confined within aB < (a + 1)B , where a is not an integer, as shown in the b ottom panel of Figure 8, then fo ots of sp ectral comp onents with different n's in equations (32) and (33) are overlapp ed with each other (this is called the "aliasing"). Therefore, information of the analog continuous­time sp ectrum S ( ) is no longer preserved in the sp ectrum SD ( ) of the sampled data. 1.1.12 Optimum Sampling Interval

In order to see that the Nyquist interval is the optimum interval for sampling, let us consider an analog continuous­time sp ectrum which is confined within 22


Spectrum of continuous-time data S ()

Spectrum of sampled data SD ()

...
-3/2T -1/T -1/2T -3B -2B -B 0 SD () 1/2T B

-3/2T -3B

-1/T -1/2T -2B -B

0 S ()

1/2T B

1/T 3/2T 2B 3B

1/T 3/2T 2B 3B

. . . . . . ...

-3/2T -3B

-1/T -1/2T -2B -B

0 S ()

1/2T B

1/T 3/2T 2B 3B



...
-3/2T -1/T -1/2T -3B -2B -B 0 SD () 1/2T B

1/T 3/2T 2B 3B

-1/T -1/2T -B

0 S ()

1/2T

1/T B B



...
-3/2T -1/T -1/2T -B 0 1/2T SD () 1/T B

3/2T



-3/2T -3B

-1/T -1/2T -2B -B

0

1/2T B

1/T 3/2T 2B 3B



...
-3/2T -1/T -1/2T -3B -2B -B 0 1/2T B

1/T 3/2T 2B 3B

. . .

Figure 8: Four cases of relation b etween sp ectrum of analog continuous­time data and sp ectrum of sampled data given by equations (32) and (33). Top: analog continuous­time sp ectrum is confined within a passband 2mB | |< (2m + 1)B and sampling interval T is equal to Nyquist interval T = 1/(2B ). Second from the top: analog continuous­time sp ectrum is confined within a passband (2m + 1)B | |< 2(m + 1)B and T = 1/(2B ). Third from the top: T > 1/(2B ). Bottom: analog continuous­time sp ectrun is confined within a passband with b oundaries of non­integer multiples of B , i.e., aB | |< (a + 1)B , and T = 1/(2B ). Here we adopted notations, : frequency, B : bandwidth of the analog continuous­time sp ectrum, m 0: an integer, and a: a non­integer numb er. We assume a real pro cess, and, therefore, an even or Hermitian symmetric sp ectrum with resp ect to frequency. All information of the original analog continuous­time data is completely preserved in the sampled data in the first two cases, but a part of the information is lost after sampling in the last two cases.

23


0

B



Figure 9: A band­limited baseband sp ectrum confined within a frequency range 0 < B . Here, p ositive frequency part only is shown. a baseband 0 < B . Here, the "baseband", or otherwise called the "video­band", implies a frequency band containing 0 Hz (or "DC", which means "direct current") as the lowest frequency, such as shown in Figure 9. This is a particular case of the passband sp ectrum within nB < (n + 1)B when n = 0. Also, we assume that the bandwidth B here corresp onds to an actual extent of the sp ectrum, that means the sp ectrum is non­zero in the inside of the interval 0 < B , but zero in the outside. Then, we can conceive three cases which are shown in Figure 10.
Spectrum of continuous-time data Spectrum of sampled data 1 Oversampling T< 2B

-1/2T -B

0

B 1/2T


T=

1 Nyquist sampling 2B

-1/T

-1/2T

0

1/2T

1/T



-B=-1/2T

0 B=1/2T


T>

1 Undersampling 2B

-2/T -3/2T -1/T -1/2T

0

1/2T 1/T

3/2T

2/T



-B -1/2T

0

B 1/2T



-3/T

-2/T

-1/T -B

0

B

1/T

2/T

3/T



Figure 10: Sp ectra of sampled data in oversampling T < 1/(2B ) (top), Nyquist sampling T = 1/(2B ) (middle), and undersampling T > 1/(2B ) (b ottom). 1. Oversampling If we sample an analog continuous­time signal with sampling interval 24


smaller than the Nyquist interval T < 1/(2B ), as shown in top panel of Figure 10, we will have larger numb er of data p oints p er unit duration of time, but information contained is not improved at all, compared with the Nyquist sampling (sampling with Nyquist interval) case shown in the middle panel of Figure 10. Such a samping with an interval T < 1/(2B ) is called the "oversampling". 2. Undersampling On the contrary, if the sampling interval is larger than the Nyquist interval T > 1/(2B ), a part of information in the original analog continuous­time signal is lost in the sampled data due to the aliasing, as shown in b ottom panel of Figure 10. This case is called the "undersampling". 3. Nyquist sampling Therefore, the Nyquist sampling, i.e. sampling with the Nyquist interval T = 1/(2B ), is the optimum sampling for an analog continuous­ time signal with a band­limited baseband sp ectrum (middle panel of Figure 10). 1.1.13 Sampling Function

We saw ab ove that auto correlation R( ) and cross­correlation Rxy ( ) of jointly stationary continuous­time random pro cesses x(t) and y (t) with band­ limited baseband sp ectra are completely restored from auto correlation R[n] = R(nT ) and cross­correlation Rxy [n] = Rxy (nT ) of time samples of the processes x[n] = x(nT ) and y [n] = y (nT ), provided that the sampling interval T is shorter than or equal to the Nyquist interval, i.e. T 1/(2B ), where B is bandwidth of the sp ectra. But how can we functionally express R( ) and Rxy ( ) through R[n] and Rxy [n]? The answer is given by the so­called "second part of the sampling theorem", which states that they satisfy equations: R( ) = and Rxy ( ) = The sinc function here: S
An n=- n=-

R[n]

sin
T

T

( - nT )
T

( - nT )

,

(34)

Rxy [n]

sin
T

( - nT ) ,

( - nT )

.

(35)

( )

sin
T

T

( - nT )

( - nT )

(36)

25


is called the "sampling function". Proof : We prove equation (34) for the auto correlation only, since equation (35) for the cross­correlation can b e derived exactly in the same way. According to equation (32), p ower sp ectrum S ( ) of a stationary random continuous­time pro cess x(t) and p ower sp ectrum SD ( ) of its time­sample x[n] = x(nT ) with a sampling interval T are related to each other by SD ( ) = 1 T
k =-

S +k T ), S a

2 . T

Therefore, if the sampling interval to the Nyquist interval, T 1/(2B continuous­time baseband sp ectrum sp ectrum of the time sample SD ( ) tion P ( ) satisfying P ( ) =


is shorter than or equal we obtain the full analog ( ) by multiplying to the rectangular window func-

1 0

otherwise,

-T < T ,

(37)

as illustrated in Figure 11. In fact, we have S ( ) = T P ( ) SD ( ), from equation (32).
Continuous-time spectrum S(w) Spectrum of sampled data SD () P()

(38)

-/T

0

/T



-2/T

-/T

0

/T

2/T



Figure 11: Analog Continuous­Time sp ectrum S ( ) (left panel) is fully restored from sp ectrum of time samples SD ( ) (right panel) by multiplying a rectangular window function P ( ) given in equation (37).

26


If we intro duce Fourier transform pairs S ( ) R( ), P ( ) p( ) and SD ( ) RD ( ), equation (38) implies R( ) = T p( ) RD ( ) = T
-

p( - ) RD () d,

(39)

in view of the convolution theorem which we saw in Chapter 3, where symb ol "" stands for the op eration of convolution. We saw in Chapter 3 that inverse Fourier transform of the rectangular window function is a sinc function: 1 p( ) = 2
-

P ( )e

i

1 d = 2

T

e
-
T

i

d =

sin

T



.

(40)

On the other hand, inverse Fourier transform of equation (32) yields 1 RD ( ) = 2
-

SD ( )e
k =-

i

1 d = 2 T
2 k T





S ( + k

1 R( ) = T

e

-i



= R( )

- k =- n=-

2 )e T

i

d

( - nT ),

(41)

where we used the shift theorem and the prop erty of the comb function given in equation (22): 1 T
k =-

e

-i

2 k T



=

n=-

( - nT ).

Therefore, equation (39) is reduced to R( ) = T
n=-

sin

T

n=--

( - ) sin
T T

( - )

R() ( - nT ) d ,

=

R(nT )

( - nT )

( - nT )

which proves equation (34), since R(nT ) = R[n]. Now we are in p osition to answer to an interesting question: why accuracy of delay determination in VLBI can b e much sup erior (i.e. smaller) 27


than a sampling interval of digitized voltage signals, from which the delay is determined? For example, typical delay accuracy of Mark I I I VLBI system has b een 0.1 nanosecond (1 â 10-10 sec), while a typical sampling interval in the Mark I I I observation has b een 125 nanosecond (= 1/(2 â 2 MHz) ). Details apart, an essential p oint of the answer is in the sampling theorem: Nyquist sampled data are capable of determining the delay as accurately as continuous­time data, from which the sampled data are formed, since they are equivalent to each other in view of the sampling theorem. The 2B optimal rate and the sampling function have b een indep endently discovered by a numb er of researchers in different countries, b esides Shannon (1949). The history even go es back to the 19th Century. Interested readers could consult with a review pap er by Meijering (2002). 1.1.14 Correlations of Nyquist Sampled Data with Rectangular Passband Sp ectra

Let us consider continuous­time stationary random pro cesses x(t) and y (t) with rectangular p ower sp ectra S ( ) with a passband of bandwidth B : a S ( ) = 0


2 nB | |< 2 (n + 1)B , otherwise,

(42)

where n is an integer, and n = 0 corresp onds to the particular case of the baseband sp ectrum (Figure 12).
S() a ... -(n+1)B -nB -B 0 S() a -B 0 B = /2 B ... nB (n+1)B = /2

Figure 12: Rectangular passband (top) and baseband (b ottom) p ower sp ectra. If we sample the data with Nyquist interval T = 1/(2B ), then we obtain following prop erties for correlations of the time samples. 28


Auto correlation: Auto correlation R( ) of an original continuous­time pro cess x(t) is obtained by an inverse Fourier transformation of the passband p ower sp ectrum S ( ) (top panel of Figure 12): 1 R( ) = 2 = = a a
-

S ( ) e

i

1 d = d = e
i B

0

S ( ) e
i2 (n+

i

d
B

2 (n+1)B

e
2 nB

i

a -e i



e

1 2

)

B - B

e

i

e

i2 (n+

1 2

)

B

-i B

d



= 2aB

sin( B ) 1 cos 2 n + B 2

B ,

(43)

which has the familiar "white­fringe" form with the fringe pattern enclosed by the bandwidth pattern, as we saw in Chapter 3. In the baseband sp ectrum (b ottom panel of Figure 12), we have n = 0, and the auto correlation of the continuous­time pro cess has a sinc function form: sin(2 B ) R( ) = 2aB . (44) 2 B For the correlation co efficient of the continuous­time pro cess: r ( ) = we have, 1 sin( B ) cos 2 n + B 2 in a case of the general passband sp ectrum, and r ( ) = r ( ) = sin(2 B ) , 2 B B , (45) R( ) , R(0)

(46)

in the particular case of the baseband sp ectrum. Now, if we sample the continuous­time pro cess x(t) with the Nyquist interval T = 1/(2B ), correlation co efficient of the time sample is given by r [m] = r (mT ) = sin
m 2 m 2

cos m n +

1 2

,

(47)

29


r ()
1 0.8 0.6

r ()
1

0.5 0.4 0.2 0 -0.2 -0.4 -0.5 -0.6 -0.8 -1 -1 -3 0

-3

-2

-1

2B

0

1

2

3

-2

-1

0

2B



1

2

3

r [m] m

-5

-4

-3

-2

-1

0

1

2

3

4

5

Figure 13: Correlation co efficient r ( ) of a continuous­time pro cess with rectangular passband sp ectrum when n = 2 (top left) and baseband sp ectrum (top right), as given in equations (45) and (46). When this pro cess is sampled with the Nyquist interval T = 1/(2B ), correlation co efficient r [m] of the time sample has the "white­noise" form which is equal to 1 if m = 0, and equal to 0 if m = 0 (b ottom), since r [m] = r (mT ) = 0 for all m except for m = 0, as shown in the top two panels. for the passband sp ectrum, and r [m] = r (mT ) = sin(m ) , m (48)

for the baseband sp ectrum, in particular. Both equations (47) and (48) show the "white­noise" form of the time sample: r [m] =
m0

=



1 0

if m = 0, if m = 0,

(49)

as given in equation (8), where ij is Kronecker's delta symb ol. This shows that different sample p oints are not correlated, and therefore indep endent of each other, in time samples of Nyquist sampled data with rectangular passband sp ectra. Relationship b etween the correlation co efficient of the original continuous­ time data and that of the sampled data is illustrated in Figure 13. 30


Cross­correlation: If a cross­p ower sp ectrum Sxy ( ) of jointly stationary continuous­time random pro cesses x(t) and y (t) is real (i.e., has zero phase), and rectangular with bandwidth B , such as shown in Figure 12, the situation is much the same with the auto correlation case discussed ab ove, and their cross­ correlation has the same functional form as equation (43) or (44). Therefore, cross­correlation co efficient of their time samples has the "white­noise" form, prop ortional to the one given in equation (49). Let us now consider a little more general case, when amplitude A( ) of the cross­p ower sp ectrum Sxy ( ) is rectangular, as given in equation (42), but phase is non­zero due to some delay d b etween correlated signals in pro cesses x(t) and y (t), which may in general contain b oth the signals and uncorrelated noises, just like in an interferometer problem. In such a case, the cross­p ower sp ectrum, which contains the signal contribution only, has a form: Sxy ( ) = A( ) e-id , (50) as we saw in Chapter 3. Strictly sp eaking, actual passband sp ectra to b e sampled in realistic interferometers are IF sp ectra after the frequency­conversion, and hence their phase sp ectra usually do not cross the origin, i.e., phases are non­zero at = 0, unlike in equation (50), as we discussed in Chapter 3. Nevertheless, we adopt equation (50) for simplicity, assuming an idealized case of "RF correlation", or a case when the "fringe stopping" is ideally p erformed so that the phase crosses the origin, but the phase slop e still remains due to an imp erfect "delay tracking". Then, in view of the shift theorem, the cross­correlation Rxy ( ): 1 Rxy ( ) = 2
-

Sxy ( ) e

i

d ,

should have a similar form as given in equation (43) or (44), but argument is replaced by - d . Thus, cross­correlation co efficient is given by rxy ( ) =


Rxy ( ) Rxx (0) Ryy (0) sin[ B ( - d )] 1 cos 2 n + B ( - d ) 2 sin[2 B ( - d )] 2 B ( - d ) 31 B ( - d ) (passband), (51) (baseband),

=




where is the maximum cross­correlation co efficient: = Rxy (d ) Rxx (0) Ryy (0) . (52)

Note that the cross­correlation co efficient in the case of the passband sp ectrum with n = 0 again shows the "white­fringe" form with the cosine "fringe pattern" enclosed within the sinc function envelop e of "bandwidth pattern".
rxy() /
1

rxy[m] / rxy() /
1

rxy[m] /

0.5

0.5

0

0

-0.5

-0.5

d
-1 -3 -2 -1 0 1 2 3

d m
-1 -3 -2 -1

2B



2B

0



1

2

3

m

Figure 14: Cross­correlation co efficient rxy ( ) of jointly stationary continuous­time pro cesses x(t) and y (t) with rectangular passband sp ectrum when n = 2 (left: dotted line) and baseband sp ectrum with n = 0 (right: dotted line), as given by equation (51). d is a time delay b etween correlated pro cesses x(t) and y (t). Also shown by bars is cross­correlation co efficient rxy [m] = rxy (mT ) of time samples x[i] = x(iT ) and y [j ] = y (j T ) sampled with the Nyquist interval T = 1/(2B ), where B is a bandwidth of the rectangular sp ectrum (equation (53)). Horizontal axis shows the time difference normalized by the sampling interval T = 1/(2B ). Vertical axis is the cross­correlation co efficient rxy ( ) normalized by its maximum value: = Rxy (d )/ Rxx (0) Ryy (0), where Rxy ( ), Rxx ( ), and Ryy ( ) are cross­ correlation and auto correlations of x(t) and y (t), corresp ondingly. Therefore, if we sample x(t) and y (t) with the Nyquist interval T = 1/(2B ), cross­correlation co efficient of the time samples is rxy [m] = rxy (mT )

32


=






sin
2

2

m- m-

d T

d T

cos n +

1 2

m-

d T

(passband), (53)



sin m - m-

d T

d T

(baseband).

Relationship b etween the cross­correlation co efficient of the original continuous­ time data and that of the sampled data is illustrated in Figure 14. The cross­correlation co efficient rxy [m] of the time samples given by equation (53) no longer has the symmetric "white­noise" form, as shown in equation (49) and in the b ottom panel of Figure 13, due to the parallel shift of the cross­correlation co efficient of the continuous­time data along the horizontal axis which is caused by the delay d . Also, it now dep ends up on n, i.e. up on lo cation of the passband sp ectrum on the frequency axis, since the "fringe pattern" in the "white­fringe" dep ends on the lo cation. Thus, in the cross­correlation co efficient, the simple "white­noise" form and the indep endence of sample p oints is obtained only when the delay d is reduced to zero (d = 0) by a suitable comp ensating op eration, such as the "delay tracking" in the interferometry. 1.1.15 S/N Ratio of Correlator Output of Sampled Data

Let us now imagine a "semi­analog" correlator (non­existing in reality), which would multiply and integrate (i.e. time­average) sampled but not quantized (not clipp ed) data streams from two antennas of an interferometer. We will estimate here a signal­to­noise ratio of such a correlator, b efore examining actual digital correlators which deal with sampled and quantized data. Let us assume that the two sampled data streams x[i] and y [i] are time samples of jointly stationary continuous­time random pro cesses x(t) and y (t), which ob ey the second­order Gaussian probability distribution, as we assumed in the signal­to­noise­ratio discussion in Chapter 3. We further assume that x(t) and y (t) have identical rectangular passband sp ectra with bandwidth B , as given in equation (42), and they are sampled with the Nyquist interval T = 1/(2B ). Then, we have x[i] = x(i T ) and y [i] = y (i T ). Also, we assume that the delay tracking and the fringe stopping are p erfectely p erformed b eforehand, so that the two input data of exactly the same wave front are b eing correlated. In this case, "correlator output" Rs of the sampled data streams is an 33


average of pro ducts of time samples over a certain numb er N : 1 Rs = N
N

x[i] y [i].
i=1

(54)

Exp ectation of this correlator output is nothing but the cross­correlation Rxy [0] of x[i] and y [j ] at zero argument, since R
s

=

1 N

N

x[i] y [i] =
i=1

1 N

N

x(i T ) y (i T ) = Rxy (0) = Rxy [0].
i=1

(55)

2 On the other hand, disp ersion of this correlator output s is given by 2 s = R 2 s

-R

s

2

,

(56)

as we saw in Chapter 3. R2 is describ ed through a double sum of the s fourth statistical momentum in view of equation (54). The fourth statistical momentum is decomp osed into a sum of pro ducts of second statistical momenta (correlations), as we discussed in Chapter 3, since x[i] = x(i T ) and y [j ] = y (j T ) ob ey the joint Gaussian probability distribution. Thus, we have R
2 s

1 = N2 = + 1 N2

N

N

x[i] y [i] x[j ] y [j ]
i=1 j =1 N N

i=1 j =1

{ x[i] y [i] x[j ] y [j ]

1 = N2 = = where = R R

x[i] x[j ] y [i] y [j ] + x[i] y [j ] y [i] x[j ] }
N N i=1 j =1 2 2 Rxy [0] + Rxx [i - j ] Ryy [i - j ] + Rxy [i - j ] Rxy [j - i]

s

s

2

1 12 Rxx [0] Ryy [0] + Rxy [0] N N 1 + Rxx [0] Ryy [0] (1 + 2 ), N + = Rxy (0)

(57)

is the maximum cross­correlation Rxx [0] Ryy [0] Rxx (0) Ryy (0) co efficient, given in equation (52), in our assumed case with d = 0. In deriving last two lines of equation (57), we used the "white­noise" relations for auto correlations: Rxx [i - j ] = Rxx [0] ij , and Ryy [i - j ] = Ryy [0] ij , 34 (58)

Rxy [0]


in view of equation (49), and for cross­correlation: Rxy [i - j ] = Rxy [0] ij , (59)

which is also satisfied since we assumed d = 0. Therefore, the disp ersion of the correlator output in equation (56) is now given by 1 2 Rxx [0] Ryy [0] (1 + 2 ), (60) s = N and we obtain the signal to noise ratio S N R: SN R = Rs = s Rxy [0] Rxx [0] Ryy [0] (1 + 2 ) N= N. 2 1+ (61)

In the expression of the maximum cross­correlation co efficient , the autocorrelations Rxx (0) and Ryy (0) are usually dominated by system noise contributions from antenna­receiver systems of a radio interferometer, while the cross­correlation Rxy (0) contains contribution of the signal from a radio source only, as we discussed in Chapter 3. Thus, when we observe a continuum sp ectrum source, is approximately given by = TA 1 T TS 1 T
A S
2

,

(62)

2

as we saw in Chapter 3, where TA1 , TA2 are antenna temp eratures, which are assumed constant throughout the frequency band B in the case of the continuum sp ectrum source, and TS1 , TS2 are system noise temp eratures, of antenna 1 and antenna 2. For most of radio sources, TA TS , and, therefore, 1. In this case, equation (61) is reduced to SN R = Rs = N = s TA 1 T TS 1 T
A S
2



N.

(63)

2

If we denote an integration time of the correlation pro cessing as a , the numb er of samples N with Nyquist interval 1 / (2 B ) is equal to N = 2 B a . (64)

Therefore, equation (63) for the continuum sp ectrum source is reduced to SN R = TA 1 T TS 1 T 35
A S
2

2 B a ,

(65)

2


which is just identical with what we derived for correlator output of continuous­ time voltages in Chapter 3. This means that the Nyquist sampling do es not cause any loss of signal­ to­noise ratio of the correlator output, compared with the continuous­time case, as exp ected from the sampling theorem. This also means that there is no ro om for the oversampling, with a sampling rate higher than the Nyquist rate (T < 1 / 2B ), in improving the signal­to­noise ratio, despite increased numb er of data p oints. Thus, the Nyquist sampling is really optimum for the radio interferometry. Note that equation (63) can b e interpreted as showing N ­fold improvement of the signal­to­noise ratio after rep eating and averaging N "measurements" of a p ower (pro duct of two data streams, in our case). This means that measurements of a p ower made at the Nyquist interval are indep endent of each other, in the case of the rectangular passband sp ectra. This is a consequence of the indep endence of time samples themselves discussed earlier. 1.1.16 Nyquist Theorem and Nyquist Interval

The Nyquist theorem (Nyquist, 1928), which we saw in Chapter 2, says that thermal noise p ower W p er unit bandwidth emitted by a resistor in a thermal equilibrium with a temp erature T is equal to W = k T , (66)

in the classical limit h k T , where k and h are the Boltzmann and rhe Planck constants, resp ectively. Therefore, energy E emitted within a rectangular band with a bandwidth B during a time interval t is E = B t k T. (67)

1 Since energy p er one degree of freedom is equal to k T under the thermal 2 equilibrium, numb er of degrees of freedom in this energy must b e NF = 2 B t. On the other hand, we have NI = 2 B t Nyquist intervals during the time t, for the bandwidth B . In the case of the rectangular band, one Nyquist interval contains one indep endent sample, as we saw earlier. Therefore, we have NI indep endent samples in the emitted energy during the time t. The equality NI = NF = 2 B t means that one indep endent sample (Nyquist interval) in the information theory corresp onds to one degree of freedom in the physics, in the thermal noise.

36


1.1.17

Higher­Order Sampling in VLBI Receiving Systems

In digital data pro cessings as applied to radio astronomy, sampling of received voltage signals has b een traditionally done at the basebands (or the video­bands), containing DC (zero frequency) as the lowest frequency, after frequency conversions. This was the safest way for reliable sampling, when clo ck rates of sampler circuits were not high enough, and not very stable.

Figure 15: Diagram of receiving system in KVN (Korean VLBI Network) adopting the higher­order sampling technique (figure brought from KVN webpage http://www.trao.re.kr/ kvn/). However, it is not easy, in existing analog filtering technology, to implement a go o d enough lowpass filter with sharp rectangular edges. This situation has often resulted in rather p o or frequency characteristics of the baseband sp ectra, and made it difficult to achieve high signal­to­noise ratio, close to the one exp ected from an ideally rectangular sp ectrum. Also, b ecause of this difficulty, high quality baseband converters tend to b e exp ensive, esp ecially when wide frequency bands are required. 37


Recently, Noriyuki Kawaguchi and his colleagues successfully applied so­ called "higher­order sampling" technique to a numb er of VLBI systems, including Japanese VERA (VLBI Exploration of Radio Astrometry). The higher­order sampling is the sampling at a passband with n > 0, discussed earlier. In general, it is easier to design go o d analog bandpass filters, with nearly rectangular band shap es, when the ratio B /0 of bandwidth B to central freuency 0 is smaller. Therefore, it is easier to make a nearly rectangular wideband filter for a passband, than for a baseband. In fact, the higher­order sampling technique has b een effective in wideband receiving systems with typical bandwidth of 512 MHz or wider, for realizing b etter frequency characteristics and higher signal­to­noise­ratio (Iguchi and Kawaguchi, 2002). Figure 15 shows a diagram of KVN (Korean VLBI Network) receiving system which adopts the higher­order sampling technique. A "baseband converter" cuts off a 512 MHz band from a 2 GHz­wide first IF signal with 8.5 GHz center frequency (i.e. 7.5 ­ 9.5 GHz band), and converts it to a 512 MHz­wide passband signal with 768 MHz central frequency (i.e. 512­1024 MHz band), which is then sampled by a high­sp eed sampler. Note here that, for the bandwidth B = 512 MHz, the passband nB < (n + 1)B with nB = 512 MHz means an o dd numb er of n (i.e. n = 1), where is the frequency and n is an integer. Therefore, sp ectrum of sampled signal is inverted with resp ect to the sp ectrum of original continuous­time signal, as we saw in Figure 8. In order to avoid p ossible inconveniences with the inverted sp ectra, LO (lo cal oscillator) frequency of the "baseband converter" is chosen so that the passband sp ectrum is obtained in the lower sideband (LSB), i.e. the sp ectrum is inverted with resp ect to the first IF sp ectrum. In this way, one can obtain a sp ectrum of the sampled signal which is identical with the one contained in the first IF signal. It seems worthwhile to mention an interesting question here. The sampling theorem says that the optimal sampling rate for a passband nB < (n + 1)B is 2B . Do es this mean that we can use an inexp ensive low­sp eed 4 Msps (mega sample p er second) sampler for sampling a signal in a high­ frequency passband, say, 10.000 ­ 10.002 GHz? The answer is "NO", as N. Kawaguchi clearly explains. Although the required sampling interval is really 1/(2B ), sampling timings must b e controlled with much greater accuracy, b etter than 1/0 , where 0 is the central frequency of the passband (the "carrier" frequency). Otherwise, we will get all chaotically "jittered" data at each of VLBI stations, from which we will never find any go o d fringe. Therefore, a required sampler must b e as accurate as, and as stable as, a 20 Gsps sampler, say. Consequently, the high­sp eed sampling technology is indisp ensable for successful application of the wideband higher­order sampling technique. 38


1.1.18

Clipping (or Quantization) of Analog Data

The sampling replaces data that are continuous in time with those discrete in time. This is undoubtedly a big step towards digitizing analog data. However, the sampling alone still leaves values of time samples analog, which may vary arbitrarily from sample to sample. For a complete analog to digital (A/D) conversion, we need to replace each continuously variable value with an element of a finite set of discrete values expressible by a certain numb er of bits. This step is called the "clipping" or the "quantization". Numb er of discrete values expressible by a given numb er of bits determines "numb er of levels of quantization". Therefore, 1­bit, 2­bit, · · ·, n­bit quantizations usually corresp ond to 2­level, 4­level, · · ·, 2n ­level quantizations, resp ectively. There have b een exceptions of the n­bit, 2n ­level law of quantization, such as 2­bit, 3­level quantization. However, such exceptional quantization schemes are rarely used in present­day VLBI systems. We will denote henceforth a discrete­time pro cess with quantized values as x[i]. If numb er of quantization levels is m, with discrete values x1 , x2 , · · ·, ^ xm , then x[i] must take one of these m values as illustrated in Figure 16. A ^ quantized pro cess x[i] is supp osed to b e related in a prescrib ed way to an ^ original discrete­time pro cess x[i] with analog values b efore clipping.
x[i] x8 x7 x6 x5 x4 x3 x2 x1

i

Figure 16: An image of a quantized discrete­time pro cess. The larger the numb er of quantization levels (i.e. bits), the more information can remain after the clipping. For reducing data size and increasing pro cessing sp eed, however, smaller numb er of bits is preferable. Thus one has to cho ose an optimal numb er of quantization levels for one's particular purp ose. In VLBI, 1­bit (2­level) and 2­bit (4­level) quantizations are mostly used. Figure 17 and Table 1 show how quantized values are related to original analog values in cases of VLBI 1­bit (left panel)and 2­bit (right panel) 39


quantizations. In 1­bit quantization scheme, clipp ed quantity takes only one of two values: +1 and -1, dep ending up on if original analog value is p ositive or negative, resp ectively. It is generally accepted that bit 0 is assigned to -1 state and bit 1 is assigned to +1 state.
x (clipped) +n +1 +1 x -1 -1 -n 0
sign bit mag. bit

x (clipped)

x

-v0 0 (1) 0 (1) 0 (1) 1 (0)

0 1 (0) 0 (0)

+v0 1 (0) 1 (1)

bit

0

1

Figure 17: Relations b etween analog (unquantized) values x and clipp ed (quantized) values x for 1­bit (left) and 2­bit (right) quantizations, resp ec^ tively. Bit assignments are shown in b ottom panels. In case of 2­bit quantization (left), a representative bit assingment for data recorder is shown along with that for a correlator chip given in parentheses.

Quantization Analog value Clipp ed value Recorder bit ass. Correlator bit ass.

1­bit (2­level) x<0 x = -1 ^ 0 0 0x 1 1 x < -v
0

2­bit (4­level) -v0 x < 0 x = -1 ^ s 0, m 1 s 1, m 0 0x 0

x = +1 ^

x = -n ^

x = +n ^ s 1, m 1 s 0, m 1

v0 x

s 0, m 0 s 1, m 1

Table 1: Clipping criteria and bit assingments of VLBI 1­bit and 2­bit quantizations. In the bit assignments for the 2­bit quantization scheme, "s" and "m" stand for sign bit and magnitude bit, resp ectively. In 2­bit, 4­level quantization by three threshold values: -v0 , values: -n, -1, +1, and +n, as 1. Values of parameters v0 and scheme, 4 quantization states are separated 0, and v0 . They corresp ond to 4 clipp ed shown in left panel of Figure 17 and Table n are chosen so that signal­to­noise ratio 40


of the correlator output is maximized. Note that n thus determined is not necessarily an integer. Bit assignments to the 4 quantization states are not uniquely standardized yet in the 2­bit quantization scheme. Existing VLBI data recording systems mostly adopt 00, 01, 10, and 11 assignments for -n, -1, +1, and +n states, where first and second bits stand for sign and magnitude bits, resp ectively, but a widely used 2­bit quantization correlator chip adopts 11, 10, 00, and 01 assignments, as shown in Figure 17 and Table 1. Therefore, recorded bits should b e rearranged b efore the correlation, when we use a correlator with the chip. Sign bits in the 2­bit quantized data are equivalent to 0 and 1 bits in the 1­bit quantized data. Therefore, it is usually not difficult to cross­ correlate 1­bit quantized and 2­bit quantized data, which are obtained in different stations in the same VLBI observation with the same sampling rate, using a 1­bit correlation mo de, if we sacrify some of information in the 2­bit quantized data. In this case, the only necessary thing is to pick up sign bits from the 2­bit qunatized data and cross­correlate them with the 1­bit quantized data. Of course, direct cross­correlation of data with different quantization schemes, such as 1­bit and 2­bit, can b e p erformed without lo osing information (see, for example, Hagen and Farley, 1973), provided that a sp ecial logical circuit for this purp ose is built in a correlator.
Analog signal +v
0

0

-v

t

0

2-bit quantized signal
+n +1 -1 -n t

Figure 18: A schematic view of time variation of original analog data (top) and 2­bit quantized data (b ottom). In the 1­bit quantization scheme, only sign information of the original analog data remains in the clipp ed data, and no information on magnitude (amplitude) is left at all, as illustrated in Figure 3. In the 2­bit quantized 41


data, some information on the magnitude (amplitude) of the analog data is left, as illustrated in Figure 18, but the information is very vague. Then, how much can we restore scientific information contained in the original analog data of VLBI after clipping them with 1­bit or 2­bit quantization scheme, which lo oks quite rough at least at the first glance? The clipping theorem gives a surprising answer. The theorem was originally develop ed by J.H. van Vleck in a study of radar­jamming during the World War I I conducted by the Radio Research Lab oratory of Harvard University (Rep ort No.51 on July 21, 1943), and was made public more than 20 years later by van Vleck and Middleton (1966). 1.1.19 Probability Distribution of Clipp ed Data

Before pro ceeding to the clipping theorem, we will examine how we can describ e probability distribution of clipp ed data. As we discussed earlier, signals from astronomical radio sources, as well as system noises pro duced in antenna­receiving systems and in environments, are well approximated by Gaussian random pro cesses. Therefore, let us consider the data as jointly stationary continuous­time random pro cesses x(t) and y (t), which ob ey the second­order Gaussian probability density, introduced in Chapter 3. Here we assume a zero­mean case (i.e. exp ectations of x(t) and y (t) are equal to zero), and use notations suited to the current discussions, slightly changing those adopted in Chapter 3. Also, we assume that b oth x(t) and y (t) are real pro cesses. Then, we describ e the zero­mean second­order Gaussian probability density of jointly stationary continuous­time random pro cesses x(t + ) and y (t) as: f (x, y ; ) f (x, y ; t + , t) = 1 2 x
y 2 1 - rxy ( )

xy 1 x2 y - 2 rxy ( ) + 2 ( )] 2 x y x e 2[1 - rxy -

2 2 y

, (68)

2 2 where we intro duced notations: x Rxx (0), y Ryy (0), for disp ersions of x(t) and y (t), and Rxy ( ) rxy ( ) , Rxx (0) Ryy (0)

for cross­correlation co efficient of x(t + ) and y (t). Here, Rxy ( ), Rxx ( ), and Ryy ( ) are cross­correlation and auto correlations of x(t) and y (t), as b efore. We first assume that the cross­correlation co efficient rxy ( ) is smaller than 42


2 1 in absolute value (i.e. rxy ( ) < 1) in order to avoid p ossible singularity in 2 our calculations which may o cur when rxy ( ) = 1.

x(t) x+dx x

y(t) y+dy y t

t+

time

time

Figure 19: In case of continuous­time pro cesses, Joint probability density describ es probability of the pro cesses to take values contained within infinitesimal ranges at certain p erio ds of time. Of course, this joint probability density satisfies the general definition of the cross­correlation: Rxy ( ) = x(t + ) y (t) =


x y f (x, y ; ) dx dy ,

(69)

- -

as we saw in Chapter 3 (note that in our zero­mean case the cross­covariance Cxy ( ) is just equal to the cross­correlation, i.e. Cxy ( ) = Rxy ( ) ). It is worth to recall here that the joint probability means "f (x, y ; ) dx dy is a probability for x(t + ) and y (t) to b e within ranges: x x(t + ) < x + dx, y y (t) < y + dy , for any t (see Figure 19). The condition "for any t" corresp onds to our case of the stationary random pro cesses. Now if we consider discrete­time pro cesses x[i] and y [i], which are time samples of the ab ove jointly stationary real continuous­time random processes x(t) and y (t), i.e. x[i] = x(iT ) and y [i] = y (iT ), where T is a sampling 43


interval, the time samples x[i] and y [i] are also jointly stationary as we saw earlier, and their cross­correlation: Rxy [m] = x[n + m] y [n] = Rxy (mT ) = x(nT + mT ) y (nT ) is describ ed by the same joint probability density of the continuous­time pro cesses f (x, y ; ) as Rxy [m] = Rxy (mT ) =


x y f (x, y ; mT ) dx dy .

(70)

- -

Let us then consider that we clip the time samples x[i] and y [i], and obtain clipp ed discrete­time pro cesses which we denote as x[i] and y [i]. They ^ ^ now take only discrete values of a finite numb er N (this means N ­level quantization) x1 , x2 , · · ·, xN and y1 , y2 , · · ·, yN .

^ x[i]
x5 x4 x3 x2 x1

^ y[i]
y5 y4 y3 y2 y1 n

n+m

i

i

Figure 20: Joint probability P (xi , yj ; m) of clipp ed pro cesses x[n + m] and ^ y [n] describ es probability for them to take certain discrete values xi and yj ^ on quantization levels (here 5­level case is shown). In this case, their cross­correlation Rxy [m] = x[n + m] y [n] is decrib ed ^ ^ ^^ by an equation:
N N

Rxy [m] = ^^
i=1 j =1

xi yj P (xi , yj ; m),

(71)

where P (xi , yj ; m) is a joint probability for x[n + m] and y [n] to take values: ^ ^ x[n + m] = xi , ^ y [n] = yj , ^ 44


for any n (See Figure 20). This cross­correlation is what should b e yielded by a digital correlator. In fact, in view of the ergo dicity, the cross­correlation is well approximated by a time­average of pro ducts of N quantized data x[n + m] and y [n]: ^ ^ 1 Rxy [m] = ^^ N
N n=1

(x[m + n] y [n]), ^ ^

(72)

if N is sufficiently large; and the digital correlator is nothing but a machine which time­averages a large numb er of pro ducts of digital (i.e. sampled and clipp ed) data. At the first glance, equations (70) and (71) lo ok quite different, and it app ears difficult to relate Rxy [m] with Rxy [m]. However, if the quantization ^^ condition is clearly sp ecified, we can calculate P (xi , yj ; m) rather easily from f (x, y ; mT ) (van Vleck and Middleton, 1966). In case of the 1­bit, 2­level quantization, the condition is +1 x[i] = ^ -1


for x[i] = x(iT ) < 0,

for x[i] = x(iT ) 0, for y [i] = y (iT ) 0,

Therefore, the probability for x[i] to b e +1, for example, is nothing but the ^ probability for x(iT ) to b e 0 x(iT ) < +. Thus we can describ e joint probabilities for all combinations of quantization states through equations: P (+1, +1; m) = P (-1, -1; m) = P (+1, -1; m) = P (-1, +1; m) =
+ 0 0

+1 ^ y [i] = -1

for y [i] = y (iT ) < 0.

(73)



+ 0 0

f (x, y ; mT ) dx



dy ,

- 0

- 0 0 + 0

- +



f (x, y ; mT ) dx dy , f (x, y ; mT ) dx


dy ,



f (x, y ; mT ) dx

dy .

(74)

-

Integrals in the RHS of these equations are readily calculated, since f (x, y ; mT ) is given by the joint Gaussian probability density in equation (68). 45


1.1.20

Cross­Correlation of 1­Bit Quantized Data: van Vleck Relationship

In case of the 1­bit quantization, we have N = 2, and Therefore, cross­correlation R
2 2



y1 = -1, y2 = +1.

x1 = -1, x2 = +1,

(75)

xy ^^

of clipp ed data x[i] and y [i] is given by ^ ^

Rxy [m] = ^^
i=1 j =1

xi yj P (xi , yj ; m) (+1) · (+1) · P (+1, P (+1, (+1) (-1) +1; -1; ·P ·P m) m) (+1, +1; (+1, -1; + P (-1, - P (-1, m) m) -1; +1; + (-1) · (-1) · P (-1, -1; m) + (-1) · (+1) · P (-1, +1; m) m) m). (76)

= + = -

Because of symmetric prop erties of the joint Gaussian probability density shown in equation (68), the joint probabilities given in equation (74) satisfy P (+1, +1; m) = P (-1, -1; m), P (+1, -1; m) = P (-1, +1; m).

(77)

Also, by definition of the probability, sum of probabilities of all p ossible cases must b e equal to 1, and hence P (+1, +1; m) + P (-1, -1; m) + P (+1, -1; m) + P (-1, +1; m) = 2 P (+1, +1; m) + 2 P (+1, -1; m) = 1. Then, from equations (76), (77), and (78), we obtain Rxy [m] = ^^ - = = P P 2 4 (+1, +1; m) + P (+1, -1; m) - P P (+1, +1; m) - P (+1, +1; m) - (-1, -1; m) (-1, +1; m) 2 P (+1, -1; m) 1.

(78)

(79)

Substituting the explicit form of the joint Gaussian probability density in equation (68) into equation (74), we calculte 4 P (+1, +1; m): 4 P (+1, +1; m) = 4
+ 0



+ 0

f (x, y ; mT ) dx dy 46




=

+ + 0



2 x
y

0

1-r

2 xy

1 - 2 e 2(1 - rxy )

x2 - 2r 2 x

xy

y xy + x y

2 2 y

dx



dy , (80)

where T is the sampling interval, as b efore, and we denoted the cross­ correlation co efficient rxy [m] = rxy (mT ) as rxy , for simplicity. Let us intro duce new variables and , which satisfy x = x cos , y = y sin . Then the ab ove integral is reduced to 2 2
2 2

(81)

4 P (+1, +1; m) =

1-r

2 xy 0

d
2 xy

0

2 (1 - rxy sin 2) 2 2(1 - rxy ) d e -

=

0

1 - rxy sin 2

1-r

d .

(82)

If we further intro duce another new variable , which satisfies tan = tan - r 1-r
xy

2 xy

,

tan - rxy and, therefore, = arctan , 2 1 - rxy 1 1
2 2 1 - rxy 1 = . cos2 1 - rxy sin 2





(83)

then we have d = d (84)

1+

tan -rxy 2 1-rxy

1-r

2 xy

Note that this has the same form as the integrand of equation (82). The limits of the integration = 0 and = now corresp ond to 2 rxy and = , resp ectively. Terefore, equation = 0 - arctan 2 2 1 - rxy (82) b ecomes 2 4 P (+1, +1; m) =
2



0

2 2 rxy d = 1 - 0 = 1 + arctan . (85) 2 1 - rxy 47






Denoting the cross­correlation co efficient r r we obtain 4 P (+1, +1; m) = 1 +
xy

xy

through a sine function: r
xy 2 xy

= sin , and, therefore,

1-r

= tan ,

2 2 2 arctan(tan ) = 1 + = 1 + arcsin(rxy ). (86)

We must sp ecify here a range of arcsin(rxy ), since, in general, arcsine is a multi­value function, while the probability P (+1, +1; m) is certainly not. In view of the general prop erty of the probability, P (+1, +1, m) must b e confined within a range: 1 0 P (+1, +1, m) . 2 Indeed, the upp er limit corresp onds to a case of the complete correlation (identical data), for which 1 P (+1, +1; m) = P (-1, -1; m) = , 2 b ecause P (+1, -1; m) = P (-1, +1; m) = 0, while the lower limit corresp onds to a case of the complete anti­correlation (identical data but with different signs), for which P (+1, +1; m) = P (-1, -1; m) = 0. Since the cross­correlation co efficient rxy of the original unclipp ed data must b e 1 in the complete correlation, and -1 in the complete anti­correlation, the arcsine function in equation (86) must b e confined within a range: - arcsin(rxy ) . 2 2 (87)

Finally, from equation (79), we obtain Rxy [m] = 4P (+1, +1; m) - 1 = ^^ =


Rxy (mT ) 2 arcsin . Rxx (0) Ryy (0) 48

2 arcsin(rxy [m])

(88)


Since x[i] x[i] = 1 and y [i] y [i] = 1 for any i, and therefore Rxx [0] = 1 and ^^ ^^ ^^ Ryy [0] = 1, for the 1­bit quantized data, their cross­correlation co efficient is ^^ equal to their cross­correlation: rxy [m] = ^^ Rxy [m] ^^ Rxx [0] Ryy [0] ^^ ^^ = Rxy [m]. ^^ (89)

Therefore, equation (88) is describ ed also as rxy [m] = ^^ 2 arcsin(rxy [m]). (90)

In a particular case of a small cross­correlation co efficient | rxy [m] | 1, which is usually the case in radio interferometry, we have an approximate linear equation: 2 (91) rxy [m] = rxy [m]. ^^ 2 Although we derived equations (88) and (90) assuming that rxy < 1, it is worth to confirm here that the resultant equations are valid in the limiting cases of the complete correlation (rxy = 1 and rxy = 1) and the complete ^^ anti­correlation (rxy = -1 and rxy = -1), to o. ^^ Equation (88) or (90) is generally called the "van Vleck relationship". This is indeed a surprising result which shows that the cross­correlation co efficient rxy [m] of the original analog data is almost completely restored from the cross­correlation co efficient rxy [m] of the 1­bit quantized data by ^^ a simple equation: rxy [m] = sin rxy [m] . (92) ^^ 2 In the case of the small cross­correlation co efficient | rxy [m] | 1, we have rxy [m] = rxy [m]. ^^ 2 (93)

As we saw b efore, cross­correlation Rxy [m] (= rxy [m]) of digital data is ^^ ^^ readily obtained from a digital correlator. Therefore, equations (92) and (93) mean that, we can completely derive functional form and, therefore, sp ectral shap e of the original cross­correlation co efficient from an output Rxy [m] of a digital correlator of the 1­bit quantized data, though the ^^ amplitude is reduced by a factor of 2/ (see, for example, Fifure 21). = It is natural that the cross­correlation Rxy (mT ) itself of the original analog data cannot b e directly obtained from the 1­bit quantized data alone which carries sign information only. Nevertheless, we can restore the cross­ correlation Rxy (mT ) of the original analog data, if their disp ersions Rxx (0) 49


Figure 21: Sp ectra of water maser lines derived from 1­bit quantized data. Right and left panels show water maser lines of Asymptotic Giant Branch (AGB) stars RT Vir and VY CMa, resp ectively, obtained at two ep o chs (top and b ottom panels). Thin lines show total­p ower (i.e. auto correlation) sp ectra obtained with Mizusawa 10m antenna. Thick lines show VLBI cross-p ower sp ectra obtained with Mizusawa 10m - Kagoshima 6m baseline. The cross­p ower sp ectra show lower flux densities compared with the total­ p ower sp ectra b ecause maser features are slightly extended and hence partially resolved in the VLBI baseline. (Figure courtesy of Imai et al., Astron. Astrophys, 317, L67-L70, 1997.)

50


and Ryy (0) are known, for example, from suitable system­noise measurements. Derivation of this relationship owes to the clever use of the probability relations in equation (74) by van Vleck. Note that J.H. van Vleck is a scientist who in 1977 received Nob el Prize in Physics in his ma jor field of research: the "fundamental theoretical investigations of the electronic structure of magnetic and distorted systems". 1.1.21 van Vleck Relationship in Auto correlation

Although we examined the derivation of the van Vleck relationship for cross­ correlation, there is no restriction in applying the same logic to auto correlation of a zero­mean stationary random pro cess x(t). As a matter of fact, van Vleck originally derived his relationship for the auto correlation. The only difference is that we have to use, instead of equation (68), second­order Gaussian probability density for values of x(t) taken at times t + and t: 1 2
2 x 2 1 - rxx ( )

f (x1 , x2 ; ) =

1 - 2 e 2(1 - rxx ( ))

x2 - 2rxx ( ) x1 x2 + x 1 2 x

2 2

,

(94) Rxx ( ) 2 is the correlation where x = Rxx (0) is the disp ersion and rxx ( ) = Rxx (0) co efficient of x(t). Exactly the same logic as in the cross­correlation case leads to an equation: rxx [m] = Rxx [m] = ^^ ^^ Rxx (mT ) 2 2 arcsin(rxx [m]) = arcsin , Rxx (0) (95)

for the correlation co efficient of the 1­bit quantized data x[i], which is the ^ van Vleck relationship in the auto correlation. 1.1.22 Sp ectra of Clipp ed Data

The derivation of the van Vleck relationship, as we saw ab ove, is quite general and applicable to any data ob eying joint Gaussian probability density. In particular, the relationship is well valid for continuous­time data, though we have dealt with only sampled data in our discussion of digitization of analog data. Thus, we generally have Rxy ( ) = ^^ 2 arcsin(rxy ( )), 51 (96)


for the cross­correlation, and Rxx ( ) = ^^ 2 arcsin(rxx ( )), and Ryy ( ) = ^^ 2 arcsin(ryy ( )), (97)

for the auto correlations, with arbitrary time interval . Therefore, we can calculate cross­p ower sp ectrum Sxy ( ), and p ower ^^ sp ectra Sxx ( ) and Syy ( ) of clipp ed data in terms of the ordinary Fourier ^^ ^^ transformations: Sxy ( ) = ^^ Sxx ( ) = ^^ Syy ( ) = ^^
- - -

Rxy ( ) e ^^

-i

2 d = d = d = 2 2

- - -

arcsin(rxy ( )) e

-i

d , d , d . (98)

Rxx ( ) e ^^ Ryy ( ) e ^^

-i

arcsin(rxx ( )) e arcsin(ryy ( )) e

-i

-i

-i

Let us consider a case when analog data have rectangular sp ectra. This is an imp ortant case from a practical p oint of view, since rectangular bandpass filters are widely adopted in radio astronomy. We saw earlier that correlation co efficients of analog data take sinc function forms when their sp ectra are rectangular. Then, how sp ectra of the data will lo ok like after clipping? Figure 22 shows a p ower sp ectrum Sxx ( ) of 1­bit quantized data (solid ^^ line in b ottom panel) which is derived from analog data with a rectangular p ower sp ectrum Sxx ( ) of bandwidth B (broken line in b ottom panel). The p ower sp ectrum of the analog data Sxx ( ) is normalized by the disp ersion Rxx (0) here. In other words, the "p ower sp ectrum" in this Figure is the Fourier transform of the correlation co efficient rxx ( ). Therefore, areas under the sp ectra (i.e. p owers) of the analog and clipp ed data are equal. Also shown are (1) correlation co efficient of the analog data rxx ( ) having a sinc function form (upp er left panel), and (2) correlation co efficient of the clipp ed 2 data rxx ( ) = Rxx ( ) = arcsin(rxy ( )) (upp er right panel). ^^ ^^ Although the bulk of the sp ectrum after clipping still remains nearly band­limited, p eak amplitude is somewhat reduced and a low level­skirt app ears outside the original band which spreads over a wide range of frequency. Thus, the original Nyquist interval (1 / (2B)) is no longer strictly optimum for sampling the clipp ed data. On the other hand, Figure 23 shows a cross­p ower sp ectrum Sxy ( ) of 1­ ^^ bit quantized data x and y (solid line in b ottom panel) which are derived from ^ ^ 52


Correlation coefficient of analog data 1 0.8 0.6 0.4 0.2 0 -0.2 -10 -5 0 2B 5 10 1 0.8 0.6 0.4 0.2 0 -0.2 -10

Correlation coefficient of clipped data

-5

0 2B

5

10

Power spectrum of clipped data 1 0.8 0.6 0.4 0.2 0 -8 -6 -4 -2 0 /B 2 4 6 8

Figure 22: Original rectangular p ower sp ectrum Sxx ( ) with bandwidth B (broken line in b ottom panel) of analog data, and p ower sp ectrum Sxx ( ) ^^ (solid line in b ottom panel) of clipp ed data which is derived from the analog data by means of the 1­bit quantization. Horizontal axis of the b ottom panel shows frequency normalized by the bandwidth B . Upp er panels show correlation co efficient rxx ( ) of the original analog data having a sinc function form (left), and correlation co efficient rxx ( ) of the clipp ed data ^^ (right). Horizontal axes of the upp er panels show delay normalized by the Nyquist interval (1 / (2 B )) of the original analog data.

53


Cross-correlation coefficient of analog data 0.001 0.0008 0.0006 0.0004 0.0002 0 -0.0002 -10 -5 0 2B 5 10 0.001 0.0008 0.0006 0.0004 0.0002 0 -0.0002 -10

Cross-correlation coefficient of clipped data

-5

0 2B

5

10

Cross-power spectrum of clipped data 0.001 0.0008 0.0006 0.0004 0.0002 0 -8 -6 -4 -2 0 /B 2 4 6 8

Figure 23: Original rectangular cross­p ower sp ectrum Sxy ( ) with bandwidth B (broken line in b ottom panel) of analog data, and cross­p ower sp ectrum Sxy ( ) (solid line in b ottom panel) of clipp ed data which are derived ^^ from the analog data by means of the 1­bit quantization. Horizontal axis of the b ottom panel shows frequency normalized by the bandwidth B . Upp er panels show cross­correlation co efficient rxy ( ) of the original analog data having a sinc function form (left), and cross­correlation co efficient rxy ( ) of ^^ the clipp ed data (right). The maximum cross­correlation co efficient of the analog data is assumed to b e 0.001. Horizontal axes of the upp er panels show delay normalized by the Nyquist interval (1 / (2 B )) of the original analog data.

54


analog data x and y having a real rectangular cross­p ower sp ectrum Sxy ( ) of bandwidth B (broken line in b ottom panel). The cross­p ower sp ectrum of the analog data Sxy ( ) in this Figure is normalized by the geometric mean of the disp ersions Rxx (0) Ryy (0), and, therefore, is the Fourier transform of the cross­correlation co efficient rxy ( ). We assume here that the maximum cross­correlation co efficient of the analog data is small, and, for definiteness, set it to b e 0.001. Upp er panels of the Figure show (1) cross­correlation co efficient of the analog data rxy ( ) having a sinc function form (left), and (2) 2 cross­correlation co efficient of the clipp ed data rxy ( ) = Rxy ( ) rxy ( ) = ^^ ^^ (right). Unlike in the p ower sp ectrum case, the sp ectrum after clipping remains rectangular, though p eak amplitude is reduced by a factor of 2/ 0.6366 = compared with the original analog one, in this cross­p ower sp ectrum case. This is b ecause the cross­correlation co efficient of the clipp ed data is just prop ortional to the cross­correlation co efficient of the original analog data with a prop ortionality co efficient of 2/ , in the approximation of the small cross­correlation co efficient | rxy ( ) | 1, which we assumed here. 1.1.23 Price's Theorem

Now we pro ceed to the 2­bit quantization case. Although we can derive the cross­correlation of the 2­bit quantized data by evaluating probabilities of quantization states, as we learned in the 1­bit quantization case, use of Price's theorem (Price, 1958) offers a simpler solution. The theorem states the following. Supp ose we have zero­mean stationary random pro cesses x(t) and y (t) which satisfy joint Gaussian probability density: 1 1 - 2 e 2(1 - r ) x2 xy y - 2r + 2 x x y
2 2 y

f (x, y ) f (x, y ; ) =

2 x

y

1-r

2

,

2 2 where x = Rxx (0) and y = Ryy (0) are disp ersions and r rxy ( ) =

Rxy ( ) x y

is a cross­correlation co efficient of the pro cesses x(t) and y (t). Then, for an arbitrary function g (x, y ) with expectation g (x, y ) =


g (x, y )f (x, y ) dx dy ,

(99)

- -

55


we have 1 n g (x, y ) = nn x y rn (Price, 1958). Auxiliary Formula In order to prove Price's theorem, we first derive an auxiliary formula (Pap oulis, 1984): 2n f (x, y ) 1 n f (x, y ) = , nn x y rn xn y n where x= x , x y= 1 y , y x 2 -2rx y +y 2 2(1 - r 2 ) e . - (102) or n f (x , y ) 2n f (x , y ) , = rn x n y n (101) 2n g (x, y ) , xn y n (100)

f (x , y ) =

2 x

y

1-r

2

We verify this formula using the metho d of mathematical induction. · For n = 1, simple differentiations show f (x , y ) r (1 - r ) + (r x - y ) (r y - x ) - = e r 2 x y (1 - r 2 )5/2 and
2

x 2 - 2rx y +y 2 2(1 - r 2 ) ,

x 2 - 2rx y +y 2 2 f (x , y ) r (1 - r 2 ) + (r x - y ) (r y - x ) - 2(1 - r 2 ) = e . 2 )5/2 x y 2 x y (1 - r 2 f (x , y ) f (x , y ) = , i.e. the formula is valid for n = 1. Therefore, r x y · Now, if the formula is valid for n = m: m f (x , y ) 2m f (x , y ) , = rm x m y m

56


then, for n = m + 1, we have m f (x , y ) 2m f (x , y ) f (x , y ) = = r m+1 r rm r x m y m 2m 2(m+1) f (x , y ) 2m f (x , y ) 2 f (x , y ) , = = = x m y m r x m y m x y x m+1 y m+1
m+1

i.e., the formula is also valid for n = m + 1. Since the formula is valid for n = 1, this means that the formula is valid for arbitrary n 1. Thus, we confirmed the auxiliary formula. Pro of of Price's Theorem We now prove Price's theorem, using again the metho d of mathematical induction. · For n = 1, we have 1 g (x, y ) = x y r = =


- -

1 f g dx dy = x y r
2





g

- -

2f dx dy x y

- - -

x

g

f y

-

y
-

g g f+ f dx dy x x y g f x
y =+ y =-

f g y

x=+ x=-

dy -

dx +





- -

2g f dx dy , x y

where we used the auxiliary formula: 1 f (x, y ) 2 f (x, y ) = . x y r x y First two terms in the RHS vanish as long as g (x, y ) e to zero at x = ± and y = ±. Thus, 1 g (x, y ) = x y r
-(x2 +y 2 )

converges

- -

2 g (x, y ) f (x, y ) dx dy = x y

2 g (x, y ) , x y

i.e. the theorem is valid for n = 1. 57


· Now, if the theorem is valid for n = m: 1 m m x y
m

g (x, y ) = rm

2m g (x, y ) , xm y m

then, for n = m + 1, we have 1 m x +1
m+1 y m+1

g (x, y ) 1 = r m+1 x y r


1 mm x y

m

g (x, y ) rm

1 2m g (x, y ) = = x y r x m y m =


- -

2m g (x, y ) 1 f (x, y ) dx dy x m y m x y r

- -

2m g (x, y ) 2 f (x, y ) dx dy . xm y m x y

Integrating by parts again, we see that the last term is equal to


- -

2(m+1) g (x, y ) f (x, y ) dx dy = xm+1 y m+1
2 2

2(m+1) g (x, y ) , xm+1 y m+1

as long as g (x, y ) e-(x +y ) converges to zero at x = ± and y = ±. Therefore, the theorem is valid for n = m + 1, and, hence, for any n 1. Thus we proved Price's Theorem. Price's Theorem in the 1­Bit Quantaization Case Let us exp erience usefulness of Price's theorem in the 1­bit quantization case as an example. If we cho ose g (x, y ) = x y , then we have ^^ g (x, y ) = R with x(x) = ^

xy ^^

= xy , ^^


+1 : x 0

-1 : x < 0,

y (y ) = ^

+1 : y 0

-1 : y < 0,

in the 1­bit quantization case. We again assume that x and y ob ey the joint Gaussian probability density: 1 1 - 2 e 2(1 - r ) 58 x2 y xy + - 2r 2 x x y
2 2 y

f (x, y ) =

2 x

y

1-r

2

.


Now, Price's theorem says 1 Rx y ^^ = x y r and we have 2 (x y ) ^^ = x y x y ^^ , x y

x ^ y ^ = 2 (x), = 2 (y ), x y


in the 1­bit quantization case. Therefore x y ^^ = x y Thus we have 4 (x) (y ) f (x, y ) dx dy = 2 .

- -

x

y

1-r

2

and hence R
xy ^^

2 Rx y ^^ = , r 1 - r2 2 =
r

0



dr 1-r

2

=

2 arcsin(r ),

where the limits of the integration are chosen to satisfy Rxy = 0 at r = 0. ^^ Thus, Price's theorem allows us to derive the van Vleck relationship in the really straightforward way. 1.1.24 Cross­Correlation of the 2-Bit Quantized Data

Now we consider, for definiteness, cross­correlation Rxy ( ) of 2­bit quantized ^^ data x(t) and y (t), though logics b elow are equally applicable to auto corre^ ^ lation as well. Let clipping criteria for the 2­bit quantization are those given in Figure 17 and Table 1. We assume a simple case where disp ersions of the original analog data x(t) and y (t) are identical, and denote the common value as , i.e. x = y = . Then, the joint probability density is given by 1 x2 - 2 r x y + y 2 2 2 (1 - r 2 ) , e -

f (x, y ) f (x, y ; ) =

2

2

1-r

2

(103)

where r rxy ( ) is the cross­correlation co efficient of the analog data. According to Price's theorem, the cross­correlation of the clipp ed data 2 (x y ) ^^ x y ^^ 1 Rx y ^^ = Rxy ( ) satisfies = , where the derivatives ^^ 2 r x y x y

59


are now given by x ^ = (n - 1) (x + v0 ) + 2 (x) + (n - 1) (x - v0 ), x y ^ = (n - 1) (y + v0 ) + 2 (y ) + (n - 1) (y - v0 ). y Using these derivatives, we can calculate Rxy ^^ = r
2

x y ^^ = x y

2





- -

x y ^^ f (x, y ) dx dy . x y

After simple manipulations, we obtain 1 Rx y ^^ = r 1-r (n - 1)
2

2

e

-

2 v0 2 (1+r )

+e

-

2 v0 2 (1-r )

+ 4(n - 1)e
xy ^^

-

2 v0 2 2 (1-r 2 )

+2 .

Therefore, the solution which satisfies a condition R R
xy ^^

= 0 when r = 0, is
2 v0 2 (1-r )

1 =

r

0



1 1-r

2

(n - 1)

2

e

-

2 v0 2 (1+r )

+e

-

+4(n - 1)e In the limiting case where | r | R
xy ^^

-

v2 0 2 2 (1-r 2 )

1, the cross­correlation is given by
-
2 v0 2 2

+ 2 dr .



(104)

=

r 2 (n - 1)2 e

-

2 v0 2

+ 4 (n - 1) e

+2 =

2r [(n - 1) E + 1]2 , (105)

where we intro duced a notation Ee
-
2 v0 2 2

.

(106)

Note that the cross­correlation of the clipp ed data Rxy ( ) is prop ortional to ^^ the analog cross­correlation co efficient r = rxy ( ), in this case. 1.1.25 Cross­Correlation Co efficient of the 2­Bit Quantized Data

Now we would like to calculate cross­correlation co efficient rxy ( ) of the 2­bit ^^ 2 quantized data. For this purp ose, we need their disp ersions: xx = Rxx (0) ^^ ^^ 2 and yy = Ryy (0), other than the cross­correlation Rxy ( ). ^^ ^^ ^^ 60


We assumed here that analog data x(t) and y (t) have equal disp ersions: Rxx (0) = Ryy (0) = 2 , and therefore they ob ey identical Gaussian probability densities: y2 x2 1 1 e- 2 2 , and f (y ) = e- 2 2 . f (x) = (107) 2 2 Since 4 quantization states in the 2­bit clipp ed data take values -n, -1, +1, and +n, the joint probabilities of the quantization states of x[n] and x[n] ^ ^ ^ ^ (or y [n] and y [n]), taken at the same time, satisfy P (-n, -n; 0) + P (-1, -1; 0) + P (+1, +1; 0) + P (+n, +n; 0) = 1, (108)
4

and the disp ersion: Rxx (0) = x(t) x(t) = ^^ ^^
i=1

xi xi P (xi , xi ; 0), must b e

Rxx (0) = n2 P (-n, -n; 0) + P (-1, -1; 0) ^^ + P (+1, +1; 0) + n2 P (+n, +n; 0).

(109)

Combining equations (108) and (109), and rep eating the same thing for y , ^ we obtain Rxx (0) = Ryy (0) = + n2 (1 - ), (110) ^^ ^^ where we intro duced a notation: 1 = P (-1, -1; 0) + P (+1, +1; 0) = 2
+v
0 2 2 2

e
-v
0

-

d .

(111)

We used here the explicit form of the probability density in equation (107). Thus, the cross­correlation co efficient of the 2­bit quantized data rxy ( ) ^^ is given by rxy ( ) = ^^ Rxy ( ) ^^ Rxx (0) Ryy (0) ^^ ^^ 1 [ + n2 (1 - )] +4(n - 1)e
-

=

Rxy ( ) ^^ + n2 (1 - )
r
xy

( )

=

0

2 v0 2 (1-r 2 ) 2

1 1-r


2

(n - 1)

2

e

-

2 v0 2 (1+r )

+e

-

2 v0 2 (1-r )

+2

dr ,

(112)

as a function of the cross­correlation co efficient of the original analog data r = rxy ( ). This is an analogue of the van Vleck relationship in the 2­bit quantization case. 61


In the limiting case of | rxy ( ) | rxy ( ) = ^^

1, equations (105) and (110) yield (113)

2 [(n - 1) E + 1]2 rxy ( ). [ + n2 (1 - )]

This is an analogue of equation (91) in the 2­bit quantization case.
1 Correlation coefficient of clipped data 1 bit quantization 2 bit quantization

0.8

0.6

0.4

0.2

0

0

0.2

0.4

0.6

0.8

1

Correlation coefficient of analog data

Figure 24: Correlation co efficients of 1­bit (dotted line) and 2­bit (solid line) quantized data with n = 3 and v0 = 0.996 as functions of the analog correlation co efficient. Exactly the same logic leads to the same functional forms as equations (112) and (113) for a relationship b etween the correlation co efficient of the 2­bit quantized data rxx ( ) and that of the original analog data rxx ( ). Thus ^^ the correlation co efficient of the 2­bit quantized data is given by rxx ( ) = ^^ 1 [ + n2 (1 - )] +4(n - 1)e
- r
xx

( )

0

v2 0 2 (1-r 2 ) 2

1 1-r


2

(n - 1)

2

e

-

2 v0 2 (1+r )

+e

-

2 v0 2 (1-r )

+2

dr ,

(114)

62


for a general case, and rxx ( ) = ^^ 2 [(n - 1) E + 1]2 rxx ( ), [ + n2 (1 - )] 1. (115)

for the limiting case of | rxx ( ) |

Figure 24 shows the relationship b etween the correlation co efficient, which could b e either (auto­)correlation co efficient or cross­correlation co efficient, of the clipp ed data and that of the original analog data. The dotted line shows the van Vleck relationship for the 1­bit quantization case as given by equation (90) or (95). The solid line shows the relationship for the 2­bit quantization case as given by equation (112) or (114) for a particular set of parameters n = 3 and v0 = 0.996 . 1.1.26 Power and Cross­Power Sp ectra of 2-Bit Quantized Data

We can now calculate p ower sp ectrum Sxx ( ) and cross­p ower sp ectrumSxy ( ) ^^ ^^ 2 of the 2­bit quantized data, which are normalized by the disp ersion x and ^ geometric mean of the disp ersions x y , resp ectively, by Fourier transforming ^^ the correlation co efficient rxx ( ) and the cross­correlation co efficient rxy ( ) ^^ ^^ given in equation (112), Sxx ( ) = ^^ Sxy ( ) = ^^
- -

rxx ( ) e ^^ rxy ( ) e ^^

-i

d = d =

- -

F (rxx ( )) e F (rxy ( )) e

-i

d , d , (116)

-i

-i

where the function F (r ) is given by r 1 1 F (r ) = 2 (1 - ) ( + n 1-r 0

2

(n - 1)

2

e
-

-

v2 0 2 (1+r )

+e +2

-

v2 0 2 (1-r )

+4(n - 1)e

v2 0 2 2 (1-r 2 )

Let us again consider the case when analog data have rectangular sp ectra. Figure 25 shows the p ower sp ectrum Sxx ( ) of the 2­bit quantized data in ^^ the case with n = 3 and v0 = 0.996 (solid line in b ottom panel) and that of the original analog data Sxx ( ) with the rectangular shap e of bandwidth B (broken line in b ottom panel). Similarly to the 1­bit case, areas under the sp ectra (i.e. p owers) of the analog and clipp ed data are equal. Also shown are (1) correlation co efficient of the analog data rxx ( ) having a sinc function 63



dr .


Correlation coefficient of analog data 1 0.8 0.6 0.4 0.2 0 -0.2 -10 -5 0 2B 5 10 1 0.8 0.6 0.4 0.2 0 -0.2 -10

Correlation coefficient of 2-bit quantized data

-5

0 2B

5

10

Power spectrum of 2-bit quantized data 1 0.8 0.6 0.4 0.2 0 -8 -6 -4 -2 0 /B 2 4 6 8

Figure 25: Original rectangular p ower sp ectrum Sxx ( ) with bandwidth B (broken line in b ottom panel) of the analog data, and p ower sp ectrum Sxx ( ) ^^ (solid line in b ottom panel) of the 2­bit quantized data with n = 3 and v0 = 0.996 . Horizontal axis of the b ottom panel shows frequency normalized by the bandwidth B . Upp er panels show correlation co efficient rxx ( ) of the original analog data having a sinc function form (left), and correlation co efficient rxx ( ) of the 2­bit quantized data (right). Horizontal axes of the ^^ upp er panels show delay normalized by the Nyquist interval (1 / (2 B )) of the original analog data.

64


Cross-correlation coefficient of analog data 0.001 0.0008 0.0006 0.0004 0.0002 0 -0.0002 -10 -5 0 2B 5 10 0.001 0.0008 0.0006 0.0004 0.0002 0 -0.0002 -10

Cross-correlation coefficient of 2-bit quantized data

-5

0 2B

5

10

Cross-power spectrum of 2-bit quantized data 0.001 0.0008 0.0006 0.0004 0.0002 0 -8 -6 -4 -2 0 /B 2 4 6 8

Figure 26: Original rectangular cross­p ower sp ectrum Sxy ( ) with bandwidth B (broken line in b ottom panel) of the analog data, and cross­p ower sp ectrum Sxy ( ) (solid line in b ottom panel) of the 2­bit quantized data with ^^ n = 3 and v0 = 0.996 . Horizontal axis of the b ottom panel shows frequency normalized by the bandwidth B . Upp er panels show cross­correlation coefficient rxy ( ) of the original analog data having a sinc function form (left), and cross­correlation co efficient rxy ( ) of the 2­bit quatized data (right). ^^ The maximum cross­correlation co efficient of the analog data is assumed to b e 0.001. Horizontal axes of the upp er panels show delay which is normalized by the Nyquist interval (1 / (2 B )) of the original analog data.

65


form (upp er left panel), and (2) correlation co efficient of the 2­bit quantized data rxx ( ) (upp er right panel). ^^ The sp ectrum after clipping again shows somewhat reduced p eak amplitude and a low­level skirt extending over a wide range of frequency. Thus, the original Nyquist interval (1 / (2B)) is not strictly optimum for sampling the clipp ed data again, though to a smaller extent compared with the 1­bit case. On the other hand, Figure 26 shows a cross­p ower sp ectrum Sxy ( ) of the ^^ 2­bit quantized data x and y in the case with n = 3 and v0 = 0.996 (solid ^ ^ line in b ottom panel) and that of the original analog data x and y , having a real rectangular cross­p ower sp ectrum Sxy ( ) of bandwidth B (broken line in b ottom panel). We assumed here again that the maximum cross­correlation co efficient of the analog data is as small as 0.001, for definiteness. Upp er panels of the Figure show (1) cross­correlation co efficient of the analog data rxy ( ) having a sinc function form (left), and (2) cross­correlation co efficient of the clipp ed data rxy ( ) (right). Note that the sp ectrum after clipping ^^ remains again rectangular, b ecause of the approximate linearity b etween the analog and clipp ed cross­correlation co efficients. 1.1.27 Disp ersion of Digital Correlator Output

We again assume here that the delay tracking and completely p erformed b efore the multiplication, so two data streams x[i] and y [i] are p erfectly aligned. ^ ^ Under the assumption of the stationary random 2 tion R and the disp ersion R of the output R are R = x[i] y [i] = Rxy [0], ^^ ^^
2 R = R 2

Let us now examine the signal­to­noise ratio of an output of a digital correlator. How do es the S/N ratio differ from the one exp ected from the analog or "semi­analog" correlator, which we saw earlier? For this purp ose, we first consider disp ersion of the digital correlator output. The output R of a digital correlator, which averages pro ducts of N samples of clipp ed data x[i] and y [i] from two antennas in a radio interferometer, ^ ^ is 1N R= x[i] y [i]. ^^ (117) N i=1 the fringe stopping are that signal parts in the pro cesses, the exp ectagiven by (118)
N N 2 x[i] y [i] x[j ] y [j ] - Rxy [0]. (119) ^^^^ ^^

-R

2

=

1 N

2

i=1 j =1

66


Note that usually N is a huge numb er. For example, if we integrate data sampled with a rate of 4 Msps (mega sample p er second) for 1 second, then N 4, 000, 000. Let us assume that the original analog data x(t) and y (t) have rectangular baseband sp ectra with bandwith B , and the data are sampled with Nyquist rate 2B . In such a case, different sample pairs are indep endent in the analog data, as we discussed earlier, i.e. Rxy [m], Rxx [m], and Ryy [m] are all zero, if m = 0. The same statement is valid for the 1­bit quantized and 2­bit quantized data, since correlation co efficients, and correlations, to o, of these clipp ed data are equal to zero when those of the original analog data are zero, as we can easily see in equations (90), (95), and (112). Thus, Rxy [m], ^^ Rxx [m], and Ryy [m] are also all zero, if m = 0. In addition, we assume that ^^ ^^ the cross­correlation Rxy [0] is much smaller than the auto correlations Rxx [0] ^^ ^^ and Ryy [0], as usually so in radio interferometers. ^^ Then in the double sum of equation (119):
N N

x[i] y [i] x[j ] y [j ] = ^^^^

N i=1

x[i] y [i] x[i] y [i] + ^^^^

N

N

x[i] y [i] x[j ] y [j ] , ^^^^ (120)

i=1 j =1

i=1 j =i

dominating terms will b e
N N

i=1 j =1

x[i] y [i] x[j ] y [j ] ^^^^ =

N i=1

x[i] x[i] y[i] y [i] + ^^ ^^

N

N

x[i] y [i] x[j ] y [j ] ^^ ^^

i=1 j =i

2 2 = N Rxx [0] Ryy [0] + N (N - 1) Rxy [0] N Rxx [0] Ryy [0] + N 2 Rxy [0], = ^^ ^^ ^^ ^^ ^^ ^^ (121)

where we neglected 1 compared with the large numb er N . Now equations (119) and (121) yeild an approximate formula for the disp ersion of the digital correlator output: 1 1 2 2 2 Rxx [0] Ryy [0] + Rxy [0] - Rxy [0] = Rxx [0] Ryy [0]. R = ^^ ^^ ^^ ^^ ^^ ^^ N N 1.1.28 S/N Ratio of Digital Correlator Output (122)

We are now ready to calculate the signal R to noise R ratio S N R of the digital correlator output, using equation (122). We obtain SN R = R =N R Rxy [0] ^^ Rxx [0] Ryy [0] ^^ ^^ = N rxy [0] = ^^ N rxy [0] ^^ , rxy [0] (123)

where rxy [0] is the maximum cross­correlation co efficient of the original analog data under our assumption of the p erfect delay tracking and fringe 67


stopping. As we saw earlier, the maximum cross­correlation co efficient for a continuum sp ectrum source is approximately given by = TA 1 T TS 1 T
A S
2

,

2

where TA1 , TA2 and TS1 , TS2 are antenna temp eratures and system noise temp eratures, resp ectively, at antennas 1 and 2, which are b oth assumed constant throuout the frequency band B . On the other hand, if the integration time is a , the numb er of samples N with the Nyquist interval 1 / (2 B ) is equal to N = 2 B a . Therefore, equation (123) is reduced to SN R = where SN R
analog

rxy [0] ^^ rxy [0]

TA 1 T TS 1 T

A S

2

2 B a =

2

rxy [0] ^^ SN R rxy [0]

analog

,

(124)



TA 1 T TS 1 T

A S

2

2 B a ,

2

is the signal­to­noise ratio of an analog correlator output for the same continuum sp ectrum source received with the same antenna­receiver systems, as we saw in equation (65). Thus the so­called "coherence factor" c , which is defined as c is given by c = in this case. For the case of is usually the case p ortional to rxy [0] (91) and (113), we rxy [0] ^^ , rxy [0] (126) SN R S N Ranal ,
og

(125)

the small cross­correlation co efficient | rxy [0] | 1, which in radio interferometry, rxy [0] of the clipp ed data is pro^^ of the original analog data. Then, in view of equations have 2 c = 0.64, (127) = for the 1­bit quantization case, and c = 2 [(n - 1) E + 1]2 , [ + n2 (1 - )] 68 (128)


for the 2­bit quantization case. With the coherence factor c , equation (124) is reduced to SN R =
c

TA 1 T TS 1 T

A S

2

2 B a ,

(129)

2

which is an imp ortant formula for estimating sensitivity of a radio interferometer observing a continuum sp ectrum source. Note that the coherence factor as given by equation (126) still takes into account only the loss due to effects of clipping. Usually, the factor is further reduced in view of losses o ccuring through digital pro cessing in hardware correlators. 1.1.29 Optimum Parameters v0 and n in the 2­Bit Quantization

We mentioned earlier that the two parameters in the 2­bit quantization, the threshold v0 and the higher­level value n, are chosen so that the signal­to­ noise ratio of the digital correlator output is maximized. This condition is
Coherence factor of 2-bit quantized data

0.885 0.88 0.875 0.87 0.865 0.86 1.15 1.1 1.05 v0 /

2.5

3 n

3.5

4

0.95 0.9 0.85 0.8 4.5

1

Figure 27: Coherence factor of the 2­bit quantized data as a function of normalized threshold v0 / and higher­level value n in the limiting case of the small cross­correlation co efficient. 69


fulfilled when the coherence factor c takes the maximum. In the limiting case of the small cross­correlation co eficient | rxy [0] | 1, which is common in radio interferometry, we can calculate the coherence factor c using equation (128) with explicit forms for the functions E and given in equations (106) and (111). Figure 27 shows the coherece factor as a function of the two parameters, v0 normalized by the analog disp ersion , and n, in the small cross­correlation co efficient case. As we see from the Figure, the maximum value of the coherence factor c = 0.883 is obtained when n = 3.34 and v0 = 0.982 . However, from a practical p oint of view, designing of digital circuitry for 2­bit hardware correlators can b e more easily implemented when n is an integer. Therefore, n = 3, v0 = 0.996 with c = 0.881, and n = 4, v0 = 0.942 with c = 0.880, are often used in existing VLBI 2­bit quantization systems. This is why we adopted n = 3, v0 = 0.996 in Figures 24, 25, and 26. Accordingly, in the limiting case of the small cross­correlation co efficient | rxy [0] | 1, the coherence factor, which takes into acount the effect of clipping only, is given by c 0.64, (130) = for the 1­bit quantization case, and c 0.88, = for the 2­bit quantization case. 1.1.30 Effect of Oversampling in S/N Ratio of Clipp ed Data (131)

As we saw earlier, p ower sp ectra of clipp ed data show low­level but broad skirt b eyond band edges at = ±B of the original rectangular sp ectra of corresp onding analog data (Figures 22 and 25). Therefore, the Nyquist sampling for the original analog data with 2B rate is no longer optimum for the clipp ed data. Sampling with a rate higher than the Nyquist rate for the analog data 2B , which is a little incorrectly called the "oversampling", improves the signal­ to­noise ratio of the clipp ed data. In this case, different sample p oints are no longer indep endent any more, and we have to take into account contributions of auto correlations b etween different sample p oints, i.e. Rxx [m] and Ryy [m] with m = 0, when we calculate the signal­to­noise ratio. Let us consider that we sample our data with a rate which is times as fast as the Nyquist rate of the original band­limited analog data 2B . Then, in the calculation of the disp ersion of the digital correlator output R shown 70


in equations (119) ­ (122), we must leave auto correlation terms with non­ zero arguments. Sp ecifically, the second term in the RHS of equation (120) now must b e
N N

i=1 j =i

x[i] y [i] x[j ] y [j ] ^^^^ =

N

N

[ x[i] y [i] x[j ] y [i] + x[i] x[j ] y [i] y [i] ], ^^ ^^ ^^ ^^

i=1 j =i

where the pro ducts of the auto correlations in the second term in the RHS could b e well larger than the pro ducts of the cross­correlations in the first term in this oversampling case. Thus the disp ersion now b ecomes
2 R =

1 Rxx [0] Ryy [0] + 1 = ^^ ^^ N N

1 N2



N i=1

x[i] y [i] x[i] y [i] + ^^^^
N 2 N

N

N

i=1 j =i

2 x[i] y [i] x[j ] y [j ] - Rxy [0] ^^^^ ^^



i=1 j =i

Rxx [i - j ] Ryy [i - j ], ^^ ^^

(132)

where the second term in the RHS stands for the auto correlations b etween different i­th and numb er of combinations of i and j having the is N - | k |, and auto correlations are likely to | k | N only, we have 2 1 2 Rxx [0] Ryy [0] + R = ^^ ^^ N N
N -1 2 k =1

the largest contribution of j ­th sample p oints. Since same diffference k = i - j b e large enough for small

(N - k )Rxx [k ] Ryy [k ] ^^ ^^ (133)

1 Rxx [0] Ryy [0] (1 + 2 = ^^ ^^ N

rxx [k ] ryy [k ]). ^^ ^^

k =1

Thus, for a continuum sp ectrum source, the signal­to­noise ratio of the digital correlator output of N = 2 B a oversampled data is approximately given by R rxy [0] TA1 TA2 2 B a ^^ SN R = = . (134) R rxy [0] TS1 TS2 1+2 rxx [k ] ryy [k ] ^^ ^^
k =1

Therefore, the coherence factor in the case of the oversampling is given by SN R rxy [0] ^^ c = . (135) = S N Ranalog rxy [0] 1+2 rxx [k ] ryy [k ] ^^ ^^
k =1

71


We can estimate this factor, using equations (95) and (114) which describ e correlation co efficients of the clipp ed data rxx [k ] and ryy [k ] as functions of ^^ ^^ analog correlation co efficients rxx [k ] and ryy [k ], which we denote as rxx [k ] = ^^ fcl (rxx [k ]) and ryy [k ] = fcl (ryy [k ]). Here fcl is the clipping function which is ^^ given in equation (95) and equation (114) for the 1­bit and 2­bit quatization cases, resp ectively. When the original analog data have rectangular baseband sp ectra with bandwidth B , as we have assumed in the present discussion, the analog correlation co efficients rxx ( ) and ryy ( ) have sinc function forms: sin(2 B 2 B as given in equation (46). Therefore, correlation samples x[k ] and y [k ] with the sampling interval Ts rxx ( ) = ryy ( ) = ) ,

co efficients of their time = 1/(2 B ) are given by sin
k k

rxx [k ] = ryy [k ] = rxx (k Ts ) = ryy (k Ts ) =

.

(136)

Coherence Factor vs Oversampling Ratio 1 0.8 0.6


0.4 0.2 0 analog two-bit one-bit 0 2 4 6 8 10 12 14 16 18 20

Figure 28: Coherence factor c for the analog (+), 2­bit quantized (â), and 1­bit quantized () data as functions of the oversampling ratio . Thus, the signal­to­noise ratio and the coherence factor of the oversampled clipp ed data is given by SN R =
c

c

TA 1 T TS 1 T

A S

2

2 B a ,

(137)

2

72


rxy [0] ^^ c = rxy [0] 1+2


k =1




2 fcl

sin

k k



.

(138)

Figure 28 shows the coherence factor c for the analog, 2­bit quantized, and 1­bit quantized data as functions of the oversampling ratio , calculated by means of equation (138). The coherence factor of the clipp ed data improves gradually with increasing , approaching to some constant values. On the other hand, the coherence factor of the analog data is always 1, irresp ective of the oversampling ratio , in accordance with the sampling theorem. In a particular case of the "double­sp eed sampling" = 2, we have c = 0.744 for the 1­bit quantized data, and c = 0.930 for the 2­bit quantized data. 1.1.31 Coherence Factor and Sensitivity with Given Bit Rate

Coherence factor c with different clipping and oversampling is summarized in Table 2. It is obvious that the coherence factor increases as we increase Numb er of bits N 1 2
b

Numb er of quantization levels 2 4

coherence factor =1 0.64 0.88 =2 0.74 0.93

c

Table 2: Relationship of coherence factor c with numb er of bits Nb and oversampling ratio . the numb er of bits Nb and the oversampling ratio . However, actual observations are usually limited by maximum bit­rate b of data streams allowed by correlators or recorders. If the maximum bit­ rate b is fixed, the maximum p ermissible sampling rate, which is times as large as the Nyquist rate 2B with the analog bandwidth B , is b /Nb = 2 B . Therefore, the maximum allowable bandwidth is limited by the bit­rate as B = b /(2 Nb ). Hence, in view of equation (137), the maximum signal­ to­noise ratio, i.e. the sensitivity of the interferometric observation, for a continuum sp ectrum source, is prop ortional to c S N R c B . (139) Nb 73


Numb er of bits N 1 2
b

Numb er of quantization levels 2 4

c / N =1 0.64 0.62

b

=2 0.52 0.47

Table 3: Factor c / Nb , which determines the signal­to­noise ratio, estimated for clipp ed and oversampled data with a fixed bit­rate. Table 3 shows the factor c / Nb , which determines the signal­to­noise ratio for the continuum sp ectrum source. When the bit­rate is given, the sensitivity turns out to b e largest in the simplest case of the 1­bit quantization (Nb = 1) with the Nyquist sampling ( = 1)! This is the reason why many VLBI observations still use the 1­bit quantization scheme with the Nyquist sampling. However, if we observe a line sp ectrum source, such as an astronomical maser source, the frequency range containing the signal from the radio source is confined within a limited sp ectral profile of the source where we often see many narrow lines. Thus, in order to estimate the signal­to­noise ratio of a sp ectral line, we must replace in equation (137) the bandwidth B of our receiving system by the width of the sp ectral line. The line width is intrincic to the radio source and constant irresp ective of system­dep endent parameters, such as the numb er of bits Nb , or the oversampling ratio . Therefore, the larger the coherence factor c , the higher is the sensitivity in this case. Consequently, mo dern VLBI systems tend to adopt the 2­bit quantization scheme for b etter p erformance in line­sp ectrum­source observations. Note also that the 2­bit quantization scheme (Nb = 2) with the Nyquist sampling ( = 1) offers almost the same sensitivity as the 1­bit quantization scheme with the Nyquist sampling, i.e. 0.62 against 0.64, for a continuum source, as evident from Table 3. Figure 29 shows an example of the maser source sp ectra derived from 2­bit quantized data.

74


Figure 29: Sp ectra of the water maser source Cep A in an active region of star formation derived from 2­bit quantized data. Upp er panel shows total­ p ower sp ectrum with Mizusawa 20m antenna, and lower panel shows phase and amplitude of cross­p ower sp ectrum with Mizusawa­Ogasawara baseline of the VERA. Profiles of the two sp ectra are fairly different which indicate effects of partial resolution of maser features in the VLBI baseline. Note that amplitude scales are different in the two panels. (T. Hirota, private communication in 2005.) 75


1.2
1.2.1

Frequency Standard
VLBI Requires "Absolute" Frequency Stability
RF



IF

- 0

analog

digital

Connected Interferometer
instrumental delay
Multiplier Integrator

correlator

Tape recorder

VLBI

Multiplier

Integrator correlator

Figure 30: Connected­Element Interferometer vs. VLBI system. Development of the highly stable frequency standard was indisp ensable for realization of VLBI. In the case of the connected­element interferometer, a common frequency standard is used to generate lo cal oscillator (LO) reference signals for frequency conversion in all antennas (see upp er panel of Figure 30). Therefore, any phase noise, due to instability in the frequency standard, is common in the signals from all antennas. Such common phase noise from any given antenna is always comp ensated by the same phase noise from another antenna, in the correlation pro cessing. In fact, let us assume that a signal from a radio source, which we assume to b e p oint­like for simplicity, received by antenna A is: vA (t) sin[ t - (t)], and the same signal received by antenna B is: vB (T ) sin[ (t - g ) - (t)], 76


where is a frequency of the radio signal, g is the geometric delay b etween the two antennas, and (t) is the common phase noise due to the frequency standard. After multiplication of these signals, we have vA (t) vB (t) 1 {cos( g ) - cos[2 t - g - 2 (t)]}, 2

and after integration (averaging), we have vA (t) vB (t) 1 cos( g ), 2 (140)

since rapidly oscillating second term with frequency 2 is averaged out. Thus, no effect of the phase noise remains in the correlator output! Consequently, correlation results in connected­element interferometers are almost unaffected by the instability of the frequency standard. Then, we readily obtain the almost pure fringe pattern cos( g ) in the crorrelation results as shown in equation (140), as far as we p erform sufficiently long integration to suppress the thermal noise o ccuring in receiving systems and in the environment, and achieve high enough signal­to­noise ratio. This is why very high stability of frequency standards is not necessarily required in connected­element interferometers. How ab out VLBI, then? Each antenna in VLBI uses its own indep endent frequency standard to generate the LO reference signal (see lower panel of Figure 30). Then, a signal from a p oint­like radio source received by antenna A is given by vA (t) sin[ t - A (t)], and the same signal received by antenna B is: vB (t) sin[ (t - g ) - B (t)], where A (t) and B (t) are phase noises in indep endent frequency standards. After multiplication, we have 1 vA (t) vB (t) {cos[ g - A (t) + B (t)] - cos[2 t - g - A (t) - B (t)]}, 2 and after integration (averaging), vA (t) vB (t) 1 cos[ g - A (t) + B (t)]. 2 77 (141)


Therefore, no comp ensation of the phase noises is exp ected in the correlator output of VLBI. In other words, correlation results in VLBI are always directly affected by the instability of the frequency standards! The noise in the fringe phase (the argument of the sinusoidal fringe pattern in equation (141)) gives rise to two difficulties in VLBI. First, the fringe phase, which is the imp ortant observable in radio interferometry, is contaminated by the phase noise. Second, the phase noise severely limits the sensitivity of VLBI. Indeed, it b ecomes imp ossible for us to completely stop the oscillation of the sinusoidal fringe pattern if the phase noise varies in time, even when we ideally comp ensate the geometric delay g by applying accurate enough delay tracking and fringe stopping. Then, if we further integrate (time­average) the correlator output, hoping to get higher signal­to­noise ratio, the phase­noise­induced oscillation of the fringe pattern results in smaller amplitude of the averaged signal. The thermal noise contribution in the correlator output must b e surely suppressed by the integration, but the averaged signal itself could b e reduced even more rapidly due to the oscillation. Dep ending on the ratio b etween the integration time and the timescale of the oscillation, the integration may not improve the signal­to­noise ratio at all, but even degrade it. Such an effect is called the "coherence loss". Therefore, the integration time must b e short enough not to cause large coherence loss, but this implies that the signal­to­noise ratio must b e limited by the short integration time. This is why we need "absolute" stability of frequency standards in VLBI, in order to ease these difficulties. Of cource, not only the noise due to the frequency standards, but any other phase noise, due for example to the propagation delay through the turbulent atmosphere, causes similar difficulties, as we will see later. Now we would like to discuss the way to quantitatively describ e the frequency stability, and to estimate effects of the incomplete stability. 1.2.2 How to Describ e Frequency Stability?

Let us consider a reference signal v (t) which is exp ected to have a form in an ideal case: (142) v (t) = v0 cos(0 t), with a nominal frequency 0 . In actuality, however, any real reference signal has a phase noise (t), and therefore has a form: v (t) = v0 cos(0 t + (t)). (143)

78


In this actual case, instantaneous frequency a (t) will b e a (t) = 0 + d(t) . dt (144)

We then intro duce a concept of the "fractional frequency deviation (FFD)" y (t), which is defined by y (t) = (t) a (t) - 0 1 d(t) = = , 0 0 0 dt (145)

as a measure of the frequency stability. Let us assume that (t) and y (t) are stationary random pro cesses. Then, their auto correlations are functions of the time difference : R ( ) = (t + ) (t) , and Ryy ( ) = y (t + ) y (t) , and their p ower sp ectra are given by Fourier transform relations: S ( ) = Syy ( ) =
- -

(146)

R ( ) e Ryy ( ) e

-i

d , d ,

1 R ( ) = 2 Ryy ( ) = 1 2

-

S ( ) e Syy ( ) e

i

d , d . (147)

-i

i

-

In view of equation (145), the auto correlations of y (t) and (t) are mutually related by Ryy (t, t ) = y (t) y (t ) = 1 2 0 d(t) d(t ) dt dt = 1 2 R (t, t ). (148) 2 0 t t

In our case of the stationary random pro cesses, this is reduced to Ryy ( ) = - 1 d2 R ( ), 2 0 d 2 (149)

where = t - t . From equations (147) and (149), we have 1 d2 1 Ryy ( ) = - 2 2 0 d 2
-

S ( ) e

i

1 d = 2

-

2 S ( ) e 2 0

i

d ,

and, therefore, the p ower sp ectra of y (t) and (t) are related to each other by a relation: 2 Syy ( ) = 2 S ( ). (150) 0 79


Let us make a comment here ab out conventions which have b een traditionally used for describing p ower sp ectra in frequency stability discussions. Since y (t) and (t) are real functions of time, the auto correlation R ( ) and Ryy ( ), and p ower sp ectra S ( ), Syy ( ) are all real and even functions of and . Using this prop erty, and using frequency instead of angular frequency = 2 , we can describ e the p ower sp ectra in the "single­sided forms" S ( ) and Syy ( ), which have b een widely used in the frequency stability discussions: S ( ) = 4 Syy ( ) = 4
0 0

R ( ) cos(2 ) d , Ryy ( ) cos(2 ) d ,

R ( ) = Ryy ( ) =

0 0

S ( ) cos(2 ) d,

Syy ( ) cos(2 ) d. (151)

These single­sided sp ectra S ( ) and Syy ( ) are related to our double­sided p ower sp ectra S ( ) and Syy ( ) by: S ( ) = 2 S (2 ), and Syy ( ) = 2 Syy (2 ), (152)

for the p ositive frequency range 0. Therefore, their mutual relationship has b een often given by: Syy ( ) = 2 S ( ), 2 0 0 . 2 (153)

instead of equation (150), where 0 =

However, we will continue to use our double­sided p ower sp ectrum forms in equations (147), following our previous discussions. 1.2.3 Typ es of Phase and Frequency Noises

Measurements have shown that noises in the frequency stability are classified into following typ es according to the p ower­law index of the p ower sp ectrum of the FFD Syy ( ) = H with roughly constant co efficients H . Each noise typ e, or p ower­law comp onent of the FFD sp ectrum, has its own characteristic name listed in Table 4. An oscillator in a frequency standard often shows a combination of all p ower­law comp onents of Table 4 in various frequency ranges, as shown schematically in Figure 31. 80


2 1 0 -1 -2

Name of Noise Typ e White phase Flicker phase White frequency Flicker frequency Random walk of frequency

Syy ( ) H2 H1 H H H
0 -1 -2 2 1

S ( )
2 0 H 2 0 2 -1 -2 -3 -4

H1
2 0 H0 -1 -2 2 0 H 2 0 H



-1 -2



Table 4: Power­law typ es of phase and frequency noise sp ectra.

=-2

=2

=-1 =0

=1

Figure 31: A schematic sp ectrum of the FFD, y (t), of an oscillator showing all p ower­law comp onents in Table 4 in various frequency ranges. Horizontal and vertical axes show the angular frequency and the p ower sp ectrum Syy ( ), resp ectively, in log­scales.

81


1.2.4

Time Domain Measurements

It has traditionally b een easier to make measurements in the time domain, than in the frequency domain. Hence, frequency stability characteristics are usually given in terms of the time domain measurements. Supp ose that we have N phase values: (t1 ), (t2 ), · · · , (tk ), · · · , (tN ), measured at equally spaced time p oints: t1 , t2 , · · · , tk , · · · , tN , with time interval T , where T = tk+1 - tk for any k . Then, using these phase values measured in the time domain, we form a discrete time series of the "mean fractional frequency deviation" y [k ] ¯ which is defined by ¯ y [k ] = (t
k +1

1 ) - (tk ) = 0 T 0 T

t

k+1

t

k

d(t ) 1 dt = dt T

t

k+1

y (t )dt ,
t
k

(154)

and can b e describ ed also as ¯ y [k ] = 1 T
t+
T 2

y (t )dt ,
t-
T 2

(155)

where t = tk + T /2. This is a running mean of the FFD y (t) at the p oint t = tk + T /2 over the time interval T . Thus, y [k ] for any k is given by a ¯ linear system of y (t): y [k ] = y (t) a(t) = ¯ with an impulse resp onse a(t): a(t) =

1 T -

y (t - t ) a(t ) dt ,

(156)

0

otherwise,

if -

T 2


T 2

,

(157)

at t = tk + T /2, similarly to what we saw in Chapter 3. 1.2.5 "True Variance" and "Allan Variance" of Fractional Frequency Deviation

Using the mean fractional frequency deviation y [k ], which is obtained by ¯ averaging the FFD y (t) for the time interval T , as shown in equation (155), 82


we intro duce the "true variance of the fractional frequency deviation (TVAR)" as a function of the time interval T : I 2 (T ) = y 2 [k ] , ¯ ¯ where, we assumed zero­mean of the mean FFD y [k ]: y [k ] = 0. ¯ In view of the ergo dicity, we can estimate the TVAR I 2 (T ) by means of a time­average t of N - 1 values of the squared mean FFD y 2 [k ]: ¯ 1 t = N -1
N -1 k =1

(158)

y 2 [k ]. ¯

(159)

Generally sp eaking, there seems no difficulty in calculating the time­ average t , and, therefore, this estimated TVAR app ears a go o d measure of the frequency stability. In actuality, however, t , as given in equation (159), diverges for some imp ortant typ es of frequency noises, and therefore cannot b e used for characterizing the frequency stability as a whole. The assumption of the stationary random pro cess, or the ergo dicity, is not likely to b e strictly fulfilled in these diverging cases. In order to overcome this difficulty, David W. Allan prop osed to take a difference of successive two samples: y [k ] = y [k + 1] - y [k ], ¯ ¯ ¯ (160)

and calculate an average of the squared difference (y [k ])2 divided by two: ¯ 1 a = 2 (N - 2)
N -2 k =1

(y [k ])2 , ¯

(161)

(Allan, 1966). In view of equation (154), the two sample difference y [k ] of ¯ the mean FFD is easily derived from the measured phase noise [k ] by y [k ] = ¯ (t
k +2

) - 2 (tk+1 ) + (tk ) . 0 T

(162)

The mean square sum a gives an estimation of the "two­sample variance" or "Allan variance (AVAR)":
2 y (T ) =

(y [k ])2 ¯ , 2 83

(163)


y[k]

y[k]=y
[k+1]-y[k]



Figure 32: Mean square sum of the two­sample difference of the mean FFD (lower panel) has a b etter chance to converge than that of the mean FFD itself (upp er panel), even when the mean FFD shows a diverging long­time­ scale b ehaviour. as a function of the time interval T . Estimation of the AVAR with a has a b etter chance to converge than that of the TVAR with t , as schematically illustrated in Figure 32. Therefore, the estimated AVAR is widely accepted as a go o d measure of the frequency stability. In view of equation (154), we can describ e the two­sample difference of the mean FFD y [k ] through time integrations of the FFD y (t): ¯ y [k ] = y [k + 1] - y [k ] = ¯ ¯ ¯ = where t = t
k +1

1 T 1 T




t

k+2

t

k+1

t

y (t ) dt -
k+1

t

k

t+T t

t

y (t ) dt -

y (t ) dt
t-T

y (t ) dt




,

(164)

. This is nothing but a linear system of the FFD y (t): y [k ] = y (t) b(t) = ¯
-

y (t - t ) b(t ) dt ,

(165)

with an impulse resp onse:

1 T 1 T

b(t) =

-

if - T < t 0, otherwise, 84 if 0 < t T ,

(166)

0


at time t = t 1.2.6

k +1

.

True Variance and Allan Variance through Power Sp ectrum of Fractional Frequency Deviation

¯ Equations (156) and (165) describ e the discrete­time series y [k ] (the meam FFD) and y [k ] (the two­sample difference of the mean FFD) through the ¯ linear systems with impulse resp onses a(t) and b(t) at times t = tk + T /2 and t = tk+1 , resp ectively. We can formally extend these linear systems to yield continuous­time functions y (t) and y (t): ¯ ¯ y (t) = y (t) a(t) = ¯ y (t) = y (t) b(t) = ¯
- -

y (t - t ) a(t ) dt , y (t - t ) b(t ) dt , (167)

at arbitrary time t, and intro duce their auto correlations Ryy ( ) = y (t + ) y (t) , ¯ ¯ ¯¯ Ry y ( ) = y (t + ) y (t) , ¯ ¯ ¯¯ (168)

and p ower sp ectra Syy ( ) and Sy y ( ). Of course, y (t) takes the particular ¯ ¯¯ ¯¯ value y [k ] at t = tk + T /2, and y (t) takes the particular value of y [k ] at ¯ ¯ ¯ t = tk+1 , i.e. T , 2 y (t) = y [k ] at t = tk+1 . ¯ ¯ y (t) = y [k ] ¯ ¯ at t = tk +

(169)

Then, we can describ e the TVAR, I 2 (T ) , through the auto correlation Ryy ( ), and then the p ower sp ectrum Syy ( ): ¯¯ ¯¯ I (T ) = (y [k ]) ¯
2 2

1 = Ryy (0) = ¯¯ 2

-

Syy ( ) d . ¯¯

(170)

2 Similarly, for the AVAR, y (T ) , we have

(y [k ]) ¯ (T ) = 2
2 y

2

=

R

y y ¯¯

(0)

2

1 = 4

-

S

y y ¯¯

( ) d .

(171)

Since the functions y (t) and y (t) are related to the FFD y (t) through ¯ ¯ the linear system equations (167), we can describ e the p ower sp ectra Syy ( ) ¯¯ 85


and Syy ( ) through the p ower sp ectrum Syy ( ) of the FFD y (t), which ¯¯ we intro duced in equation (147). For this purp ose, let us intro duce system functions A( ) and B( ) of the impulse resp onses a(t) and b(t), resp ectively:
- -i t

A( ) =

a(t) e

1 dt = T dt =

T 2

e
-
T

-i t

dt =

T sin( 2 ) , T ( 2 ) T 0

B( ) =

-

b(t) e

-i t

1 T

2

0

e

-i t

-T

dt -

e

-i t

dt



= 2i

T sin2 ( 2 ) . T ( 2 )

(172)

From general prop erties of linear systems, we have Syy ( ) = Syy ( ) | A( ) |2 , ¯¯ and S
y y ¯¯

( ) = Syy ( ) | B( ) |2 .

Therefore, equations (170) and (171) are reduced to
Squared system function of true variance 1 0.8 sin2() ()2 0.6 0.4 0.2 0 sin4() 2 ()2 1.2 1 0.8 0.6 0.4 0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0 0.5 1 1.5 2 2.5 3 3.5 4 Squared system function of Allan variance



Figure 33: Squared system functions for the TVAR, | A( ) |2 , (left) and the AVAR, | B( ) |2 , (right). Horizontal axes show the pro duct T of the frequency and the time interval. 1 I (T ) = 2
2 2 y (T ) = - - T sin2 ( 2 ) d , T ( 2 )2 T sin4 ( 2 ) d . T ( 2 )2

Syy ( )

(173) (174)

1 2

Syy ( ) 2 86


These are the equations which relate the TVAR and the AVAR to the p ower sp ectrum Syy ( ) of the FFD y (t). Figure 33 shows squared system functions | A( ) |2 and | B( ) |2 for the TVAR and the AVAR, resp ectively. It is evident from this Figure that the low frequency noise is effectively suppressed in the AVAR. 1.2.7 Time­Interval Dep endence of Allan Variance

We can calculate dep endence of the TVAR and the AVAR on the time interval T according to equations (173) and (174) for each p ower­law index of the FFD sp ectrum Syy ( ) given in Table 4. Results are shown in Table 5, where h in the white phase ( = 2) and the flicker phase ( = 1) comp onents is a cut­off frequency at the high frequency side of the FFD sp ectrum. Noise Typ e White phase Flicker phase White frequency Flicker frequency Random walk of frequency Syy ( ) H2 H1 H H H
0 -1 -2 2 1

S ( )
2 0 H 2 -1 -2 -3 -4 2 0 H1 2 0 H0

2 y (T ) 3H2 h T 2 3H1 ln(h T ) T 2 H0 T (2 ln 2) H-1 T H -2 3

I 2 (T )
2H2 T 2
h

-- -- --
H T
0



-1 -2

2 0 H 2 0 H

-1 -2



2 Table 5: Time­interval dep endence of the AVAR, y (T ), and the TVAR, 2 I (T ), for each p ower­law comp onent of the FFD sp ectrum. h is a cut­off frequency at the high frequency side of a p ower­law sp ectrum Syy ( ).

Figure 34 schematically shows the time­interval dep endence of the "Allan standard deviation (ASD)", y (T ), which is defined as the square ro ot of the AVAR. It is clear from this Figure that, having measured ASD as a function of the time interval b etween samples T , we can readily distinguish noise typ es, except for the case of the white phase and the flicker phase. This demonstrates the real usefulness of the ASD (or the AVAR) as a measure of the frequency stability. Figure 35 shows characteristic p erformance of various frequency standards in terms of the ASD versus the time interval. ASD's of frequency standards are approximated by different noise typ es in different time­interval ranges. The active hydrogen maser, which shows ASD 10-15 in its flicker frequency regime, exhibits the highest frequency stability in the time­interval range of 1,000 to 10,000 seconds. 87


Figure 34: A shematic view of the time­interval T dep endence of the Allan standard deviation y (T ) in log­scales. Each noise typ e shows its own gradient in this diagram, though lines of the white phase and the flicker phase are hardly distinguished by the difference of their gradients.

Figure 35: Performance of various frequency standards in terms of the ASD-- the time interval b etween samples. QZ: Quartz, RB: Rubidium, CS: Cesium, HM: Active Hydrogen Maser. (Figure courtesy of HEWLETT PACKARD, Application Note 1289). 88


1.2.8

True Variance and Allan Variance through Auto correlation of Phase Noise

We can describ e the TVAR and the AVAR also through the auto correlation R (T ) of the phase noise (t). In fact, for the TVAR, equations (154) and (158) yield I 2 (T ) = y 2 [k ] = ¯ [(tk + T ) - (tk )] (0 T )2
2

=

2 [R (0) - R (T )] . (0 T )2
2

(175)

Also, for the AVAR, equations (162) and (163) lead to
2 y (T ) =

(y [k ])2 ¯ [(tk + 2T ) - 2 (tk + T ) + (tk )] = 2 2 (0 T )2 3 R (0) - 4 R (T ) + R (2 T ) . = (0 T )2 2 [R(0) - R(T )] -

(176)

Now, from equation (175), we have, 1 [R(0) - R(2 T )] 2 I 2 (T ) - I 2 (2T ) = (0 T )2 3 R (0) - 4 R (T ) + R (2 T ) . = 2 (0 T )2

(177)

Therefore, in cases when the TVAR I 2 (T ) do es not diverge, we have the following relationship b etween the AVAR and the TVAR:
2 y (T ) = 2 [I 2 (T ) - I 2 (2T )].

(178)

1.2.9

Coherence Function

As we saw earlier, the phase noise in the fringe phase causes the serious problem, the coherence loss, for the sensitivity of VLBI. The coherence loss due to the phase noise (t), after integrating the correlator output for duration T , can b e estimated by intro ducing the "coherence function" (Rogers and Moran, 1981), which is defined by 1 C (T ) = T
T

e
0

i(t)

dt .

(179)

Magnitude of the squared coherence function is represented by its disp ersion: 1 C (T ) = 2 T
2 TT

e
00

i[(t)-(t )]

dt dt .

(180)

89


Now, if we assume the Gaussian distribution of the phase noise difference = (t) - (t ): 2 1 (181) f () = e- 22 , 2 where 2 = 2 = [(t) - (t )]2 , then, using the formula
- -x2 -iax

e

dx =



e

-

a2 4

,

we obtain e
i

=

-

f () ei d = e

-

2 2

=e

-

2 2

.

Hence, we reduce equation (180) to 1 C (T ) = 2 T
2 TT

e
00

-

[(t)-(t )]2 2

1 dt dt = 2 T

TT

e
00

-

D (t, t ) 2

dt dt ,

(182)

where D (t, t ) is the so­called "temp oral structure function" defined by D (t, t ) = [(t) - (t )]2 . Under the assumption of the stationary random phase noise, we have D (t, t ) = D ( ), with = t - t . Then we can reduce the double integral in equation (182) 1 to a single integral. In fact, noting that dz = 2 d and L = 2 (T - ) in Figure 36, we obtain 2 C (T ) = 2 T
2 T

(183)

0

(T - ) e

-

D ( ) 2

2 d = T

T

0

1-

T

e

-

D ( ) 2

d .

(184)

According to equation (154), the mean FFD y [k ] with the sample interval ¯ is (tk + ) - (tk ) . y [k ] = ¯ 0 Therefore, we can describ e the structure function D ( ) through the TVAR I 2 ( ) as 2 2 D ( ) = 0 2 y 2 = 0 2 I 2 ( ). ¯ (185) 90


t' T T- 0 T
Figure 36: Geometry of integration in equation (182). This enable us to describ e the disp ersion of the coherence function through the TVAR: 2 C (T ) = T
2 T

L

dz

t

0 T

1-

T

e

-

2 0 2 I 2 ( ) 2

d
yy

2 = T

0

1- T

e

-

2 0 2 4



S
-

( )

sin2 ( ) 2 ( )2 2

d

d .

(186)

Thus, when a functional form of the TVAR I 2 ( ) late the disp ersion of the coherence function C 2 (T ) . the "coherence time c ", i.e. the interval during less coherently integrate our signal. As the criterion c , we usually adopt the time interval which gives C 1.2.10

is given, we can calcuThen, we can estimate which we can more or for the coherence time 2 (c ) 0.85.

Approximate Estimation of Coherence Time

Precise estimation of the coherence time using equation (186) is sometimes impractical when the estimation of the TVAR with equation (159) diverses. For practical purp oses, the coherence time is usually estimated by a simpler way. Phase noise accumulated during a time T is approximately given by 0 y T , where y is "some standard deviation" of the FFD, for which we usually adopt the ASD. It is obvious that we will not obtain any meaningful correlation result if the phase noise varies more than 2 , during an integration time T . It is the 91


(Note that y is, in general, a function of the time interval T .) Thus, the coherence time c is estimated as a time which satisfies 0 y (c )
c

usual practice to require that the accumulated phase noise must not exceed 1 radian: 0 y (T )T 1 radian. (187)

1.

(188)

If we observe at = 8 GHz, and require that c = 1000 sec, say, we need a frequency stability b etter than 1 2 â 10-14 , 9 â 103 2 â 8 â 10 which is sometimes describ ed as the "stability of a clo ck which would deviate by 1 sec in 5 â 1013 sec, or 1.6 million years!" y (1000 sec)

The Active Hydrogen Maser Frequency Standard, which has the frequency stability of 10-16 < y < 10-14 at time scales around 1000 sec (see Figure 35), fulfilled the requirement, and is widely used in the world VLBI observations. 1.2.11 Estimation of Time­Averaged Phase Noise

In some applications of VLBI obserrvation, for example in VLBI astrometry, it is meaningful to theoretically estimate the phase noise exp ected after time­ averaging of the correlator output for a duration of time T . If we have a time series of measured fringe phase with high enough signal­ ¯ to­noise ratio, the time­averaged phase noise is given by a simple mo del: ¯1 = T
T

(t) dt,
0

(189)

where (t) is the phase noise at time t. This mo del is go o d enough for the case of the "vector averaging" of the correlator output, such as shown in equation (179) for the coherence function, as long as the phase noise is kept well smaller than 1 radian. In fact, if (t) 1, we have 1 T
T

e
0

i(t)

1 dt = T

T ¯ (1 + i (t)) dt ei . = 0

Now, we will derive formulae for the disp ersion of the residual phase noise from its time average: 1 ¯ = (t) - = (t) - T 92
T 0

(t )dt ,

(190)


and that of the time-averaged phase itself, given by equation (189). The disp ersions are
2

1 (T ) = T 2 =2 T

T 0 T 0

1 (t) - T

T 0

2

(t )dt

dt 1 T2
T 0 2 (T - ) 0 2 I 2 ( )d .

(T - )[R (0) - R ( )]d =
T 0 2

(191)
2 (T ) = ¯

1 T

(t)dt

=

2 T2

T 0

(T - )R ( )d .

(192)

Therefore, we can calculate these disp ersions if we know the TVAR I 2 ( ) and the auto correlation of the phase noise R ( ). Note that equation (191) describ es the accumulated phase noise around its time average. In particular, equation (186) can b e reduced to 2 C (T ) = T
2 T

2 = T

0 T

1- 1-

T T

e

-

2 2 I 2 ( ) 0 2

d
2

0

[1 -

2 0 2 I 2 ( ) ]d = 1 - 2

(T ),

(193)

2 as long as 0 2 I 2 ( ) / 2

1. 2 H 2 h , 2 3 H 2 h , 2

In the simplest case of the white phase noise, we have I 2 ( ) =
2 and y ( ) =

(194)

for the TVAR and the AVAR, and S ( ) = H2 ,
2 0

1 and R ( ) = 2

-

S ( ) e

i

2 d = 0 H2 ( ),

(195) for the p ower sp ectrum and the auto correlation of the phase noise. Then, equations (191) and (192) yield
2

(T ) =

1 T2

T 0

2 (T - ) 0 2 I 2 ( )d =

2 H2 0 h ,

(196)

for the disp ersion of the residual phase noise, and
2 (T ) = ¯

2 T2

T 0

(T - )R ( )d = 93

2 H2 0 , T

(197)


for the disp ersion of the time­averaged phase noise. It is evident that, for the white phase noise, the disp ersion of the residual 2 phase noise (T ) do es not dep end on the averaging time T . This disp ersion 2 is readily estimated, if we have measured AVAR y ( ), from which we can easily extract the co efficient H2 h . 2 On the other hand, the disp ersion of the time­averaged phase noise (T ) ¯ decreases with increasing averaging time as 1/T , just like the thermal noise. In this case, only H2 figures as the unknown co efficient which is not directly available from the measured AVAR alone. We can estimate the cut­off frequency h , and then H2 , in the following way. Supp ose that actual measurements of the phase noise are p erformed with a time interval m . If we regard the measured phase noise value as a kind of the running mean for the duration m , the high frequency cut­off 2 should b e h 1/m . Then, we can estimate (T ), using this h and H2 h = ¯ derived from the measured AVAR.

1.3

Time Synchronization

In order to find the white fringe within the coherence interval, the clo cks of the element antennas must b e synchronized with accuracy sync , which should b e b etter than the coherence interval 2/B , where B is the recorded bandwidth, as we saw in Chapter 3. Therefore, even in the early days of VLBI observations, with the typical bandwidth of B = 2 MHz, we needed high time­synchronization accuracy:
sy nc

< 1 µsec for B = 2MHz,

which was not easily available from time transfer systems at that time using surface waves. Nowadays, the requirement is much more severe since we use the observing bandwidth with B = 256 MHz or wider. Then, we need
sy nc

< 7.8 nsec(!) for B = 256MHz.

In actuality, multi­lag correlators, which we will discuss later, can significantly ease this requirement. But, anyway, we need highly accurate time synchronization b etter than 100 nsec. Fortunately, now GPS (Global Positioning System) Satellites are capable of providing time synchronizations at a few tens nsec level. Therefore, right now we do not have any essential problem in the time synchronization technology. Once VLBI fringe is successfully detected, VLBI itself serves as the b est time synchronization device, with 1 nsec level accuracy. 94


1.4

Recording System

Even in the very early stages, VLBI required a very high data rate and capacity for the recording system. For example, if we wish to record digitized data, with bandwidth B = 2 MHz, sampled at Nyquist rate (which equals 2B samples/sec) with 1 bit (or 2­level) quantization, for a time duration of 400 sec, we need a data recording rate of 4 Mbit/sec, and a data capacity of at least 1.6 Gbit. Nowadays, new recording sytems, such as the VERA system, allow 1 hour recording at a 1 Gbit/sec rate, p er volume of magnetic tap e. Therefore, such a tap e records 3.6 Tbit of data.

Figure 37: VLBI systems (from Thompson, Moran, and Swenson, 2001). Generations of "VLBI systems" were marked by the development of the digital data recording technology, as shown in Figure 37. The Communications Research Lab oratory (CRL, now NICT: National Institute of Information and Communications Technology) has develop ed the Mark I I­compatible K-1, the exp erimental real­time VLBI system K-2, the Mark I I I­compatible K-3, and cassette­based K-4 systems, by its own efforts (Takahashi, et al., 2000). Figures 38, 39, 40 and 41 show the VLBA recorder, S2 recorder, K-4 recorder, and the tap e handler for K-4 recorder. Figure 42 shows the S2, K-4, and VLBA (Mark IV) tap es.

95


Figure 38: VLBA recorder (MIT Haystack Observatory/NRAO, USA).

96


Figure 39: S2 recorder (Center for Research in Earth and Space Technology, Canada).

97


Figure 40: K-4 recorder (NICT/CRL, Japan).

98


Figure 41: Automated tap e handler for K-4 (NICT/CRL, Japan).

99


Figure 42: Magnetic tap es used in VLBI: S2 (top left), K-4 (left), and VLBA/Mark IV (right).

100


2
2.1

Overview of the VLBI System
MK­3 As a Prototyp e of Mo dern VLBI Systems

Figure 43 shows a schematic view of the Mark I I I / K-3 VLBI system, often called "MK­3", which was initially develop ed by the MIT Haystack Observatory with sp onsorship from NASA, and by the CRL (now NICT), in the late 1970's. Although this system has now b een almost replaced by newer systems, such as the Mark IV, Mark V, VLBA, K-4, K-5 and VSOP, many imp ortant elements of these mo dern VLBI systems were implemented for the first time in the MK­3 system, and have b een further develop ed in the latest systems. Therefore, we briefly describ e here the basic comp onents, and their functions, of the MK­3 system.

P & D Antenna Unit

S and X Low Noise Amplifiers SX

Antenna Control Unit

Local Oscillator

Downconverter Receiver Room

P&D Ground Unit H-Maser Frequency Standard

IF Frequency Distributer VLBI Terminal Video Converter Formatter Data Recorder

Control Computer

Control Building

Figure 43: A schematic view of the Mark I I I / K-3 VLBI system. The MK­3 system was originally develop ed mainly for realizing high­ 101


precision VLBI geo desy, although the system has b een extensively used for astrophysical radio source imaging observations as well. In order to accurately estimate the group delay, the system adopted a multi­frequency channel design, together with high­sp eed recording technology; these features have b een retained in most of the latest systems. 2.1.1 Dual­Frequency Reception

In geo detic applications, a VLBI antenna receives radio waves from distant extragalactic sources usually at two frequencies simultaneously, in order to correct the effects of frequency­dep endent propagation delay in the ionosphere. The two frequencies, S­band (2 GHz) and X­band (8 GHz), are most widely used in global geo detic VLBI observations. The RF frequency bands in the MK­3 system typically covered 200 MHz for S­band (e.g., from 2120 to 2320 MHz), and 420 MHz for X­band (e.g., from 8180 to 8600 MHz). Multi­frequency coaxial feed horns are widely used for dual­frequency reception. Other systems, including FSS (Frequency Selective Surfaces) and spiral arrays (see, e.g., Figure 44), are also used.

Figure 44: S/X spiral­array feed system, develop ed by Hosei University, Japan, is used for dual­frequency reception in VERA.

2.1.2

First Frequency Conversion

In the receiver ro om, usually built in the antenna structure, the RF signals are amplified by the S­band and X­band low­noise amplifiers, and down102


converted to, for example, 100 ­ 300 MHz (S­band) and 100 ­ 520 MHz (X­band), resp ectively, with the lo cal oscillator signals at frequencies of 2020 MHz (S­band) and 8080 MHz (X­band), which are generated from a reference signal provided by the Hydrogen Maser Frequency Standard. 2.1.3 Transmission to Control Building

Then, the IF signals are fed to a so­called MK­3 VLBI terminal rack in a control building, via transmission cables which are usually laid in underground ducts, in order to minimize phase fluctuations due to cable length variations, induced by temp erature changes. 2.1.4 Intermediate Frequency Distributer
IF signal from receiver.


IF Distributer Makes n identical copies.

Video Converter Selects specified frequency ranges and converts them to video bands.

Formatter Digitizes, formats and adds time codes.

to VLBI Recorder

Figure 45: Data flow in the MK­3 VLBI terminal. In the VLBI terminal rack, the S­band and X­band IF signals are copied 103


over to 6 (S­band) and 8 (X­band) identical signals, resp ectively, by the so­called "Intermediate Frequency Distributer" unit (Figure 45). 2.1.5 Baseband Conversion

The 6 and 8 identical IF signals are then sent to 14 sp ecial units consisting of frequency downconverters, called "Video Converters" (VC's) (or "Baseband Converters", BBC's) lo cated in the same rack. Each Video Converter selects a sp ecified frequency range in the IF band, and downconverts this range into a video band, typically from 0 to 2 MHz. For this purp ose, a frequency synthesizer, which is built in the Video Converter, generates an LO frequency at the edge of the selected IF range, using a reference signal provided by the Hydrogen Maser Frequency Standard. The Video Converter is equipp ed with a sideband­rejection mixer, which is capable of converting b oth lower and upp er sidebands around the selected LO frequency, separately. Thus, in total up to 28 (14 â 2 SB's) baseband frequency channels are generated by the 14 Video Converters. Usually, of course, the baseband channels are selected in such a way that they corresp ond to different RF frequency ranges. Table 6 and Figure 46 show an example of the LO frequency distribution used in a recent geo detic VLBI observation. This apparently strange distribution of the data among the baseband channels, which are to b e subsequently digitized and recorded, are spread over fairly wide ranges of the RF bands; This is sp ecifically designed for b etter estimation of the group delay observables via a technique called "bandwidth synthesis", which will b e discussed later.

8200

8250

8300

8350 8400 8450 RF frequency (MHz)

8500

8550

8600

2220

2240

2260

2280 2300 RF frequency (MHz)

2320

2340

2360

Figure 46: Distributions of baseband channels within the RF frequency ranges, in a sample geo detic VLBI observation, as used in Table 6. Top: X­band, and b ottom: S­band.

104


Ch# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

band X X X X X X X X S S S S S S S S

LO freequency (MHz) 8209.99 8219.99 8249.99 8309.99 8419.99 8499.99 8549.99 8569.99 2239.99 2244.99 2259.99 2289.99 2319.99 2329.99 2344.99 2354.99

bandwidth (MHz) 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00

Table 6: An example of the distribution of LO frequencies in baseband channels. Note that this example is taken from a recent geo detic VLBI observation, which uses the K-4 VLBI system. The K-4 is newer than the MK­3, and has 16 Video Converters (therefore, 16 baseband channels), each with 4 MHz bandwidth.

105


2.1.6

Formatter

The baseband signals from the 14 Video Converters (p ossibly containing b oth USB and LSB channels) are then fed to 14 "Formatter" units in the same rack, each of which converts the analog baseband signal into a digital signal, using a high­sp eed sampler, with a one­bit (or two­level) quantization scheme, and Nyquist sampling rate (i.e. 2BV C samples/sec, where BV C 2 MHz is the bandwidth of each video band channel). At the same time, the Formatter p erio dically generates time mark co des, using the reference signal and the clo ck pulse provided by the Hydrogen Maser Frequency Standard, and inserts them into the digitized data. These time mark co des play basic and imp ortant roles in the later digital correlation pro cessing. 2.1.7 Data Recorder

Up to 28 channels of the digitized, formatted data are sent to a high­sp eed data recording device, the Honeywell M­96 op en­reel digital recorder, in the case of the MK­3 system and its successors (see Figure 38). The formatted multi­channel data are recorded in parallel tracks. From the ab ove discussion, we see, that the maximum recording rate in the MK­3 system is given by: 1 bit/sample â 2 â 2 M sample/sec (Nyquist rate) â 2 sidebands â 14 Video Converters = 112 Mbit/sec. 2.1.8 Phase and Delay Calibration

The 14 Video Converters (VC's) add their own arbitrary initial phases to the converted video band signals. It is necessary, for the accurate estimation of the group delay, to calibrate the phase offsets among the video bands due to the VC initial phases. For that purp ose, a sp ecial device called the "Phase and Delay Calibrator" is used (Figure 47). This calibrator system, which consists of a Ground Unit in the VLBI terminal rack, and an Antenna Unit in the receiver ro om, generates so-called "comb­tone" signal, which is nothing but the comb function given in equations (19) and (23), i.e. equally spaced in time delta functions ("pulse series"). As we saw b efore, Fourier transform of the comb function in the time domain is the comb function in the frequency domain. Therefore, in the frequency domain, the comb­tone signal consists of a large numb er of sine waves at equally spaced frequencies. Moreover, the sine waves have regularly aligned phases, based on the reference signal from the Hydrogen Maser Frequency Standard. The Antenna unit generates the comb­tones and adds them to the observed signal in the RF band. In the 106


actual observations, the LO frequencies of the Video Converters are selected in such a way that at least one comb­tone signal falls into each of the video bands (e.g., at a frequency of 10 kHz in the MK­3 system). The comb­tone signals pass through the same units as the observed data do, and, therefore, are affected by the same phase offsets due to the VC initial phases, as the observed data are. Then, the data containing the comb­tones are digitized and recorded. These comb­tones are later detected in the correlation processing, and the phases of the comb­tones in the different video bands are compared with each other, in order to estimate, and then remove from the observed data, the unknown phase offsets due to the VC initial phases.
A cos [(t-)] Antenna Unit

Bn cos [n(t-)] n

Antenna feed Comb pulses in time

Round-trip measurement of cable delay

Comb tones go through the same path as the radio source signal.

Fourier transform

t

IF Frequency Distributer

Comb tones in spectrum



Video Converter A cos (t) Ground Unit Comb tones at 10 kHz of every video band channel

Formatter Data Recorder

H-Maser Frequency Standard

10 kHz



Correlation Processor

Phase of the 10 kHz tone is detected by correlating with a standard 10 kHz signal.

t

Figure 47: Phase and delay calibration system in the MK­3.

107


2.1.9

Hydrogen Maser Frequency Standard

The Hydrogen Maser Frequency Standard provides very high stability reference signals, usually at 5 MHz or 10 MHz, to the first downconverters in the receiver ro om, the Video Converters, the Formatters, and to the Phase and Delay Calibrator. It also provides clo ck pulses, usually at 1 PPS (pulse p er second), to many devices, to guarantee their synchronous op eration. The Hydrogen Maser Frequency Standard is usually placed in a sp ecial magnetically­ shielded and temp erature­controlled ro om, to shied against external disturbances as much as p ossible. Figure 48 shows a Hydrogen Maser Frequency Standard installed in the VERA Mizusawa station.

Figure 48: A Hydrogen Maser Frequency Standard in the VERA Mizusawa station.

2.1.10

Automated Op eration by Control Computer

All devices in a VLBI observing station are usually designed to b e fully controllable by a single control computer (Figure 49). All commands for setting LO frequencies, sampler mo des, recording rates, etc., are remotely controlled. Also, commands for antenna op erations, measurements of system noise temp erature and meteorological parameters, and so on, are issued from the control computer. In a VLBI observation, a so­called VLBI schedule file is given to the control computer, where all setting information, co ordinates of the observed radio sources, and station lo cations are listed. Also, the detailed time sequence of the observing events, such as rep ointing the antenna to the next 108


source, tap e start, tap e stop, and so on, are given in Universal Times (UT).

P & D Antenna Unit

S and X Low Noise Amplifiers SX Downconverter Receiver Room

Antenna Control Unit

Local Oscillator

P&D Ground Unit H-Maser Frequency Standard

IF Frequency Distributer VLBI Terminal Video Converter Formatter Data Recorder

Control Computer Log File

Control Building

Schedule File Observation sequence Source information Antenna information

Mark III DB

Field System
Figure 49: A typical automated VLBI op eration system. The control computer automatically conducts all steps of the observation by issuing commands to the antenna control unit, VLBI terminal, and data recorder, according to the schedule file. As a result, the only remaining task, to b e done by an op erator in a normal VLBI observation, is to change the recording tap es once an hour or, at most a few times a day, dep ending on the system design. The most widely used software for automated control of the VLBI equipment, which was develop ed in the NASA Go ddard Space Flight Center (NASA/GSFC), is called the "VLBI Field System".

2.2
2.2.1

Mo dern VLBI Systems
New Recording and Fib er­Link Systems

The ma jor successors of the MK­3 system are listed in Figure 37. They are the VLBA and Mark IV sytems based on the advanced head controls of the 109


Honeywell M­96 recorder (Figure 38), the S2 system using 8 video cassette recorders in parallel (Figure 39), and the K-4 system based on an ID1 digital video cassette recorder (Figure 40). Nowadays, mo dern VLBI recording systems with 1 Gbps recording sp eed have b een develop ed and some of them are already in use (Table 7). They are the Mark V system (Figure 50) using a hard disk array instead of the magnetic tap es, the K­5 system also based on the hard disk array (Figure 51), the GBR-1000 system based on an HDTV video cassette recorder (Figure 52), the VERA recording system based on a new ID1 digital video cassette recorder (Figure 53), and the fib er­linked e­VLBI.

name Mark V K­5 GBR-1000 VERA e­VLBI

typ e hard disk array hard disk array HDTV ID1 real­time fib er link

bitrate 1024 Mbps 512 & 2048 Mbps 1024 Mbps 1024 Mbps > 2048 Mbps

Table 7: New generation recording and fib er­link systems.

Figure 50: Mark V recording system based on a new hard disk array design.

110


Figure 51: K­5 VLBI terminal.

111


Figure 52: GBR-1000 1 Gbps VLBI system based on an HDTV video cassette recorder.

112


Figure 53: VERA recording system based on a new ID1 digital video cassette recorder. 2.2.2 Digital Baseband Converters

Another progress is b eeing made in several countries for implementing digital BBC systems based on the digital filtering technology. The idea is first to

Figure 54: Data Aquisition System of the KVN adopting the digital filter system for the baseband conversion. digitize a wideband IF data using a high­sp eed sampler (analog­to­digital converter), and second to cut the digital data into baseband channels by means of a high­sp eed digital filter. Then the baseband data will no longer suffer from irregular bandpass characteristics of analog BBC's and arbitrary 113


phases added by the LO's of the BBC's. In fact, the digital filter can in principle remove the phase calibration systems discussed ab ove, as far as the LO phase problem only is concerned. Also, the digital filter allows flexible organization of baseband channels meeting different scientific requirements. For example, a single channel ultra­wideband data could b e suited to imaging very weak continuum sources, while 16 channel data are necessary for geo detic observations. Such a digital BBC system was develop ed and successfully utilized in 4 stations of the VERA array. Figure 54 shows the Data Aqusition System (DAS) of the KVN which uses a digital BBC system for the baseband conversion. The DAS system includes 4 high­sp eed samplers lo cated in the receiver ro om. The 4 samplers are equipp ed for 4 frequency bands to b e simultaneously received for mm­ wave VLBI observations. Each sampler is capable of digitizing IF data of 512 MHz bandwidth using the higher­order sampling technique discussed earlier with 1 Gsps sp eed in the 2­bit quantization mo de (therefore, the output bit­rate is 2 Gbps). The digital data are transmitted to the control building via optical fib er cables. Then the digital filter forms 1, 2, 4, 8, and 16 baseband channels with bandwidths of 256, 128, 64, 32, and 16 or 8 MHz, corresp ondingly, out of the 512 MHz / 2­bit / 1 Gsps input, dep ending on scientific purp oses. 2.2.3 e­VLBI

Several groups in the world are now developing VLBI systems based on high­ sp eed data transmission techniques via optical fib ers. The idea is to replace the data tap es by ultra­wide­band transmission cables. This technology is now called "e­VLBI". Figures 55 and 56 show the first successful EVN (Europ ean VLBI Network) e­VLBI observation conducted by three observatories, in April 2004, and a b eautiful image of the gravitational lens ob ject B0218+357 obtained through this co ordinated effort. Of course, this technology by no means transforms VLBI into a connected­ element interferometer, since the frequency standards in different stations must remain indep endent, in view of the essential technical difficulties to transmit reference signals over thousands of kilometers without significant delays or phase fluctuations. Nevertheless, this technology will bring VLBI much closer to the connected­element interferometer, in the sense that the observed data could b e correlated and analyzed in real­time, or almost in real time ("near­real-time"). For example, pioneering real­time e­VLBI exp eriments conducted in the KSP (Key Stone Pro ject, 1995­2001) of the CRL (now NICT) regularly yielded final geo detic results a few minutes after each observation. Moreover, the optical fib er cables offer even higher data trans114


Figure 55: First EVN e­VLBI observation (2004 April) using radio telescop es at Jo drell Bank, UK; Westerb ork, the Netherlands; and Onsala, Sweden. The lower panels show b eats of the fringe amplitudes pro duced by two closely spaced sources, and an image map obtained during the observation (from URL: http://www.jive.nl).

Figure 56: A close­up view of the image of a radio­loud gravitational lens JVAS B0218+357, obtained in the first EVN e­VLBI observation (from URL: http://www.jive.nl). 115


mission rates, and therefore higher sensitivity, than the magnetic tap es, or hard disk arrays. Real­time VLBI exp eriments with 2.5 Gbps transmission rate have b een successfully conducted since 1998 (Figure 57). Even much higher transmission rates are exp ected in new connected­element interferometer arrays (for example, 96 Gbps / antenna is planned for the Expanded VLA (EVLA), and for the Atacama Large Millimeter and submillimeter Array (ALMA)). High­sp eed correlators are now b eing develop ed to meet these high data rates. At the same time, less exp ensive and more widely accessible VLBI data transmission, via broad­band Internet using the IP proto col, is also b eing intensively studied, and has b een successfully tested. This IP­based e­VLBI, or IP­VLBI, seems to b e a particularly promising technology which will make VLBI observations much more user­friendly for many astronomers and geophysicists around the world. The K­5 system, develop ed at the NICT, is designed to realize the IP­VLBI concept (Figure 58). 2.2.4 VLBI Standard Interface (VSI)

Another remarkable example of progress in mo dern VLBI is the definition of the international "VLBI Standard Interface (VSI)" sp ecifications (http://web.haystack.edu/vsi). The world VLBI community has long suffered from incompatibily of the various different VLBI systems, shown ab ove, which had b een develop ed in different institutions. In order to cross­correlate data recorded on different tap es by different VLBI systems A and B, say, one had to convert the format of B's data into A's format, copy the converted data to A's tap e, and then cross­correlate them with A's correlator, or vice versa. The purp ose of the VSI sp ecifications is to make all VLBI systems in the world compatible, provided only that they ob ey these standard sp ecifications. So far, a hardware sp ecification called "VSI­H", and a software sp ecification called "VSI­S", were worked out by an international working group of VLBI sp ecialists. Another sp ecification for e­VLBI, to b e called "VSI­E", is now under intensive discussion and development. The VSI­H hardware sp ecification defines a standard interface b etween the "data­acquisition system (DAS)" (i.e. VLBI terminals) and the "data­ transmission sytem (DTS)" (i.e. tap es or hard disks), as well as b etween the "data­transmission system" and the "data­pro cessing system (DPS)" (i.e. correlators). Figure 59 shows how the interface could b e realized. Every DAS, DTS, and DPS must have common connectors with well sp ecified pin­ assignments and data rates. The actual data transmission media (the handler units for tap es or hard­disks) are supp osed to b e equipp ed with a "data 116


Figure 57: 2.5 Gbps fib er­linked real­time VLBI exp eriment "OLIVE­ GALAXY", Japan (1998, Decemb er).

117


Figure 58: IP­based e­VLBI develop ed for the K­5 system.

118


Figure 59: VSI-H functional diagram (http://web.haystack.edu/vsi).

119


input mo dule (DIM)", and a "data output mo dule (DOM)", which interface the common connectors, and the connectors of the handler hardware. The actual data formats in the individual data transmission media are arbitrary, provided that the DIM accepts the input data stream (sp ecified by VSI-H), and the DOM yields the output data stream (also sp ecified by VSI-H). The VSI­S software sp ecification defines proto cols for handling VSI­H­ compliant equipment. A numb er of VSI­H­compliant VLBI systems have b een develop ed in various countries, and successfully cross­correlated with each other.

3

Difficulties in Ground­Based VLBI

UNFINISHED.

4

Correlation Processing in VLBI

UNFINISHED.

5

Observables of VLBI

UNFINISHED.

120


References
Allan, D.W., 1966, Statistics of Atomic Frequency Standards, Proc. IEEE, 54, 221­230. Hagen, J.B., and Farley, D.T., 1973, Digital Correlation Techniques in Radio Science, Radio Sci., 8, 775­784. Iguchi, S., and Kawaguchi, N., 2002, Higher­Order Sampling, Over Sampling and Digital Filtering Techniques for Radio Interferometry, IEICE, Trans. Commun., E85-B, 1806­1816. Meijering, E., 2002, A Chronology of Interp olation: From Ancient Astronomy to Mo dern Signal and Image Pro cessing, Proc, IEEE, 90, 319­342. Pap oulis, A., 1984, Probability, Random Variables, and Sto chastic Pro cesses, 2nd Edition, McGraw-Hil l (New York). Price, R., 1958, A Useful Theorem for Nonlinear Devices Having Gaussian Inputs, IRE Trans. Inf. Theory, IT-4, 69­72. Rogers, A.E.E., and Moran, J.M., 1981, Coherence Limits for Very­Long­Baseline Interferometry, IEEE Trans. Instrum. Meas., IM­30, 283­286. Shannon, C.E., 1949, Communication in the Presence of Noise, Proc. IEEE, 37, 10­21. Thompson, A.R., Moran, J.M., and Swenson, G.W., Jr., 2001, Interferometry and Synthesis in Radio Astronomy, John Wiley & Sons (New York). Takahashi, F., Kondo, T., Takahashi, Y., and Koyama, Y., 2000, Very Long Baseline Interferometer, Ohmsha (Tokyo). Van Vleck, J.H., and Middelton, D., 1966, The Sp ectrum of Clipp ed Noise, Proc. IEEE, 54, 2­19.

121