Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://www.naic.edu/~phil/hardware/wapps/pulsarStartTm_jan07.html
Äàòà èçìåíåíèÿ: Wed Feb 7 02:44:34 2007
Äàòà èíäåêñèðîâàíèÿ: Sat Dec 22 22:15:15 2007
Êîäèðîâêà:
Ïîèñêîâûå ñëîâà: ï ï ï ï ï ï ï ï ï ï ï

Start times in 4 wapps differs in pulsar observing

06jan07

An archive of the headers from the wapp pulsar files is made each month. At the end of the month, all of the pulsar files on disc are scanned and the headers are archived. Many pulsar data files are removed from the online disc before the end of each month, so this process only archives a fraction of the data actually taken.

An observation can use 1 to 4 wapps. When starting an observation, the 4 wapps are told to configure themselves and then start taking data on the next hardware 1 second tick. The start time for each wapp is stored in the header for that wapps datafile.

The start of datataking is determined by a hardware 1 second tick (locked to the hydrogen maser). The time recorded in the header comes from the local clock on each wapp. This time is maintained by the ntp Daemon on each wapp.

There has been a problem where the time on each wapp was drifting (see ntp problems with wapps). This seemed to clear up when the wapp linux kernels were upgraded.

I recently went through the wapp pulsar archive and looked at the number of times the start times for the different wapps used in an observation differed. The data covered jan05 through jan07. The time difference was usually 1 second.

The plots shows when the wapp start_times differed (.ps) (.pdf):

Top plot. Total observations, total mismatched observations: The black line is the total number of observations that were looked at by month. The red line is the total number of these observations that had mismatches in at least one of the header start times.
Bottom: The fraction of time there was a mismatch: This is the ratio of the mismatches divided by the total observations. There was large drop in jun05 when the wapp4 kernel was updated. This updated solved the ntp time drift. The fraction of time there is a mismatch is still runs between 5 and 20% of the time.

The question is whether the start time in the header is the real start time (one or more wapps started late) or all of the wapps started together and the time in the header is wrong. When the ntp time was drifting then the time in the header could have been wrong. After the kernels were upgraded my guess is that the times in the header are correct and the wapps are actually starting at different times.

How pulsar observations are started.

As far as i can tell, the wapp startup sequence for pulsare observing is:

The cima gui in wapp.tcl wapp_sendhdr() loops filling in the the wapp header for each wapp and then sending the message to wappcon on each wapp. Each wapp receives the request spaced by the time it takes to go through this loop (looks like computing the polyco coef might take a little while).
wappcon on each wapp receives the wapp header from a socket. It waits for wapprt to be not busy and then loads he header into shared memory for wapprt to grab it.
wapprt gets the header out of shared memory and then prepares to take data. This includes:

Configures the wapp for requested data taking mode.
Checks available disc space on all discs
preallocates all files need for the observations

It hen waits for the clock time to be .5 seconds before the next second. It does this by:

Get the current fraction of a second from the cpu time.
If the FractSec < .5 seconds, wait for .5 - FracSec . You should wake up at .5 seconds before the next tick. You stay in the current second
If FracSec > .5 then wait for 1.5 - FracSec. This will put you at .5 seconds on the next 1 second.

The above algorithm has a built in race condition.

If a wapp gets to the wait code before .5 seconds then it starts on the next tick
If a wapp gets to the wait code on or after .5 seconds then it will wait 1 extra second.

The algorithm has the following problems:

The requests to each wapp are not issued simultaneously.
After receiving the start request different wapps could have different setup times (especially allocating and checking the discs).
There does not seem to be any time synchronization when the initial request from the gui is sent. It is whenever the user pushed the button. You'd like to see it happen around a 1 second tick. That would the maximum time before the .5 second threshold arrived. The time would be if the user pushed the button a few milliseconds before .5 seconds of the current second.

<- page up
home_~phil