Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://www.atnf.csiro.au/vlbi/dokuwiki/doku.php/correlator/archiving
Äàòà èçìåíåíèÿ: Unknown
Äàòà èíäåêñèðîâàíèÿ: Sun Apr 10 23:20:37 2016
Êîäèðîâêà: IBM-866

Ïîèñêîâûå ñëîâà: dark nebula
correlator:archiving [ATNF VLBI Wiki]

User Tools

Site Tools


correlator:archiving

Archiving LBA or AuScope data products from the correlator

Archiving the Correlator Output Data

The correlator archive now lives on iVEC's new data store. (You'll need your own iVEC account for access.)

If you log in to the data store via your web browser, under òÀÜToolsòÀÝ there are some data management scripts available for command line use.
ashell.py can be run from the corr account on CUPPA (n.b. advisable to run in screen as the transfers can take a while):

There is a wrapper script for ashell.py that automatically tars up and transfers the data required for the archive:

archivec.py $CORR_DATA/<expname> /projects/VLBI/Archive/LBA/<exp_parent>

It will prompt for username and password.

To do this by hand:

corr@cuppa03:~$ ashell.py 
Welcome to ashell v0.8, type 'help' for a list of commands
ivec:offline>login
Username: hbignall # insert your iVEC user name here - same login as you use on other Pawsey Centre machines e.g. galaxy
Password:
ivec:online>cf VLBI/Archive/Curtin/[AuScope] # changes to this directory on the data store 
ivec:online>cd /data/corr/corrdat            # changes the local working directory
ivec:online>put /data/corr/corrdat/{expname} # will transfer the whole {expname} directory to the data store
ivec:online>logout                           # when finished, to prevent someone else using your login
ivec:offline>exit

Archiving the pipeline outputs

The pipeline outputs are distributed to PIs via the wiki. lba_feedback.py will create the wiki page, and archpipe will automatically send the archive plots and the wiki page to the wiki. The wiki pages are linked from the correlator records spreadsheet ( LBA or Auscope).

cd /data/corr/pipe/<expname>/out
lba_feedback.py <expname>.wikilog > <expname.txt>
archpipe <expname>

Notes on what needs to go into the archive (for correlation with DiFX versions 1.5 and higher)

Associated with each job

  • .joblist (not critical)
  • .v2d
  • .vex
  • .difxlog (DiFX messages from errormon2)

Unique to each job

  • .input
  • .calc
  • .im
  • .uvw [DiFX 1.5]
  • .delay [DiFX 1.5]
  • .rate [DiFX 1.5]
  • .flag
  • .difx (output directory - may contain multiple files)

Additional files for pulsar modes only:

  • .polyco
  • .binconfig

Final output

  • FITS files - may be associated with multiple jobs, arbitrarily named. Include README (as per description of output files on wiki page).

Notes

Jobs may live in subdirectories, with identical filenames for files from different jobs within each subdirectory.

For accountability it's important to keep the directory structure.

Ideally we want to keep all relevant files for all production jobs.

NB: the following is mainly relevant for old versions of DiFX (pre DiFX-2): In some cases SWIN format output data may be impractically large for online storage and it may not be desirable to keep this intermediate-stage data. For example, output from DiFX 1.5 when a full band is correlated at high spectral resolution, but the user only wants the subset of the band containing the spectral lines at high resolution. In this case the output FITS data will generally be a manageable size, but the SWIN output data in the .difx directories will typically be at least several times larger (e.g. it covers 16 MHz, while the region of interest is only 2 MHz wide).

It may be useful to keep all jobs (clock search and test as well - e.g. to have some data at higher spectral resolution for checking). Usually clock search jobs are in òÀÜclocksòÀÝ subdirectory, but it won't necessarily be obvious which jobs are test or final production. Test jobs could be manually moved to a subdirectory (e.g. òÀÜtestòÀÝ). Other (dated) subdirectories may exist for production jobs (especially where multiple runs were needed). Note: Running with espresso now creates a comment.txt file to contain a description of each job (operator prompted to edit file at completion of correlation). Espresso also allows test jobs to be specified on running (to be written to òÀÜtestòÀÝ subdirectory in output data area).

Not useful to keep:

  • jobs with zero length (no) output files
  • MPI related files (threads, machines, run)
  • station file lists (already in .input)
correlator/archiving.txt ˆ§ Last modified: 2016/03/30 14:44 by cormac