Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://www.eso.org/~qc/dfos/phoenix_instances.html
Äàòà èçìåíåíèÿ: Fri Apr 1 12:52:06 2016
Äàòà èíäåêñèðîâàíèÿ: Sun Apr 10 01:19:13 2016
Êîäèðîâêà:
Ïîèñêîâûå ñëîâà: stonehenge

Common DFO tools: phoenix instances

Common DFOS tools:
Documentation

dfos = Data Flow Operations System, the common tool set for DFO

make printable

PHO
ENIX

phoenix: instances

Find here an overview of the characteristics of the currently existing PHOENIX instances:

UVES

The UVES PHOENIX process was the first one to be implemented. It builds on the experience from the first reprocessing project (UVES data 2000-2007). It supports the ECHELLE mode (with and without SLICERs) but the extended mode and not the FLAMES/UVES (MOS) data.

It uses the existing historical ABs and the existing historical master calibrations. It provides a flux calibration with master response curves that are versioned historically. This is available for certain standard setups but not for all. Hence all IDPs in these standard setups come flux-calibrated, while the IDPs for the unsupported setups come without flux calibration.

To observe:

The master response curves are meant to provide the proper slope of the spectra but cannot always provide the proper scale. In case of significant changes of the chromatic properties of optical elements of the instrument, or of calibration lamps, the master response curves for the respective new period would need to be constructed from daily response curves.
UVES IDPs are not pipeline-provided but are generated by a post-pipeline tool called idp2sdp that is provided by vforchi@eso.org.
QC reports (actually spectral overviews) are generated and stored on qcweb, but not distributed to the archive users.
UVES IDPs for the RED arm come in one single spectrum (composed of the two REDU and REDL pipeline products). There is one IDP per raw file.

History: http://qcweb/UVESR_2/monitor/FINISHED/histoMonitor.html

Release description: http://www.eso.org/observing/dfo/quality/PHOENIX/UVES_ECH/processing.html

Monitoring of quality: by an automatic scoring system, with a dedicated qc1 database table. A WISQ monitoring process is TBD.

Stream processing: in the standard monthly way, can also be done on a daily basis. Requires the master calibrations to be available, which is normally the case within a few days. Within the monthly loop, a given day is processed in the standard CONDOR way: ABs first, then QC reports.

XSHOOTER

The XSHOOTER PHOENIX process uses the existing historical ABs and the existing historical master calibrations. It supports all three SLIT modes, but not IFU. The NODDING stacks are not combined but processed individually (because it is impossible to read from the headers if stacking would be appropriate or not).

It uses the existing historical ABs and the existing historical master calibrations. It provides a flux calibration with master response curves that are versioned historically. They are available for all setups, hence all XSHOOTER IDPs come flux-calibrated. A telluric correction in the NIR arm is not provided.

To observe:

The master response curves are meant to provide the proper slope of the spectra but cannot always provide the proper scale. In case of significant changes of the chromatic properties of optical elements of the instrument, or of calibration lamps, the master response curves for the respective new period would need to be constructed from daily response curves.
The pipeline products come in the IDP format. The idpConvert_xs tool adds/modifies some header keys.
QC reports (spectral overviews) are generated and stored on qcweb, and distributed to the archive users.
XSHOOTER IDPs come in one spectrum per ARM. There is one IDP per raw file.

History: http://qcweb/XSHOOT_R/monitor/FINISHED/histoMonitor.html

Release description: http://www.eso.org/observing/dfo/quality/PHOENIX/XSHOOTER/processing.html

Monitoring of quality: by an automatic scoring system, with a dedicated qc1 database table. A WISQ monitoring process is TBD.

GIRAFFE

The GIRAFFE PHOENIX process partly uses the existing historical ABs and the existing historical master calibrations. The early data (from 2003/2004) have seen the phoenix 2.0 project GIRAF_M which constructed new master calibrations (they were previously not existing). Based on those master calibrations, the tool phoenixPrepare has been developed to use calSelector for the creation of the science ABs. These new ABs, together with the historical ABs for later periods, were then used to create the historical batch.

The PHOENIX process covers the Medusa1/2 modes (MOS), but not the IFU modes. GIRAFFE IDPs come per science fibre, which means anything between 1 and 120 (typical number: 80-100) spectra per raw file. SKY and SIMCAL fibres are not producing IDPs but are added as ancillary files.

The process does not provide a flux calibration. Data come on a heliocentric wavelength scale.

To observe:

The pipeline provides the IDPs, including the splitting of the fibre signals into individual spectra. The idpConvert_gi tool adds/modifies some header keys.
QC reports (spectral overviews) are generated and stored on qcweb, and distributed to the archive users.
GIRAFFE IDPs come in one spectrum per fibre. There are many (up to 120) IDPs per raw file.
GIRAFFE phoenix jobs heavily saturate the muc08 machine. They should be scheduled with coordination with the other instances.

History: http://qcweb/GIRAF_R/monitor/FINISHED/histoMonitor.html . There is also the monitor for the master calibration project, under http://qcweb.hq.eso.org/GIRAF_M/monitor/FINISHED/histoMonitor.html.

Release description: http://www.eso.org/observing/dfo/quality/PHOENIX/GIRAFFE/processing.html

Monitoring of quality: by an automatic scoring system, with a dedicated qc1 database table. A WISQ monitoring process is done under http://www.eso.org/observing/dfo/quality/WISQ/HEALTH/trend_report_GIRAFFE_IDP_QC_HC.html.

Stream processing: in the standard monthly way. The daily call is also possible but rather inefficient. Due to the special setup of the GIRAFFE IDPs (many IDPs from one raw file), the AB processing takes rather short (a minute per file) while the QC reports take long (a minute per fibre). The optimized processing scheme is therefore the QC job concentration: as many QC jobs run in parallel as possible. For simplicity, this is reached by collecting QC jobs (hence the name) rather than starting to split the QC jobs by fibre. Individual QC job are still executed sequentially. Therefore a daily processing may take almost as long as a monthly processing.

The monthly processing requires the master calibrations to be available, which is normally the case within a few days.

MUSE

Find a detailed description of the MUSE PHOENIX process here.

Overview

Here is an overview of the technical aspects of the various phoenix processes and accounts. Highlights are marked in red.

Instrument/ release	hostname	N_cores	disk size/TB	disk space	pattern	schema for AB processing	QC reports	performance limited by	typical exectime
									AB	QC
UVESR_2	sciproc@muc08	32	7.0	not an issue	monthly*	condor, daily	condor, daily		<1 min	0.5 min
XSHOOTER_R	xshooter_ph@muc08				monthly*	condor, daily	condor, daily		2-7 min	0.5 min
GIRAF_R	giraffe_ph@muc08				monthly*	condor, daily	QCJOB_CONCENTRATION: monthly batch for efficiency	QC reports (because of 1 per fibre: ~100 per raw file)	0.3 min	30-40 min
MUSE_R	2 instances: SLAVE: muse_ph@muc09MASTER: muse_ph2@muc10	32	3.5 (50% quota of 7 TB disk)	to be monitored carefully (1 month of data about 3 TB if in 'reduced'; more if on $DFS_PRODUCT	daily	each account: 2 streams, INTernal**	seriell	- AB processing in 2 streams - memory (512 GB each, sufficient for N=16 combination)	5-30 min***	1 min
		48	10		daily

* can also be done daily, but then completeness needs to be carefully monitored
** most recipes run in two streams (using 24 cores each)
*** depending on AB type

Instrument/ release	SCIENCE AB source	stacks or single raw files?	conversion from pipeline format to IDP required?	monthly batch:
Instrument/ release	SCIENCE AB source	stacks or single raw files?	conversion from pipeline format to IDP required?	typical total number of files per month (ingested)****	IDPs only	exec times per month [hrs]	size [GB]	ingestion time [hrs]
UVESR_2	qcweb:UVES	single	yes, conversion tool idp2sdp	1,000*****	1,000	1	1	0.5
XSHOOTER_R	qcweb:XSHOOTER	single	no***	1,000	500	4	7	0.5
GIRAF_R	qcweb:GIRAF_R and qcweb:GIRAF_R2*	single	no***	20,000	10,000	5 - 10	7	25
MUSE_R	qcweb:MUSE_smart**	stacks	no***	...	...	many ...	...3000	long

* early epoch: mcalibs created with phoenix, ABs created with phoenixPrepare
** created with phoenixPrepare_MUSE
*** pipeline format is IDP; tool needed for header manipulation
**** including ANC files
***** plus 1 png per IDP that is not ingested

Data model: schema for AB source

In general the phoenix process takes the DFOS SCIENCE ABs, edits them with some PGI (in order to e.g. update pipeline parameters, unify static calibration schemes), and stores them after processing in a new repository. The idea is to have them there available in case of future further reprocessing projects. This might become an option (if e.g. a substantial algorithmic improvement would become available) or even a necessity (if e.g. a pipeline bug is discovered, resulting in an error with the already processed IDPs). Then, it is very advantageous to not start from the DFOS ABs again andapply again all modifications to the ABs and then add the new improvements, but instead take the existing phoenix ABs and reprocess them directly.

This approach is generally called a data model: all information about the processing is stored for easy replay. In the phoenix case, one would start from the new AB source, turn off the old AB modification PGI, replace it, if needed, by a new one, and start the entire reprocessing with a new pipeline version, new parameters, new master calibrations etc.

The standard data model for IDP production is:

Standard data model for PHOENIX: ABs are taken from the DFOS site, edited, processed, and get stored under a new tree.

For MUSE, there is the need to have an intermediate storage of ABs, and a final storage after editing and processing:

Data model for MUSE PHOENIX: ABs are taken from the DFOS site, get their input dataset modified, then are processed, and get stored under a new tree.

If an existing IDP project would need to be reprocessed, it could take the phoenix storage and reprocess from there:

Data model for future reprocessing projects

Common DFOS tools: Documentation

Common DFOS tools:
Documentation