Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://www.adass.org/adass/proceedings/adass96/fitzpatrickm.html
Дата изменения: Tue Jun 23 21:15:32 1998 Дата индексирования: Tue Oct 2 03:47:27 2012 Кодировка: Поисковые слова: m 81 |
Next: The ROSAT RESULTS ARCHIVE: Tools and Methods
Previous: WIYN Data Distribution and Archiving
Up: Data Archives
Table of Contents - Index - PS reprint - PDF reprint
Mike Fitzpatrick and Doug Tody
IRAF Group,
NOAO, PO Box 26732, Tucson, AZ 85726
David L. Terrett
Rutherford Appleton Laboratory,
Chilton, Didcot, Oxfordshire OX11 0QX, United Kingdom
[1]Image Reduction and Analysis Facility, distributed by the National Optical Astronomy Observatories [2]National Optical Astronomy Observatories, operated by the Association of Universities for Research in Astronomy, Inc. (AURA) under cooperative agreement with the National Science Foundation
The subject of automatic mirroring can be approached in one of two ways: from the standpoint of those wishing to export their archive for mirroring, and of those wishing to be a mirror for an existing archive. Although this paper deals with the specific issues we faced in setting up a mirror of the IRAF archives, the techniques presented are general, and can easily be applied to any similar archive.
On both ends there were some expected setup glitches in trying to verify the thousands of links involved, in bringing both systems to a common understanding about requirements in HTTP server and local software, and in establishing a routine procedure for maintaining the mirror. The initial experiment between the NOAO IRAF Group and the UK Mirror at Rutherford Appleton Labs has worked out many of these problems, and has provided us with the ability to establish other mirrors much more easily. In the first five months of operation, the UK IRAF Mirror Site has distributed more than 4300 files to 120 different nodes in more that ten countries, providing a faster, more reliable link for UK and European sites. Negotiations are underway to establish mirrors in other parts of the world where FTP access to the NOAO Tucson archives or UK mirror sites is prohibitively slow.
The host-independent manner in which the WWW pages are written means that they can also be used from a CD-ROM running on a local machine, in effect duplicating the IRAF archive on any machine. We discuss the limitations and special setup required in this case.
There are only a few steps involved in preparing your archive so it can be easily mirrored elsewhere:
The mirroring site will have a Web address different from the original site. If Web pages contain explicit HTTP URLs, then these pages will still refer to the original archive when the pages are mirrored, negating the point of the Web mirror. The simplest solution is to substitute file relative URLs in all cases except where one really does want a URL to point to a specific network host. For the exporting site this means each link will need to be examined and changed in the following ways:
There are several things to be done to make most CGI scripts portable:
SPM_quot
#!/bin/csh"
path as the first line. Such paths may not be universal, however. The
mirroring site is responsible for creating the system links needed to
resolve these paths.
You may wish to arrange for mirror site usage logs to be propagated back to the original site. This can be done as a weekly cron job that greps for entries containing a certain keyword in the logs (``iraf'' in our case) and automatically mails them to a specific maildrop. If the archive is large, it is best to make a snapshot tape of the full directory tree tree to be mirrored and mail that to the mirroring institution to populate the initial directory tree. Once the initial system is installed and working, updates should be small and will be handled automatically by the mirror software.
Now that the initial IRAF mirror site has been established, we should have worked out most of the bugs in the scripts and documents on our end, but there are still concerns for new sites wishing to establish a mirror:
The complete IRAF archive now requires approximately 3GB of storage-this will probably increase another 1GB in the next year as more software is released. Potential mirrors should consider the purchase of a new dedicated disk.
The RAL mirror site is maintained using a package called MIRROR from Lee McLoughlin of the University of London; other packages are also available. This particular package required Perl 4, which had to be installed specifically to support the package. A cron script is run nightly to update the archive, and a separate script is run once weekly to mail access logs back to Tucson. The archive scripts directory is mirrored separately to a different directory, in part because execute permissions are stripped in the mirroring process and in part so new code may be hand checked, as a security measure.
The UK mirror was already serving Web documents and had a configured HTTP server. New sites, or those using the CD-ROM, may need to configure a server. The only changes required to support the mirror were alias definitions for the IRAF CGI scripts directory. This means editing the httpd/conf/srm.conf file with an Alias and ScriptAlias definition for the scripts directory which points to the iraf Web scripts directory on the mirror, and aliases for the root-relative links. For example,
Alias /iraf/web /iraf/web Alias /iraf/ftp /iraf/ftp Alias /scripts /mirror/iraf/web/scripts ScriptAlias /scripts/ /iraf/web/scripts
One other problem is that most HTTP servers define a default MIME type as plain text for documents for which the server cannot determine the type from the file name extension. This means that tar files, compressed PostScript files, etc., show up as jumbled text in the browser rather than being identified as binary or starting an external viewer. To work around this, we suggest the following definition in the server's srm.conf file
Redirect /iraf/ftp/ ftp://iraf.noao.edu/This causes the most browsers to create a save pop-up window rather than trying to display the file, which is what is most often desired.
Aside from the initial setup and verification of new scripts, the process is now largely automatic requiring an estimated one hour/month to maintain the mirror. Only rarely has the nightly update not completed successfully; in each case it has succeeded the following night.
While the host-independent nature of the WWW pages means the archive can be distributed on and browsed from a CD-ROM, there are a few issues of concern for viewing the CD-ROM Web pages as though they were a live Web site:
We welcome inquiries from any sites wishing to set up additional IRAF mirrors, or from sites interested in using the techniques outlined in this paper to mirror their own archives. Contact iraf@noao.edu for further information.
Next: The ROSAT RESULTS ARCHIVE: Tools and Methods
Previous: WIYN Data Distribution and Archiving
Up: Data Archives
Table of Contents - Index - PS reprint - PDF reprint