Документ взят из кэша поисковой машины. Адрес оригинального документа : http://www.adass.org/adass/proceedings/adass03/reprints/P3-11.pdf
Дата изменения: Sat Aug 28 02:36:28 2004
Дата индексирования: Tue Oct 2 11:05:47 2012
Кодировка:
Astronomical Data Analysis Software and Systems XIII ASP Conference Series, Vol. 314, 2004 F. Ochsenbein, M. Al len, and D. Egret, eds.

A Prototyp e toward Japanese Virtual Observatory (JVO)
Masatoshi OHISHI, Yoshihiko MIZUMOTO, Naoki YASUDA1 , Yuji SHIRASAKI, Masahiro TANAKA, Satoshi HONDA National Astronomical Observatory of Japan, 2-21-1, Osawa, Mitaka, Tokyo 181-8588, Japan Yoshifumi MASUNAGA2 Ochanomizu University, 2-1-1, Otsuka, Bunkyo, Tokyo 112-8610, Japan Ken MIURA3 , Hirokuni MONZEN, Kenji KAWARAI, Yasuhide ISHIHARA, Yasushi YAMAGUCHI and Hiroshi YANAKA Fujitsu Ltd., 1-9-3, Nakase, Mihama, Chiba 261-8588, Japan Abstract. We developed the first prototype toward a Japanese Virtual Observatory (JVO) by using the Globus Tool Kit 2 (GTK2). We found that the system worked as we had expected, including the functionality of JVO Query Language. However, it took a very long to initiate each Grid process. Thus we replaced the GTK2 with a tool distributed by NSF Middleware Initiative, and shortened the polling interval of the jobmanager from 30 seconds to 3 seconds. As a result, the serious problem was partially resolved, and the elapsed time for a query reduced to about half of the previous one.

1.

Introduction

The National Astronomical Observatory of Japan (NAOJ) operates the Subaru telescope in Hawaii and large radio telescopes in Nobeyama. All the observed data are digitally archived and are accessible via internet. The radio telescopes of Nobeyama produce about 1 TByte per year, and the Subaru telescope outputs about 20 TBytes per year. Because astronomical ob jects radiate electromagnetic waves in a wide frequency range, it has been recognized that multi-wavelength analyses are essential to understand the physical and chemical behavior of galaxies, stars, planets and so on. JVO is designed to provide seamless access to federated databases and data analyses systems for astronomers by utilizing the state-of-the-art GRID tech1 2 3

Institute for Cosmic Ray Research, University of Tokyo National Astronomical Observatory of Japan National Institute of Informatics

296 c Copyright 2004 Astronomical Society of the Pacific. All rights reserved.


A Prototype toward Japanese Virtual Observatory (JVO)
JVO Prototype System Architecture
JVO Solaris8 (Possible at Linux) Controller UDDI Registory / Servlet Engine
Java2SE 1.4 Hypertext contents Apache Tomcat soapuddi Apache JDBC Gb Ether

297
Redhat Linux
2Mass DataBase

Redhat LinJVO Service ux
GSDL Globus Toolkit (Server) JVO Service (X match)

UDDI Maintenance Tool
Java2SE 1.4 Maintenance Tool UDDI4J ? register ?update ? delete

JVO Client
Netscape Communicator Java2 Plugin JVO Client (applet)

Applet DownLoad

JVO Service (select)

SQL library

Request Observation

Find Service

JVO Controller
Java2SE 1.4 JVO Server UDDI4J ? ? ? ? get & parse GSDL pre condition check execute Service post condition check UDDI Data

DB Service
PostgreSQL (DBMS)

Image Data (copy)

DB Service
PostgreSQL

Monitoring

SUBARU SupCam z-band

Image Data

2Mass DB

Redhat Linux
issue SQL GSDL (copy) f Get GSDL by GridFTP GSDL

SUBARU DataBase

execute Service f ecute Service ex

get GSDL Check status Globus Toolkit (Client)

JVO Service

Internet
Get GSDL by GridFTP

Globus Toolkit (Server) f pawn Service s JVO Service (image) JVO Service (X match) Globus Toolkit (Server) GridFTP

Globus Toolkit

SUBARU DataBase

JVO Service
GSDL

spawn Service Globus Toolkit JVO Service (count) SQL library JVO Service (select) SQL library JVO Service (image) JVO Service (X match)

JVO Service (select)

SQL library

JVO Service (count)

SQL library

Image Data

Image Data (copy)

f ssue SQL i Copy imate data by GridFTP Image Data (copy)

DB Service
ORACLE

Free Software Commercial Software

In House Software Function

The following free software is included by globus toolkit. ? OpenSSL (Secure Socket Library) ?OpenLDAP (LDAP Server) ? wu-ftpd (ftp server)

Copy imate data by GridFTP

Image Data

SUBARU SupCam i-band

Figure 1. Architecture of JVO prototype. Note that the prototype has not been connected to other VOs yet. nology through the 10 Gbps which was installed in 2002. access to the distributed data by Mizumoto et al. (2003). This paper describes in first prototype toward JVO. SuperSINET (http://www.sinet.ad.jp/english/) The basic concept and a new query language to bases, JVO Query Language, are already described detail the implementation and assessment of the

2.

Implementation of the JVO Prototype

We implemented the first prototype in a closed subnet in NAOJ. The architecture of the JVO prototype is shown in Figure 1. We adopted the Globus Toolkit 2 for the prototype. However we also take into account the Web service concept which is included in the OGSA. Here we describe how the prototype works. First of all, researchers provide the JVO with simple instructions, described by using JVOQL, how they want to perform their "Virtual Observation" through the JVO portal. The JVO portal interprets them and generates a "work-flow" by consulting the UDDI servers to find where available JVO services are registered. Based on the work-flow, built-in or user-defined services are called sequentially by the JVO controller. Prior to command execution the JVO controller


298

Ohishi et al.

issues a "pre-condition check" to make dynamic assignment of distributed resources according to their availabilities. When one step of the work flow is finished, the result is examined by "post-condition check" to determine whether the step finished successfully or not. If the step finished successfully, the JVO controller generates the next step(s) of the work flow and executes them. If the step finished unsuccessfully, the JVO controller searches for an alternative server which provides the same service, and executes the same step on that server, if available. Successful execution results of the work flow are transferred from remote servers to the JVO controller through GridFTP, and are presented to the researchers by the JVO client. It is a very important service in the JVO to cross-match (X-match) query results from multiple wavelength data. Each query is sent from the JVO controller to an appropriate database server. Then the smallest query result is GridFTPed from the server to another server with the next smallest result. The recipient server is asked to run its X-match engine, and the result is further GridFTPed to a server with the third smallest query result. The final result is GridFTPed to the JVO controller. 3. Assessment of the Prototype

We used several JVOQLs to assess this prototype. Table 1 contains each step of the work flow and elapsed time for each step. Tabl Step # 0 0.0 0.1 0. 2 0.2.0 0.2.1 0.2.2 0.2.3 e 1. A sampl Host mizu-g mizu-g minazuki-g mizu-g mizu-g minazuki-g mizu-g minazuki-g e of work flow and elapsed time Command Elapsed Time JVOQLparser.sh 1' 12" jvo-query.sh 1' 15" jvo-query.sh 1' 09" Scheduler.sh 1' 14" jvo-query.sh 1' 15" post-xmatch.sh 1' 33" jvo-query.sh 1' 21" jvo-query.sh 2' 26"

Table 1 contains several commands described as shell scripts: JVOQLparser.sh reads input JVOQL script and parses into individual queries in SQL; jvo-query.sh issues individual queries to database servers, counts up database records hit, and cuts images out from image databases; Scheduler.sh collects count results and determines the order to request database servers query results and image data; and post-xmatch.sh kicks off the cross-match engine. These commands were submitted by using the GRAM service of the Globus Tool Kit 2. As we expected, all steps were generated automatically, and we could get results successfully. We examined the robustness of our prototype by forcing the system to issue a command, at step 0.2.2, which would fail at one server but succeed at another server. At first the issued command to the "wrong" server failed, but then the system reissued the same command to the "right" server through dynamic generation of the work flow.


A Prototype toward Japanese Virtual Observatory (JVO)

299

However we found the elapsed times were too long for all steps. We knew that an elapsed time for each command was less than a few seconds when it was issued in a non-globus environment. It should be noted that the final step, 0.2.3, corresponds to cutting out images and needs a very long CPU time even in a non-globus environment. Such very long elapsed times seemed to be due to the authentication process and the "globus-job-run" command of the Globus Tool Kit 2. It is well known that the authentication process takes nearly 10 seconds and the "globus-job-run" command takes long during its initial hand-shaking procedure before issuing a "real command". Since JVO is a pseudo-real-time system, it was crucial to shorten such large overhead in each process. 4. Improvement of the Prototype

We introduced the NSF Middleware Initiative (NMI)1 to accelerate the slow authentication process in the Globus Tool Kit 2, because NMI provides a binary module for the authentication. Then we analyzed the source code of the "globusjob-run", and found that the polling interval was fixed to 30 seconds. Therefore we modified the polling interval to 3 seconds, and recompiled the tool kit. As the result the elapsed times in Table 1 were shortened by more than a factor of 2. For example the elapsed time for step 0 became 20 - 25 seconds. Although we succeeded to accelerate all steps in our prototype, the elapsed times are much longer compared with those for cases in non-globus environment. Thus it is necessary to investigate further, for example, the source code of the tool kit to make our system to run much faster. 5. Summary

We constructed the first version of the JVO prototype based on GTK2, and confirmed that the JVOQL has sufficient functionality to access federated databases. We found that the prototype worked as we had expected, however, it took a very long to initiate each Grid process. Thus we replaced the GTK2 with a tool distributed by NSF Middleware Initiative, and shortened the polling interval of the job-manager from 30 seconds to 3 seconds. As a result, the serious problem was partially resolved, and the elapsed time for a query became less than half compared with the previous one. Acknowledgments. This research was supported by Grant-in-aid "Information Science" carried out by the MEXT (14019092 and 15017289). References Mizumoto, Y., et al. 2003, in ASP Conf. Ser., Vol. 295, ADASS XII, ed. H. E. Payne, R. I. Jedrzejewski, & R. N. Hook (San Francisco: ASP), 96.

1

http://www.nsf-middleware.org/