Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://acat02.sinp.msu.ru/presentations/GGtut/Dev-06-ResourceManagement1.pdf
Äàòà èçìåíåíèÿ: Sat Jul 6 22:29:40 2002
Äàòà èíäåêñèðîâàíèÿ: Mon Oct 1 20:37:02 2012
Êîäèðîâêà:
GRAM: Grid Resource Allocation & Management
Globus ToolkitTM Developer Tutorial The Globus ProjectTM
Argonne National Laboratory USC Information Sciences Institute http://www.globus.org/
Copyright (c) 2002 University of Chicago and The University of Southern California. All Rights Reserved. This presentation is licensed for use under the terms of the Globus Toolkit Public License. See http://www.globus.org/toolkit/download/license.html for the full text of this license.


Resource Management Review
l

l

l

Resource Specification Language (RSL) is used to communicate requirements The Grid Resource Allocation and Management (GRAM) API allows programs to be started on remote resources, despite local heterogeneity A layered architecture allows applicationspecific resource brokers and co-allocators (e.g. DUROC) to be defined in terms of GRAM services
Globus ToolkitTM Developer Tutorial: GRAM 2

March 25, 2002


GRAM Components
Client
MDS client API calls to locate resources MDS client API calls to get resource info GRAM client API calls to request resource allocation and process creation. GRAM client API state change callbacks Globus Security Infrastructure

MDS: Grid Index Info Server
Site boundary

MDS: Grid Resource Info Server
Query current status of resource

Local Resource Manager
Request Allocate & create processes Process Process Process
3

Create

Job Manager

Gatekeeper

Parse
RSL Library

Monitor & control

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM


Resource Management APIs
l

Globus Toolkit has APIs for RSL, GRAM, and DUROC:
­ globus_rsl ­ globus_gram_client ­ globus_gram_myjob ­ globus_duroc_control ­ globus_duroc_runtime

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

4


Resource Specification Language
l l

Much of the power of GRAM is in the RSL Common language for specifying job requests
­ GRAM service translates this common language into scheduler specific language

l

GRAM service constrains RSL to a conjunction of (attribute=value) pairs
­ E.g. &(executable="/bin/ls")(arguments="-l")

l

GRAM service understands a well defined set of attributes
Globus ToolkitTM Developer Tutorial: GRAM 5

March 25, 2002


globus_rsl
l

Module for manipulating RSL expressions
­ Parse an RSL string into a data structure ­ Functions to manipulate the data structure ­ Unparse the data structure into a string

l

Can be used to assist in writing brokers or filters which refine an RSL specification

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

6


RSL Attributes For GRAM
l

(executable=string)
­ Program to run ­ A file path (absolute or relative) or URL

l

(directory=string)
­ Directory in which to run (default is $HOME)

l

(arguments=arg1 arg2 arg3...)
­ List of string arguments to program

l

(environment=(E1 v1)(E2 v2))
­ List of environment variable name/value pairs

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

7


RSL Attributes For GRAM
l

(stdin=string)
­ Stdin for program ­ A file path (absolute or relative) or URL

l

(stdout=string)
­ Stdout for program ­ A file path (absolute or relative) or URL

l

(stderr=string)
­ Stdout for program ­ A file path (absolute or relative) or URL

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

8


RSL Attributes For GRAM
l

(count=integer)
­ Number of processes to run (default is 1)

l

(hostCount=integer)
­ On SMP multi-computers, number of nodes to distribute the "count" processes across

l

(project=string)
­ Project (account) against which to charge

l

(queue=string)
­ Queue into which to submit job

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

9


RSL Attributes For GRAM
l

(maxTime=integer)
­ Maximum wall clock or cpu runtime (schedulers's choice) in minutes

l

(maxWallTime=integer)
­ Maximum wall clock runtime in minutes

l

(maxCpuTime=integer)
­ Maximum CPU runtime in minutes

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

10


RSL Attributes For GRAM
l

(maxMemory=integer)
­ Maximum amount of memory for each process in megabytes

l

(minMemory=integer)
­ Minimum amount of memory for each process in megabytes

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

11


RSL Attributes For GRAM
l

(jobType=value)
­ Value is one of "mpi", "single", "multiple", or "condor"
> mpi: Run the program using "mpirun -np " > single: Only run a single instance of the program, and let the program start the other count-1 processes. > multiple: Start instances of the program using the appropriate scheduler mechanism > condor: Start a Condor processes running in "standard universe"

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

12


RSL Attributes for GRAM
l

(gramMyjob=value)
­ Value is one of "collective", "independent" ­ Defines how the globus_gram_myjob library will operate on the processes
> collective: Treat all processes as part of a single job > independent: Treat each of the processes as an independent uniprocessor job

l

(dryRun=true)
­ Do not actually run job

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

13


RSL Attributes for GRAM
l

(save_state=yes)
­ Causes the jobmanager to save job state/information to a persistent file on disk ­ Recover from a jobmanager crash ­ New in Globus Toolkit v2.0

l

(two_phase=)
­ Implement a two-phase commit for job submission and completion ­ =seconds to wait before job times out ­ New in Globus Toolkit v2.0

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

14


RSL Attributes for GRAM
l

(restart=)
­ Start a new jobmanager but instead of submitting a new job, start watching over an existing job. ­ New in Globus Toolkit v2.0

l l

(stdout_position=) (stderr_position=)
­ specified as part of a job restart ­ restart file streaming from this byte ­ New in Globus Toolkit v2.0

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

15


RSL Substitutions
l l

RSL supports simple variable substitutions Substitutions are declared using a list of pairs
­ (rslSubstitution=(SUB1 val1)(SUB2 val2))

l l

A substitution is invoked with $(SUB) Processing order:
­ Within scope, processed left-to-right, ­ Outer scope processed before inner scope ­ Variable definition can reference previously defined variables

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

16


RSL Substitution Example
l

This
&(rslSubstitution =(URLBASE "ftp://host:1234")) (rslSubstitution=(URLDIR $(URLBASE)/dir)) (executable=$(URLDIR)/myfile)

l

is equivalent to this
&(executable=ftp://host:1234/dir/ myfile)

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

17


GRAM Defined RSL Substitutions
l

GRAM defines a set of RSL substitutions before processing the job request Machine Information
­ GLOBUS_HOST_MANUFACTURER ­ GLOBUS_HOST_CPUTYPE ­ GLOBUS_HOST_OSNAME ­ GLOBUS_HOST_OSVERSION

l

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

18


GRAM Defined RSL Substitutions
l

Paths to Globus
­ GLOBUS_LOCATION

l

Miscellaneous
­ HOME ­ LOGNAME ­ GLOBUS_ID

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

19


GRAM Examples
The globus-job-run client is a sample GRAM client that integrates GASS services for executable staging and standard I/O redirection, using command-line arguments rather than RSL.
% globus-job-run pitcairn.mcs.anl.gov /bin/ls % globus-job-run pitcairn.mcs.anl.gov ­s myprog % globus-job-run pitcairn.mcs.anl.gov \ ­s myprog ­stdin ­s in.txt ­stdout ­s out.txt

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

20


GRAM Examples
The globusrun client is a more involved tool that allows complicated RSL expressions.
% globusrun ­r pitcairn.mcs.anl.gov ­f myjob.rsl % globusrun ­r pitcairn.mcs.anl.gov \ `&(executable=myprog)'

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

21


globus_gram_client
l

globus_gram_client_job_request()
­ Submit a job to a remote resource ­ Input:
> Resource manager contact string > RSL specifying the job to be run > Callback contact string, for notification

­ Output:
> Job contact string

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

22


Finding The Gatekeeper
l

globus_gram_client_job_request() requires a resource manager contact string to find the gatekeeper
hostname[:port][/service][:subject] ­ hostname ­ host of gatekeeper
> required

­ port ­ port on which gatekeeper is listening
> defaults to well known port = gsigatekeeper = 2119

­ service ­ gatekeeper service to invoke
> defaults to "jobmanager "

­ subject ­ security subject name of gatekeeper
> Defaults to standard host cert form: ".../cn=host/hostname" > Applies fuzzy match to deal with interface names, etc.
March 25, 2002 Globus ToolkitTM Developer Tutorial: GRAM 23


Job Contact
l

globus_gram_client_job_request() returns a job contact
­ Opaque string ­ Other globus_gram_client_*() functions use the job contact to find the right job manager to which requests are made ­ Job contact string can be passed between processes, even on different machines

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

24


globus_gram_client
l

globus_gram_client_job_status()
­ Check the status of the job
> UNSUBMITTED, PENDING, ACTIVE, FAILED, DONE, SUSPENDED

­ Can also get job status through callbacks
> globus_gram_client_callback_{allow,disallow,check}()
l

globus_gram_client_job_cancel()
­ Cancel/kill a pending or active job

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

25


globus_gram_client
l

globus_gram_client_job_signal()
­ Controls the jobmanager ­ COMMIT_REQUEST*
> submit job

­ COMMIT_END*
> Cleanup job

­ COMMIT_EXTEND*
> Wait additional N seconds

­ * when jobs have "(two_phased=yes)"

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

26


globus_gram_client
l

globus_gram_client_job_signal(), continued
­ STDIO_UPDATE
> Allows client to submit an RSL that changes some I/O attributes of the job
l

stdout, stderr, stdout_position, stderr_position, remote_ io_url

­ STDIO_SIZE
> verify that streamed I/O has been completely received

­ STOP_MANAGER
> Tells JM to exit, but leave the job running

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

27


State Change Callbacks
l

GRAM managed job can be in the states:
­ Unsubmitted, Pending, Active, Failed, Done, Suspended

l

GRAM client can register for asynchronous state change callbacks
­ Registration can be done during submission
> Globus_gram_client_job_request()

­ Registration can be done later by any process, using the job contact
> globus_gram_client_job_callback_register()

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

28


globus_gram_client
l

globus_gram_client_callback_allow() globus_gram_client_callback_disallow() globus_gram_client_callback_check()
­ Create/destroy a client port to listen for asynchronous state change callbacks ­ Callback to local function on state change

l

globus_gram_client_job_callback_register() globus_gram_client_job_callback_unregister()
­ Register with job manager to receive callbacks

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

29


globus_gram_myjob
l

When a set of processes in a single job startup, they may need to self organize
­ How many processes in the job? ­ What is my rank within the job? ­ Simple send/receive between job processes.

l

l

This API is a minimal set of functions to allow this self organization This is a bootstrapping library. It is NOT meant to be a general purpose message passing library for use by applications
Globus ToolkitTM Developer Tutorial: GRAM 30

March 25, 2002


DUROC Review
l

Simultaneous allocation of a resource set
­ Handled via optimistic co-allocation based on free nodes or queue prediction ­ In the future, advance reservations will also be supported

l

globusrun will co-allocate specific multirequests
­ Uses a Globus component called the Dynamically Updated Request Online Co-allocator (DUROC)

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

31


A Co-allocation Multirequest
+( & (resourceManagerContact= *** "flash.isi.edu:2119/jobmanagerlsf :/O=Grid/.../CN=host/flash.isi.edu ") (count=1) Different resource (label="subjob A") managers (executable= my_app1) Different ) counts ( & (resourceManagerContact= ***"sp139.sdsc.edu:2119:/O=Grid/.../CN=host/sp097.sdsc.edu") Different executables (count=2) (label="subjob B") (executable=my_app2) )
March 25, 2002 Globus ToolkitTM Developer Tutorial: GRAM 32


RSL Attributes For DUROC
l

(subjobStartType=value)
­ Alters the startup barrier mechanism ­ values are "strict-barrier", "loose-barrier", "no-barrier"

l

(subjobCommsType=value)
­ values are "blocking-join" and "independent" ­ if value is set to "independent", the subjob won't be seen from the other subjobs when doing inter-subjob communication.

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

33


RSL Attributes For DUROC
l

(label=string)
­ Identifier for this subjob

l

(resourceManagerContact=string) (resourceManagerName=string)
­ Resource manager to which to submit a subjob

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

34


globus_duroc_control
l l

Submit a multi-request Edit a pending request
­ Add new nodes, edit out failed nodes

l

Commit to configuration
­ Delay to last possible minute ­ Barrier synchronization

l

Initialize computation
­ Bootstrap library

l

Monitor and control collection

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

35


globus_duroc_runtime
l

globus_duroc_runtime_barrier()
­ All processes in DUROC job must call this ­ It will wait until the DUROC control module releases all processes from the barrier

l

globus_duroc_runtime_inter_subjob_*()
­ Bootstrap library between subjobs

l

globus_duroc_runtime_intra_subjob_*()
­ Bootstrap library within a subjob

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

36


Job Manager Files
GRIS Client
monitoring Job status

Gatekeeper
X509_USER_PROXY UP

Jobmanager GASS_CACHE
UP stdout stderr Staged EXE Staged stdin

Submission Scheduler Desc. Exe=x Args=y Env=z

JOB

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

37


GRAM exercises
l

Note: GRAM has three APIs:
­ client, myjob, job_manager ­ Most users will never use job_manager API

l l

Go to the "gram" subdirectory Documentation
­ http://www.globus.org/gram

l

Follow instructions in the file README

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

38


DUROC exercises
l

Note: DUROC has two APIs:
­ control, runtime

l l

Go to the "duroc" subdirectory Documentation
­ http://www.globus.org/duroc

l

Follow instructions in the file README

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

39


RSL exercises
l l

Go to the "rsl" subdirectory Documentation
­ http://www.globus.org/gram/ rsl

l

Follow instructions in the file README

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

40


Changes: 1.1.x Þ 2.0
l

One-and-only-once submission
­ Through 2 phase commit signal

l

Recoverability
­ Job manager can be restarted ­ Restart/redirect stdout/err

l

Generalized signaling of job manager

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

41


Future "GRAM 1.6"
l l

Asynchronous client API New RSL attribute to pass through scheduler specific commands
­ No more piggy-backing on the environment attributes

l

File staging
­ scratch dir, input, output

l

Advanced output management
­ Stream/store stdout and stderr to multiple destinations

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

42


Interesting Issues
l

The Globus Toolkit does not include a resource broker or a metascheduler!
­ We have helped many people to build these using GRAM and MDS services; many now exist.
> Condor-G, DRM, PUNCH, Nimrod/G, Cactus, AppLeS,

March 25, 2002

Globus ToolkitTM Developer Tutorial: GRAM

43