This wiki is obsolete, see the NorduGrid web pages for up to date information.

A-REX transition

From NorduGrid
Jump to navigationJump to search

Warn.png

NOTE: This page is out of date and is kept for purely historical reasons.

only testing - don't use this as a guide!

Installing and configuring A-REX - first experiences

From scratch

  • google nordugrid a-rex getting started gives me NOX/Tests/A-REX test cases as first hit, NOX/SysAdminManual as second hit
  • I have fedora, so choosing installation from binaries with yum
    • yum install nordugrid-arc-nox nordugrid-arc-nox-hed nordugrid-arc-nox-arex nordugrid-arc-nox-client nordugrid-arc-nox-isis nordugrid-arc-nox-charon nordugrid-arc-nox-hopi
  • Setting up A-REX
    • No ini-file, NOX/SysAdminManual refers to xml profiles
    • Choosing A-REX with fork lrms
    • Note: no mentioning of certificates before the end of NOX/SysAdminManual. Should be mentioned right after or within installation description as no service can work without certificates. (A sysadmin never reads the manual to the end before starting test installation.)
    • Note2: Should be mentioned explicitly that fork is a simple back-end and does not use any batch system, there is no specific configuration needed of the underlying system. (I had to google a lot to find it the info in backends-arc1.pdf (just google it ;))
    • Found /usr/share/arc/examples/config/ComputingElementWithFork, with a close to working ini (had to change @prefix@ to /usr in xml profile), using that instead. Should be mentioned in manual.
    • Started A-REX, seems to work
  • Testing A-REX
    • arcproxy from nordugrid-arc-nox-client-1.2.0-1.fc14.i686 doesn't work with instantCA
    • Can submit job as described in manual
    • arcget -a fails to get job results but still deletes the job(!), arcls jobID fails with "ERROR: Failed listing files" - should I setup gridftpd or something?
      • From A-REX log file: [2010-12-15 11:40:49] [Arc.MCC.HTTP] [WARNING] [10041/159653704] HTTP Error: 404 Not Found
    • Switch over to Ubuntu Lucid, as it has older openssl
    • Now choosing arex-charon-echo.xml and storage-hopi.xml as described in manual.
      • Both xmls use an outdated config for logfile and loglevel. <Logger level="VERBOSE">/var/log/arched.log</Logger> should be <Logger><File>/tmp/arex_arched.log</File><Level>DEBUG</Level></Logger>
      • This setup requires the old /etc/arc_arex.conf.
    • Manual states that installing nordugrid-arc-nox-arex will give me /etc/arc_arex.conf automatically -- I did not get that
      • Choosing /usr/share/arc/examples/config/SecureComputingAndStorageElement_DNlist/arc_arex.conf, removing pbs-stuff and change lrms to fork
      • Manual states that I should set the path to charon_policy.xml, but nothing about what it should look like
      • Copied from /usr/share/arc/examples/charon/charon_policy_arc.xml.example and guessed
    • A-REX runs but all jobs failed unexpectedly in LRMS.
      • Turns out that A-REX session dir needs to be writable by 'nobody' even though grid-mapfile maps to 'jonkni'and usermap in xml is <arex:defaultLocalName>jonkni</arex:defaultLocalName>.
    • Job submission now works
  • What is missing
    • Updated documentation
    • Working ini-configuration

From existing grid manager

Check-list

Table of config params from GM.pdf (NOTE! Not synced with arc.conf.template). All tests are run on RHEL5 64-bit arc-0.8.3.1 release.

Option Tested Working in old GM Working in A-REX via arc.conf Comment
Global commands
daemon=yes/no Yes Yes Yes A-REX: Only read if defined in grid-manager block
logfile=[path] Yes Yes Yes A-REX: Only read if defined in grid-manager block, still writes most of log to default path if arched fails
user=[uid[:gid]] Yes Yes Yes OGM: Requires UNAME:GNAME, not UID:GID. A-REX: Requires UNAME only, only read if defined in grid-manager block
pidfile=[path] Yes Yes Yes A-REX: Only read if defined in grid-manager block
debug=number Yes Yes Yes A-REX has more debug levels than grid-manager
grid-manager
joblog=[path] Yes Yes Yes
jobreport=[URL ... number] No Don't know how to set up the service that should receive the job reports
securetransfer=yes/no Yes No No A-REX: Jobs fail when securetransfer and passive option in gridftp block don't mach (both must be yes or no at same time). Should be documented. BOTH: Encrypt=1 or Encrypt=True for every transfer even with securetransfer=no
passivetransfer=yes/no Yes No No Passive transfer is on ("PASV/SPAS" in gridftp.log) even with passivetransfer=no
localtransfer=yes/no Yes Yes Yes
maxjobs=[max_processed_jobs [max_running_jobs]] Yes Yes Yes
maxload=[max_frontend_jobs [emergency_frontend_jobs [max_transferred_files]]] Yes Yes Yes
maxloadshare=max_share share_type Yes Yes Yes
wakeupperiod=time Yes not tested Yes A-REX: wakes up at least every 3 seconds if wakeupperiod>3, seems to have effect when wakeupperiod=1
authplugin=state options plugin Yes Yes Yes
localcred=timeout plugin No Don't know how to configure, who uses this?
norootpower=yes/no Yes Yes Yes
allowsubmit=[group ...] No N/A Not in A-REX manual, no point in testing
speedcontrol=min_speed min_time min_average_speed max_inactivity Yes Yes Yes
preferredpattern=pattern No Need a lfc url with replicas
copyurl=template replacement Yes No Yes No sign of replacement in log for GM
linkurl=template replacement [node_path] Yes No Yes No sign of replacement in log for GM
Per UNIX user commands
mail=e-mail_address Yes Yes No No mails are sent from A-Rex, see bug #2182
defaultttl=ttl [ttr] Yes Yes Yes setting defaultttl to 60 seconds makes session dirs to be deleted in less than 12 hours
lrms=default_lrms_name default_queue_name Yes Yes Yes
sessiondir=path [drain] Yes Yes Yes A-REX: drain hasn't been tested
cachedir=path [link_path] Yes Yes Yes A-REX: link_path hasn't been tested
remotecachedir=path [link_path] Yes Yes Yes
cachesize=high_mark [low_mark] Yes Yes Yes Works, but manual says numbers should be in bytes, while they should be in percent. (http://www.nordugrid.org/documents/GM.pdf is probably outdated)
cachelifetime=lifetime Yes Yes Yes seems to take longer before deletion in A-REX
maxrerun=number Yes Yes No ngresume does not work with A-REX, see bug #2184
maxtransfertries=number Yes Yes No A-REX doesn't retry srm://gmail.com/file.dat, gives up after first attempted port.
control=path username [username [...]] Yes Yes Yes
helper=username command [argument [argument [...]]] Yes Yes Yes
Global, LRMS specific (PBS)
pbs_bin_path=path Yes Yes Yes
pbs_log_path=path No Not tested yet
gnu_time=path Yes Yes No Jobs still runs fine with /usr/bin/date in A-REX
tmpdir=path No Not tested yet
runtimedir=path No Not tested yet
shared_filesystem=yes/no Yes Yes Yes
nodename=command No Not in arc.conf.template; obsolete?
scratchdir=path Yes Yes Yes
shared_scratch=path Yes Yes Yes
Argument substitutions
%R - session root Yes No Yes sessiondir="/cluster/charged/sessions", but %R is //.jobs when control is set (in both versions, but A-REX does not support %c)
%C - control dir Yes Yes Yes
%U - username Yes Yes Yes
%u - userid Yes Yes Yes
%g - groupid Yes Yes Yes
%H - home dir Yes Yes Yes
%Q - default queue Yes Yes Yes
%L - default lrms Yes Yes Yes
%W - installation path Yes Yes Yes
%G - globus path Yes Yes Yes
%c - list of all control directories Yes Yes N/A somehow control overwrites %R
%I - job ID (for plugins only, substituted in runtime) Yes Yes Yes
%S - job state (for authplugin plugins only, substituted at runtime) Yes Yes Yes
%O - reason (for localcred plugins only, substituted at runtime) Yes No No %O is not substituted
Command line options
-h - short help Yes Yes Yes
-d - debug level Yes Yes No -d means "dump generated xml config"
-L - log file (overwrites value in configuration file) Yes Yes No No such option in arched
-P - file containing process id (overwrites value in configuration file) Yes Yes No -p in arched
-U - user and group id to use for running daemon Yes Yes No -u in arched
-f - do not make process daemon Yes Yes No -f in arched
-c - name of configuration file Yes Yes Yes
-C - remove old information before starting Yes No No
Signals
SIGINT, Ctrl-C Yes Yes Yes
SIGTERM Yes Yes Yes

All config option of original GM

  • daemon=yes|no - specifies whether the GM should run in the background after started. Defaults to yes.
  • logfile=[path] - specifies name of file for logging debug/informational output. Defaults to /dev/null for daemon

mode and stderr for foreground mode.

  • user=[uid[:gid]] - specifies user id (and optionally group id) to which the GM must switch after reading con-

figuration. Defaults to not switch.

  • pidfile=[path] - specifies file where process id of GM process will be stored. Defaults to not write.
  • debug=number - specifies level of debug information. More information is printed for higher levels. Currently the highest effective number is 3 and lowest 0. Defaults to 2.

All commands above are generic for every daemon-enabled server in the ARC NorduGrid toolkit (such as GFS and HTTPSD). 11

  • joblog=[path] - specifies where to store log file containing information about started and finished jobs.
  • jobreport=[URL ... number] - specifies that GM has to report information about jobs being processed (started, finished) to a centralized service running at the given URL. Multiple entries and multiple URLs are allowed. number specifies how long (in days) old records are to be kept if they failed to be reported. The last specified value becomes effective.
  • securetransfer=yes|no - specifies whether to use encryption while transferring data. Currently works for GridFTP only. Default is no. It is overridden by values specified in URL options.
  • passivetransfer=yes|no - specifies whether GridFTP transfers are passive. Setting this option to yes can solve transfer problems caused by firewalls. Default is no.
  • localtransfer=yes|no - specifies whether to pass file downloading/uploading task to computing node. If set to yes the GM will not download/upload files, but compose a script which is submitted to the LRMS in order that the LRMS can execute file transfer. This requires the GM and Globus installation to be accessible from computing nodes and environment variables GLOBUS_LOCATION and NORDUGRID_LOCATION to be set accordingly. Default is no.
  • maxjobs=[max_processed_jobs [max_running_jobs]] - specifies maximum number of jobs being processed by the GM at different stages:
    • max_processed_jobs - maximum number of concurrent jobs processed by GM. This does not limit the number of jobs which can be submitted to the cluster.
    • max_running jobs - maximum number of jobs passed to Local Resource Management System. Missing value or -1 means no limit.
  • maxload=[max_frontend_jobs [emergency_frontend_jobs [max_transferred_files]]] - specifies maximum load caused by jobs being processed on frontend:
    • max_frontend_jobs - maximum number of jobs in PREPARING and FINISHING states (downloading and up- loading files). Jobs in these states can cause a heavy load on the GM host. This limit is applied before moving jobs to PREPARING and FINISHING states.
    • emergency_frontend_jobs - if the limit of max_frontend_jobs is used only by PREPARING or only by FINISH- ING jobs, aforementioned number of jobs can be moved to another state. This is used to avoid the case where jobs cannot finish due to a large number of recently submitted jobs.
    • max_transfered_files - maximum number of files being transfered in parallel by every job. Used to decrease load on not so powerful frontends.

Missing value or -1 means no limit.

  • maxloadshare=max_share share_type - specifies a sharing mechanism for data transfer. max_share is the max- imum number of processes that can run per transfer share and share_type is the scheme used to assign jobs to transfer shares. See Section 8.5 for possible values and more details.
  • wakeupperiod=time - specifies how often external changes are performed (like new arrived job, job finished in LRMS, etc.). time is a minimal time period specified in seconds. Default is 3 minutes.
  • authplugin=state options plugin - specifies plugin (external executable) to be run every time job is about to switch to state. The following states are allowed: ACCEPTED, PREPARING, SUBMIT, FINISHING, FIN- ISHED and DELETED. If the exit code of plugin is not 0, the job is canceled by default. Options consists of name=value pairs separated by commas. The following names are supported:

timeout - specifies how long in seconds execution of the plugin is allowed to last (mandatory, “timeout=“ can be skipped for backward compatibility). onsuccess, onfailure and ontimeout - defines action taken in each case (onsuccess happens if exit code is 0). Possible actions are: pass - continue execution, log - write information about result into logfile and continue execution, fail - write information about result into logfile and cancel job.

  • localcred=timeout plugin - specifies plugin (external executable or function in shared library) to be run every time job has to do something on behalf of local user. Execution of plugin may not last longer than timeout seconds. If plugin looks like function@path then function int function(char*,char*,char*,...) from shared library path is called (timeout is not functional in this case). If exit code is not 0, current operation will fail.

12

  • norootpower=yes/no - if set to yes, all processes involved in job management will use the local identity of a user to which a Grid identity is mapped in order to access the filesystem at the path specified in the sessiondir command (see below). Sometimes this may involve running a temporary external process.
  • allowsubmit=[group ...] - list of authorization groups of users allowed to submit new jobs while "allownew=no" is active in jobplugin.so configuration (see below in section 8.3). Multiple commands are allowed.
  • speedcontrol=min_speed min_time min_average_speed max_inactivity - specifies how long or slow data trans- fer is allowed to be. A transfer is canceled if the transfer rate (bytes per second) is lower than min_speed for at least min_time seconds, or if average rate is lower than min_average_speed, or no data is receved for longer than max_inactivity seconds.
  • preferredpattern=pattern - specifies a preferred pattern on which to sort multiple replicas of an input file. It consists of one or more patterns separated by a pipe character (|) listed in order of preference. Replicas will be ordered by the earliest match. If the dollar character ($) is used at the end of a pattern, the pattern will be matched to the end of the hostname of the replica.
  • copyurl=template replacement - specifies that URLs starting from template should be accessed in a different way (most probably Unix open). The template part of the URL will be replaced with replacement. replacement can either be a URL or a local path starting from ’/’. It is advisable to end template with ’/’.
  • linkurl=template replacement [node_path] - mostly identical to copyurl but the file will not be copied - instead a soft-link will be created. replacement specifies the way to access the file from the frontend, and is used to check permissions. node_path specifies how the file can be accessed from computing nodes, and will be used for soft-link creation. If node_path is missing - local_path will be used instead. Neither node_path nor replacement should be URLs.

NOTE: URLs which fit into copyurl or linkurl are treated as more easily accessible than other URLs. This means if the GM has to choose between several URLs from which should it download input files, these will be tried first.

Per-UNIX user commands:

  • mail=e-mail_address - specifies an email address from which notification mails are sent.
  • defaultttl=ttl [ttr] - specifies the time in seconds for the SD to be available after job finishes (ttl) and after job is deleted (ttr) due to ttl. Defaults are 7 days for ttl and 30 days for ttr.
  • lrms=default_lrms_name default_queue_name - specifies names for the LRMS and queue. A queue name can also be specified in the JD (currently it is not allowed to override the LRMS used using JD).
  • sessiondir=path [drain] - specifies the path to the directory in which the SD is created. Multiple session directories may be specified by specifying multiple sessiondir commands. In this case jobs are spread evenly over the session directories. If the path is * the default is used - $HOME/.jobs. When adding a new session directory, ensure to restart the GM so that jobs assigned there are processed. A session directory can be drained prior to removal by adding the “drain” option (no restart is required in this case). No new jobs will be assigned to this session directory by the GFS, but running jobs will still be accessible. When all jobs are processed the session directory can be removed and the GM should be restarted.
  • cachedir=path [link_path] - specifies a directory to store cached data (see 7). Multiple cache directories may be specified by specifying multiple cachedir commands. Cached data will be distributed over multiple caches according to free space in each. Specifying no cachedir command or commands with an empty path disables caching. The optional link_path specifies the path at which path is accessible on computing nodes, if it is different from the path on the GM host. If link_path is set to ’.’ files are not soft-linked, nor are per-job links created, but files are copied to the session directory. If a cache directory needs to be drained, then cachedir should specify “drain” as the link_path.
  • remotecachedir=path [link_path] - specifies caches which are under the control of other GMs, but which this GM can have read-only access to (see Section 7.3). Multiple remote cache directories may be specified by specifying multiple remotecachedir commands. If a file is not available in paths specified by cachedir, the GM looks in remote caches. link_path has the same meaning as in cachedir, but the special path “replicate” means files will be replicated from remote caches to local caches when they are requested.

13

  • cachesize=high_mark [low_mark] - specifies high and low watermarks for space used on the file system on which the cache directory is located, as a percentage of total file system capacity. When high_mark is exceeded, files will be deleted to bring the used space down to low_mark. It is a good idea to have each cache on its own separate file system. To turn off cache deletion, "cachesize" without parameters can be specified. These cache settings apply to all caches specified by cachedir commands.
  • cachelifetime=lifetime - if cache cleaning is enabled, files accessed less recently than the lifetime time period will be deleted. Example values of this option are 1800, 90s, 24h, 30d. When no suffix is given the unit is seconds.
  • maxrerun=number - specifies maximum number of times job will be allowed to rerun after it has failed in the LRMS. Default value is 2. This only specifies a upper limit. The actual number is provided in the job description and defaults to 0.
  • maxtransfertries=number - specifies the maximum number of times download and upload will be attempted per job (retries are only performed if an error is judged to be temporary). This number must be greater than 0 and defaults to 10.

All per-user commands should be put before the control command which initiates the serviced user.

  • control=path username [username [...]] - This option initiates a UNIX user as being serviced by the GM. path refers to the control directory (see Section 6 for the description of control directory). If the path is * the default one is used - $HOME/.jobstatus . username stands for UNIX name of the local user. Multiple names can be specified. If the name is * it is substituted by all names found in file /etc/grid-security/grid-mapfile (for the format of this file one should study the Globus project [12]).

The special name ’.’ (dot) can also be used. The corresponding control directory will be used for any user. This option should be the last one in the configuration file. The command controldir=path is also available. It uses the special username ’.’ and is always executed last independent of its placement in the file.

  • helper=username command [argument [argument [...]]] - associates an external program with a local UNIX user. This program will be kept running under account of the user specified by username. Special names can be used: ’*’ - all names from /etc/grid-security/grid-mapfile, ’.’ - root user. The user should be already configured with the control option (except root, who is always configured). command is an executable and arguments are passed as arguments to it. The stderr output of an executable (or error message if failed to run an executable) are redirected to a file in the control directory called job.helper.username.errors.

The following commands are global commands and are specific to the underlying LRMS (PBS in this case).

  • pbs_bin_path=path - path to directory which contains PBS commands.
  • pbs_log_path=path - path to directory with PBS server’s log files.
  • gnu_time=path - path to time utility.
  • tmpdir=path - path to directory for temporary files.
  • runtimedir=path - path to directory which contains runtimenvironment scripts.
  • shared_filesystem=yes|no - if computing nodes have access to the session directory through a shared filesystem

like NFS. Corresponds to the environment variable RUNTIME_NODE_SEES_FRONTEND.

  • nodename=command - command to obtain hostname of computing node.
  • scratchdir=path - path on computing node to move session directory to before execution.
  • shared_scratch=path - path on frontend where scratchdir can be found.

In each command’s arguments (paths, executables, ...), the following substitutions can be used:

  • %R - session root - see command sessiondir
  • %C - control dir - see command control

14

  • %U - username
  • %u - userid - numerical
  • %g - groupid - numerical
  • %H - home dir - home specified in /etc/passwd
  • %Q - default queue - see command lrms
  • %L - default lrms - see command lrms
  • %W - installation path - ${NORDUGRID_LOCATION}
  • %G - globus path - ${GLOBUS_LOCATION}
  • %c - list of all control directories
  • %I - job ID (for plugins only, substituted in runtime)
  • %S - job state (for authplugin plugins only, substituted at runtime)
  • %O - reason (for localcred plugins only, substituted at runtime). Possible reasons are:

new - new job, new credentials renew - old job, new credentials write - write/delete file, create/delete directory (through gridftp) read - read file, directory, etc. (through gridftp) extern - call external program (grid-manager)

Some configuration parameters can be specified from command line while starting the GM: grid-manager [-h] [-C level] [-d level] [-c path] [-F] [-U uid[:gid]] [-L path] [-P path]

  • -h - short help
  • -d - debug level
  • -L - log file (overwrites value in configuration file)
  • -P - file containing process id (overwrites value in configuration file)
  • -U - user and group id to use for running daemon
  • -F - do not make process daemon
  • -c - name of configuration file
  • -C - remove old information before starting: 1 - remove finished jobs, 2 - remove active jobs too, 3 - also remove everything that looks like junk.

The GM can be stopped in different ways by sending following signals:

  • SIGINT, Ctrl-C - stop starting some external processes and exit after all stopped executing. If SIGINT is sent twice behavior is same as in case of SIGTERM.
  • SIGTERM - stop external processes by sending SIGTERM to them and exit.

TODO

Remaining testing before transition

  • Feature testing
  • Performance/stability testing
    • Submit a number of jobs and check time when all are finished/failed and how many in each
      • With different files
      • With same file, caching enabled
    • See how many jobs are needed to crash the service
    • ...