This wiki is obsolete, see the NorduGrid web pages for up to date information.
NOX/Tests/Client tests
General issues
- There is no realistic testbed:
- Infoindex has only few sites at NIIF
- Various examples picked from templates are probably outdated
- No Chelonia/Bartender endpoints are advertised anywhere
- There is no way to know whether there are MyProxy or VOMS or SLCS services available
- There is no way to submit a real job that does some reasonable calculation, as only 3 RTEs (2 of them ara Java) are visible, and no other relevant information is visible (not even operating system)
- General confusion about configuration format and file name (XML vs INI): all man pages refer to an XML file, installation deploys arcclient.xml in /etc, and so forth.
- The client configuration is ONLY INI. Any reference to arcclient.xml and XML client configuration have been removed.
- Time is inconsistently treated everywhere: e.g. maximal queue wall time is reported as "P1DT6H1M" (whatever it is), proxy validity is always in GMT, etc etc etc.
- Should be fixed in revision 15509.
- Information is overly verbose and most of the time very cryptic, e.g.:
Free Slots With Duration: P12Y4M2DT12H: 1
- See above.
or:
[2009-10-26 00:59:33] [Arc.Plugin] [ERROR] [1843/28687952] Could not find loadable module by name (empty) ((empty))
- This error could be generated when specifying a wrong middleware plugin like FOO:https//example.com/cluster, where FOO plugin does not exist. Checks have now been added to code (revision 15694 and 15695), which should catch this error in a earlier stage. If the above error message can be reproduced please let me know.
or:
[2009-10-26 00:40:45] [Arc.MCC.TLS] [ERROR] [987/14137344] SSL error: -1 - (empty):(empty):(empty)
or, randomly, at job submission:
[2009-10-26 02:37:34] [Arc.A-REX-Client] [ERROR] [24094/21088896] The response to a service status request is Fault message: Response is not valid SOAP
- Message have been promoted to INFO, along with other similar ERROR messages.
Summary of installed arc* commands
SVN build | Binary repos | Comment | Judgement |
---|---|---|---|
Security | |||
arcproxy | arcproxy | OK | |
arcslcs | SLCS generation utility | ||
arcdecision | arcdecision | ||
Jobs | |||
arccat | arccat | OK | |
arcclean | arcclean | Completely silent | OK |
arcget | arcget | OK | |
arcinfo | arcinfo | OK | |
arckill | arckill | Completely silent | OK |
arcmigrate | arcmigrate | Unclear what are "migratable" states | ? |
arcrenew | arcrenew | Completely silent, difficult to verify | ? |
arcresub | arcresub | Problems submitting to a target | ? |
arcresume | arcresume | Difficult to test, needs a "resumable state" | ? |
arcstat | arcstat | OK | |
arcsub | arcsub | Still very verbose | OK |
arcsync | arcsync | OK | |
Data | |||
arcrm | arcrm | ||
arcsrmping | arcsrmping | ||
arccp | arccp | ||
arcls | arcls | ||
chelonia | |||
Other | |||
arcecho | arcecho |
CLI from SVN on Ubuntu 9.04
System
- Intel Core2 Duo
- Ubuntu 9.04 Jaunty x86_64; kernel 2.6.28-15-generic
- Pre-defined environment variables:
- X509_USER_KEY
- X509_USER_CERT
- X509_USER_PROXY
Build
- Pre-installed:
- libglobus-gssapi-gsi-dev and all the pulled dependencies from the NorduGrid Jaunty repo - needed for MyProxy interactions
- IGTF certificates v1.31 "classic" from tarball
- 0.9.4rc2 tag from SVN
- Build and installation:
sudo ./autogen.sh sudo ./configure --disable-a-rex-service --disable-isi-service --disable-charon-service \ --disable-compiler-service --disable-hopi-service --disable-paul-service \ --disable-sched-service --disable-storage-service --disable-janitor-service \ --disable-java --disable-python --disable-doc sudo make install
Installs everything in '/usr/local/' (very uncommon for Ubuntu, but works)
Tests
Common tests
- When system configuration file does not exist (e.g., it never exists on Windows), the following warning is always printed:
[2009-10-25 02:39:43] [Arc.UserConfig] [WARNING] [7380/141377160] System configuration file (/etc/arc/client.conf) does not exists.
SOLUTION: promote this to a higher debug level
- Done in revision 15696 and 15697.
- When no configuration file is found at all, a misleading warning is printed:
[2009-10-25 02:25:16] [Arc.UserConfig] [WARNING] [7117/146706056] System configuration file (/etc/arc/client.conf) does not exists.
SOLUTION: Print a message that no configuration file is found at all, or add $HOME/.arc/client.conf to the list of possible candidates.
- This is when the system configuration cannot be found. This has been solved as in the above situation.
- In RC2, every tool produced warning after loading old /usr/local/etc/arc/client.conf:
[2009-10-15 20:20:32] [Arc.UserConfig] [WARNING] [21400/14995024] Unknown section client, ignoring it
REASON: old ~/.arc/client.conf, which is still needed for v0.8. SOLUTION: Martin changes verbosity level to INFO, and adds a debug message before processing (was only after)
arc* -v
No isssues
Man-pages are not localized
arcproxy
arcproxy --help
In RC2 was dumping gettext comment:
Usage: arcproxy [OPTION...] Project-Id-Version: Arc Report-Msgid-Bugs-To: POT-Creation-Date: 2009-10-15 19:48+0200 ...
REASON: Empty argument string to OptionParser. ACTION: Martin fixed the code.
man arcproxy
Has funny info:
... COPYRIGHT We need to have this FILES AUTHOR Written by developers
ACTION: ask someone to synchronise this bit in man pages
arcproxy arcproxy -O arcproxy -I arcproxy -I -P cow arcproxy -c validityPeriod="1 second" arcproxy -t 1
No issues
arcproxy -C usercert-old.pem -K userkey-old.pem
When proxy is attempted to be generated from expired credentials, arcproxy fails and yet reports success, with validity time coinsiding with the certifiate expiration date:
... [2009-10-16 23:59:44] [Arc.Credential] [ERROR] [8500/24735312] Certificate verification failed Proxy generation succeeded Your proxy is valid until: Thu, 17 Sep 2009 15:13:03 GMT
REASON: unknown. SOLUTION: TBD
arcproxy -T /etc
Proxy creation fails as expected (no trusted certificates found), and yet success is reported, with odd validity period:
... [2009-10-17 00:04:10] [Arc] [ERROR] [8689/33365584] Certificate verification error: unable to get issuer certificate [2009-10-17 00:04:10] [Arc.Credential] [ERROR] [8689/33365584] Certificate verification failed Proxy generation succeeded Your proxy is valid until: Sat, 17 Oct 2009 10:04:10 GMT
REASON: unknown. SOLUTION: TBD
CURIOSITY: validity time is always printed in GMT:
... Proxy generation succeeded Your proxy is valid until: Fri, 16 Oct 2009 23:51:46 GMT date Fri Oct 16 23:51:53 CEST 2009
- Fixed in revision 15700. Time is now printed in local time.
arcproxy -z ~/.arc/other.conf
Prefers X509_USER_* values to those specified in the configuration - unclear whether this is the expected behaviour?
VOMS tests
arcproxy -V ./vomses -S knowarc.eu arcproxy -S knowarc.eu arcproxy -S knowarc.eu:all arcproxy -S knowarc.eu:list arcproxy -G -S knowarc.eu:all arcproxy -S atlas:/atlas/Role=production arcproxy -O -S atlas arcproxy -S atlas -O -c validityPeriod="5 hours"
No issues
MyProxy tests
arcproxy -L knowarc1.grid.niif.hu -U oxana -M PUT arcproxy -L knowarc1.grid.niif.hu -U oxana -M GET arcproxy -S atlas -L knowarc1.grid.niif.hu- U oxana -M PUT arcproxy -L knowarc1.grid.niif.hu -U oxana -M GET
No issues; except that MyProxy server can not store VOMS extensions (as expected).
Curious usage of X509_VOMS_DIR
arcproxy appears to use X509_VOMS_DIR variable as a pointer to the vomses file (list of VOMS server contact points). The native VOMS client uses this variable to point to the directory that contains VOMS server credentials, needed to validate the proxy.
WARNING: possibility of confusion when the same variable has different meaning for different tools
arcsync
- BIG PROBLEM: retrieves all jobs from http-AREX, from every user (a feature, actually, but a very bad one)
- AFAIK there is nothing to do about it. The service is unsecure, and if you choose to sync against an unsecure service then you will get all the jobs registered there. There is no way to identify which job are yours and which are not.
- man page refers to client.xml
- Fixed in trunk.
- man pages and help page are different: option "-m" doesn't exist, contrary to what is written in man page
- Fixed in trunk. There should be no "-m" flag. Default is to merge, otherwise use the "-t" flag to truncate the joblist before adding sync'ed jobs.
arcsub
General notes:
- pity that some useful options are gone (-U, -C, -dryrun)
- Unknown attributes should be allowed.
- The functionality of using the -C flag can partly be achieved by using an alias. If you feel some functionality is missing, please open a feature request.
- AFAIK dry run is not supported by A-REX. It is still possible to dry run on grid-manager, which is done by putting the dryrun attribute in the XRSL jobdescription. If the dry run functionality is required then a feature request should be opened.
- How do I know which brokers are available?
- At the moment there is no way. We have been talking about this in the arclib team, however nothing have been done yet. A feature request should be opened on this...
- How do I plug in own broker?
- Read the manual... :o)
- FastestCPU does not exist, though is documented in the deliverable
- The named was changed to Benchmark. Without arguments the Benchmark broker is equivalent to the old FastestCPU.
- The tool is still very verbose: below is an example of a simple echo job submission with default settings
> arcsub echo-stage.jsdl ERROR: Failed to bind to ldap server (index1.nordugrid.org) ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: The service status could not be retrieved ERROR: Failed to bind to ldap server (gridsrv4.nbi.dk) ERROR: Failed to bind to ldap server (topaasi.grid.utu.fi) ERROR: Failed to bind to ldap server (spektroliitti.lut.fi) ERROR: Failed to bind to ldap server (gridsrv4.nbi.dk) ERROR: Failed to bind to ldap server (kvartsi.hut.fi) ERROR: Failed to bind to ldap server (akaatti.tut.fi) ERROR: Failed to bind to ldap server (gridsrv4.nbi.dk) ERROR: Failed to bind to ldap server (opaali.phys.jyu.fi) ERROR: Conversion failed: adotf ERROR: Conversion failed: adotf ERROR: Invalid period string: 4320.0 ERROR: Invalid period string: 120.0 ERROR: Conversion failed: - ERROR: Ldap bind timeout (lcg.bitp.kiev.ua) ERROR: Failed to bind to ldap server (hexgrid.bccs.uib.no) ERROR: Connect: Failed authentication: 535 Not allowed ERROR: Submit: Failed to connect Submission to gsiftp://neolith2.nsc.liu.se:2811/jobs failed, trying next target ERROR: Connect: Failed authentication: 535 Not allowed ERROR: Submit: Failed to connect Submission to gsiftp://svea.c3se.chalmers.se:2811/jobs failed, trying next target ERROR: Connect: Failed authentication: 535 Not allowed ERROR: Submit: Failed to connect Submission to gsiftp://neolith2.nsc.liu.se:2811/jobs failed, trying next target ERROR: Connect: Failed authentication: 535 Not allowed ERROR: Submit: Failed to connect Submission to gsiftp://gtpps2.csc.fi:2811/jobs failed, trying next target ERROR: Can not create the SSL Context object ERROR: SSL error: 336236785 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: Creating delegation to CREAM delegation service failed ERROR: Creating delegation failed Submission to https://cream.grid.upjs.sk:8443/ce-cream/services failed, trying next target Job submitted with jobid: gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080
- Unresolved
- When a job is killed or otherwise is not in joblist, misleading ERROR is printed; some more informative message is needed
~ > arckill gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080 ~ > arcstat gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080 WARNING: Job not found in job list: gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080 ERROR: No job controllers loaded
- Unresolved
Tests:
- Normal job submission succeeds only to an AREX and GM sites, though with plenty of errors and warnings (see arcsync above). CREAM and UNICORE do not seem to work.
arcsub echo.jsdl
- Fixed regarding CREAM support have been made, however there are still an unresolved issue, see bug 1755.
- Submission to a specific ARC0 site fails
arcsub -c ARC0:ldap://knowarc1.grid.niif.hu:2135/nordugrid-cluster-name=knowarc1.grid.niif.hu,Mds-Vo-name=local,o=grid echo.jsdl [2009-10-26 00:39:19] [Arc.Plugin] [ERROR] [787/16326224] Could not find loadable module by name (empty) ((empty)) [2009-10-26 00:39:19] [Arc.Plugin] [ERROR] [787/16326224] Could not find loadable module by name ARC0 and HED:TargetRetriever ((empty)) [2009-10-26 00:39:19] [Arc.Loader] [ERROR] [787/16326224] TargetRetriever ARC0 could not be created Job submission aborted because no clusters returned any information
REASON: it is highly non-trivial to figure that the reason is some missing Globus libraries during the ./configure step. By guessing, trial and error, and 3 rebuilds, it started working.
- Checks have been added which will give a more informative error message.
- AFAIK UNICORE is currently not supported.
- Submission to specific ARC1 sites succeeds
- Submission to aliases: works as expected (arexes is a recursive alias)
arcsub -c arex1 echo.jsdl arcsub -c arexes echo.jsdl
- Submission using specified broker works as expected
arcsub -b FastestQueue ech.jsdl
- PROBLEM: Setting brokername=Cow in client configuration SUCEEDS!
arcsub -c arexes echo.jsdl Job submitted with jobid: http://knowarc1.grid.niif.hu:50000/arex/1342912565143791867204702
- A check have been added in revision 15704, which gives an error message if the broker is not found.
- Verify job description - works as expected
arcsub -c ARC1:https//example.org:60000/arex job.jsdl -x
arcstat
PROBLEM: man page mentions option "-i", but it is not implemented. Fixed in trunk.
- Query all jobs: works as expected, except of being very verbose complaining of "Could not find loadable module by name..."
arcstat -a
- Query specific job: both short and long versions work as expected
arcstat <jobID> arcstat -l <jobID>
PROBLEM: very little information from A-REX, even with -l. Basically, I don't even know whether the jobs are mine.
- The information retrieval have been updated. Please re-do the evaluation.
- Query jobs on a specific cluster:
arcstat -c <url>
PROBLEM: when cluster alias can not be resolved, proceeds to stat all jobs!
~ > arcstat -c cow ERROR: Could not resolve alias "cow" it is not defined. ERROR: Failed to bind to ldap server (pgs02.grid.upjs.sk) WARNING: Job state information not found: gsiftp://pgs02.grid.upjs.sk:2811/jobs/18401226184112135916457 WARNING: Job state information not found: gsiftp://shiva.rhi.hi.is:2811/jobs/1040812261924051614242126 WARNING: Job state information not found: gsiftp://gateway01.dcsc.ku.dk:2811/jobs/94731226193948987458939 WARNING: Job state information not found: gsiftp://gateway01.dcsc.ku.dk:2811/jobs/100911226193970685376796 WARNING: Job state information not found: gsiftp://arc-ce.smokerings.nsc.liu.se:2811/jobs/281991235081230880182882 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2434012350812551337331521 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/996312350820311427289495 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/15706123508214085320563 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/20871235082527988736939 WARNING: Job state information not found: gsiftp://lscf.nbi.dk:2811/jobs/940512415410181382505774 WARNING: Job state information not found: gsiftp://lscf.nbi.dk:2811/jobs/1149812415412531723520409 WARNING: Job state information not found: gsiftp://benedict.grid.aau.dk:2811/jobs/17591241541267971426534 WARNING: Job state information not found: gsiftp://lscf.nbi.dk:2811/jobs/119751241541281615800024 WARNING: Job state information not found: gsiftp://siri.lunarc.lu.se:2811/jobs/188812553078141140454717 WARNING: Job state information not found: gsiftp://siri.lunarc.lu.se:2811/jobs/256371255306399952746572 Job: gsiftp://grad.uppmax.uu.se:2811/jobs/2231012574512572087245670 Name: JSDL stdin/stdout test State: Failed (FAILED) Error: Failed extracting LRMS ID due to some internal error Job: gsiftp://grid.tsl.uu.se:2811/jobs/802412574550371508616905 Name: Test job State: Finished (FINISHED) ...
- Unresolved
- Query jobs in joblist: works as expected
arcstat -j joblist
- Query jobs with a given status: only works when "-a" is specified
arcstat -s Finished [2009-10-26 01:01:23] [Arc.arcstat] [ERROR] [2073/39910992] No jobs given arcstat -a -s Finished [2009-10-26 01:00:40] [Arc.Plugin] [ERROR] [2062/10149456] Could not find loadable module by name (empty) ((empty)) [2009-10-26 01:00:40] [Arc.Plugin] [ERROR] [2062/10149456] Could not find loadable module by name ARC0 and HED:JobController ((empty)) [2009-10-26 01:00:40] [Arc.Loader] [ERROR] [2062/10149456] JobController ARC0 could not be created [2009-10-26 01:00:40] [Arc.A-REX-Client] [ERROR] [2062/10149456] The status of the job (https://knowarc1.grid.niif.hu:60000/arex/18559124152278119431851) could not be retrieved. [2009-10-26 01:00:40] [Arc.JobController.ARC1] [ERROR] [2062/10149456] Failed retrieving job status information [2009-10-26 01:00:44] [Arc.JobController] [WARNING] [2062/10149456] Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/18559124152278119431851 Job: https://knowarc1.grid.niif.hu:60000/arex/1056712565140411350802261 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50003/arex/122271256514211711205873 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50003/arex/122271256514322848951627 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50000/arex/134291256514330947900441 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50000/arex/1342912565143791867204702 State: Finished (FINISHED)
- In revision 15707, it is possible to only specify -s <state> in which case all jobs with <state> in the joblist will be queried.
- The error messages from A-REX-Client and JobController.ARC1 have been promoted to VERBOSE and INFO respectively.
arcget
- Man page refers to client.xml - fixed in RC5
- Get all jobs - OK
arcget -a
- Get specific job:
arcget <jobID>
PROBLEM: completely silent; in fact, does not print anything at all. Must report something like "<N> job(s) received"
PROBLEM: On Windows, can not create download directory:
C:\Users\oxana>arcget http://knowarc1.grid.niif.hu:50006/arex/247131256752821488940856 [2009-10-29 11:01:12] [Arc.UserConfig] [WARNING] [7648/35793680] System configuration file (/usr/i686-pc-mingw32/sys-root/mingw/etc/arc/client.conf) does not exists. [2009-10-29 11:01:18] [Arc.URL] [WARNING] [7648/35793680] Attempt to assign relative path to URL - making it absolute [2009-10-29 11:01:20] [Arc.DataPoint.File] [ERROR] [7648/35793680] Failed to create/find directory /C:\Users\oxana\247131256752821488940856, (22) [2009-10-29 11:01:20] [Arc.DataMover] [ERROR] [7648/35793680] Failed to start writing to destination: file:/C:\Users\oxana\247131256752821488940856\primenumbers
FIXED PROBLEM: Initially, was failing at all A-REX instances at NIIF because of a linefeed symbol in the A-REX configuration file (?! ask Gabor to elaborate).
- Get specific job, keeping the job on the site - OK
arcget -k https://knowarc1.grid.niif.hu:60000/arex/1056712564879191002972630
- Get jobs on a specific cluster - OK
arcget -c <url>
- Get jobs stored in joblist - OK
arcget -j joblist
- Get jobs having the specified status - OK
arcget -s Finished
- Save jobs into another directory - OK
~ > pwd /home/oxana ~ > arcget -D /tmp gsiftp://grid.tsl.uu.se:2811/jobs/802412574550371508616905 ~ > ls /tmp/802412574550371508616905/ job.gmlog job.log
arcclean
Clean all jobs
arcclean -a
Clean specific job
arcclean <jobID>
Clean jobs on a specific cluster
arcclean -c <url>
Clean job specified in joblist
arcclean -j joblist
Clean jobs with the specified status
arcclean -s Failed
Force cleaning jobs
arcclean -f -j joblist
arckill
- Kill a specific job - OK
~ > arckill gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080
PROBLEM: Completely silent. Must report "<N> jobs are scheduled for slaughter"
- Kill all jobs - OK
arckill -a
- Kill jobs in joblist - OK
arckill -j joblist
- Kill jobs on a specific cluster - OK
arckill -c <url>
- Kill jobs with specified status - OK
arckill -s Running
- Kill job, but keep files on server and retrieve files afterwards - OK
arckill -k <jobID> arcget <jobID>
arcinfo
- Man pages appear to be a copy-and-paste of arcstat
arcinfo
- When proxy expires, and configuration has https URLs, segmentation fault is randomly occurring (in RC2 on Jaunty; not found in RC3 on Fedora 11):
[2009-10-25 02:57:49] [Arc.MCC.TLS] [ERROR] [26558/9481552] Failed to establish SSL connection [2009-10-25 02:57:49] [Arc.MCC.TLS] [ERROR] [26558/9481552] SSL error: 336151573 - SSL routines:SSL3_READ_BYTES:sslv3 alert certificate expired [2009-10-25 02:57:49] [Arc.MCC.TLS] [ERROR] [26558/9481552] Failed to send content of buffer Segmentation fault
- When ARC0 index services are listed in defaultservices, errors are produced, and yet execution continues:
[2009-10-25 02:00:41] [Arc.Plugin] [ERROR] [26784/25116240] Could not find loadable module by name (empty) ((empty)) [2009-10-25 02:00:41] [Arc.Plugin] [ERROR] [26784/25116240] Could not find loadable module by name ARC0 and HED:TargetRetriever ((empty))
SOLUTION:Demote to a higher debug level, provide more informative text
arcinfo -z client-gabor.conf
No issues
Query cluster and index server
arcinfo -c <url> arcinfo -i <url>
Repeat above with the long option '-l' flag.
arccat
- Concatenate output (stdout, stderr and gmlog) a specific job - didn't work on NIIF's servers, but got fixed since.
- Concatenate output for all jobs: proceeds through all jobs as expected
arccat -a
- Concatenate output for all jobs on a specific cluster: proceeds through all jobs as expected
arccat -c arex1
- Concatenate output for jobs with specified status: proceeds through all jobs as expected
arccat -s Finished -a
- Get gmlog - works
arccat -l -c grid-tsl
arcresub
- Re-submission from a site: something works
arcresib -c arex1
PROBLEM: extreme verbosity, prints tons of output and even resubmits something, but it is impossible to match failures to jobs and understand why something was not resubmitted
- Resubmit specific job, or to a specific target - never seem to work
arcresub <jobID> arcresub -q <url> <jobID> arcresub -m <jobID>
~> arcresub -q arex1 gsiftp://grad.uppmax.uu.se:2811/jobs/2231012574512572087245670 Job submission aborted because no clusters returned any information
PROBLEM: for ARC0 and ARC1, keeps printing "Job submission aborted because no clusters returned any information"
arcmigrate
arcmigrate <jobID>
PROBLEM: man-page says that jobs can be migrated in Running/Executing/Queuing states
PROBLEM: before announcing that the job is not in queueing state, arcmigrate still polls the entire information system for targets. Should be other way around.
- migrate to a specific site: does not respect the target
arcmigrate -q <url> <jobID>
PROBLEM: polls entire information system even when -q is specified, and picks another target
PROBLEM: extremely verbose:
> arcmigrate https://knowarc1.grid.niif.hu:60000/arex/2412612574621581805552407 ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: The service status could not be retrieved ERROR: Failed to bind to ldap server (topaasi.grid.utu.fi) ERROR: Failed to bind to ldap server (kvartsi.hut.fi) ERROR: Failed to bind to ldap server (spektroliitti.lut.fi) ERROR: Failed to bind to ldap server (akaatti.tut.fi) ERROR: Failed to bind to ldap server (opaali.phys.jyu.fi) ERROR: Conversion failed: adotf ERROR: Conversion failed: adotf ERROR: Invalid period string: 4320.0 ERROR: Invalid period string: 120.0 ERROR: Conversion failed: - ERROR: Ldap bind timeout (lcg.bitp.kiev.ua) ERROR: Failed to bind to ldap server (hexgrid.bccs.uib.no) WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a CREAM cluster. WARNING: Cannot migrate to a ARC0 cluster. WARNING: Cannot migrate to a CREAM cluster. Job migrated with jobid: https://knowarc1.grid.niif.hu:60000/arex/2412612574622531164449078
arcrenew
- Proxy renewal
arcrenew <jobID>
PROBLEM: Completely silent. No easy way to check whether it succeeded or no: nothing shows proxy expiration time on ARC0, and while ARC1 shows proxy expiration time, renewal does not work:
> arcrenew https://knowarc1.grid.niif.hu:60000/arex/2412612574608872038873988 ERROR: Renewal of ARC1 jobs is not supported ERROR: Failed renewing job https://knowarc1.grid.niif.hu:60000/arex/2412612574608872038873988
PROBLEM: Is this actually true and ARC1 jobs can not get proxies renewed?!
arcrenew -a arcrenew -s <status> -a arcrenew -j joblist arcrenew -c <url>
arcresume
> arcresume http://knowarc1.grid.niif.hu:50000/arex/241611257456176468896456 ERROR: Job http://knowarc1.grid.niif.hu:50000/arex/241611257456176468896456 does not report a resumable state ERROR: Failed to find delegation credentials in client configuration ERROR: Failed resuming job http://knowarc1.grid.niif.hu:50000/arex/241611257456176468896456
PROBLEM: Requires a "resumable state" and yet nothing explains what are resumable states?
PROBLEM: What are "credentials in client configuration"? All credentials were fine before and after.
Comment : (Katarina)
Tested resume by "hiding" the input file so the job filed in PREPARING stat. Resume works only for ARC0 computing elements (sending an xrsl job to ARC0). Resume is not working for the arexes.
arcslcs
arcslcs has incompatible usage of "-c" (should be "-z")
chelonia
Has incompatible usage of "-v" (should be "-d"); has no option "-h"
Does not print method help, as advertised:
chelonia modify /usr/bin/chelonia:239: DeprecationWarning: the md5 module is deprecated; use hashlib instead import md5 ERROR: ARC python client library not found (maybe PYTHONPATH is not set properly?) If you want to run without the ARC client libraries, use the '-w' flag
(-w flag has no effect)
CLI Windows
03.11 update
The new installer (from 29.10) was tested but the installation suffers from the "copy client.conf.example" problem. The file is not found as the paths are wrong. It is fixed in svn, but not available in the installer.
05.11 update
For 05.11 testing done using the zip file installation http://knowarc1.grid.niif.hu/windows/arc1-xp-vista-compatible.zip where the copy problem was solved. The cleint.conf had to be added by hand.
System
- Intel Dual Core
- Windows XP Pro
Installation
You need to install these packages:
- ftp://download.nordugrid.org/test/NorduGridARC.exe Use this instead - 05.11 there is a problem with the installer. Use the zip file
http://knowarc1.grid.niif.hu/windows/arc1-xp-vista-compatible.zip
- Globus-package http://www.knowarc.eu/download/GlobusToolkit_421.exe Included in the package
- GTK runtime libraries ??
First the certificate. Keep a copy the .globus directory from your home directory on what ever system you are using ready.
Open a dos prompt or a file manager. Go to your Windows home directory. These use to be found in the path like
C:\Documents and Settings\<username>\
Create a .globus directory. In the Explorer one may get problems with creating a directory starting with a dot. Create the directory from a dos prompt
>mkdir .globus
Copy the content over to the local .globus directory whatever way is most simple.
Client configuration
In the home directory (typically C:\Documents and Settings\<username>) there is an "Application Data" directory (hidden directory, switch on visibility in Explorer Tools menu -> Folder option -> View, "Hidden files and folders")
Create a .arc folder
> cd "Application Data" >mkdir .arc
Now having a .arc: in C:\Documents and Settings\<username>\Application Data\.arc Copy the client.xml file info the .arc folder http://knowarc1.grid.niif.hu/windows/demo/client.xml This file contains cluster aliases.
Some environment setup (assuming a very default installation in the Program Files folder):
One needs globus and NorduGridARC in path. To do that, append following to the environment variable Path:
;C:\Program Files\Globus\bin;C:\Program Files\NorduGridARC\bin
This is done automatically now
If missing environment variables are found from the control panel menu in "System Properties" -> "Advanced" -> "Environment variables"
GLOBUS_LOCATION set to C:\Program Files\Globus X509_CERT_DIR set to C:\Program Files\NorduGridARC\etc\grid-security\certificates X509_USER_CERT set to %HOMEPATH%\.globus\usercert.pem X509_USER_KEY set to %HOMEPATH%\.globus \userkey.pem
In prompt check what the proxy is called:
>dir %TEMP% >set X509_USER_PROXY=%TEMP%\x509up_u0
Tests
arcproxy
arcproxy arcproxy -I arcproxy -O -S knowarc.eu
All work fine
arcsub 05.11
Works fine for arex
arcsub -c arc0 job.xtsl Job submitted with jobid: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2044812563891681363597456
arcsub -c arex1 testjobwww3.jsdl Job submitted with jobid: http://knowarc1.grid.niif.hu:50000/arex/1342912563892491848381691
Tried to add ARC0 services
arc0=computing:ARC0:ldap://knowarc1.grid.niif.hu:2135/nordugrid-cluster-name=knowarc1.grid.niif.hu,Mds-Vo-name=local,o=grid arc1=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid
but they are not recognized.
arcsub -c arc0 testjobwww3.jsdl ERROR: Could not resolve alias "arc0" it is not defined. Job submission aborted because no clusters returned any information
arcsub -c cream job.jdl
No cream targets found.
arcstat 05.11
Works, but provides an overwhelming amount of output. One has to look for the actual job info. Could it somehow be reduced (as default).
arcstat -a
C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir>arcstat -a ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/154512466269442249306) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/190312480960421804289383) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412480960942102207750) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412480986551484725515) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412480987372004004782) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248178660816299854) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248178693270274735) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481788161833884369) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481794011689773322) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481794841135549564) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481795061490164002) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248179664658629600) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481806331995853780) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248181654764503202) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/19031248266727719885386) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/190312482688801649760492) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/19031248269664596516649) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/190312482708001189641421) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487751911274785528) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/135681248776174890038130) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487761862099818677) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/135681248778282954184859) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487783181690167551) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/13568124877853011805764) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487798162116602801) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487815761120847641) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/1432512487896321201817824) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812489549801085755434) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/1432512490424262056397442) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/143251249042517279751030) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/278701250254923894246671) could not be retrieved. ERROR: Failed retrieving job status information ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/278701250255050573870217) could not be retrieved. ERROR: Failed retrieving job status information WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/154512466269442249306 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/190312480960421804289383 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412480960942102207750 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412480986551484725515 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412480987372004004782 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248178660816299854 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248178693270274735 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481788161833884369 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481794011689773322 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481794841135549564 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481795061490164002 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248179664658629600 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481806331995853780 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248181654764503202 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/19031248266727719885386 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/190312482688801649760492 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/19031248269664596516649 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/190312482708001189641421 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487751911274785528 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/135681248776174890038130 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487761862099818677 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/135681248778282954184859 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487783181690167551 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/13568124877853011805764 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487798162116602801 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487815761120847641 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/1432512487896321201817824 WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812489549801085755434 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/1432512490424262056397442 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/143251249042517279751030 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/278701250254923894246671 WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/278701250255050573870217 Job: http://knowarc1.grid.niif.hu:50000/arex/1342912563830101935539909 State: Deleted (DELETED) Exit Code: 0 Job: http://knowarc1.grid.niif.hu:50000/arex/1342912563846191295524800 State: Deleted (DELETED) Exit Code: 0 Job: http://knowarc1.grid.niif.hu:50000/arex/13429125638579881230185 State: Deleted (DELETED) Exit Code: 0 Job: http://knowarc1.grid.niif.hu:50000/arex/134291256386274486281849 State: Deleted (DELETED) Exit Code: 0 Job: https://knowarc1.grid.niif.hu:60000/arex/105671256386577232995984 State: Deleted (DELETED) Exit Code: 0 Job: http://knowarc1.grid.niif.hu:50000/arex/1342912563892491848381691 State: Deleted (DELETED) Exit Code: 0 Job: http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175 State: Running (INLRMS:EXECUTED) Job: http://knowarc1.grid.niif.hu:50004/arex/244681257411410435311740 State: Running (INLRMS:EXECUTED) Job: https://knowarc1.grid.niif.hu:60000/arex/2412612574114162090478554 State: Running (INLRMS:EXECUTED) ERROR: Failed to bind to ldap server (index1.nordugrid.org) WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2437412466271271111006935 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/251511246627161344082881 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2536212466271742017330082 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/253751246627175216184522 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2541012466271751071459481 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/254151246627175623801147 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2544412466271761481664569 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/1692312466287812129046274 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/294951248178561968194419 WARNING: Job state information not found: gsiftp://ce02.titan.uio.no:2811/jobs/103681248270203427504322 WARNING: Job state information not found: gsiftp://ce02.titan.uio.no:2811/jobs/106671248270242401341040 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/1539512487873681581185043 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/27196124938638531541090 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2955512502547611962155131 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/199071256385636680402761 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2866212563867231353919820 WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/289281256386733861975919 ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: Failed to send SOAP message ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: Failed to send SOAP message ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: Failed to send SOAP message ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: Failed to send SOAP message ERROR: Failed to establish SSL connection ERROR: SSL error: -1 - (empty):(empty):(empty) ERROR: Failed to send content of buffer ERROR: Failed to send SOAP message WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=7f314be5-cf98-4a75-8c00-88cc5d9 fb05 WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=d2f479ad-18e5-42d9-8408-edd898c 7485 WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=700b85aa-8379-4b32-b68e-cb79896 6be2 WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=4320102f-de7f-4c55-816f-ca09944 9922 WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=548f2c05-47f6-4b77-ac02-6e6db86 60e9 ERROR: The job status could not be retrieved ERROR: Could not retrieve job information ERROR: The job status could not be retrieved ERROR: Could not retrieve job information ERROR: The job status could not be retrieved ERROR: Could not retrieve job information ERROR: The job status could not be retrieved ERROR: Could not retrieve job information WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM779180392 WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM898036279 WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM439661494 WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM875668628
arccat 05.11
The errors reported at All - hands meeting were related to an error in the site configuration as well ass a missing statement in the jsdl file (DeleteInTermination).
arccat http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175 ERROR: Illegal URL - no hostname given ERROR: Cannot output stdout for job (http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175), non-valid destination URL (c:\DOCUME~1\Katari na\LOCALS~1\Temp\arccat.QX752U)
arccat -l http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175 ERROR: Can not determine the gmlog location: http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175
arcls 05.11
Works fine.
arccp 05.11
Works for for arex.
arcget 05.11
Not working
C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir>arcget http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175 WARNING: Attempt to assign relative path to URL - making it absolute ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt ERROR: File download failed: Can't write to destination ERROR: Failed dowloading http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175/err.txt to file:/C:\Documents and Settings\Katarina\My Docu ments\KnowARC\testdir\2431212574114041550319175\err.txt WARNING: Attempt to assign relative path to URL - making it absolute ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22) ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt ERROR: File download failed: Can't write to destination ERROR: Failed dowloading http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175/out.txt to file:/C:\Documents and Settings\Katarina\My Docu ments\KnowARC\testdir\2431212574114041550319175\out.txt ERROR: Failed downloading job http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175
arckill
Again only tested on arc0
arckill <jobID> arckill -k <jobID>
Both work.
CLI from Ubuntu binary packages
System
- Intel Centrino2 vPro
- Ubuntu 9.04 Jaunty x86_64; kernel 2.6.28-15-generic
- Pre-defined environment variables:
- X509_USER_KEY
- X509_USER_CERT
- X509_USER_PROXY
Installation
- Pre-installed:
- IGTF certificates v1.31 "classic" from tarball
- Using NorduGrid test repo 'http://download.nordugrid.org/repos/ubuntu/ jaunty-testing .':
sudo apt-get install nordugrid-arc1-client
Claims to install nordugrid-arc1-lib (0.9.4~rc2-1), nordugrid-arc1-plugins-base (0.9.4~rc2-1), nordugrid-arc1-client (0.9.4~rc2-1) and a bunch of Globus packages for no obvious reason
In reality, does not install ARC binaries - only man pages |
File list of nordugrid-arc1-client_0.9.4~rc2-1_amd64.deb:
/. /usr /usr/share /usr/share/man /usr/share/man/man5 /usr/share/man/man5/arcclient.xml.5.gz /usr/share/man/man1 /usr/share/man/man1/arccp.1.gz /usr/share/man/man1/arcsub.1.gz /usr/share/man/man1/arcrm.1.gz /usr/share/man/man1/arcls.1.gz /usr/share/man/man1/arcdecision.1.gz /usr/share/man/man1/arcstat.1.gz /usr/share/man/man1/arcslcs.1.gz /usr/share/man/man1/arcsync.1.gz /usr/share/man/man1/arcclean.1.gz /usr/share/man/man1/arcinfo.1.gz /usr/share/man/man1/arccat.1.gz /usr/share/man/man1/arcresub.1.gz /usr/share/man/man1/arcget.1.gz /usr/share/man/man1/perftest.1.gz /usr/share/man/man1/arckill.1.gz /usr/share/man/man1/arcecho.1.gz /usr/share/man/man1/arcrenew.1.gz /usr/share/man/man1/chelonia.1.gz /usr/share/man/man1/arcresume.1.gz /usr/share/man/man1/arcmigrate.1.gz /usr/share/man/man1/arcsrmping.1.gz /usr/share/man/man1/arcproxy.1.gz /usr/share/doc /usr/share/doc/nordugrid-arc1-client /usr/share/doc/nordugrid-arc1-client/changelog.gz /usr/share/doc/nordugrid-arc1-client/changelog.Debian.gz
CLI from Fedora binary packages
System
- Intel Centrino
- Fedora 11 i386; kernel 2.6.30.8-64.fc11.i686.PAE
- Pre-defined environment variables:
- X509_USER_KEY
- X509_USER_CERT
- X509_USER_PROXY
- X509_VOMS_DIR
- Katarian:
- Intel Dual core
- Fedora 7 i386
Installation
- Pre-installed:
- ARC client v 0.8.1b and respective dependencies from the NorduGrid test repo
- IGTF certificates v1.31 "classic" from the NorduGrid repo
- Using NorduGrid test repo 'http://ftp.nordugrid.org/repos/fedora/$releasever/$basearch/testing':
yum install nordugrid-arc1-client
Installs nordugrid-arc1-0.9.4-0.rc2.fc11.i586.rpm, nordugrid-arc1-client-0.9.4-0.rc2.fc11.i586.rpm, nordugrid-arc1-plugins-base-0.9.4-0.rc2.fc11.i586.rpm
Everything installs in root ('/usr', '/etc'...) and not '/usr/local' as advertised in the Guide
- FC7 (Katarina) Nothing preinstaled
- yum install nordugrid-arc1
- yum install nordugrid-python
- The update to rc3 came up automatically once it was available from the repository. Works perfect.
Tests
arcproxy
arcproxy
Suceeds, but produces 2 warnings:
[2009-10-16 01:40:55] [Arc.UserConfig] [WARNING] [3446/137125512] Unknown section client, ignoring it
REASON: old ~/.arc/client.conf, which is still needed for v0.8. SOLUTION: Martin changes verbosity level to INFO, and adds a debug message before processing (was only after)
[2009-10-16 01:40:55] [Arc.OpenSSL] [WARNING] [3446/137125512] Failed to lock arccrypto library in memory
Tests FC7 Katarina
Differences compared to the all-hands results
- Gabor fixed some configuration on the server side
- The jsdl job was equped with DeleteOnTermination = false
- The globus packages were installed so the arc0 works
client.conf relevant stuff:
[common] defaultservices=index:ARC1:https://knowarc2.grid.niif.hu:50000/isis [alias] arc0=computing:ARC0:ldap://knowarc1.grid.niif.hu:2135/nordugrid-cluster-name=knowarc1.grid.niif.hu,Mds-Vo-name=local,o=grid arc1=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid #arex1=computing:ARC1:https://knowarc1.grid.niif.hu:60000/arex #arex2=computing:ARC1:https://knowarc1.grid.niif.hu:50000/arex arex1=computing:ARC1:http://knowarc1.grid.niif.hu:50000/arex arex2=computing:ARC1:https://knowarc1.grid.niif.hu:60000/arex arex3=computing:ARC1:http://knowarc1.grid.niif.hu:50003/arex arex4=computing:ARC1:http://knowarc1.grid.niif.hu:50004/arex arex5=computing:ARC1:http://knowarc1.grid.niif.hu:50005/arex arex6=computing:ARC1:http://knowarc1.grid.niif.hu:50006/arex arex7=computing:ARC1:http://knowarc1.grid.niif.hu:50007/arex arex8=computing:ARC1:http://knowarc1.grid.niif.hu:50008/arex arex9=computing:ARC1:http://knowarc1.grid.niif.hu:50009/arex
Test jobs
jsdl job
Compared to the first test (all-hands meeting) <DeleteOnTermination>false</DeleteOnTermination> was added for the 03.11 tests
<JobDefinition xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl" xmlns:posix="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix" > <JobDescription> <JobIdentification> <JobName>Windows test job</JobName> </JobIdentification> <Application> <posix:POSIXApplication> <posix:Executable>/bin/sh</posix:Executable> <posix:Argument>test.sh</posix:Argument> <posix:Argument>Testing</posix:Argument> <posix:Output>out.txt</posix:Output> <posix:Error>err.txt</posix:Error> </posix:POSIXApplication> </Application> <DataStaging> <FileName>test.sh</FileName> <DeleteOnTermination>false</DeleteOnTermination> <Source><URI>http://knowarc1.grid.niif.hu/ogf25/storage/test.sh</URI></Source> </DataStaging> <DataStaging> <FileName>out.txt</FileName> <DeleteOnTermination>false</DeleteOnTermination> </DataStaging> <DataStaging> <FileName>err.txt</FileName> <DeleteOnTermination>false</DeleteOnTermination> </DataStaging> </JobDescription> </JobDefinition>
xrsl jobs
&("executable" = "run.sh" ) ("arguments" = "2" ) ("inputfiles" = ("run.sh" "http://www.fys.uio.no/~katarzp/test/run.sh" ) ("Makefile" "http://www.fys.uio.no/~katarzp/test/Makefile" ) ("prime.cpp" "http://www.fys.uio.no/~katarzp/test/prime.cpp" ) ) ("stderr" = "primenumbers" )("outputfiles" = ("primenumbers" "" )) ("jobname" = "ARC testjob from www" ) ("stdout" = "stdout" ) ("gmlog" = "gmlog" ) ("CPUTime" = "8" )
arcproxy
Works fine to get a proxy. But if you don't have one, the message may be a bit scary.
arcproxy -I [2009-11-03 15:26:39] [Arc.OpenSSL] [WARNING] [3935/139171360] Failed to lock arccrypto library in memory [2009-11-03 15:26:39] [Arc.Credential] [ERROR] [3935/139171360] Can't get the first byte of input BIO to get its format Segmentation fault
Voms proxy
arcproxy -S knowarc.eu [2009-11-03 15:36:34] [Arc.OpenSSL] [WARNING] [4119/163444256] Failed to lock arccrypto library in memory Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Enter pass phrase for /home/katarzp/.globus/userkey.pem: ....++++++ .....++++++ [2009-11-03 15:36:38] [Arc] [ERROR] [4119/163444256] Cannot get voms server knowarc.eu information from file: /home/katarzp/.voms Proxy generation succeeded arcproxy -I [2009-11-03 15:36:41] [Arc.OpenSSL] [WARNING] [4123/148125216] Failed to lock arccrypto library in memory Subject: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel/CN=1209213760 Identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Timeleft for proxy: 11 hours 59 minutes 57 seconds Proxy path: /tmp/x509up.nG3931 Proxy type: X.509 Proxy Certificate Profile RFC compliant restricted proxy
I was a bit worried about the message "Cannot get voms server knowarc.eu information from file: /home/katarzp/.voms". My voms list is in file $HOME/.voms/vomses (works fine for the example below). Then I moves $HOME/.voms/vomses file for $HOME/.voms and get :
arcproxy -S knowarc.eu [2009-11-03 15:40:34] [Arc.OpenSSL] [WARNING] [4185/147871264] Failed to lock arccrypto library in memory Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Enter pass phrase for /home/katarzp/.globus/userkey.pem: ........................++++++ ..........................................++++++ Contacting VOMS server (named knowarc.eu ): arthur.hep.lu.se on port: 15001 [2009-11-03 15:40:38] [Arc.MCC.TLS] [ERROR] [4185/147871264] Failed to establish SSL connection [2009-11-03 15:40:38] [Arc.MCC.TLS] [ERROR] [4185/147871264] SSL error: -1 - (empty):(empty):(empty) [2009-11-03 15:40:38] [Arc.MCC.TLS] [ERROR] [4185/147871264] Failed to send content of buffer [2009-11-03 15:40:38] [Arc] [ERROR] [4185/147871264] ???: STATUS_UNDEFINED (No explanation.)
In stead of copying into what looks like the expected location I pointed to vomses and get the same:
arcproxy -S knowarc.eu --vomses=/home/katarzp/.voms/vomses [2009-11-03 15:57:09] [Arc.OpenSSL] [WARNING] [4453/153048608] Failed to lock arccrypto library in memory Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Enter pass phrase for /home/katarzp/.globus/userkey.pem: ...............++++++ ...................++++++ Contacting VOMS server (named knowarc.eu ): arthur.hep.lu.se on port: 15001 [2009-11-03 15:57:12] [Arc.MCC.TLS] [ERROR] [4453/153048608] Failed to establish SSL connection [2009-11-03 15:57:12] [Arc.MCC.TLS] [ERROR] [4453/153048608] SSL error: -1 - (empty):(empty):(empty) [2009-11-03 15:57:12] [Arc.MCC.TLS] [ERROR] [4453/153048608] Failed to send content of buffer [2009-11-03 15:57:12] [Arc] [ERROR] [4453/153048608] ???: STATUS_UNDEFINED (No explanation.) arcproxy --voms=atlas --vomses=/home/katarzp/.voms/vomses [2009-11-03 16:00:40] [Arc.OpenSSL] [WARNING] [4513/136640032] Failed to lock arccrypto library in memory Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Enter pass phrase for /home/katarzp/.globus/userkey.pem: ....++++++ .......++++++ Contacting VOMS server (named atlas ): voms.cern.ch on port: 15001 [2009-11-03 16:00:43] [Arc.MCC.TLS] [ERROR] [4513/136640032] Failed to establish SSL connection [2009-11-03 16:00:43] [Arc.MCC.TLS] [ERROR] [4513/136640032] SSL error: -1 - (empty):(empty):(empty) [2009-11-03 16:00:43] [Arc.MCC.TLS] [ERROR] [4513/136640032] Failed to send content of buffer [2009-11-03 16:00:43] [Arc] [ERROR] [4513/136640032] ???: STATUS_UNDEFINED (No explanation.)
Result:
arcproxy -I [2009-11-03 16:01:28] [Arc.OpenSSL] [WARNING] [4527/154777120] Failed to lock arccrypto library in memory Subject: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel/CN=1222106176 Identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Timeleft for proxy: 11 hours 59 minutes 15 seconds Proxy path: /tmp/x509up.nG3931 Proxy type: X.509 Proxy Certificate Profile RFC compliant restricted proxy
This one works fine with $HOME/.voms/vomses
arcproxy -O -S knowarc.eu arcproxy -I [2009-11-03 15:33:12] [Arc.OpenSSL] [WARNING] [4068/138683936] Failed to lock arccrypto library in memory Subject: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel/CN=proxy Identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel Timeleft for proxy: 11 hours 57 minutes 58 seconds Proxy path: /tmp/x509up.nG3931 Proxy type: Legacy Globus impersonation proxy
arcsub
Tested first a good old xrsl job which downloads input for an http place and compiles a and calculated prime numbers
arcsub -c arex1 testjobwww.xrsl
arcsub -c arex2 testjobwww.xrsl [2009-10-26 23:47:13] [Arc.OpenSSL] [WARNING] [4461/153974056] Failed to lock arccrypto library in memory Job submitted with jobid: https://knowarc1.grid.niif.hu:60000/arex/105671256597263621536123
Works for all niif arexes. Warning for arex2 (the only https service)
Tested then a test.jsdl job.
arcsub -c arex2 test.jsdl [2009-10-26 16:35:34] [Arc.OpenSSL] [WARNING] [11145/160253472] Failed to lock arccrypto library in memory Job submitted with jobid: https://knowarc1.grid.niif.hu:60000/arex/1056712565713591783296158
Works for all arexs. Warning for arex2 (https). At the moment (03.11) the https (arex2) is not available.
03.11 tests
After installing the globus package - ARC0 tests I think I managed to submit jobs last week. Now (03.11) this is the output, but then again it could be a problem on the server side.
arcsub -c arc0 testjobwww.xrsl
[2009-11-03 10:33:51] [Arc.OpenSSL] [WARNING] [3479/137664136] Failed to lock arccrypto library in memory [2009-11-03 10:33:59] [Arc.FTPControl] [ERROR] [3479/137635360] Connect: Failed authentication: globus_ftp_control: gss_init_sec_context failed/GSS Major Status: Authentication Failed/globus_gsi_gssapi: SSLv3 handshake problems/globus_gsi_gssapi: Unable to verify remote side's credentials/globus_gsi_gssapi: SSLv3 handshake problems: Couldn't do ssl handshake/OpenSSL Error: s3_clnt.c:894: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Invalid CRL: The available CRL has expired [2009-11-03 10:33:59] [Arc.Submitter.ARC0] [ERROR] [3479/137635360] Submit: Failed to connect Submission to gsiftp://knowarc1.grid.niif.hu:2811/jobs failed, trying next target [2009-11-03 10:33:59] [Arc.FTPControl] [ERROR] [3479/137635360] Connect: Failed authentication: globus_ftp_control: gss_init_sec_context failed/GSS Major Status: Authentication Failed/globus_gsi_gssapi: SSLv3 handshake problems/globus_gsi_gssapi: Unable to verify remote side's credentials/globus_gsi_gssapi: SSLv3 handshake problems: Couldn't do ssl handshake/OpenSSL Error: s3_clnt.c:894: in library: (null), function (null): (null)/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Invalid CRL: The available CRL has expired [2009-11-03 10:33:59] [Arc.Submitter.ARC0] [ERROR] [3479/137635360] Submit: Failed to connect Submission to gsiftp://knowarc1.grid.niif.hu:2811/jobs failed, trying next target Job submission failed, no more possible targets
Yes it is server problem. changed to arc1=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid and submission works fine.
arcsub -c arc0 testjobwww.xrsl [2009-11-03 10:36:29] [Arc.OpenSSL] [WARNING] [3527/154600152] Failed to lock arccrypto library in memory Job submitted with jobid: gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143
If no cluster is given, what should happen? Some defaults? At the moment the job is not submitted:
arcsub testslow.jsdl [2009-11-03 16:15:52] [Arc.OpenSSL] [WARNING] [4856/140648360] Failed to lock arccrypto library in memory [2009-11-03 16:15:52] [Arc] [ERROR] [4856/140648360] SSL error: 12 - (empty):(empty):(empty) [2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] Failed to establish SSL connection [2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] SSL error: -1 - (empty):(empty):(empty) [2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] SSL error: 336134278 - SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed [2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] Failed to send content of buffer Job submission aborted because no clusters returned any information
arcstat
Works fine both for arex and arc0 jobs. Except for some error messages
arcstat -a [2009-11-03 10:51:41] [Arc.OpenSSL] [WARNING] [3629/143394336] Failed to lock arccrypto library in memory [2009-11-03 10:51:42] [Arc] [ERROR] [3629/143394336] SSL error: 12 - (empty):(empty):(empty) [2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] Failed to establish SSL connection [2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] SSL error: -1 - (empty):(empty):(empty) [2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] SSL error: 336134278 - SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed [2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] Failed to send content of buffer [2009-11-03 10:51:42] [Arc.A-REX-Client] [ERROR] [3629/143394336] �ed to send SOAP message [2009-11-03 10:51:42] [Arc.JobController.ARC1] [ERROR] [3629/143394336] Failed retrieving job status information [2009-11-03 10:51:43] [Arc.JobController] [WARNING] [3629/143394336] Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/241261256752838667371441 Job: http://knowarc1.grid.niif.hu:50000/arex/241611257239618970891242 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50003/arex/2431212572396251634684492 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50000/arex/241611257240471994297934 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910 State: Finished (FINISHED) Job: http://knowarc1.grid.niif.hu:50003/arex/243121257240535396630099 State: Finished (FINISHED) [2009-11-03 10:51:56] [Arc.JobController] [WARNING] [3629/143394336] Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/22701256754327498076244 [2009-11-03 10:51:56] [Arc.JobController] [WARNING] [3629/143394336] Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/1622012567547201782103219 Job: gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143 Name: ARC testjob from www State: Finished (FINISHED) Exit Code: 0
For a single job no error messaged:
arcstat -l http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910 Job: http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910 State: Finished (FINISHED) Stdin: /dev/null Stdout: out.txt Stderr: err.txt Submitted: 2009-11-03 10:28:45 End Time: 2009-11-03 10:31:48 Results must be retrieved before: 2009-11-06 10:31:48
arcls
Problems? Shouldn't is show the output files?
arcls http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086 [2009-10-26 17:24:45] [Arc.arcls] [ERROR] [11345/141080096] Failed listing metafiles
03.11 tests
A configuration error was fixed at the server side. arcls works fine
[katarzp@localhost test]$ arcls http://knowarc1.grid.niif.hu:50003/arex/243121257240535396630099 primenumbers gmlog [katarzp@localhost test]$ arcls http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910 err.txt test.sh out.txt
The arc0 gives a warning
arcls gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143 [2009-11-03 10:47:28] [Arc.OpenSSL] [WARNING] [3611/163272224] Failed to lock arccrypto library in memory gmlog stdout primenumbers
A.K: Fixed in revision 15269 of trunk.
arccat
xrsl job - no output Gabor R. did a fix on the niif server side see the 03.11 test below for current status
[katarzp@localhost test]$ arccat http://knowarc1.grid.niif.hu:50005/arex/1257312565691711103923833 [2009-10-26 16:14:13] [Arc.JobController] [ERROR] [11019/155788832] File download failed: Failed while reading from source
[katarzp@localhost test]$ arccat -l http://knowarc1.grid.niif.hu:50005/arex/1257312565691711103923833 [2009-10-26 16:14:23] [Arc.JobController] [ERROR] [11030/154388000] File download failed: Failed while reading from source
jsdl job - no output
[katarzp@localhost test]$ arccat http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086 [2009-10-26 17:24:55] [Arc.JobController] [ERROR] [11347/149460512] File download failed: Failed while reading from source
[katarzp@localhost test]$ arccat -l http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086 [2009-10-26 17:25:01] [Arc.JobController] [ERROR] [11358/156669472] Can not determine the gmlog location: http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086
03.11 tests
arccat for jsdl job, standard output works
arccat http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745 stdout from job http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745 Hello Testing Welcome 1 times Welcome 2 times Welcome 3 times Welcome 4 times Welcome 5 times Welcome 6 times Welcome 7 times Welcome 8 times Welcome 9 times Welcome 10 times
However arccal -l (gmlog) does now work
arccat -l http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745 [2009-11-03 11:52:52] [Arc.JobController] [ERROR] [4391/163735072] Can not determine the gmlog location: http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745
Sending an xrsl job to the arex services works opposite. The standard output is not visible, while the gmlog can be read.
Gabor's comments it can be relater to the fact that in jsdl one needs to set DeleteOnTermination to fals if one wants the output to be kept. It is added in the jsdl test job while when submitting the xrsl job, one has to trust the translation. It looks like the stderr is kept, but not the stdout. Question: maybe DeleteOnTermination=False should be default?
<DataStaging> <FileName>out.txt</FileName> <DeleteOnTermination>false</DeleteOnTermination> </DataStaging> <DataStaging> <FileName>err.txt</FileName> <DeleteOnTermination>false</DeleteOnTermination> </DataStaging>
Both arccat and arccat -l works for the ARC0 service
arccat -l gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143
arcget
arcget http://knowarc1.grid.niif.hu:50005/arex/1257312565691711103923833
Works for a while but nothing seems to be downloaded. At least not to the work directory.
arcget -D /scratch/knowarc/test http://knowarc1.grid.niif.hu:50003/arex/122271256597278471460765
Nothing downloaded. But the jobs are removed from the job list.
03.11 tests
jsdl job on arex
arcget http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745 ls 2487812572440821650868745/ err.txt out.txt test.sh
xrsl job on arex (stdout not available)
arcget http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423 ls 248781257244754809208423/ gmlog primenumbers
The content of gmlog:
<HTML> <HEAD> <TITLE>ARex: Job Logs</TITLE> </HEAD> <BODY> <UL> <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/description">description</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/grami">grami</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/status">status</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/input">input</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/local">local</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/errors">errors</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/output">output</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/rte">rte</A> - log file <LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/diag">diag</A> - log file </UL> </BODY> </HTML>
Works fine for ARC0.
arcget gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143 [2009-11-03 12:04:10] [Arc.OpenSSL] [WARNING] [4463/134698528] Failed to lock arccrypto library in memory ls 8901257240953331117143/ gmlog primenumbers stdout l 8901257240953331117143/gmlog/ description diag errors input local output status
arccp
Works fine. Coped files from the session directory. Valid also 03.11.
arccp http://knowarc1.grid.niif.hu:50003/arex/2431212572396251634684492/out.txt test_out.txt
arckill
arckill
Works fine removes the job.
arckill -k
Keeps the job. One can use arcls, arccat - see where the job stopped. Job list is nicely updated.
Job: http://knowarc1.grid.niif.hu:50004/arex/244681257260942572912558 State: Killed (KILLED) Job: http://knowarc1.grid.niif.hu:50005/arex/2465212572609451975194226 State: Killed (KILLED) Job: http://knowarc1.grid.niif.hu:50006/arex/2471312572609481992071298 State: Killed (KILLED)
arckill -c arex4
Works fine with cluster as argument.
BUT this is not working:
arckill -s INLRMS:R [2009-11-03 16:44:27] [Arc.arckill] [ERROR] [5337/162956832] No jobs given arckill -s Running [2009-11-03 16:46:06] [Arc.arckill] [ERROR] [5365/165852704] No jobs given
arcclean
Works fine with a job or cluster as argument
arcclean http://knowarc1.grid.niif.hu:50003/arex/243121257261796512764518
arcclean -c arex5
BUT not with a state
arcclean -s KILLED [2009-11-03 16:48:02] [Arc.arcclean] [ERROR] [5401/145307168] No jobs given arcclean -s Killed [2009-11-03 16:48:06] [Arc.arcclean] [ERROR] [5403/161232416] No jobs given
ARC client python
The ARC client library is also available in python. It is very useful as basis for writing Grid applications, job management scripts or similar.
Target generation
Martin Skou is workin on this
usercfg = arc.UserConfig("","") targen = arc.TargetGenerator(usercfg) targen.GetTargets(0, 1) targets = targen.FoundTargets()
As well as
usercfg = arc.UserConfig("","") print usercfg.GetSelectedServices(arc.COMPUTING)
gives segmentation faults.
Job listing
#!/usr/bin/python import arc, sys; usercfg = arc.UserConfig(""); joblist = "jobs.xml"; # Logging... logger = arc.Logger(arc.Logger_getRootLogger(), "arcstat.py"); logcout = arc.LogStream(sys.stdout); arc.Logger_getRootLogger().addDestination(logcout); arc.Logger_getRootLogger().setThreshold(arc.DEBUG); #jobmaster = arc.JobSupervisor(usercfg, [sys.argv[1]], joblist); OUDTADED jobmaster = arc.JobSupervisor(usercfg,[]); jobcontrollers = jobmaster.GetJobControllers(); print 'jobcontrollers ', jobcontrollers for job in jobcontrollers: job.PrintJobStatus([], True);
Works fine. Finds all jobs previously submitted from CLI.
Job retrieval
#!/usr/bin/python import arc, sys; # User configuration file. # Initialise a default user configuration. usercfg = arc.UserConfig(""); # List of job ids to process. jobids = sys.argv[1:]; # List of clusters to process. clusters = []; # Job list containing active jobs. joblist = "jobs.xml"; # Process only jobs with the following status codes. # If list is empty all jobs will be processed. status = []; # Directory where the job directory will be created. downloaddir = "/scratch/knowarc/test/"; # Keep the files on the server. keep = False; # Logging... logger = arc.Logger(arc.Logger_getRootLogger(), "arcget.py"); logcout = arc.LogStream(sys.stdout); arc.Logger_getRootLogger().addDestination(logcout); arc.Logger_getRootLogger().setThreshold(arc.DEBUG); jobmaster = arc.JobSupervisor(usercfg,[]); jobcontrollers = jobmaster.GetJobControllers(); for job in jobcontrollers: job.Get(status, downloaddir, keep);
Like with the CLI arcget. It seems to be doing something, removes the jobs from the jobs list, but no files are downloaded.
03.11 tests
The server side fix works. All jobs are downloaded and removed from the list.
ARC lib examples
The ARC client library is also available in python. It is very useful as basis for writing Grid applications, job management scripts or similar. The following examples show the basic job cycle with submission, status and output retrieval and cleaning of jobs.
Submit jobs
This basic example shows how to retrieve possible target clusters and submit jobs.
#!/usr/bin/python import arc, sys joblist = "jobs.xml" usercfg = arc.UserConfig("","") logger = arc.Logger(arc.Logger_getRootLogger(), "arcsub.py") logcout = arc.LogStream(sys.stdout) arc.Logger_getRootLogger().addDestination(logcout) arc.Logger_getRootLogger().setThreshold(arc.ERROR) targen = arc.TargetGenerator(usercfg) targen.GetTargets(0, 1) targets = targen.FoundTargets() job = arc.JobDescription() job.Application.Executable.Name = '/bin/echo' job.Application.Executable.Argument.append('Hello') job.Application.Executable.Argument.append('World') job.Application.Output = 'std.out' #std.out will be not deleted if it is finished job_output = arc.FileType() job_output.Name = 'std.out' job.DataStaging.File.append(job_output) info = arc.XMLNode(arc.NS(), 'Jobs') for target in targets: submitter = target.GetSubmitter(usercfg) print 'Submitting to ', target.Cluster.ConnectionURL() submitted = submitter.Submit(job, target) if submitted: print "Job ID: " + submitted.fullstr() # Uncomment break if one wants to submit only one job #break;
Job status
This example shows how to list the status of all jobs a user has in the system. It corresponds to the "arcstat -a" command.
#!/usr/bin/python import arc, sys; usercfg = arc.UserConfig(""); joblist = "jobs.xml"; # Logging... logger = arc.Logger(arc.Logger_getRootLogger(), "arcstat.py"); logcout = arc.LogStream(sys.stdout); arc.Logger_getRootLogger().addDestination(logcout); arc.Logger_getRootLogger().setThreshold(arc.ERROR); jobmaster = arc.JobSupervisor(usercfg,[]); jobcontrollers = jobmaster.GetJobControllers(); for job in jobcontrollers: job.PrintJobStatus([], True);
Get results
This example shows how to download the jobs once they are finished. It corresponds to "srcget -a".
#!/usr/bin/python import arc, sys; # User configuration file. # Initialise a default user configuration. usercfg = arc.UserConfig("","") # List of job ids to process. jobids = sys.argv[1:]; # List of clusters to process. clusters = []; # Job list containing active jobs. joblist = "jobs.xml"; # Process only jobs with the following status codes. # If list is empty all jobs will be processed. status = []; # Directory where the job directory will be created. downloaddir = "/scratch/knowarc/test/"; # Keep the files on the server. keep = False; # Logging... logger = arc.Logger(arc.Logger_getRootLogger(), "arcget.py"); logcout = arc.LogStream(sys.stdout); arc.Logger_getRootLogger().addDestination(logcout); arc.Logger_getRootLogger().setThreshold(arc.ERROR); jobmaster = arc.JobSupervisor(usercfg,[]); jobcontrollers = jobmaster.GetJobControllers(); i = 0 for job in jobcontrollers: job.Get(status, downloaddir, keep); i += 1