This wiki is obsolete, see the NorduGrid web pages for up to date information.

NOX/Tests/Client tests

From NorduGrid
Jump to navigationJump to search

General issues

  • There is no realistic testbed:
    • Infoindex has only few sites at NIIF
    • Various examples picked from templates are probably outdated
    • No Chelonia/Bartender endpoints are advertised anywhere
    • There is no way to know whether there are MyProxy or VOMS or SLCS services available
  • There is no way to submit a real job that does some reasonable calculation, as only 3 RTEs (2 of them ara Java) are visible, and no other relevant information is visible (not even operating system)
  • General confusion about configuration format and file name (XML vs INI): all man pages refer to an XML file, installation deploys arcclient.xml in /etc, and so forth.
    • The client configuration is ONLY INI. Any reference to arcclient.xml and XML client configuration have been removed.
  • Time is inconsistently treated everywhere: e.g. maximal queue wall time is reported as "P1DT6H1M" (whatever it is), proxy validity is always in GMT, etc etc etc.
    • Should be fixed in revision 15509.
  • Information is overly verbose and most of the time very cryptic, e.g.:
 Free Slots With Duration:
 P12Y4M2DT12H: 1
  • See above.

or:

 [2009-10-26 00:59:33] [Arc.Plugin] [ERROR] [1843/28687952] Could not find loadable module by name (empty) ((empty))
  • This error could be generated when specifying a wrong middleware plugin like FOO:https//example.com/cluster, where FOO plugin does not exist. Checks have now been added to code (revision 15694 and 15695), which should catch this error in a earlier stage. If the above error message can be reproduced please let me know.

or:

 [2009-10-26 00:40:45] [Arc.MCC.TLS] [ERROR] [987/14137344] SSL error: -1 - (empty):(empty):(empty)

or, randomly, at job submission:

 [2009-10-26 02:37:34] [Arc.A-REX-Client] [ERROR] [24094/21088896] The response to a service status request is Fault message: Response is not valid SOAP
  • Message have been promoted to INFO, along with other similar ERROR messages.

Summary of installed arc* commands

SVN build Binary repos Comment Judgement
Security
arcproxy arcproxy OK
arcslcs SLCS generation utility
arcdecision arcdecision
Jobs
arccat arccat OK
arcclean arcclean Completely silent OK
arcget arcget OK
arcinfo arcinfo OK
arckill arckill Completely silent OK
arcmigrate arcmigrate Unclear what are "migratable" states ?
arcrenew arcrenew Completely silent, difficult to verify ?
arcresub arcresub Problems submitting to a target ?
arcresume arcresume Difficult to test, needs a "resumable state" ?
arcstat arcstat OK
arcsub arcsub Still very verbose OK
arcsync arcsync OK
Data
arcrm arcrm
arcsrmping arcsrmping
arccp arccp
arcls arcls
chelonia
Other
arcecho arcecho

CLI from SVN on Ubuntu 9.04

System

  • Intel Core2 Duo
  • Ubuntu 9.04 Jaunty x86_64; kernel 2.6.28-15-generic
  • Pre-defined environment variables:
    • X509_USER_KEY
    • X509_USER_CERT
    • X509_USER_PROXY

Build

  • Pre-installed:
    • libglobus-gssapi-gsi-dev and all the pulled dependencies from the NorduGrid Jaunty repo - needed for MyProxy interactions
    • IGTF certificates v1.31 "classic" from tarball
  • 0.9.4rc2 tag from SVN
    • Build and installation:
  sudo ./autogen.sh
  sudo ./configure --disable-a-rex-service --disable-isi-service --disable-charon-service \
                   --disable-compiler-service --disable-hopi-service --disable-paul-service \
                   --disable-sched-service --disable-storage-service --disable-janitor-service \
                   --disable-java --disable-python --disable-doc
  sudo make install

Installs everything in '/usr/local/' (very uncommon for Ubuntu, but works)

Tests

Common tests

  • When system configuration file does not exist (e.g., it never exists on Windows), the following warning is always printed:
[2009-10-25 02:39:43] [Arc.UserConfig] [WARNING] [7380/141377160] System configuration file (/etc/arc/client.conf) does not exists.

SOLUTION: promote this to a higher debug level

  • Done in revision 15696 and 15697.
  • When no configuration file is found at all, a misleading warning is printed:
 [2009-10-25 02:25:16] [Arc.UserConfig] [WARNING] [7117/146706056] System configuration file (/etc/arc/client.conf) does not exists.

SOLUTION: Print a message that no configuration file is found at all, or add $HOME/.arc/client.conf to the list of possible candidates.

  • This is when the system configuration cannot be found. This has been solved as in the above situation.
  • In RC2, every tool produced warning after loading old /usr/local/etc/arc/client.conf:
  [2009-10-15 20:20:32] [Arc.UserConfig] [WARNING] [21400/14995024] Unknown section client, ignoring it

REASON: old ~/.arc/client.conf, which is still needed for v0.8. SOLUTION: Martin changes verbosity level to INFO, and adds a debug message before processing (was only after)


arc* -v

No isssues

Man-pages are not localized

arcproxy

  arcproxy --help

In RC2 was dumping gettext comment:

Usage:                                  
  arcproxy [OPTION...] Project-Id-Version: Arc
Report-Msgid-Bugs-To:                         
POT-Creation-Date: 2009-10-15 19:48+0200                                                                       
...

REASON: Empty argument string to OptionParser. ACTION: Martin fixed the code.

  man arcproxy

Has funny info:

...
COPYRIGHT
       We need to have this

FILES
AUTHOR
       Written by developers

ACTION: ask someone to synchronise this bit in man pages

  arcproxy
  arcproxy -O
  arcproxy -I
  arcproxy -I -P cow
  arcproxy -c validityPeriod="1 second"
  arcproxy -t 1

No issues

  arcproxy -C usercert-old.pem -K userkey-old.pem

When proxy is attempted to be generated from expired credentials, arcproxy fails and yet reports success, with validity time coinsiding with the certifiate expiration date:

...
[2009-10-16 23:59:44] [Arc.Credential] [ERROR] [8500/24735312] Certificate verification failed
Proxy generation succeeded
Your proxy is valid until: Thu, 17 Sep 2009 15:13:03 GMT

REASON: unknown. SOLUTION: TBD

  arcproxy -T /etc

Proxy creation fails as expected (no trusted certificates found), and yet success is reported, with odd validity period:

...
[2009-10-17 00:04:10] [Arc] [ERROR] [8689/33365584] Certificate verification error: unable to get issuer certificate
[2009-10-17 00:04:10] [Arc.Credential] [ERROR] [8689/33365584] Certificate verification failed
Proxy generation succeeded
Your proxy is valid until: Sat, 17 Oct 2009 10:04:10 GMT

REASON: unknown. SOLUTION: TBD

CURIOSITY: validity time is always printed in GMT:

  ...
Proxy generation succeeded
Your proxy is valid until: Fri, 16 Oct 2009 23:51:46 GMT
  date
Fri Oct 16 23:51:53 CEST 2009
  • Fixed in revision 15700. Time is now printed in local time.
  arcproxy -z ~/.arc/other.conf

Prefers X509_USER_* values to those specified in the configuration - unclear whether this is the expected behaviour?

VOMS tests

  arcproxy -V ./vomses -S knowarc.eu
  arcproxy -S knowarc.eu
  arcproxy -S knowarc.eu:all
  arcproxy -S knowarc.eu:list
  arcproxy -G -S knowarc.eu:all
  arcproxy -S atlas:/atlas/Role=production
  arcproxy -O -S atlas
  arcproxy -S atlas -O -c validityPeriod="5 hours"

No issues

MyProxy tests

  arcproxy -L knowarc1.grid.niif.hu -U oxana -M PUT
  arcproxy -L knowarc1.grid.niif.hu -U oxana -M GET
  arcproxy -S atlas -L knowarc1.grid.niif.hu- U oxana -M PUT
  arcproxy -L knowarc1.grid.niif.hu -U oxana -M GET

No issues; except that MyProxy server can not store VOMS extensions (as expected).

Curious usage of X509_VOMS_DIR

arcproxy appears to use X509_VOMS_DIR variable as a pointer to the vomses file (list of VOMS server contact points). The native VOMS client uses this variable to point to the directory that contains VOMS server credentials, needed to validate the proxy.

WARNING: possibility of confusion when the same variable has different meaning for different tools

arcsync

  • BIG PROBLEM: retrieves all jobs from http-AREX, from every user (a feature, actually, but a very bad one)
    • AFAIK there is nothing to do about it. The service is unsecure, and if you choose to sync against an unsecure service then you will get all the jobs registered there. There is no way to identify which job are yours and which are not.
  • man page refers to client.xml
    • Fixed in trunk.
  • man pages and help page are different: option "-m" doesn't exist, contrary to what is written in man page
    • Fixed in trunk. There should be no "-m" flag. Default is to merge, otherwise use the "-t" flag to truncate the joblist before adding sync'ed jobs.

arcsub

General notes:

  • pity that some useful options are gone (-U, -C, -dryrun)
    • Unknown attributes should be allowed.
    • The functionality of using the -C flag can partly be achieved by using an alias. If you feel some functionality is missing, please open a feature request.
    • AFAIK dry run is not supported by A-REX. It is still possible to dry run on grid-manager, which is done by putting the dryrun attribute in the XRSL jobdescription. If the dry run functionality is required then a feature request should be opened.
  • How do I know which brokers are available?
    • At the moment there is no way. We have been talking about this in the arclib team, however nothing have been done yet. A feature request should be opened on this...
  • How do I plug in own broker?
    • Read the manual... :o)
  • FastestCPU does not exist, though is documented in the deliverable
    • The named was changed to Benchmark. Without arguments the Benchmark broker is equivalent to the old FastestCPU.
  • The tool is still very verbose: below is an example of a simple echo job submission with default settings
 > arcsub echo-stage.jsdl
ERROR: Failed to bind to ldap server (index1.nordugrid.org)
ERROR: Failed to establish SSL connection
ERROR: SSL error: -1 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: The service status could not be retrieved
ERROR: Failed to bind to ldap server (gridsrv4.nbi.dk)
ERROR: Failed to bind to ldap server (topaasi.grid.utu.fi)
ERROR: Failed to bind to ldap server (spektroliitti.lut.fi)
ERROR: Failed to bind to ldap server (gridsrv4.nbi.dk)
ERROR: Failed to bind to ldap server (kvartsi.hut.fi)
ERROR: Failed to bind to ldap server (akaatti.tut.fi)
ERROR: Failed to bind to ldap server (gridsrv4.nbi.dk)
ERROR: Failed to bind to ldap server (opaali.phys.jyu.fi)
ERROR: Conversion failed: adotf
ERROR: Conversion failed: adotf
ERROR: Invalid period string: 4320.0
ERROR: Invalid period string: 120.0
ERROR: Conversion failed: -
ERROR: Ldap bind timeout (lcg.bitp.kiev.ua)
ERROR: Failed to bind to ldap server (hexgrid.bccs.uib.no)
ERROR: Connect: Failed authentication: 535 Not allowed
ERROR: Submit: Failed to connect
Submission to gsiftp://neolith2.nsc.liu.se:2811/jobs failed, trying next target
ERROR: Connect: Failed authentication: 535 Not allowed
ERROR: Submit: Failed to connect
Submission to gsiftp://svea.c3se.chalmers.se:2811/jobs failed, trying next target
ERROR: Connect: Failed authentication: 535 Not allowed
ERROR: Submit: Failed to connect
Submission to gsiftp://neolith2.nsc.liu.se:2811/jobs failed, trying next target
ERROR: Connect: Failed authentication: 535 Not allowed
ERROR: Submit: Failed to connect
Submission to gsiftp://gtpps2.csc.fi:2811/jobs failed, trying next target
ERROR: Can not create the SSL Context object
ERROR: SSL error: 336236785 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: Creating delegation to CREAM delegation service failed
ERROR: Creating delegation failed
Submission to https://cream.grid.upjs.sk:8443/ce-cream/services failed, trying next target
Job submitted with jobid: gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080
  • Unresolved
  • When a job is killed or otherwise is not in joblist, misleading ERROR is printed; some more informative message is needed
 ~ > arckill gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080
 ~ > arcstat gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080
 WARNING: Job not found in job list: gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080
 ERROR: No job controllers loaded
  • Unresolved

Tests:

  • Normal job submission succeeds only to an AREX and GM sites, though with plenty of errors and warnings (see arcsync above). CREAM and UNICORE do not seem to work.
 arcsub echo.jsdl
  • Fixed regarding CREAM support have been made, however there are still an unresolved issue, see bug 1755.
  • Submission to a specific ARC0 site fails
  arcsub -c ARC0:ldap://knowarc1.grid.niif.hu:2135/nordugrid-cluster-name=knowarc1.grid.niif.hu,Mds-Vo-name=local,o=grid echo.jsdl
 [2009-10-26 00:39:19] [Arc.Plugin] [ERROR] [787/16326224] Could not find loadable module by name (empty) ((empty))
 [2009-10-26 00:39:19] [Arc.Plugin] [ERROR] [787/16326224] Could not find loadable module by name ARC0 and HED:TargetRetriever ((empty))
 [2009-10-26 00:39:19] [Arc.Loader] [ERROR] [787/16326224] TargetRetriever ARC0 could not be created
 Job submission aborted because no clusters returned any information

REASON: it is highly non-trivial to figure that the reason is some missing Globus libraries during the ./configure step. By guessing, trial and error, and 3 rebuilds, it started working.

  • Checks have been added which will give a more informative error message.
  • AFAIK UNICORE is currently not supported.
  • Submission to specific ARC1 sites succeeds
  • Submission to aliases: works as expected (arexes is a recursive alias)
  arcsub -c arex1 echo.jsdl
  arcsub -c arexes echo.jsdl
  • Submission using specified broker works as expected
  arcsub -b FastestQueue ech.jsdl
  • PROBLEM: Setting brokername=Cow in client configuration SUCEEDS!
 arcsub -c arexes echo.jsdl
 Job submitted with jobid: http://knowarc1.grid.niif.hu:50000/arex/1342912565143791867204702
  • A check have been added in revision 15704, which gives an error message if the broker is not found.
  • Verify job description - works as expected
  arcsub -c ARC1:https//example.org:60000/arex job.jsdl -x

arcstat

PROBLEM: man page mentions option "-i", but it is not implemented. Fixed in trunk.

  • Query all jobs: works as expected, except of being very verbose complaining of "Could not find loadable module by name..."
  arcstat -a
  • Query specific job: both short and long versions work as expected
  arcstat <jobID>
  arcstat -l <jobID>

PROBLEM: very little information from A-REX, even with -l. Basically, I don't even know whether the jobs are mine.

  • The information retrieval have been updated. Please re-do the evaluation.
  • Query jobs on a specific cluster:
  arcstat -c <url>

PROBLEM: when cluster alias can not be resolved, proceeds to stat all jobs!

~ > arcstat -c cow
ERROR: Could not resolve alias "cow" it is not defined.
ERROR: Failed to bind to ldap server (pgs02.grid.upjs.sk)
WARNING: Job state information not found: gsiftp://pgs02.grid.upjs.sk:2811/jobs/18401226184112135916457
WARNING: Job state information not found: gsiftp://shiva.rhi.hi.is:2811/jobs/1040812261924051614242126 
WARNING: Job state information not found: gsiftp://gateway01.dcsc.ku.dk:2811/jobs/94731226193948987458939
WARNING: Job state information not found: gsiftp://gateway01.dcsc.ku.dk:2811/jobs/100911226193970685376796
WARNING: Job state information not found: gsiftp://arc-ce.smokerings.nsc.liu.se:2811/jobs/281991235081230880182882
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2434012350812551337331521      
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/996312350820311427289495       
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/15706123508214085320563        
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/20871235082527988736939        
WARNING: Job state information not found: gsiftp://lscf.nbi.dk:2811/jobs/940512415410181382505774                 
WARNING: Job state information not found: gsiftp://lscf.nbi.dk:2811/jobs/1149812415412531723520409                
WARNING: Job state information not found: gsiftp://benedict.grid.aau.dk:2811/jobs/17591241541267971426534         
WARNING: Job state information not found: gsiftp://lscf.nbi.dk:2811/jobs/119751241541281615800024                 
WARNING: Job state information not found: gsiftp://siri.lunarc.lu.se:2811/jobs/188812553078141140454717           
WARNING: Job state information not found: gsiftp://siri.lunarc.lu.se:2811/jobs/256371255306399952746572           
Job: gsiftp://grad.uppmax.uu.se:2811/jobs/2231012574512572087245670
 Name: JSDL stdin/stdout test
 State: Failed (FAILED)
 Error: Failed extracting LRMS ID due to some internal error 
      
Job: gsiftp://grid.tsl.uu.se:2811/jobs/802412574550371508616905
 Name: Test job
 State: Finished (FINISHED) 
   
...
  • Unresolved
  • Query jobs in joblist: works as expected
  arcstat -j joblist
  • Query jobs with a given status: only works when "-a" is specified
   arcstat -s Finished
   [2009-10-26 01:01:23] [Arc.arcstat] [ERROR] [2073/39910992] No jobs given

   arcstat -a -s Finished
   [2009-10-26 01:00:40] [Arc.Plugin] [ERROR] [2062/10149456] Could not find loadable module by name (empty) ((empty))
   [2009-10-26 01:00:40] [Arc.Plugin] [ERROR] [2062/10149456] Could not find loadable module by name ARC0 and HED:JobController ((empty))
   [2009-10-26 01:00:40] [Arc.Loader] [ERROR] [2062/10149456] JobController ARC0 could not be created
   [2009-10-26 01:00:40] [Arc.A-REX-Client] [ERROR] [2062/10149456] The status of the job (https://knowarc1.grid.niif.hu:60000/arex/18559124152278119431851) could not be retrieved.
   [2009-10-26 01:00:40] [Arc.JobController.ARC1] [ERROR] [2062/10149456] Failed retrieving job status information
   [2009-10-26 01:00:44] [Arc.JobController] [WARNING] [2062/10149456] Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/18559124152278119431851
   Job: https://knowarc1.grid.niif.hu:60000/arex/1056712565140411350802261
    State: Finished (FINISHED)
 
   Job: http://knowarc1.grid.niif.hu:50003/arex/122271256514211711205873
    State: Finished (FINISHED)
 
   Job: http://knowarc1.grid.niif.hu:50003/arex/122271256514322848951627
    State: Finished (FINISHED)
 
   Job: http://knowarc1.grid.niif.hu:50000/arex/134291256514330947900441
    State: Finished (FINISHED)
  
   Job: http://knowarc1.grid.niif.hu:50000/arex/1342912565143791867204702
    State: Finished (FINISHED)
  • In revision 15707, it is possible to only specify -s <state> in which case all jobs with <state> in the joblist will be queried.
  • The error messages from A-REX-Client and JobController.ARC1 have been promoted to VERBOSE and INFO respectively.

arcget

  • Man page refers to client.xml - fixed in RC5
  • Get all jobs - OK
  arcget -a
  • Get specific job:
  arcget <jobID>

PROBLEM: completely silent; in fact, does not print anything at all. Must report something like "<N> job(s) received"

PROBLEM: On Windows, can not create download directory:

C:\Users\oxana>arcget http://knowarc1.grid.niif.hu:50006/arex/247131256752821488940856
[2009-10-29 11:01:12] [Arc.UserConfig] [WARNING] [7648/35793680] System configuration file (/usr/i686-pc-mingw32/sys-root/mingw/etc/arc/client.conf) does not exists.
[2009-10-29 11:01:18] [Arc.URL] [WARNING] [7648/35793680] Attempt to assign relative path to URL - making it absolute
[2009-10-29 11:01:20] [Arc.DataPoint.File] [ERROR] [7648/35793680] Failed to create/find directory /C:\Users\oxana\247131256752821488940856, (22)
[2009-10-29 11:01:20] [Arc.DataMover] [ERROR] [7648/35793680] Failed to start writing to destination: file:/C:\Users\oxana\247131256752821488940856\primenumbers

FIXED PROBLEM: Initially, was failing at all A-REX instances at NIIF because of a linefeed symbol in the A-REX configuration file (?! ask Gabor to elaborate).

  • Get specific job, keeping the job on the site - OK
  arcget -k https://knowarc1.grid.niif.hu:60000/arex/1056712564879191002972630
  • Get jobs on a specific cluster - OK
  arcget -c <url>
  • Get jobs stored in joblist - OK
  arcget -j joblist
  • Get jobs having the specified status - OK
  arcget -s Finished
  • Save jobs into another directory - OK
  ~ > pwd
  /home/oxana      
  ~ > arcget -D /tmp gsiftp://grid.tsl.uu.se:2811/jobs/802412574550371508616905 
  ~ > ls /tmp/802412574550371508616905/
  job.gmlog  job.log

arcclean

Clean all jobs

  arcclean -a

Clean specific job

  arcclean <jobID>

Clean jobs on a specific cluster

  arcclean -c <url>

Clean job specified in joblist

  arcclean -j joblist

Clean jobs with the specified status

  arcclean -s Failed

Force cleaning jobs

  arcclean -f -j joblist

arckill

  • Kill a specific job - OK
~ > arckill gsiftp://jeannedarc.hpc2n.umu.se:2811/jobs/58091257459269795497080

PROBLEM: Completely silent. Must report "<N> jobs are scheduled for slaughter"

  • Kill all jobs - OK
  arckill -a
  • Kill jobs in joblist - OK
  arckill -j joblist
  • Kill jobs on a specific cluster - OK
  arckill -c <url>
  • Kill jobs with specified status - OK
  arckill -s Running
  • Kill job, but keep files on server and retrieve files afterwards - OK
  arckill -k <jobID>
  arcget <jobID>

arcinfo

  • Man pages appear to be a copy-and-paste of arcstat
 arcinfo
  • When proxy expires, and configuration has https URLs, segmentation fault is randomly occurring (in RC2 on Jaunty; not found in RC3 on Fedora 11):
[2009-10-25 02:57:49] [Arc.MCC.TLS] [ERROR] [26558/9481552] Failed to establish SSL connection
[2009-10-25 02:57:49] [Arc.MCC.TLS] [ERROR] [26558/9481552] SSL error: 336151573 - SSL routines:SSL3_READ_BYTES:sslv3 alert certificate expired
[2009-10-25 02:57:49] [Arc.MCC.TLS] [ERROR] [26558/9481552] Failed to send content of buffer
Segmentation fault
  • When ARC0 index services are listed in defaultservices, errors are produced, and yet execution continues:
[2009-10-25 02:00:41] [Arc.Plugin] [ERROR] [26784/25116240] Could not find loadable module by name (empty) ((empty))
[2009-10-25 02:00:41] [Arc.Plugin] [ERROR] [26784/25116240] Could not find loadable module by name ARC0 and HED:TargetRetriever ((empty))

SOLUTION:Demote to a higher debug level, provide more informative text

arcinfo -z client-gabor.conf

No issues


Query cluster and index server

  arcinfo -c <url>
  arcinfo -i <url>

Repeat above with the long option '-l' flag.

arccat

  • Concatenate output (stdout, stderr and gmlog) a specific job - didn't work on NIIF's servers, but got fixed since.
  • Concatenate output for all jobs: proceeds through all jobs as expected
  arccat -a 
  • Concatenate output for all jobs on a specific cluster: proceeds through all jobs as expected
  arccat -c arex1
  • Concatenate output for jobs with specified status: proceeds through all jobs as expected
  arccat -s Finished -a
  • Get gmlog - works
  arccat -l -c grid-tsl

arcresub

  • Re-submission from a site: something works
  arcresib -c arex1

PROBLEM: extreme verbosity, prints tons of output and even resubmits something, but it is impossible to match failures to jobs and understand why something was not resubmitted

  • Resubmit specific job, or to a specific target - never seem to work
  arcresub <jobID>
  arcresub -q <url> <jobID>
  arcresub -m <jobID>
~> arcresub -q arex1 gsiftp://grad.uppmax.uu.se:2811/jobs/2231012574512572087245670
Job submission aborted because no clusters returned any information

PROBLEM: for ARC0 and ARC1, keeps printing "Job submission aborted because no clusters returned any information"

arcmigrate

  arcmigrate <jobID>

PROBLEM: man-page says that jobs can be migrated in Running/Executing/Queuing states

PROBLEM: before announcing that the job is not in queueing state, arcmigrate still polls the entire information system for targets. Should be other way around.

  • migrate to a specific site: does not respect the target
  arcmigrate -q <url> <jobID>

PROBLEM: polls entire information system even when -q is specified, and picks another target

PROBLEM: extremely verbose:

> arcmigrate https://knowarc1.grid.niif.hu:60000/arex/2412612574621581805552407
ERROR: Failed to establish SSL connection                                                            
ERROR: SSL error: -1 - (empty):(empty):(empty)                                                       
ERROR: Failed to send content of buffer                                                              
ERROR: The service status could not be retrieved                                                     
ERROR: Failed to bind to ldap server (topaasi.grid.utu.fi)                                           
ERROR: Failed to bind to ldap server (kvartsi.hut.fi)                                                
ERROR: Failed to bind to ldap server (spektroliitti.lut.fi)                                          
ERROR: Failed to bind to ldap server (akaatti.tut.fi)                                                
ERROR: Failed to bind to ldap server (opaali.phys.jyu.fi)                                            
ERROR: Conversion failed: adotf                                                                      
ERROR: Conversion failed: adotf                                                                      
ERROR: Invalid period string: 4320.0                                                                 
ERROR: Invalid period string: 120.0                                                                  
ERROR: Conversion failed: -                                                                          
ERROR: Ldap bind timeout (lcg.bitp.kiev.ua)                                                          
ERROR: Failed to bind to ldap server (hexgrid.bccs.uib.no)                                           
WARNING: Cannot migrate to a ARC0 cluster.                                                           
WARNING: Cannot migrate to a ARC0 cluster.                                                           
WARNING: Cannot migrate to a ARC0 cluster.                                                           
WARNING: Cannot migrate to a ARC0 cluster.                                                           
WARNING: Cannot migrate to a ARC0 cluster.                                                           
WARNING: Cannot migrate to a ARC0 cluster.                                                           
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a CREAM cluster.
WARNING: Cannot migrate to a ARC0 cluster.
WARNING: Cannot migrate to a CREAM cluster.
Job migrated with jobid: https://knowarc1.grid.niif.hu:60000/arex/2412612574622531164449078

arcrenew

  • Proxy renewal
  arcrenew <jobID>

PROBLEM: Completely silent. No easy way to check whether it succeeded or no: nothing shows proxy expiration time on ARC0, and while ARC1 shows proxy expiration time, renewal does not work:

> arcrenew https://knowarc1.grid.niif.hu:60000/arex/2412612574608872038873988
ERROR: Renewal of ARC1 jobs is not supported
ERROR: Failed renewing job https://knowarc1.grid.niif.hu:60000/arex/2412612574608872038873988

PROBLEM: Is this actually true and ARC1 jobs can not get proxies renewed?!

  arcrenew -a
  arcrenew -s <status> -a
  arcrenew -j joblist
  arcrenew -c <url>

arcresume

> arcresume http://knowarc1.grid.niif.hu:50000/arex/241611257456176468896456
ERROR: Job http://knowarc1.grid.niif.hu:50000/arex/241611257456176468896456 does not report a resumable state
ERROR: Failed to find delegation credentials in client configuration
ERROR: Failed resuming job http://knowarc1.grid.niif.hu:50000/arex/241611257456176468896456

PROBLEM: Requires a "resumable state" and yet nothing explains what are resumable states?

PROBLEM: What are "credentials in client configuration"? All credentials were fine before and after.

Comment : (Katarina)

Tested resume by "hiding" the input file so the job filed in PREPARING stat. Resume works only for ARC0 computing elements (sending an xrsl job to ARC0). Resume is not working for the arexes.

arcslcs

arcslcs has incompatible usage of "-c" (should be "-z")

chelonia

Has incompatible usage of "-v" (should be "-d"); has no option "-h"

Does not print method help, as advertised:

 chelonia modify
 /usr/bin/chelonia:239: DeprecationWarning: the md5 module is deprecated; use hashlib instead
   import md5
 ERROR: ARC python client library not found (maybe PYTHONPATH is not set properly?)
      If you want to run without the ARC client libraries, use the '-w' flag

(-w flag has no effect)

CLI Windows

03.11 update

The new installer (from 29.10) was tested but the installation suffers from the "copy client.conf.example" problem. The file is not found as the paths are wrong. It is fixed in svn, but not available in the installer.

05.11 update

For 05.11 testing done using the zip file installation http://knowarc1.grid.niif.hu/windows/arc1-xp-vista-compatible.zip where the copy problem was solved. The cleint.conf had to be added by hand.

System

  • Intel Dual Core
  • Windows XP Pro

Installation

You need to install these packages:

http://knowarc1.grid.niif.hu/windows/arc1-xp-vista-compatible.zip

First the certificate. Keep a copy the .globus directory from your home directory on what ever system you are using ready.

Open a dos prompt or a file manager. Go to your Windows home directory. These use to be found in the path like

C:\Documents and Settings\<username>\

Create a .globus directory. In the Explorer one may get problems with creating a directory starting with a dot. Create the directory from a dos prompt

>mkdir .globus

Copy the content over to the local .globus directory whatever way is most simple.

Client configuration

In the home directory (typically C:\Documents and Settings\<username>) there is an "Application Data" directory (hidden directory, switch on visibility in Explorer Tools menu -> Folder option -> View, "Hidden files and folders")

Create a .arc folder

> cd "Application Data"
>mkdir .arc

Now having a .arc: in C:\Documents and Settings\<username>\Application Data\.arc Copy the client.xml file info the .arc folder http://knowarc1.grid.niif.hu/windows/demo/client.xml This file contains cluster aliases.

Some environment setup (assuming a very default installation in the Program Files folder):

One needs globus and NorduGridARC in path. To do that, append following to the environment variable Path:

;C:\Program Files\Globus\bin;C:\Program Files\NorduGridARC\bin  

This is done automatically now

If missing environment variables are found from the control panel menu in "System Properties" -> "Advanced" -> "Environment variables"

GLOBUS_LOCATION         set to C:\Program Files\Globus 
X509_CERT_DIR		set to C:\Program Files\NorduGridARC\etc\grid-security\certificates
X509_USER_CERT		set to %HOMEPATH%\.globus\usercert.pem
X509_USER_KEY		set to %HOMEPATH%\.globus \userkey.pem

In prompt check what the proxy is called:

>dir %TEMP%
>set X509_USER_PROXY=%TEMP%\x509up_u0

Tests

arcproxy

arcproxy 
arcproxy -I
arcproxy -O -S knowarc.eu

All work fine

arcsub 05.11

Works fine for arex

arcsub -c arc0 job.xtsl
Job submitted with jobid: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2044812563891681363597456
arcsub -c arex1 testjobwww3.jsdl
Job submitted with jobid: http://knowarc1.grid.niif.hu:50000/arex/1342912563892491848381691

Tried to add ARC0 services

arc0=computing:ARC0:ldap://knowarc1.grid.niif.hu:2135/nordugrid-cluster-name=knowarc1.grid.niif.hu,Mds-Vo-name=local,o=grid
arc1=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid

but they are not recognized.

arcsub -c arc0 testjobwww3.jsdl
ERROR: Could not resolve alias "arc0" it is not defined.
Job submission aborted because no clusters returned any information
arcsub -c cream job.jdl

No cream targets found.

arcstat 05.11

Works, but provides an overwhelming amount of output. One has to look for the actual job info. Could it somehow be reduced (as default).

arcstat -a

 C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir>arcstat -a
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/154512466269442249306) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/190312480960421804289383) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412480960942102207750) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412480986551484725515) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412480987372004004782) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248178660816299854) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248178693270274735) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481788161833884369) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481794011689773322) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481794841135549564) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481795061490164002) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248179664658629600) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/156412481806331995853780) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/15641248181654764503202) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/19031248266727719885386) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/190312482688801649760492) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/19031248269664596516649) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/190312482708001189641421) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487751911274785528) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/135681248776174890038130) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487761862099818677) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/135681248778282954184859) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487783181690167551) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/13568124877853011805764) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487798162116602801) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812487815761120847641) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/1432512487896321201817824) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (https://knowarc1.grid.niif.hu:60000/arex/1356812489549801085755434) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/1432512490424262056397442) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/143251249042517279751030) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/278701250254923894246671) could not be retrieved.
ERROR: Failed retrieving job status information
ERROR: The status of the job (http://knowarc1.grid.niif.hu:50000/arex/278701250255050573870217) could not be retrieved.
ERROR: Failed retrieving job status information
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/154512466269442249306
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/190312480960421804289383
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412480960942102207750
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412480986551484725515
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412480987372004004782
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248178660816299854
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248178693270274735
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481788161833884369
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481794011689773322
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481794841135549564
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481795061490164002
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248179664658629600
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/156412481806331995853780
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/15641248181654764503202
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/19031248266727719885386
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/190312482688801649760492
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/19031248269664596516649
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/190312482708001189641421
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487751911274785528
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/135681248776174890038130
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487761862099818677
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/135681248778282954184859
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487783181690167551
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/13568124877853011805764
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487798162116602801
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812487815761120847641
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/1432512487896321201817824
WARNING: Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/1356812489549801085755434
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/1432512490424262056397442
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/143251249042517279751030
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/278701250254923894246671
WARNING: Job state information not found: http://knowarc1.grid.niif.hu:50000/arex/278701250255050573870217
Job: http://knowarc1.grid.niif.hu:50000/arex/1342912563830101935539909
 State: Deleted (DELETED)
 Exit Code: 0

Job: http://knowarc1.grid.niif.hu:50000/arex/1342912563846191295524800
 State: Deleted (DELETED)
 Exit Code: 0

Job: http://knowarc1.grid.niif.hu:50000/arex/13429125638579881230185
 State: Deleted (DELETED)
 Exit Code: 0

Job: http://knowarc1.grid.niif.hu:50000/arex/134291256386274486281849
 State: Deleted (DELETED)
 Exit Code: 0

Job: https://knowarc1.grid.niif.hu:60000/arex/105671256386577232995984
 State: Deleted (DELETED)
 Exit Code: 0

Job: http://knowarc1.grid.niif.hu:50000/arex/1342912563892491848381691
 State: Deleted (DELETED)
 Exit Code: 0

Job: http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175
 State: Running (INLRMS:EXECUTED)

Job: http://knowarc1.grid.niif.hu:50004/arex/244681257411410435311740
 State: Running (INLRMS:EXECUTED)

Job: https://knowarc1.grid.niif.hu:60000/arex/2412612574114162090478554
 State: Running (INLRMS:EXECUTED)

ERROR: Failed to bind to ldap server (index1.nordugrid.org)
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2437412466271271111006935
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/251511246627161344082881
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2536212466271742017330082
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/253751246627175216184522
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2541012466271751071459481
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/254151246627175623801147
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2544412466271761481664569
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/1692312466287812129046274
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/294951248178561968194419
WARNING: Job state information not found: gsiftp://ce02.titan.uio.no:2811/jobs/103681248270203427504322
WARNING: Job state information not found: gsiftp://ce02.titan.uio.no:2811/jobs/106671248270242401341040
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/1539512487873681581185043
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/27196124938638531541090
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2955512502547611962155131
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/199071256385636680402761
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/2866212563867231353919820
WARNING: Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/289281256386733861975919
ERROR: Failed to establish SSL connection
ERROR: SSL error: -1 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: Failed to send SOAP message
ERROR: Failed to establish SSL connection
ERROR: SSL error: -1 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: Failed to send SOAP message
ERROR: Failed to establish SSL connection
ERROR: SSL error: -1 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: Failed to send SOAP message
ERROR: Failed to establish SSL connection
ERROR: SSL error: -1 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: Failed to send SOAP message
ERROR: Failed to establish SSL connection
ERROR: SSL error: -1 - (empty):(empty):(empty)
ERROR: Failed to send content of buffer
ERROR: Failed to send SOAP message
WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=7f314be5-cf98-4a75-8c00-88cc5d9
fb05
WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=d2f479ad-18e5-42d9-8408-edd898c
7485
WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=700b85aa-8379-4b32-b68e-cb79896
6be2
WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=4320102f-de7f-4c55-816f-ca09944
9922
WARNING: Job state information not found: https://testbed5.grid.upjs.sk:8080/KnowARC-testbed/services/BESActivity?res=548f2c05-47f6-4b77-ac02-6e6db86
60e9
ERROR: The job status could not be retrieved
ERROR: Could not retrieve job information
ERROR: The job status could not be retrieved
ERROR: Could not retrieve job information
ERROR: The job status could not be retrieved
ERROR: Could not retrieve job information
ERROR: The job status could not be retrieved
ERROR: Could not retrieve job information
WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM779180392
WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM898036279
WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM439661494
WARNING: Job state information not found: https://cream.grid.upjs.sk:8443/ce-cream/services/CREAM2/CREAM875668628

arccat 05.11

The errors reported at All - hands meeting were related to an error in the site configuration as well ass a missing statement in the jsdl file (DeleteInTermination).

arccat http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175
ERROR: Illegal URL - no hostname given
ERROR: Cannot output stdout for job (http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175), non-valid destination URL (c:\DOCUME~1\Katari
na\LOCALS~1\Temp\arccat.QX752U)
arccat -l http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175
ERROR: Can not determine the gmlog location: http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175

arcls 05.11

Works fine.

arccp 05.11

Works for for arex.

arcget 05.11

Not working

C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir>arcget http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175
WARNING: Attempt to assign relative path to URL - making it absolute
ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\err.txt

ERROR: File download failed: Can't write to destination
ERROR: Failed dowloading http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175/err.txt to file:/C:\Documents and Settings\Katarina\My Docu
ments\KnowARC\testdir\2431212574114041550319175\err.txt
WARNING: Attempt to assign relative path to URL - making it absolute
ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt

ERROR: Failed to create/find directory /C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175, (22)
ERROR: Failed to start writing to destination: file:/C:\Documents and Settings\Katarina\My Documents\KnowARC\testdir\2431212574114041550319175\out.txt

ERROR: File download failed: Can't write to destination
ERROR: Failed dowloading http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175/out.txt to file:/C:\Documents and Settings\Katarina\My Docu
ments\KnowARC\testdir\2431212574114041550319175\out.txt
ERROR: Failed downloading job http://knowarc1.grid.niif.hu:50003/arex/2431212574114041550319175

arckill

Again only tested on arc0

arckill <jobID>
arckill -k <jobID>

Both work.

CLI from Ubuntu binary packages

System

  • Intel Centrino2 vPro
  • Ubuntu 9.04 Jaunty x86_64; kernel 2.6.28-15-generic
  • Pre-defined environment variables:
    • X509_USER_KEY
    • X509_USER_CERT
    • X509_USER_PROXY

Installation

 sudo apt-get install nordugrid-arc1-client

Claims to install nordugrid-arc1-lib (0.9.4~rc2-1), nordugrid-arc1-plugins-base (0.9.4~rc2-1), nordugrid-arc1-client (0.9.4~rc2-1) and a bunch of Globus packages for no obvious reason

Warning.png
In reality, does not install ARC binaries - only man pages

File list of nordugrid-arc1-client_0.9.4~rc2-1_amd64.deb:

 /.
 /usr
 /usr/share
 /usr/share/man
 /usr/share/man/man5
 /usr/share/man/man5/arcclient.xml.5.gz
 /usr/share/man/man1
 /usr/share/man/man1/arccp.1.gz
 /usr/share/man/man1/arcsub.1.gz
 /usr/share/man/man1/arcrm.1.gz
 /usr/share/man/man1/arcls.1.gz
 /usr/share/man/man1/arcdecision.1.gz
 /usr/share/man/man1/arcstat.1.gz
 /usr/share/man/man1/arcslcs.1.gz
 /usr/share/man/man1/arcsync.1.gz
 /usr/share/man/man1/arcclean.1.gz
 /usr/share/man/man1/arcinfo.1.gz
 /usr/share/man/man1/arccat.1.gz
 /usr/share/man/man1/arcresub.1.gz
 /usr/share/man/man1/arcget.1.gz
 /usr/share/man/man1/perftest.1.gz
 /usr/share/man/man1/arckill.1.gz
 /usr/share/man/man1/arcecho.1.gz
 /usr/share/man/man1/arcrenew.1.gz
 /usr/share/man/man1/chelonia.1.gz
 /usr/share/man/man1/arcresume.1.gz
 /usr/share/man/man1/arcmigrate.1.gz
 /usr/share/man/man1/arcsrmping.1.gz
 /usr/share/man/man1/arcproxy.1.gz
 /usr/share/doc
 /usr/share/doc/nordugrid-arc1-client
 /usr/share/doc/nordugrid-arc1-client/changelog.gz
 /usr/share/doc/nordugrid-arc1-client/changelog.Debian.gz


CLI from Fedora binary packages

System

  • Intel Centrino
  • Fedora 11 i386; kernel 2.6.30.8-64.fc11.i686.PAE
  • Pre-defined environment variables:
    • X509_USER_KEY
    • X509_USER_CERT
    • X509_USER_PROXY
    • X509_VOMS_DIR
  • Katarian:
  • Intel Dual core
  • Fedora 7 i386

Installation

 yum install nordugrid-arc1-client

Installs nordugrid-arc1-0.9.4-0.rc2.fc11.i586.rpm, nordugrid-arc1-client-0.9.4-0.rc2.fc11.i586.rpm, nordugrid-arc1-plugins-base-0.9.4-0.rc2.fc11.i586.rpm

Everything installs in root ('/usr', '/etc'...) and not '/usr/local' as advertised in the Guide

  • FC7 (Katarina) Nothing preinstaled
    • yum install nordugrid-arc1
    • yum install nordugrid-python
    • The update to rc3 came up automatically once it was available from the repository. Works perfect.


Tests

arcproxy

  arcproxy

Suceeds, but produces 2 warnings:

 [2009-10-16 01:40:55] [Arc.UserConfig] [WARNING] [3446/137125512] Unknown section client, ignoring it

REASON: old ~/.arc/client.conf, which is still needed for v0.8. SOLUTION: Martin changes verbosity level to INFO, and adds a debug message before processing (was only after)

 [2009-10-16 01:40:55] [Arc.OpenSSL] [WARNING] [3446/137125512] Failed to lock arccrypto library in memory

Tests FC7 Katarina

Differences compared to the all-hands results

  • Gabor fixed some configuration on the server side
  • The jsdl job was equped with DeleteOnTermination = false
  • The globus packages were installed so the arc0 works

client.conf relevant stuff:

[common]
                                 
defaultservices=index:ARC1:https://knowarc2.grid.niif.hu:50000/isis


[alias]
arc0=computing:ARC0:ldap://knowarc1.grid.niif.hu:2135/nordugrid-cluster-name=knowarc1.grid.niif.hu,Mds-Vo-name=local,o=grid
arc1=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid
#arex1=computing:ARC1:https://knowarc1.grid.niif.hu:60000/arex                                                                                                                 
#arex2=computing:ARC1:https://knowarc1.grid.niif.hu:50000/arex                                                                                                                 
arex1=computing:ARC1:http://knowarc1.grid.niif.hu:50000/arex
arex2=computing:ARC1:https://knowarc1.grid.niif.hu:60000/arex
arex3=computing:ARC1:http://knowarc1.grid.niif.hu:50003/arex
arex4=computing:ARC1:http://knowarc1.grid.niif.hu:50004/arex
arex5=computing:ARC1:http://knowarc1.grid.niif.hu:50005/arex
arex6=computing:ARC1:http://knowarc1.grid.niif.hu:50006/arex
arex7=computing:ARC1:http://knowarc1.grid.niif.hu:50007/arex
arex8=computing:ARC1:http://knowarc1.grid.niif.hu:50008/arex
arex9=computing:ARC1:http://knowarc1.grid.niif.hu:50009/arex

Test jobs

jsdl job

Compared to the first test (all-hands meeting) <DeleteOnTermination>false</DeleteOnTermination> was added for the 03.11 tests

<JobDefinition
 xmlns="http://schemas.ggf.org/jsdl/2005/11/jsdl"
 xmlns:posix="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix"
 >
 <JobDescription>
   <JobIdentification>
     <JobName>Windows test job</JobName>
   </JobIdentification>
   <Application>
     <posix:POSIXApplication>
       <posix:Executable>/bin/sh</posix:Executable>
       <posix:Argument>test.sh</posix:Argument>
       <posix:Argument>Testing</posix:Argument>
       <posix:Output>out.txt</posix:Output>
       <posix:Error>err.txt</posix:Error>
     </posix:POSIXApplication>
   </Application>
   <DataStaging>
     <FileName>test.sh</FileName>
     <DeleteOnTermination>false</DeleteOnTermination>
     <Source><URI>http://knowarc1.grid.niif.hu/ogf25/storage/test.sh</URI></Source>
   </DataStaging>
   <DataStaging>
     <FileName>out.txt</FileName>
     <DeleteOnTermination>false</DeleteOnTermination>
   </DataStaging>
   <DataStaging>
     <FileName>err.txt</FileName>
     <DeleteOnTermination>false</DeleteOnTermination>
   </DataStaging>
 </JobDescription>
</JobDefinition>

xrsl jobs

&("executable" = "run.sh" )
("arguments" = "2" )
("inputfiles" = ("run.sh" "http://www.fys.uio.no/~katarzp/test/run.sh" ) 
("Makefile" "http://www.fys.uio.no/~katarzp/test/Makefile" ) 
("prime.cpp" "http://www.fys.uio.no/~katarzp/test/prime.cpp" ) )
("stderr" = "primenumbers" )("outputfiles" = ("primenumbers" "" ))
("jobname" = "ARC testjob from www" )
("stdout" = "stdout" )
("gmlog" = "gmlog" )
("CPUTime" = "8" )


arcproxy

Works fine to get a proxy. But if you don't have one, the message may be a bit scary.

arcproxy -I
[2009-11-03 15:26:39] [Arc.OpenSSL] [WARNING] [3935/139171360] Failed to lock arccrypto library in memory
[2009-11-03 15:26:39] [Arc.Credential] [ERROR] [3935/139171360] Can't get the first byte of input BIO to get its format
Segmentation fault

Voms proxy

arcproxy -S knowarc.eu
[2009-11-03 15:36:34] [Arc.OpenSSL] [WARNING] [4119/163444256] Failed to lock arccrypto library in memory
Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Enter pass phrase for /home/katarzp/.globus/userkey.pem:
....++++++
.....++++++
[2009-11-03 15:36:38] [Arc] [ERROR] [4119/163444256] Cannot get voms server knowarc.eu information from file: /home/katarzp/.voms 
Proxy generation succeeded

arcproxy -I
[2009-11-03 15:36:41] [Arc.OpenSSL] [WARNING] [4123/148125216] Failed to lock arccrypto library in memory
Subject:  /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel/CN=1209213760
Identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Timeleft for proxy: 11 hours 59 minutes 57 seconds
Proxy path: /tmp/x509up.nG3931
Proxy type: X.509 Proxy Certificate Profile RFC compliant restricted proxy

I was a bit worried about the message "Cannot get voms server knowarc.eu information from file: /home/katarzp/.voms". My voms list is in file $HOME/.voms/vomses (works fine for the example below). Then I moves $HOME/.voms/vomses file for $HOME/.voms and get :

arcproxy -S knowarc.eu
[2009-11-03 15:40:34] [Arc.OpenSSL] [WARNING] [4185/147871264] Failed to lock arccrypto library in memory
Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Enter pass phrase for /home/katarzp/.globus/userkey.pem:
........................++++++
..........................................++++++
Contacting VOMS server (named knowarc.eu ): arthur.hep.lu.se on port: 15001
[2009-11-03 15:40:38] [Arc.MCC.TLS] [ERROR] [4185/147871264] Failed to establish SSL connection
[2009-11-03 15:40:38] [Arc.MCC.TLS] [ERROR] [4185/147871264] SSL error: -1 - (empty):(empty):(empty)
[2009-11-03 15:40:38] [Arc.MCC.TLS] [ERROR] [4185/147871264] Failed to send content of buffer
[2009-11-03 15:40:38] [Arc] [ERROR] [4185/147871264] ???: STATUS_UNDEFINED (No explanation.)

In stead of copying into what looks like the expected location I pointed to vomses and get the same:

arcproxy -S knowarc.eu --vomses=/home/katarzp/.voms/vomses
[2009-11-03 15:57:09] [Arc.OpenSSL] [WARNING] [4453/153048608] Failed to lock arccrypto library in memory
Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Enter pass phrase for /home/katarzp/.globus/userkey.pem:
...............++++++
...................++++++
Contacting VOMS server (named knowarc.eu ): arthur.hep.lu.se on port: 15001
[2009-11-03 15:57:12] [Arc.MCC.TLS] [ERROR] [4453/153048608] Failed to establish SSL connection
[2009-11-03 15:57:12] [Arc.MCC.TLS] [ERROR] [4453/153048608] SSL error: -1 - (empty):(empty):(empty)
[2009-11-03 15:57:12] [Arc.MCC.TLS] [ERROR] [4453/153048608] Failed to send content of buffer
[2009-11-03 15:57:12] [Arc] [ERROR] [4453/153048608] ???: STATUS_UNDEFINED (No explanation.)

arcproxy --voms=atlas --vomses=/home/katarzp/.voms/vomses
[2009-11-03 16:00:40] [Arc.OpenSSL] [WARNING] [4513/136640032] Failed to lock arccrypto library in memory
Your identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Enter pass phrase for /home/katarzp/.globus/userkey.pem:
....++++++
.......++++++
Contacting VOMS server (named atlas ): voms.cern.ch on port: 15001
[2009-11-03 16:00:43] [Arc.MCC.TLS] [ERROR] [4513/136640032] Failed to establish SSL connection
[2009-11-03 16:00:43] [Arc.MCC.TLS] [ERROR] [4513/136640032] SSL error: -1 - (empty):(empty):(empty)
[2009-11-03 16:00:43] [Arc.MCC.TLS] [ERROR] [4513/136640032] Failed to send content of buffer
[2009-11-03 16:00:43] [Arc] [ERROR] [4513/136640032] ???: STATUS_UNDEFINED (No explanation.)

Result:

 arcproxy -I
[2009-11-03 16:01:28] [Arc.OpenSSL] [WARNING] [4527/154777120] Failed to lock arccrypto library in memory
Subject:  /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel/CN=1222106176
Identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Timeleft for proxy: 11 hours 59 minutes 15 seconds
Proxy path: /tmp/x509up.nG3931
Proxy type: X.509 Proxy Certificate Profile RFC compliant restricted proxy

This one works fine with $HOME/.voms/vomses

arcproxy -O -S knowarc.eu

arcproxy -I
[2009-11-03 15:33:12] [Arc.OpenSSL] [WARNING] [4068/138683936] Failed to lock arccrypto library in memory
Subject:  /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel/CN=proxy
Identity: /O=Grid/O=NorduGrid/OU=fys.uio.no/CN=Katarina Pajchel
Timeleft for proxy: 11 hours 57 minutes 58 seconds
Proxy path: /tmp/x509up.nG3931
Proxy type: Legacy Globus impersonation proxy

arcsub

Tested first a good old xrsl job which downloads input for an http place and compiles a and calculated prime numbers

arcsub -c arex1 testjobwww.xrsl
arcsub -c arex2 testjobwww.xrsl
[2009-10-26 23:47:13] [Arc.OpenSSL] [WARNING] [4461/153974056] Failed to lock arccrypto library in memory
Job submitted with jobid: https://knowarc1.grid.niif.hu:60000/arex/105671256597263621536123

Works for all niif arexes. Warning for arex2 (the only https service)

Tested then a test.jsdl job.

arcsub -c arex2 test.jsdl 
[2009-10-26 16:35:34] [Arc.OpenSSL] [WARNING] [11145/160253472] Failed to lock arccrypto library in memory
Job submitted with jobid: https://knowarc1.grid.niif.hu:60000/arex/1056712565713591783296158

Works for all arexs. Warning for arex2 (https). At the moment (03.11) the https (arex2) is not available.

03.11 tests

After installing the globus package - ARC0 tests I think I managed to submit jobs last week. Now (03.11) this is the output, but then again it could be a problem on the server side.

arcsub -c arc0 testjobwww.xrsl
[2009-11-03 10:33:51] [Arc.OpenSSL] [WARNING] [3479/137664136] Failed to lock arccrypto library in memory
[2009-11-03 10:33:59] [Arc.FTPControl] [ERROR] [3479/137635360] Connect: Failed authentication: globus_ftp_control: gss_init_sec_context failed/GSS Major Status: Authentication Failed/globus_gsi_gssapi: SSLv3 handshake problems/globus_gsi_gssapi: Unable to verify remote side's credentials/globus_gsi_gssapi: SSLv3 handshake problems: Couldn't do ssl handshake/OpenSSL Error: s3_clnt.c:894: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Invalid CRL: The available CRL has expired
[2009-11-03 10:33:59] [Arc.Submitter.ARC0] [ERROR] [3479/137635360] Submit: Failed to connect
Submission to gsiftp://knowarc1.grid.niif.hu:2811/jobs failed, trying next target
[2009-11-03 10:33:59] [Arc.FTPControl] [ERROR] [3479/137635360] Connect: Failed authentication: globus_ftp_control: gss_init_sec_context failed/GSS Major Status: Authentication Failed/globus_gsi_gssapi: SSLv3 handshake problems/globus_gsi_gssapi: Unable to verify remote side's credentials/globus_gsi_gssapi: SSLv3 handshake problems: Couldn't do ssl handshake/OpenSSL Error: s3_clnt.c:894: in library: (null), function (null): (null)/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Could not verify credential/globus_gsi_callback_module: Invalid CRL: The available CRL has expired
[2009-11-03 10:33:59] [Arc.Submitter.ARC0] [ERROR] [3479/137635360] Submit: Failed to connect
Submission to gsiftp://knowarc1.grid.niif.hu:2811/jobs failed, trying next target
Job submission failed, no more possible targets

Yes it is server problem. changed to arc1=computing:ARC0:ldap://grid.tsl.uu.se:2135/nordugrid-cluster-name=grid.tsl.uu.se,Mds-Vo-name=local,o=grid and submission works fine.

arcsub -c arc0 testjobwww.xrsl
[2009-11-03 10:36:29] [Arc.OpenSSL] [WARNING] [3527/154600152] Failed to lock arccrypto library in memory
Job submitted with jobid: gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143

If no cluster is given, what should happen? Some defaults? At the moment the job is not submitted:

arcsub testslow.jsdl
[2009-11-03 16:15:52] [Arc.OpenSSL] [WARNING] [4856/140648360] Failed to lock arccrypto library in memory
[2009-11-03 16:15:52] [Arc] [ERROR] [4856/140648360] SSL error: 12 - (empty):(empty):(empty)
[2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] Failed to establish SSL connection
[2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] SSL error: -1 - (empty):(empty):(empty)
[2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] SSL error: 336134278 - SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
[2009-11-03 16:15:52] [Arc.MCC.TLS] [ERROR] [4856/140648360] Failed to send content of buffer
Job submission aborted because no clusters returned any information

arcstat

Works fine both for arex and arc0 jobs. Except for some error messages

arcstat -a
[2009-11-03 10:51:41] [Arc.OpenSSL] [WARNING] [3629/143394336] Failed to lock arccrypto library in memory
[2009-11-03 10:51:42] [Arc] [ERROR] [3629/143394336] SSL error: 12 - (empty):(empty):(empty)
[2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] Failed to establish SSL connection
[2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] SSL error: -1 - (empty):(empty):(empty)
[2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] SSL error: 336134278 - SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
[2009-11-03 10:51:42] [Arc.MCC.TLS] [ERROR] [3629/143394336] Failed to send content of buffer
[2009-11-03 10:51:42] [Arc.A-REX-Client] [ERROR] [3629/143394336] �ed to send SOAP message
[2009-11-03 10:51:42] [Arc.JobController.ARC1] [ERROR] [3629/143394336] Failed retrieving job status information
[2009-11-03 10:51:43] [Arc.JobController] [WARNING] [3629/143394336] Job state information not found: https://knowarc1.grid.niif.hu:60000/arex/241261256752838667371441
Job: http://knowarc1.grid.niif.hu:50000/arex/241611257239618970891242
 State: Finished (FINISHED)

Job: http://knowarc1.grid.niif.hu:50003/arex/2431212572396251634684492
 State: Finished (FINISHED)

Job: http://knowarc1.grid.niif.hu:50000/arex/241611257240471994297934
 State: Finished (FINISHED)

Job: http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910
 State: Finished (FINISHED)

Job: http://knowarc1.grid.niif.hu:50003/arex/243121257240535396630099
 State: Finished (FINISHED)

[2009-11-03 10:51:56] [Arc.JobController] [WARNING] [3629/143394336] Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/22701256754327498076244
[2009-11-03 10:51:56] [Arc.JobController] [WARNING] [3629/143394336] Job state information not found: gsiftp://knowarc1.grid.niif.hu:2811/jobs/1622012567547201782103219
Job: gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143
 Name: ARC testjob from www
 State: Finished (FINISHED)
 Exit Code: 0

For a single job no error messaged:

arcstat -l http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910
Job: http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910
State: Finished (FINISHED)
Stdin: /dev/null
Stdout: out.txt
Stderr: err.txt
Submitted: 2009-11-03 10:28:45
End Time: 2009-11-03 10:31:48
Results must be retrieved before: 2009-11-06 10:31:48

arcls

Problems? Shouldn't is show the output files?

 arcls http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086
[2009-10-26 17:24:45] [Arc.arcls] [ERROR] [11345/141080096] Failed listing metafiles

03.11 tests

A configuration error was fixed at the server side. arcls works fine

 [katarzp@localhost test]$ arcls http://knowarc1.grid.niif.hu:50003/arex/243121257240535396630099
 primenumbers
 gmlog

 [katarzp@localhost test]$ arcls http://knowarc1.grid.niif.hu:50003/arex/243121257240525281690910
 err.txt
 test.sh
 out.txt

The arc0 gives a warning

arcls gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143
[2009-11-03 10:47:28] [Arc.OpenSSL] [WARNING] [3611/163272224] Failed to lock arccrypto library in memory
gmlog
stdout
primenumbers


A.K: Fixed in revision 15269 of trunk.

arccat

xrsl job - no output Gabor R. did a fix on the niif server side see the 03.11 test below for current status

[katarzp@localhost test]$ arccat http://knowarc1.grid.niif.hu:50005/arex/1257312565691711103923833
[2009-10-26 16:14:13] [Arc.JobController] [ERROR] [11019/155788832] File download failed: Failed while reading from source
[katarzp@localhost test]$ arccat -l http://knowarc1.grid.niif.hu:50005/arex/1257312565691711103923833
[2009-10-26 16:14:23] [Arc.JobController] [ERROR] [11030/154388000] File download failed: Failed while reading from source

jsdl job - no output

[katarzp@localhost test]$ arccat http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086
[2009-10-26 17:24:55] [Arc.JobController] [ERROR] [11347/149460512] File download failed: Failed while reading from source
[katarzp@localhost test]$ arccat -l http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086
[2009-10-26 17:25:01] [Arc.JobController] [ERROR] [11358/156669472] Can not determine the gmlog location: http://knowarc1.grid.niif.hu:50005/arex/125731256571476222227086

03.11 tests

arccat for jsdl job, standard output works

arccat http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745
stdout from job http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745
Hello Testing
Welcome 1 times
Welcome 2 times
Welcome 3 times
Welcome 4 times
Welcome 5 times
Welcome 6 times
Welcome 7 times
Welcome 8 times
Welcome 9 times
Welcome 10 times

However arccal -l (gmlog) does now work

arccat -l http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745
[2009-11-03 11:52:52] [Arc.JobController] [ERROR] [4391/163735072] Can not determine the gmlog location: http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745

Sending an xrsl job to the arex services works opposite. The standard output is not visible, while the gmlog can be read.

Gabor's comments it can be relater to the fact that in jsdl one needs to set DeleteOnTermination to fals if one wants the output to be kept. It is added in the jsdl test job while when submitting the xrsl job, one has to trust the translation. It looks like the stderr is kept, but not the stdout. Question: maybe DeleteOnTermination=False should be default?

   <DataStaging>
     <FileName>out.txt</FileName>
     <DeleteOnTermination>false</DeleteOnTermination>
   </DataStaging>
   <DataStaging>
     <FileName>err.txt</FileName>
     <DeleteOnTermination>false</DeleteOnTermination>
   </DataStaging>

Both arccat and arccat -l works for the ARC0 service

arccat -l gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143

arcget

arcget http://knowarc1.grid.niif.hu:50005/arex/1257312565691711103923833

Works for a while but nothing seems to be downloaded. At least not to the work directory.

arcget -D /scratch/knowarc/test http://knowarc1.grid.niif.hu:50003/arex/122271256597278471460765

Nothing downloaded. But the jobs are removed from the job list.

03.11 tests

jsdl job on arex

arcget http://knowarc1.grid.niif.hu:50007/arex/2487812572440821650868745

ls 2487812572440821650868745/
err.txt  out.txt  test.sh

xrsl job on arex (stdout not available)

arcget http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423

ls 248781257244754809208423/
gmlog  primenumbers

The content of gmlog:

<HTML>
<HEAD>
<TITLE>ARex: Job Logs</TITLE>
</HEAD>
<BODY>
<UL>
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/description">description</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/grami">grami</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/status">status</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/input">input</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/local">local</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/errors">errors</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/output">output</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/rte">rte</A> - log file
<LI><I>file</I> <A HREF="http://knowarc1.grid.niif.hu:50007/arex/248781257244754809208423/gmlog/diag">diag</A> - log file
</UL>
</BODY>
</HTML>

Works fine for ARC0.

arcget  gsiftp://grid.tsl.uu.se:2811/jobs/8901257240953331117143
[2009-11-03 12:04:10] [Arc.OpenSSL] [WARNING] [4463/134698528] Failed to lock arccrypto library in memory

ls 8901257240953331117143/
gmlog  primenumbers  stdout
l 8901257240953331117143/gmlog/
description  diag  errors  input  local  output  status

arccp

Works fine. Coped files from the session directory. Valid also 03.11.

arccp http://knowarc1.grid.niif.hu:50003/arex/2431212572396251634684492/out.txt test_out.txt

arckill

arckill 

Works fine removes the job.

arckill -k 

Keeps the job. One can use arcls, arccat - see where the job stopped. Job list is nicely updated.

Job: http://knowarc1.grid.niif.hu:50004/arex/244681257260942572912558
 State: Killed (KILLED)

Job: http://knowarc1.grid.niif.hu:50005/arex/2465212572609451975194226
 State: Killed (KILLED)

Job: http://knowarc1.grid.niif.hu:50006/arex/2471312572609481992071298
 State: Killed (KILLED)
arckill -c arex4

Works fine with cluster as argument.

BUT this is not working:

arckill -s INLRMS:R
[2009-11-03 16:44:27] [Arc.arckill] [ERROR] [5337/162956832] No jobs given
arckill -s Running
[2009-11-03 16:46:06] [Arc.arckill] [ERROR] [5365/165852704] No jobs given

arcclean

Works fine with a job or cluster as argument

arcclean http://knowarc1.grid.niif.hu:50003/arex/243121257261796512764518
arcclean -c arex5

BUT not with a state

arcclean -s KILLED
[2009-11-03 16:48:02] [Arc.arcclean] [ERROR] [5401/145307168] No jobs given
arcclean -s Killed
[2009-11-03 16:48:06] [Arc.arcclean] [ERROR] [5403/161232416] No jobs given

ARC client python

The ARC client library is also available in python. It is very useful as basis for writing Grid applications, job management scripts or similar.

Target generation

Martin Skou is workin on this

usercfg = arc.UserConfig("","")
targen = arc.TargetGenerator(usercfg)
targen.GetTargets(0, 1)
targets = targen.FoundTargets()

As well as

usercfg = arc.UserConfig("","")
print usercfg.GetSelectedServices(arc.COMPUTING)

gives segmentation faults.

Job listing

#!/usr/bin/python
import arc, sys;
usercfg = arc.UserConfig("");
joblist = "jobs.xml";
# Logging...
logger = arc.Logger(arc.Logger_getRootLogger(), "arcstat.py");
logcout = arc.LogStream(sys.stdout);
arc.Logger_getRootLogger().addDestination(logcout);
arc.Logger_getRootLogger().setThreshold(arc.DEBUG);
#jobmaster = arc.JobSupervisor(usercfg, [sys.argv[1]], joblist); OUDTADED
jobmaster = arc.JobSupervisor(usercfg,[]);
jobcontrollers = jobmaster.GetJobControllers();
print 'jobcontrollers ', jobcontrollers
for job in jobcontrollers:
  job.PrintJobStatus([], True);

Works fine. Finds all jobs previously submitted from CLI.

Job retrieval

 
#!/usr/bin/python

import arc, sys;

# User configuration file.
# Initialise a default user configuration.
usercfg = arc.UserConfig("");

# List of job ids to process.
jobids = sys.argv[1:];

# List of clusters to process.
clusters = [];

# Job list containing active jobs.
joblist = "jobs.xml";

# Process only jobs with the following status codes.
# If list is empty all jobs will be processed.
status = [];

# Directory where the job directory will be created.
downloaddir = "/scratch/knowarc/test/";

# Keep the files on the server.
keep = False;

# Logging...
logger = arc.Logger(arc.Logger_getRootLogger(), "arcget.py");
logcout = arc.LogStream(sys.stdout);
arc.Logger_getRootLogger().addDestination(logcout);
arc.Logger_getRootLogger().setThreshold(arc.DEBUG);


jobmaster = arc.JobSupervisor(usercfg,[]);
jobcontrollers = jobmaster.GetJobControllers();

for job in jobcontrollers:
  job.Get(status, downloaddir, keep);

Like with the CLI arcget. It seems to be doing something, removes the jobs from the jobs list, but no files are downloaded.

03.11 tests

The server side fix works. All jobs are downloaded and removed from the list.


ARC lib examples

The ARC client library is also available in python. It is very useful as basis for writing Grid applications, job management scripts or similar. The following examples show the basic job cycle with submission, status and output retrieval and cleaning of jobs.

Submit jobs

This basic example shows how to retrieve possible target clusters and submit jobs.

#!/usr/bin/python

import arc, sys
joblist = "jobs.xml"
usercfg = arc.UserConfig("","")

logger = arc.Logger(arc.Logger_getRootLogger(), "arcsub.py")
logcout = arc.LogStream(sys.stdout)
arc.Logger_getRootLogger().addDestination(logcout)
arc.Logger_getRootLogger().setThreshold(arc.ERROR)

targen = arc.TargetGenerator(usercfg)
targen.GetTargets(0, 1)
targets = targen.FoundTargets()

job = arc.JobDescription()
job.Application.Executable.Name = '/bin/echo'
job.Application.Executable.Argument.append('Hello')
job.Application.Executable.Argument.append('World')
job.Application.Output = 'std.out'

#std.out will be not deleted if it is finished
job_output = arc.FileType()
job_output.Name = 'std.out'
job.DataStaging.File.append(job_output)

info = arc.XMLNode(arc.NS(), 'Jobs')

for target in targets:
  submitter = target.GetSubmitter(usercfg)
  print 'Submitting to ', target.Cluster.ConnectionURL()
  submitted = submitter.Submit(job, target)

  if submitted:
    print "Job ID: " + submitted.fullstr()
    # Uncomment break if one wants to submit only one job
    #break; 


Job status

This example shows how to list the status of all jobs a user has in the system. It corresponds to the "arcstat -a" command.

#!/usr/bin/python

import arc, sys;

usercfg = arc.UserConfig("");
joblist = "jobs.xml";

# Logging...
logger = arc.Logger(arc.Logger_getRootLogger(), "arcstat.py");
logcout = arc.LogStream(sys.stdout);
arc.Logger_getRootLogger().addDestination(logcout);
arc.Logger_getRootLogger().setThreshold(arc.ERROR);

jobmaster = arc.JobSupervisor(usercfg,[]);
jobcontrollers = jobmaster.GetJobControllers();

for job in jobcontrollers:
  job.PrintJobStatus([], True);

Get results

This example shows how to download the jobs once they are finished. It corresponds to "srcget -a".

#!/usr/bin/python

import arc, sys;

# User configuration file.
# Initialise a default user configuration.
usercfg = arc.UserConfig("","")

# List of job ids to process.
jobids = sys.argv[1:];

# List of clusters to process.
clusters = [];

# Job list containing active jobs.
joblist = "jobs.xml";

# Process only jobs with the following status codes.
# If list is empty all jobs will be processed.
status = [];

# Directory where the job directory will be created.
downloaddir = "/scratch/knowarc/test/";

# Keep the files on the server.
keep = False;

# Logging...
logger = arc.Logger(arc.Logger_getRootLogger(), "arcget.py");
logcout = arc.LogStream(sys.stdout);
arc.Logger_getRootLogger().addDestination(logcout);
arc.Logger_getRootLogger().setThreshold(arc.ERROR);

jobmaster = arc.JobSupervisor(usercfg,[]);
jobcontrollers = jobmaster.GetJobControllers();

i = 0 

for job in jobcontrollers:
    job.Get(status, downloaddir, keep);
    i += 1