ARC Compute Clients

From NorduGrid

Jump to: navigation, search

This page concerns the "ARC Compute Clients" product in EMI.

The EMI project will last for 3 years, from May 2010 to April 2013, and will bring together the 3 major European middlewares ARC, gLite and UNICORE. In the project consolidation and harmonization of the 3 middlewares will be carried out, but there will also be room for proactive maintenance and evolution (development).

In this page and sub-pages, information about proactive maintenance, harmonization, evolution, testing and analysis related to the ARC Compute Clients product are gathered. This page contains a component list and a technical objective, work plan and open bugs table. If you have questions, comments or feedback please use the Discussion page.

Contents

Components

The product consist of all the computing related user command lines tools in ARC and the libarcclient library. Below these are listed:

  • Pre-WS computing client
    • ngsub - Submission
    • ngresub - Resubmission
    • ngstat - Job and resource querying
    • ngget - Output retrieval
    • ngcat - Output and log catenation
    • ngkill - Kill job
    • ngrenew - Credential renewal
    • ngresume - Resume job from a failed state
    • ngsync - Job list synchronization
    • ngtest - Test client- or server setup
    • ngclean - Job cleaning
  • WS computing client
    • arcsub - Submission
    • arcresub - Resubmission
    • arcmigrate - Migration
    • arcstat - Job querying
    • arcinfo - Resource querying
    • arcget - Output retrieval
    • arccat - Output and log catenation
    • arckill - Kill job
    • argrenew - Credential renewal
    • arcresume - Resume job from a failed state
    • arcsync - Job list synchronization
    • arcclean - Job cleaning
  • libarcclient
    • Library used by the WS computing client

Milestones, Deliverables and Technical objectives

The following milestones, deliverables and technical objectives are related to the ARC Compute Clients product. The milestones and deliverables are collected from the EMI DoW, while technical objectives (TO) are taken from DNA1.3.1, more specifically from the table in section 5.2. A list of all the EMI milestones and their status can be found here, while information about deliverables is here.

Name Due month State Comments
DNA1.3.1 – Technical Development Plan 2 ONGOING All input for the deliverable have been provided. Deliverable not delivered yet.
DJRA1.1.1 – Compute Area Work Plan and Status Report 3 ONGOING All input for the deliverable have been provided. Deliverable not delivered yet.
MJRA1.13 - Agreement on common information exchange methods 3 UNCLEAR
MJRA1.2 - Agreement on common job submission and management methods 6 ONGOING F2F and phone meetings are being held to reach an agreement
Finalize the XML and LDAP rendering of the GLUE2 model (TO 26) 9 UNCLEAR
DNA1.3.2 - Technical Development Plan 11 HOLD Input for this deliverable should be provided approximately 1 month before due date.
DJRA1.1.2 - Compute Area Work Plan and Status Report 12 HOLD Input for this deliverable should be provided approximately 1 month before due date.
MJRA1.14 - EMI Information exchange used in computing (TO 15) 12 HOLD
MJRA1.3 - Successful implementation of the common job submission and management method (TO 14) 18 HOLD Waiting for completion of MJRA1.2
DNA1.3.3 - Technical Development Plan 23 HOLD Input for this deliverable should be provided approximately 1 month before due date.
DJRA1.1.3 - Compute Area Work Plan and Status Report 24 HOLD Input for this deliverable should be provided approximately 1 month before due date.
Consolidation and harmonization of compute area clients/APIs (TO 32) 24 ONGOING Consist of improving libarcclient
DJRA1.1.4 - Compute Area Work Plan and Status Report 36 HOLD Input for this deliverable should be provided approximately 1 month before due date.
Extend job definition languages, resource modeling schema (GLUE) and job management services to be able to request access to virtualized resource managers and appliances (TO 43) 36 UNCLEAR

Work plan

Overview table

Description Priority Assignee Comments
#Improve technical documentation 5Very high Martin Writing documentation of API. Also writing a guide on howto create a ACC module for libarcclient.
#Create a framework for web content management 5Very high
#Detailed performance analysis 3Normal
#Extend the resource discovery module 4High
#Extend unit testing 3Normal
#Create functional tests 4High
#Revise READMEs and man pages 3Normal
#Improve user documentation 3Normal
#Write libarcclient article 3Normal Martin, Others?
#Secure functionality over transition 5Very high All
#ArcJobTool support 4High Martin Support is ongoing.
#P-GRADE support 3Normal Contact need to be established.
#Ganga support 3Normal Contact need to be established.
#Improve data broker 4High Salman
#Improve libarcclient language bindings 4High
Bug fixing 4High ALL Pick a bug!
#Extension to clouds 2Low This task might be appropriate for a student project.
#Extension to remote LRMS 2Low This task might be appropriate for a student project.
#Extension to localhost 2Low This task might be appropriate for a student project.
#Interactive jobs 2Low This task might be appropriate for a student project.
#Performance analysis of language bindings 3Normal
#Extend support for job migration 3Normal
#Support EMI ES 1Very low This task cannot be started before MJRA1.2 is reached. When reached priority will be highly increased.
#Harden ACCs 2Low Gabor Creating ACC unit tests. A framework for testing SOAP based client have been created.
#Harden the job description translators 2Low
#Job submission and management speed up 3Normal
#Resource discovery and brokering speed-up using cached info 3Normal
#Resource discovery and brokering speed-up (ISIS) 1Very low Salman This work can currently not be attributed to EMI since ISIS is not in EMI.

Proactive maintenance

  • Maintaining the arc* compute clients and the libarcclient
Bugfixing, see the table in the #Open Bugs section.
  • Resource discovery and brokering speed-up
Salman worked on a solution where the ISIS hold a limited set of semi-statical attributes, however it has not been committed yet. Results were posted to the KnowARC mailing list. A continuation of this can probably not be attributed to EMI, since the ISIS is not part of EMI. Salman and Martin are working together on integrating this work.
Another way this can be achieved is by caching semi-statically resource information at the client side. This might be work for a bachelor project or a autonomous project in the master studies.
  • Job submission and management speed up
Multi-job submission can be sped up by using a single connection when submitting multiple jobs to a single CE, see bug 808.
Job management can probably be sped up when multiple jobs on a single CE should be managed. This should also be done by utilizing a single connection, see bug 1887.
Carry out resource discovery, matching making and target ranking in parallel for each target. More information on this task can be found here.
  • Harden the job description translators (xrsl, jdl, jsdl)
Unit tests should be extended to make a ~100% coverage of the parsers.
Go through the translators and verify that they behave properly, and that all attributes are supported.
Make job description translators plug-able. Done.
The JSDL parser should warn about unrecognized elements.
Support for multiple XRSL in the same file.
Support logical OR expressions in xrsl.
Extend the JobDescription class to be able to support XRSL fully.
  • Harden ACCs (midleware-specific plugins)
Create functional tests, which can be used in a automated test frame work, like the one deployed by Marek in KnowARC.
Create a unit tests which can test the ACC modules.
Verify that every ACC is functional (develop tests), and resolve any issues.
Improve Unicore support, capable of finding resources, but not yet submit jobs. Also investigate delegation procedure used in Unicore, it seems certificate and key are expected by the service as default.
Extending arcsync to CREAM, see D2.2-1.
  • Secure that all functionality in the pre-WS Compute client is transfered to the WS Compute client
See Secure functionality over transition

Harmonization

  • Support EMI Execution Service (ES) interface on the client-side
The creation of the EMI ES specification is the subject of MJRA1.2 milestone. Work on implementing the specification cannot start before the milestone have been reached. Milestone MJRA1.3 concerns this implementation. More info on the specification is located here.

Evolution

  • ArcJobTool support
Adjust libracclient to satisfy the needs of the ArcJobTool GUI.
Support the developer team.
  • P-GRADE support
Currently P-GRADE uses the CLI of the pre-WS client to interact with ARC resources. This presentation was presented at the PUCOWO 2010 conference.
Contact P-GRADE developers to promote libarcclient and learn about their interest in libarcclient (Java bindings).
Adjust libarcclient to satisfy the needs of the P-GRADE portal.
Support the developer team.
  • Ganga support
Currently Ganga uses the CLI of the pre-WS client version 0.6 to interact with ARC resources. Ganga can probably also interact with ARC through Panda.
Contact Ganga developers to promote libarcclient and learn about thier interest in libarcclient (Python bindings).
Adjust libarcclient to satisfy the needs of the Ganga tool.
Support the developer team.
  • Extend support for job migration
Currently only job residing on a A-REX CE can be migrated, and they can only be migrated to other A-REX CEs.
Examine whether job migration from GM, CREAM or UNICORE can be supported. If not possible detailed information should be added as comments to the source code. If its possible, or limited possible a solution should be crafted.
Same as above, but instead for job migration to GM, CREAM or UNICORE.
Enabling migration to also get remote files from old cluster
  • Extend the support for hold/resume (archold) (requires server side support)
Currently there is no support for holding jobs in ARC. It is planned that the EMI ES will include hold operations. This task is thus part of #Support EMI ES.
  • Advancing and extending the functionality of the data broker module
A dedicated page concerning data broker module improvements is located here.
  • Extend the resource discovery module
All relevant GLUE2 information for the ARC1 plug-in should be communicated to and available on the client side.
VO information should be fetched and used for match making.
  • Run job interactively (arcint) (requires server side support)
Might be work for a student project.
  • Develop a plug-in to be able to submit jobs to localhost as an "execution server"
This task should develop a plug-in which uses the local machine to run jobs. It should be investigated what features of libarcclient which can be supported. Grid job testing and utilizing the local machine as a execution service is motivating factors.
  • Develop a plug-in to able to submit jobs directly to LRMS (no middleware installed on cluster)
It should be investigated to what extend the libarcclient library is able to interact with remote LRMS. Remote LRMS should be understood as a LRMS run remotely without any middleware installed on it, and should be accessed through a dedicated connection. It might require lots of work supporting different connection protocols, on the contrary this task might benefit from the fact that the major LRMSs is already supported on the server side.
  • Develop a plug-in to be able to submit jobs to Clouds
It should be investigated to what extend the libarcclient library is able to interact with cloud computing.
After investigation, a plug-in should be developed which uses the libarcclient API/interface. This task might be work for a bachelor project, an autonomous master study project or maybe even a master project.
  • Improve libarcclient language bindings
The language bindings should be tested in general. Unit tests for these should be created. Consider using junit and pyunit.
Java bindings on Windows and Mac should be fixed. See bugs. Fixed. There are currently no issues.
  • ARC client on non-linux platforms
Stand-alone client for Mac OS X should be created, see packaging repository.
Language bindings should be packaged for Windows (assigned to Anders).

Performance

  • Break down performance analysis of the WS computing client
A detailed performance analysis should be carried out, which in depth analyses where computing time is spend, specifically it should be able to point to specific lines in the code. The analysis should be constructed so it can run in a automatic manner, and the results should be in a format applicable to a web page.
  • Performance analysis of libarcclient with and without language bindings
A performance analysis of the language bindings should be carried out, which compares performance of the libarcclient without language bindings. The analysis should be constructed so it can run in a automatic manner, and the results should be in a format applicable to a web page.

Testing

Note: Testing efforts should be coordinated with the SA2.5 task, more specifically the ARC partner UPJS (Marek Kocan). Marek also has testing experience from the KnowARC project. The ARC1/Testing page might contain valuable information. The EMI twiki page EmiSa2CertTestGuidelines contains EMI testing guidelines and test definitions.

  • Extend unit testing
Unit Testing of the libarcclient should be highly improved. It should be investigated which classes in libarcclient is applicable to unit testing, and unit tests should be created for them. The test coverage of the classes should be as high as possible, without changing the code of the class, see Test Coverage for more on test coverage.
  • Create functional tests
Functional tests should be created which can be run in a automated manner. It should be examined if the virtual testbed setup created by Daniel Johansson could be used for this purpose. Tests should be created for all the WS computing client commands, and the tests should cover all the common uses of command line options.
It should be noted that the functionality of the arcresume and arcrenew was not completely ensured in the KnowARC project.

Documentation and Promotion

  • Create a framework for web content management
A framework for managing web content should be set up, in order to present documentation and such on the web. The content includes documentation, performance results, test coverage, code examples, turtorials, training material etc.
  • Update READMEs files and man pages
The READMEs and manual pages should be reviewed and updated. A lot of different READMEs exist (packaging READMEs).
  • User documentation
Review and improve user manual.
Create a web page containing user documentation. Also a page for first time users should be created.
  • Technical documentation
Technical documentation already exists for the libarcclient. However it should be improved and extended. The documentation should preferably be written in a way which makes it applicable for a document (PDF) and a webpage.
A web page containing technical documentation should be created. A general layout should be used, see #Create a general Web layout above.
All classes in libarcclient should be attached with comments in doxygen style.
A web page containing API documentation from the doxygen styled comments should be created, and they also explain how the class is used from Python and Java, if different from C++.
Tutorials and code examples for using the libarcclient API should be created and presented in a web page. Python examples can be found here, and Java here.
  • Write libarcclient article
An article about the libarcclient should be written to promote its capabilities.
The article should include a comparison of performance of the WS computing client against the pre-WS computing client. A performance analysis between the two have already been carried by Marek in the KnowARC project. That analysis should be investigated, and if not found sufficient for the article it should be extended.

Open Bugs

New and reopened

IDPVersionSummaryComponentAssignee
2828P32.0.0rc4status job Failed for the cluster type httpsUser InterfaceMartin Skou Andersen


Blockers and criticals

no bugzilla tickets were found


General

IDPSeverityVersionSummary (50 tasks) ComponentAssignee
2822P3major2.0.0rc4Can not retrieve multiple jobs (LDAP server is reported not accessible)User InterfaceMartin Skou Andersen
2787P3major11.05feature request for updating status of ExecutionTargetsARClibMartin Skou Andersen
1951P3major11.05Outdated CRLs render all client tools uselessARClibMartin Skou Andersen
2193P4minor0.9.4arcsub -c should print error when it failsUser InterfaceMartin Skou Andersen
804P3minorNOXInsufficient client error messages when job information is not foundARClibMartin Skou Andersen
2679P3minor11.05Interrupting arcsub can wipe jobs.xmlUser InterfaceIvan Marton
2356P3normal1.0.0b4bad element generated from the ProcessCountLimit by JSDLARClibMartin Skou Andersen
2798P3normal2.0.0rc4Interface prefixes to -c option (ARC0: etc) are no longer supportedUser InterfaceMartin Skou Andersen
2459P3normal11.05arcsync not removing expired jobsARClibMartin Skou Andersen
2820P3normal11.05Got segmentation fault from arccpUser InterfaceDavid Cameron
2608P3normal11.05RunTest fails on Debian kfreebsd-amd64 and Debian kfreebsd-i386ARClibAleksandr Konstantinov
2662P3normal11.05Invalid proxies accepted (both 1.0.1 and 1.1.0)ARClibAleksandr Konstantinov
2831P3normal11.05brokername=FastestQueue does not work in client.confUser InterfaceMartin Skou Andersen
2786P3normal11.05arc.Endpoint does not work when using 'org.nordugrid.ldapglue2'ARClibMartin Skou Andersen
2830P3normal11.05uncaught exception "Thread creation in ParallelLdapQueries failed" aborts programARClibAleksandr Konstantinov
1941P3normalNOXcan't submit jobs with arcjobtoolUser InterfaceMartin Skou Andersen
2019P3normal0.8.2.2arc* tools treat absence of CA keys inconsistentlyUser Interfaceweizhong qiang
2030P3normal0.8.2.2walltime not set in PBS jobs with ARC 0.8.2.2 clientUser InterfaceIvan Marton
2063P3normalNOXarcget cleans jobs even if it can't download themUser InterfaceMattias Ellert
2162P3normal0.8.3.1Server responded: Failed to allocate port for data transferUser InterfaceMattias Ellert
2215P3normalNOXarc* commands do not work without a proxy?User InterfaceAleksandr Konstantinov
2424P3normal11.05arcsub prints ERRORs instead of WARNINGsUser InterfaceIvan Marton
2428P3normal11.05arcsub gives different result when using X509_CERT_DIR variable and when using cacertificatesdirectory in client.confUser Interfaceweizhong qiang
2550P3normalSVNarcproxy and voms configurationUser Interfaceweizhong qiang
849P3normal0.9.4cpuTime wallTime for parallel jobsARClibMartin Skou Andersen
1605P3normal0.9Publish UserDomain in client toolsARClibMartin Skou Andersen
2560P3normal11.05Unhandled credentials lifetime for MyProxy PUT operationUser Interfaceweizhong qiang
1755P3normal0.9CREAM: Unable to clean when running arcget and arckillARClibMattias Ellert
2622P3normal11.05arcsub gives ERROR: Failed to connect to grid.uio.no(IPv4):443User InterfaceIvan Marton
2321P3normal1.0.0b4ERROR: Failed cancelling job - when trying to kill job submitted to CreamARClibMartin Skou Andersen
2647P3normal11.05arcget, arcstat -c do not accept long format of resource descriptionUser InterfaceMartin Skou Andersen
2327P3normal1.0.0b4arcsub always empty AccessControl element generate from xrslARClibMartin Skou Andersen
2658P3normal11.05arcsub hangs when submitting many jobsUser InterfaceMartin Skou Andersen
2334P3normal1.0.0b4bad element generated from the InputSandbox by JDLARClibMartin Skou Andersen
2663P3normal11.05arcproxy not using default locations of user certificates (Windows)User InterfaceAleksandr Konstantinov
2336P3normal1.0.0b4no output generated if InputSandboxBaseURI is specified in JDLARClibMartin Skou Andersen
2344P3normal1.0.0b4no output generated if CandidateHosts/HostName is specified in JSDLARClibMartin Skou Andersen
2682P3normalSVNMisleading error message using several VOMSes when server shortened VOMS AC validity timeUser Interfaceweizhong qiang
2345P3normal1.0.0b4no output generated if FileSystem/DiskSpace is specified in JSDLARClibMartin Skou Andersen
2770P3normal11.05arcinfo 2.0.0rc3 returns Health State: (empty) from ARC1 CEUser InterfaceMartin Skou Andersen
2352P3normal1.0.0b4no output generated if IndividualCPUCount, IndividualPhysicalMemory... are specified in JSDLARClibMartin Skou Andersen
2772P3normal11.05arccat doesn't show "stdout from job <JOBID>" lineUser InterfaceMartin Skou Andersen
2354P3normal1.0.0b4bad element generated from the WallTimeLimit by JSDLARClibMartin Skou Andersen
2773P3normal11.05Job submission failed arcsubUser InterfaceMartin Skou Andersen
2355P3normal1.0.0b4bad element generated from the CPUTimeLimit by JSDLARClibMartin Skou Andersen
2795P3normal2.0.0rc4arcclean -s DELETED does not clean deleted jobsUser InterfaceMartin Skou Andersen
868P3trivial11.05Incorrect error message from arcsub when queue information is unavailableUser InterfaceMattias Ellert
1024P5trivial0.8.2.2Double-counting re-tries with bulk nggetUser InterfaceMattias Ellert
2415P3trivial1.0.0b5arcinfo man page unclear on authorizationUser InterfaceMartin Skou Andersen
2620P3trivial11.05arcsub prints WARNING instead of ERROR when CA keys not foundUser InterfaceAleksandr Konstantinov


Enhancements

IDPVersionSummary (9 tasks) ComponentAssignee
2279P31.0.0b1Helpful error message in case of absent pluginARClibMartin Skou Andersen
810P30.8.2.2Bad sites are repeatedly tried for multi-job job-submissionUser InterfaceBalazs Konya
1279P30.9.1arcsub for ARC0 error message in case of absent globus plugins should be improvedUser InterfaceMartin Skou Andersen
1673P30.8.2.2arcstat slows down with large job historyUser InterfaceIvan Marton
2492P311.05Feature request for arctest -RUser InterfaceMartin Skou Andersen
2627P311.05arcsub performance is good, better than ngsub provided the flavour is specified (e.g.-c ARC0:....)User InterfaceIvan Marton
2643P311.05Better error message needed when non-RFC proxy is used with HEDUser InterfaceAleksandr Konstantinov
2673P3SVNarcproxy should support batch query of validityUser Interfaceweizhong qiang
1532P40.8.2.2add preferred SURL optionUser InterfaceDavid Cameron


Feature requests

IDPVersionSummary (5 tasks) ComponentAssignee
1928P3SVNClient tools should check proxy validityARClibMartin Skou Andersen
2038P3NOXFlag for doing middleware specific submissionUser InterfaceMattias Ellert
2545P311.05VOMS AC for MyProxy credentials retrievalUser Interfaceweizhong qiang
2829P3SVNFeature request: Support for RESTful VOMS interface in arcproxyUser Interfaceweizhong qiang
1463P40.8.2.2arccat: specification of arbitrary file name to return?User InterfaceMattias Ellert

Minutes

Regular meetings in the ARC Compute Clients PT is taking place. Below you will find the minutes of these meetings:

KnowARC remnants

Command line tools status

Command ARC0 ARC1 CREAM UNICORE
arcsub Ok Ok Ok Job submission but no staging
arcstat Ok Ok Ok Ok
arcinfo Ok Ok Ok Ok
arcget Ok Ok Ok No
arcclean Ok Ok Ok No
arckill Ok Ok Ok Ok
arccat Ok Ok No No
arcresub Ok Ok Ok No
arcmigrate No Ok No No
arcrenew Ok No No No
arcsync Ok Ok No No
arcresume Ok No No No
arctest No No No No

A list of all the command line options used in the client commands can be found at ARC_Compute_Clients/Command_line_options.

Documents

Personal tools