This wiki is obsolete, see the NorduGrid web pages for up to date information.

ARC Compute Clients

From NorduGrid
Jump to navigationJump to search

This page concerns the "ARC Compute Clients" product in EMI.

The EMI project will last for 3 years, from May 2010 to April 2013, and will bring together the 3 major European middlewares ARC, gLite and UNICORE. In the project consolidation and harmonization of the 3 middlewares will be carried out, but there will also be room for proactive maintenance and evolution (development).

In this page and sub-pages, information about proactive maintenance, harmonization, evolution, testing and analysis related to the ARC Compute Clients product are gathered. This page contains a component list and a technical objective, work plan and open bugs table. If you have questions, comments or feedback please use the Discussion page.

Components

The product consist of all the computing related user command lines tools in ARC and the libarcclient library. Below these are listed:

  • Pre-WS computing client
    • ngsub - Submission
    • ngresub - Resubmission
    • ngstat - Job and resource querying
    • ngget - Output retrieval
    • ngcat - Output and log catenation
    • ngkill - Kill job
    • ngrenew - Credential renewal
    • ngresume - Resume job from a failed state
    • ngsync - Job list synchronization
    • ngtest - Test client- or server setup
    • ngclean - Job cleaning
  • WS computing client
    • arcsub - Submission
    • arcresub - Resubmission
    • arcmigrate - Migration
    • arcstat - Job querying
    • arcinfo - Resource querying
    • arcget - Output retrieval
    • arccat - Output and log catenation
    • arckill - Kill job
    • argrenew - Credential renewal
    • arcresume - Resume job from a failed state
    • arcsync - Job list synchronization
    • arcclean - Job cleaning
  • libarcclient
    • Library used by the WS computing client

Milestones, Deliverables and Technical objectives

The following milestones, deliverables and technical objectives are related to the ARC Compute Clients product. The milestones and deliverables are collected from the EMI DoW, while technical objectives (TO) are taken from DNA1.3.1, more specifically from the table in section 5.2. A list of all the EMI milestones and their status can be found here, while information about deliverables is here.

Name Due month State Comments
DNA1.3.1 – Technical Development Plan 2 ONGOING All input for the deliverable have been provided. Deliverable not delivered yet.
DJRA1.1.1 – Compute Area Work Plan and Status Report 3 ONGOING All input for the deliverable have been provided. Deliverable not delivered yet.
MJRA1.13 - Agreement on common information exchange methods 3 UNCLEAR
MJRA1.2 - Agreement on common job submission and management methods 6 ONGOING F2F and phone meetings are being held to reach an agreement
Finalize the XML and LDAP rendering of the GLUE2 model (TO 26) 9 UNCLEAR
DNA1.3.2 - Technical Development Plan 11 HOLD Input for this deliverable should be provided approximately 1 month before due date.
DJRA1.1.2 - Compute Area Work Plan and Status Report 12 HOLD Input for this deliverable should be provided approximately 1 month before due date.
MJRA1.14 - EMI Information exchange used in computing (TO 15) 12 HOLD
MJRA1.3 - Successful implementation of the common job submission and management method (TO 14) 18 HOLD Waiting for completion of MJRA1.2
DNA1.3.3 - Technical Development Plan 23 HOLD Input for this deliverable should be provided approximately 1 month before due date.
DJRA1.1.3 - Compute Area Work Plan and Status Report 24 HOLD Input for this deliverable should be provided approximately 1 month before due date.
Consolidation and harmonization of compute area clients/APIs (TO 32) 24 ONGOING Consist of improving libarcclient
DJRA1.1.4 - Compute Area Work Plan and Status Report 36 HOLD Input for this deliverable should be provided approximately 1 month before due date.
Extend job definition languages, resource modeling schema (GLUE) and job management services to be able to request access to virtualized resource managers and appliances (TO 43) 36 UNCLEAR

Work plan

Overview table

Description Priority Assignee Comments
#Improve technical documentation 5Very high Martin Writing documentation of API. Also writing a guide on howto create a ACC module for libarcclient.
#Create a framework for web content management 5Very high
#Detailed performance analysis 3Normal
#Extend the resource discovery module 4High
#Extend unit testing 3Normal
#Create functional tests 4High
#Revise READMEs and man pages 3Normal
#Improve user documentation 3Normal
#Write libarcclient article 3Normal Martin, Others?
#Secure functionality over transition 5Very high All
#ArcJobTool support 4High Martin Support is ongoing.
#P-GRADE support 3Normal Contact need to be established.
#Ganga support 3Normal Contact need to be established.
#Improve data broker 4High Salman
#Improve libarcclient language bindings 4High
Bug fixing 4High ALL Pick a bug!
#Extension to clouds 2Low This task might be appropriate for a student project.
#Extension to remote LRMS 2Low This task might be appropriate for a student project.
#Extension to localhost 2Low This task might be appropriate for a student project.
#Interactive jobs 2Low This task might be appropriate for a student project.
#Performance analysis of language bindings 3Normal
#Extend support for job migration 3Normal
#Support EMI ES 1Very low This task cannot be started before MJRA1.2 is reached. When reached priority will be highly increased.
#Harden ACCs 2Low Gabor Creating ACC unit tests. A framework for testing SOAP based client have been created.
#Harden the job description translators 2Low
#Job submission and management speed up 3Normal
#Resource discovery and brokering speed-up using cached info 3Normal
#Resource discovery and brokering speed-up (ISIS) 1Very low Salman This work can currently not be attributed to EMI since ISIS is not in EMI.

Proactive maintenance

  • Maintaining the arc* compute clients and the libarcclient
Bugfixing, see the table in the #Open Bugs section.
  • Resource discovery and brokering speed-up
Salman worked on a solution where the ISIS hold a limited set of semi-statical attributes, however it has not been committed yet. Results were posted to the KnowARC mailing list. A continuation of this can probably not be attributed to EMI, since the ISIS is not part of EMI. Salman and Martin are working together on integrating this work.
Another way this can be achieved is by caching semi-statically resource information at the client side. This might be work for a bachelor project or a autonomous project in the master studies.
  • Job submission and management speed up
Multi-job submission can be sped up by using a single connection when submitting multiple jobs to a single CE, see bug 808.
Job management can probably be sped up when multiple jobs on a single CE should be managed. This should also be done by utilizing a single connection, see bug 1887.
Carry out resource discovery, matching making and target ranking in parallel for each target. More information on this task can be found here.
  • Harden the job description translators (xrsl, jdl, jsdl)
Unit tests should be extended to make a ~100% coverage of the parsers.
Go through the translators and verify that they behave properly, and that all attributes are supported.
Make job description translators plug-able. Done.
The JSDL parser should warn about unrecognized elements.
Support for multiple XRSL in the same file.
Support logical OR expressions in xrsl.
Extend the JobDescription class to be able to support XRSL fully.
  • Harden ACCs (midleware-specific plugins)
Create functional tests, which can be used in a automated test frame work, like the one deployed by Marek in KnowARC.
Create a unit tests which can test the ACC modules.
Verify that every ACC is functional (develop tests), and resolve any issues.
Improve Unicore support, capable of finding resources, but not yet submit jobs. Also investigate delegation procedure used in Unicore, it seems certificate and key are expected by the service as default.
Extending arcsync to CREAM, see D2.2-1.
  • Secure that all functionality in the pre-WS Compute client is transfered to the WS Compute client
See Secure functionality over transition

Harmonization

  • Support EMI Execution Service (ES) interface on the client-side
The creation of the EMI ES specification is the subject of MJRA1.2 milestone. Work on implementing the specification cannot start before the milestone have been reached. Milestone MJRA1.3 concerns this implementation. More info on the specification is located here.

Evolution

  • ArcJobTool support
Adjust libracclient to satisfy the needs of the ArcJobTool GUI.
Support the developer team.
  • P-GRADE support
Currently P-GRADE uses the CLI of the pre-WS client to interact with ARC resources. This presentation was presented at the PUCOWO 2010 conference.
Contact P-GRADE developers to promote libarcclient and learn about their interest in libarcclient (Java bindings).
Adjust libarcclient to satisfy the needs of the P-GRADE portal.
Support the developer team.
  • Ganga support
Currently Ganga uses the CLI of the pre-WS client version 0.6 to interact with ARC resources. Ganga can probably also interact with ARC through Panda.
Contact Ganga developers to promote libarcclient and learn about thier interest in libarcclient (Python bindings).
Adjust libarcclient to satisfy the needs of the Ganga tool.
Support the developer team.
  • Extend support for job migration
Currently only job residing on a A-REX CE can be migrated, and they can only be migrated to other A-REX CEs.
Examine whether job migration from GM, CREAM or UNICORE can be supported. If not possible detailed information should be added as comments to the source code. If its possible, or limited possible a solution should be crafted.
Same as above, but instead for job migration to GM, CREAM or UNICORE.
Enabling migration to also get remote files from old cluster
  • Extend the support for hold/resume (archold) (requires server side support)
Currently there is no support for holding jobs in ARC. It is planned that the EMI ES will include hold operations. This task is thus part of #Support EMI ES.
  • Advancing and extending the functionality of the data broker module
A dedicated page concerning data broker module improvements is located here.
  • Extend the resource discovery module
All relevant GLUE2 information for the ARC1 plug-in should be communicated to and available on the client side.
VO information should be fetched and used for match making.
  • Run job interactively (arcint) (requires server side support)
Might be work for a student project.
  • Develop a plug-in to be able to submit jobs to localhost as an "execution server"
This task should develop a plug-in which uses the local machine to run jobs. It should be investigated what features of libarcclient which can be supported. Grid job testing and utilizing the local machine as a execution service is motivating factors.
  • Develop a plug-in to able to submit jobs directly to LRMS (no middleware installed on cluster)
It should be investigated to what extend the libarcclient library is able to interact with remote LRMS. Remote LRMS should be understood as a LRMS run remotely without any middleware installed on it, and should be accessed through a dedicated connection. It might require lots of work supporting different connection protocols, on the contrary this task might benefit from the fact that the major LRMSs is already supported on the server side.
  • Develop a plug-in to be able to submit jobs to Clouds
It should be investigated to what extend the libarcclient library is able to interact with cloud computing.
After investigation, a plug-in should be developed which uses the libarcclient API/interface. This task might be work for a bachelor project, an autonomous master study project or maybe even a master project.
  • Improve libarcclient language bindings
The language bindings should be tested in general. Unit tests for these should be created. Consider using junit and pyunit.
Java bindings on Windows and Mac should be fixed. See bugs. Fixed. There are currently no issues.
  • ARC client on non-linux platforms
Stand-alone client for Mac OS X should be created, see packaging repository.
Language bindings should be packaged for Windows (assigned to Anders).

Performance

  • Break down performance analysis of the WS computing client
A detailed performance analysis should be carried out, which in depth analyses where computing time is spend, specifically it should be able to point to specific lines in the code. The analysis should be constructed so it can run in a automatic manner, and the results should be in a format applicable to a web page.
  • Performance analysis of libarcclient with and without language bindings
A performance analysis of the language bindings should be carried out, which compares performance of the libarcclient without language bindings. The analysis should be constructed so it can run in a automatic manner, and the results should be in a format applicable to a web page.

Testing

Note: Testing efforts should be coordinated with the SA2.5 task, more specifically the ARC partner UPJS (Marek Kocan). Marek also has testing experience from the KnowARC project. The ARC1/Testing page might contain valuable information. The EMI twiki page EmiSa2CertTestGuidelines contains EMI testing guidelines and test definitions.

  • Extend unit testing
Unit Testing of the libarcclient should be highly improved. It should be investigated which classes in libarcclient is applicable to unit testing, and unit tests should be created for them. The test coverage of the classes should be as high as possible, without changing the code of the class, see Test Coverage for more on test coverage.
  • Create functional tests
Functional tests should be created which can be run in a automated manner. It should be examined if the virtual testbed setup created by Daniel Johansson could be used for this purpose. Tests should be created for all the WS computing client commands, and the tests should cover all the common uses of command line options.
It should be noted that the functionality of the arcresume and arcrenew was not completely ensured in the KnowARC project.

Documentation and Promotion

  • Create a framework for web content management
A framework for managing web content should be set up, in order to present documentation and such on the web. The content includes documentation, performance results, test coverage, code examples, turtorials, training material etc.
  • Update READMEs files and man pages
The READMEs and manual pages should be reviewed and updated. A lot of different READMEs exist (packaging READMEs).
  • User documentation
Review and improve user manual.
Create a web page containing user documentation. Also a page for first time users should be created.
  • Technical documentation
Technical documentation already exists for the libarcclient. However it should be improved and extended. The documentation should preferably be written in a way which makes it applicable for a document (PDF) and a webpage.
A web page containing technical documentation should be created. A general layout should be used, see #Create a general Web layout above.
All classes in libarcclient should be attached with comments in doxygen style.
A web page containing API documentation from the doxygen styled comments should be created, and they also explain how the class is used from Python and Java, if different from C++.
Tutorials and code examples for using the libarcclient API should be created and presented in a web page. Python examples can be found here, and Java here.
  • Write libarcclient article
An article about the libarcclient should be written to promote its capabilities.
The article should include a comparison of performance of the WS computing client against the pre-WS computing client. A performance analysis between the two have already been carried by Marek in the KnowARC project. That analysis should be investigated, and if not found sufficient for the article it should be extended.

Open Bugs

New and reopened

IDPVersionSummary (5 tasks) ComponentAssignee
4213P3latestStorage clients are not working with tokensUser InterfaceIevgen Sliusar
4215P3unspecifiedARC 6 client can not submit to ARC 7 if -T arcrest interface option is not explisitly addedUser InterfaceIevgen Sliusar
4025P36.12.0arcsync produces duplicates of emies/arcrest jobsUser InterfaceMartin Skou Andersen
4218P3unspecifiedARC 7 client - still contains emies and gridftp logging infoUser InterfaceIevgen Sliusar
4033P36.13.0client.conf is ignored when using ARC6 optionsUser InterfaceMartin Skou Andersen


Blockers and criticals

no bugzilla tickets were found


General

IDPSeverityVersionSummary (11 tasks) ComponentAssignee
4232P3major6.21.1VOMS test fails with openssl 3.4.0ARClibAleksandr Konstantinov
3377P3minor13.11.1Unclear description of options -o, -i and -j in man pages of arcsub, acstat, arckill etcUser InterfaceMartin Skou Andersen
3362P4minor13.11.1arc* tools can not lower pre-configured verbosity levelUser InterfaceMartin Skou Andersen
3661P3normallatestjob submitted even if walltime exceeds the allowed valueUser InterfaceMartin Skou Andersen
4226P3normal7.0.0a2ARC7 client suggests to install globus plugins talking to ARC6 serverUser InterfaceIevgen Sliusar
3361P3normal13.11.1Broken interpretation of executable location and variablesUser InterfaceMartin Skou Andersen
3472P3normal15.03arctest does not print "plugin missing" error for org.nordugrid.gridftpjob interface submissionUser InterfaceMartin Skou Andersen
3475P3normal15.03arcsub doesn't work with memory in job descriptionUser InterfaceMartin Skou Andersen
4022P3normallatestpython submission script does not exit, hangs in Arc::RunPump::Remove(Arc::Run*)ARClibAleksandr Konstantinov
3370P3normal13.11.1arcsync finds no jobsUser InterfaceMartin Skou Andersen
868P3trivial11.05Incorrect error message from arcsub when queue information is unavailableARClibFlorido Paganelli


Enhancements

IDPVersionSummary (9 tasks) ComponentAssignee
2627P311.05Increase job submission speed with arcsubUser InterfaceMartin Skou Andersen
3364P313.11.1Better message needed when job can not be foundUser InterfaceMartin Skou Andersen
2673P3SVNarcproxy should support batch query of validityUser Interfaceweizhong qiang
3388P3SVNCE-Parallel Job Query for arcstatUser InterfaceMartin Skou Andersen
810P30.8.2.2Bad sites are repeatedly tried for multi-job job-submissionUser InterfaceMartin Skou Andersen
2922P312.05All job management tools should report missing pluginsUser InterfaceMartin Skou Andersen
3521P315.03Provide API to get the reason for failed matchmakingUser InterfaceMartin Skou Andersen
3335P313.11Client does not provide local input file size and checksum for EMI-ESUser InterfaceMartin Skou Andersen
1532P40.8.2.2add preferred SURL optionUser InterfaceDavid Cameron


Feature requests

no bugzilla tickets were found

Minutes

Regular meetings in the ARC Compute Clients PT is taking place. Below you will find the minutes of these meetings:

KnowARC remnants

Command line tools status

Command ARC0 ARC1 CREAM UNICORE
arcsub Ok Ok Ok Job submission but no staging
arcstat Ok Ok Ok Ok
arcinfo Ok Ok Ok Ok
arcget Ok Ok Ok No
arcclean Ok Ok Ok No
arckill Ok Ok Ok Ok
arccat Ok Ok No No
arcresub Ok Ok Ok No
arcmigrate No Ok No No
arcrenew Ok No No No
arcsync Ok Ok No No
arcresume Ok No No No
arctest No No No No

A list of all the command line options used in the client commands can be found at ARC_Compute_Clients/Command_line_options.

Documents