This wiki is obsolete, see the NorduGrid web pages for up to date information.

EMI Development/arc-emi-pt-20100618

From NorduGrid
Jump to navigationJump to search

ARC EMI development plans

June 18 2010, 10:00 - 15:45

Present: Aleksandr, Martin Skou, Zsombor, Jozef, Balazs, Mattias, Anders (chat only), Oxana

Collect bulleted items for product developments in NorduGrid Wiki (https://wiki.nordugrid.org/index.php/Roadmap); extended text should be sent to EMI. Zsombor asks whether list of people is needed - Balazs says no.

Clarification: proactive maintenance means no bug fixing, but e.g. architectural improvements, fixing something really bad

ARC Classic SE

The only component is GridFTP. No proactive maintenance, only regular support activity until the component is harmonised out. If Globus will come with a new version with the GridFTP library, we may want to adjust it, if it'll come before Classic SE is phased out. But in all likelihood Classic SE will be phased out sooner, and no proactive maintenance or development will be needed. No known need for standalone pure GridFTP servers without SRM front-end, except of Marko's recent effort.

ARC Security Utilities

Components: update-crl, nordugridmap, arcproxy. There are few more prototypes by Weizhong, but there is a feeling they shouldn't be exposed to EMI. No proactive maintenance is foreseen. Harmonisation: update-crl is being phased out in favor of fetch-crl based on the old EDG utility by Fabio Hernandez (EU Datagrid license). arcproxy and nordugridmap will be proposed as common EMI tools. arcproxy needs more development: improved ARGUS, SAML and MyProxy support. ARGUS integration is needed also for nordugridmap.

ARC LDAP-based Infosys

Components: local LDAP, EGIIS, infoproviders, Grid Monitor. Pro-active support following BDII updates. It is not known however whether the latest version is stable enough; EGIIS needs stress tests. Grid monitor is not expected to have any proactive maintenance. Infoproviders are not going to be improved either. Harmonization: all components must support GLUE2 schema. ARC misses site info aggregation like in gLite, needs to be harmonized. Nothing to harmonize in infoproviders. Evolution: for infoproviders, we need to add support for virtualized and elastic systems; support for parallelneeds, virtualization. A new monitor is needed that works with both LDAP and WS-based infosys. Privacy, access control in LDAP-based information system is a tricky issue, Globus' hack comes with many penalties. However, WS-based infosys will have access control, and WS-based part of the monitor must be able to support it.

ARC Container

Componets: HED and all its libraries. Aleksand also mentions information library, ALIS, security framework, language bindings. Pro-active maintenance: performance to be enhanced, interfaces to be cleaned. "Easy configuration" task should be finished. Security framework needs massive clean-up. Java language bindings must be made useful. Nothing needs harmonisation in container. Evolution: ALIS, LIDI need multiple improvements, list by Aleksandr:

  • Efficient selective query implementation (GLUE document is huge and largely redundant); Balazs and Aleksandr spent 12 minutes agreeing on the following sentence: implement information port/operations and a model of an efficient information interface
  • Integration of privacy filtering and security (access control)
  • Information flow reducing
  • Multiple interfaces

Security development will mostly focus on integration with other solutions; as soon as integration tasks will be defined, it will become clear which components will be involved. Zsombor adds that Shibboleth integration will certainly be needed. Language bindings: good Java support is needed (on the same level as Python); unfortunately Ferenc's expertise is not available. Bindings must be supported on all platforms, including Windows (currently Python bindings don't work on Windows).

ARC Data Clients (should be called ARC Data Libraries)

Components: ng* data clients, arc* data clients, libarcdata2, DMCs (Data Massaging Components). ng* clients are being phased out (harmonisation activity), new clients will need stress-tests and pro-active maintainance to reach maturity. libarcdata2 is also used by uploaders and downloaders. Further harmonisation: current SRM2.2 will be simplified and disambiguated (SRM2.2-emi); following such changes, respective data clients will need to be harmonised. ARC approach to data libraries (libarcdata2) should be promoted as a general purpose EMI data library approach (we actually do use some of gLite libraries, like LFC ones). Emphasis on portabiluity (difrferent platforms), modularity, extensibility through plugins. In the end, aim at one single EMI data library. Further harmonisation will be needed in case security infrastructure will change (GSI, delegation). In the event of LFC catalogue reenginering, corresponding changes on the client wil have to be implemented as well. NFS4.1 is being widely discussed, but it probably does not require any new development - to be investigated. Regarding client-side support for cloud storage, more investigation is needed too. Of new protocols, support for xroot protocol may have to be added.

ARC Compute Clients

Components: ng* and arc* compute CLIs, libarcclient. Some discussion whether language bindings belong also here or only in HED. Availability of client libraries in Python and Java should be advertised, and approach promoted. Balazs suggests to explicitly list all the command line tools. ng* clients will be phased out, and arc* clients will need stress-testing and respective hardening, as pro-active maintenance. Performace enhancement: speed up brokering, may require changes on the server side; Salman's approach to cache some static information in ISIS might be relevant, but ISIS is not in EMI. Data-driven brokering improvement is also needed (something exists, but may be considered a new feature). Job migration needs hardening and improvements, but may also be considered new development (evolution). GUI: improvements and hardening of the existing one, and one may think of development of a data management GUI. Improvements are also need in speeding up job submission (re-using the same channel for one server, cache authorisation and such). Hardening is needed for job desription language translators. Harmonisation: primarily standardisation; agreed AGU-BES inteface is to be implemented on client side, as well as the agreed JSDL, GLUE schema. Libarcclient should be promoted in EMI as a general approach to client libraries. Evolution: GUI (arcjobtool) development (outside EMI) needs to be supported through EMI by providing underlying libraries with the necessary functionality; storage GUI might be a future direction (Jonas plans for it anyway), but it may not be welcomed by EMI. These libraries might be useful for other 3rd party clients and portals (e.g. P-Grade, GANGA); one should expect more feature requests from the customers that will have to be implemented. Job migration currently works between A-REXes only, needs to be extended to other CEs. Data broker functionality needs to be extended (see also above). Resource discovery module has to be extended, probably to support different schemas (unclear how is it different from harmonisation above). Support for interactive jobs is a desirable new feature, but definitely requires server side support. Similarly, job hold/resume feature is needed, but requires changes on servers. "Clever queueing" feature is needed, kind of a "fast track" for privileged VOs - again, would need server-side support. Another useful feature is capability to submit to the local host through the same client: as a simple background process on a workstation, or to a local batch system; to be implemented through plugins. An even more ambiguous plan is to add a cloud submission plugin.

ARC Compute Element

Advertised as one product with many components. Components: A-REX,Grid Manager,gridftp jobplugin interface, CE-Cache, CE-staging, LRMS modules, Janitor, Jura accounting hook. The following components need hardening: A-REX, Janitor, ARC CE cache, LRMS back-ends. Harmonisation: implementation of agreed standards, phasing out GM (GridFTP jobplugin interface), remove dependency on Globus Toolkit. New development: data staging system, accounting hooks, support for MPI, monitoring and remote management via messaging, virtualization. Several client-side enhancements will need changes on the CE side: support for interactive jobs, job hold/resume, server-side job request validation; and maybe faster brokering and fast track queues