In EGI Resource Usage information is centrally collected in the EGI accounting database, the so-called APEL server. Accounting records are pushed to APEL either directly from the CEs or via a site/country/organization accounting service (e.g. SGAS server or a country-level APEL deployment).
The collected, processed and/or aggregated accounting data stored in APEL is summarized and displayed in the accounting portal (http://accounting.egi.eu/egi.php). The accounting portal presents a homogeneous, user-friendly view of the data gathered in EGI and helps understanding resource utilization.
ARC integration with EGI from the accounting point of view requires the generation of accounting records in the proper format and the transfer of those records to the APEL database.
ARC comes with an accounting component, called JURA, that provides an easy, flexible, non-intrusive way of sending accounting records of the proper format to APEL server. This page gives a step-by-step instruction of howto enable EGI-compatible accounting with an ARC CE using JURA. Please note there may be other working APEL integration options as well but those are not described here (e.g. indirect reporting via SGAS or using the native APEL clients).
Accounting records: generation and transfer
JURA is capable generating compute accounting records in the CAR 1.2 format (for APEL) and also in an extended OGF UR 1.0 format (required for SGAS). The generated records optionally can be kept (archived) on the CE (see JURA documentation).
Once the records are created in the proper format, the next step is to transfer those to APEL server. This can be done directly, using the official APEL accounting record transfer method via a message broker, or sending the records first to an accounting aggregation server (e.g. SGAS or site SGAS). JURA supports both the direct record transfer to APEL and the indirect one via an SGAS server. Below only the configuration for the direct transfer option is described.
JURA-based direct integration with EGI APEL
The software to be installed on the computing element is a recent ARC CE installation (ARC release 13.02 update 3) that comes with JURA.
The integration processes is a two-phase procedure:
- First a successful testing should be carried out using the APEL test server,
- Then the production setup can be configured and enabled.
Both phase requires communication with the APEL team and modification of the arc.conf of the ARC CE.
- Deploy an ARC CE with some basic functionality (e.g. hello grid job completes successfully). Use at least the 13.02 update 3 release. The nordugrid-arc-arex package contains the JURA files including the necessary SSM modules.
- Create a dummy GOCDB Service Endpoint entry with the type of gLite-APEL: Assuming you have the proper GOCDB authorization go to https://goc.egi.eu/portal/index.php?Page_Type=New_Service_Endpoint and create a new service endpoint under the site you administer. The dummy service endpoint should have the gLite-APEL type and it is sufficient to only contain the hostname and the host DN of the CE where the JURA is installed. Furthermore, when you create the gLite-APEL service endpoint DO NOT select "Monitored" option. It is because if you specify that this service endpoint is "Monitored" then the org.apel.APEL-Sync Nagios probes become active and this test will fail constantly because of no aggregation records are sent by JURA.
- Open a GGUS ticket (e.g. "XYZ ARC CE accounting records for APEL" in which you ask the APEL team to authorize your CE to start the APEL integration process. In the ticket explain clearly that you have an ARC CE and you'd like to send compute accounting records with JURA to APEL. The mandatory information from your side is the host name and the host DN of your CE. It is also useful to specify the URL of your GOCDB site and ARC CE entries.
- Modify the arc.conf, add the following configuration block that would enable JURA to send accounting records to the TEST APEL server:
[grid-manager] ... jobreport="APEL:http://mq.afroditi.hellasgrid.gr:6163" jobreport_publisher="jura" jobreport_options="archiving:/var/spool/arc/urs, topic:/queue/global.accounting.test.cpu.central, gocdb_name:<<YOUR SITE NAME>>, benchmark_type:<<HEPSPEC, Si2k or Sf2k>>, benchmark_value:<<FLOAT VALUE>>, benchmark_description:<<OPTIONAL DESCRIPTION>>, use_ssl:false"
- Check the status
Ask the APEL team (via the same GGUS ticket) if everything is OK and whether the records from your CE are properly formatted and transferred AND inserted into the APEL test server. You can yourself do some monitoring on this page: http://goc-accounting.grid-support.ac.uk/apeltest2/jobs.html This table is normally updated at 30 minutes past each hour.
When the APEL team approves the accounting publishing configuration after a successfully completed test phase (you'll be informed via the GGUS ticket), you should modify your setup for the production APEL server:
- Check that the GOCDB gLite-APEL dummy service endpoint entry has the proper host name and host certificate DN information. Also check that the short name of the site under which the ARC CE with JURA is installed corresponds to the gocdb_name specified in the arc.conf.
- Modify the arc.conf JURA section so that it points to the production APEL server. Please note that the use_ssl option must be turned on as well.
grid-manager] ... jobreport="APEL:https://mq.cro-ngi.hr:6162" jobreport_publisher="jura" jobreport_options="archiving:/tmp/archive, topic:/queue/global.accounting.cpu.central, gocdb_name:<<YOUR SITE NAME>>, benchmark_type:<<HEPSPEC, Si2k or Sf2k>>, benchmark_value:<<FLOAT VALUE>>, benchmark_description:<<OPTIONAL DESCRIPTION>>, use_ssl:true"
- List of production APEL servers:
- mq.cro-ngi.hr:6162 with use_ssl:true and message encryption options
- If for some reason you would like to send encrypted usage records with JURA, then you should obtain the APEL server's certificate (ask for it in a GGUS ticket or install the ca-policy-egi-core package, that contains the relevant certificate ), copy it somewhere on your CE (for example: /etc/grid-security/APELservercert.pem) and edit the /usr/share/arc/ssm/sender.cfg low-level SSM config file:
[certificates] ... # If this is supplied, outgoing messages will be encrypted # using this certificate server: /etc/grid-security/APELservercert.pem
Note, this step is OPTIONAL, right now the APEL server accepts non-encrypted records as well.
- If your CE still has unsent records with a use_ssl:false and other server URL settings (e.g. from your test setup) then those records manually need to be modified. Please go to the <control directry>/logs directory and execute the following two commands
- Changing the accounting server URL: sed -i ’s/loggerurl=APEL:https://mq.afroditi.hellasgrid.gr:6162/loggerurl=APEL:https://mq.cro-ngi.hr:6162/g’ *
- Changing the message queue option: sed -i ’s/accounting.test.cpu.central/accounting.cpu.central/g’ *
- Changing the use_ssl option: sed -i ’s/use ssl:false/use ssl:true/g’ *
- After all these changes restart the A-REX service (e.g. /etc/init.d/a-rex restart)
- Finally, inform the APEL team about your production setup (via the GGUS ticket) and ask for confirmation about the successful change for production setup.
Most important problems and solutions
- P: I read this line in the ssm log.
crypto - ERROR - unable to write 'random state'
S: Remove random file.
$ rm ~/.rnd
- P: How to republish old records?
S: It is possible only for APEL if archiving option was set up. You can use this script for collecting old records to one or more messages and putting these files to the republishing directory. You are able to set the following attributes in the script:
- archiving directory as input for the script
- required data gap
- output directory for a new file(s)
Many thanks to Jernej Porenta who has shared this script with us.