Debian

From NorduGrid

Jump to: navigation, search

Debian is a fine distribution of Linux that is provided by a community of users rather than a commercial institution. More rencently, OpenSuSE (Novell) and Fedora (RedHat) have adopted that form of organisation for themselves.

Contents

Quick start

Repositories

Official channels for Debian

The Globus packackages (not those for LFS and VOMS) have been accepted by the Debian Linux. They are in the testing and unstable distributions. To include these packages to your current Debian installation, please add

 deb http://ftp.[yourcountrycode].debian.org/debian/ testing main contrib non-free

to your /etc/apt/sources.list file. If you don't have a Debian mirror with your countrycode, please cross-check with the Debian Mirror List. What is "testing" will become in "stable" in (most likely) 2010. Backports to the old stable ("lenny") have not yet been addressed via the official backports.org initiative. Instead, it is suggested to refer to NorduGrid's repository as shown below. The Globus packages are with Ubuntu since its Karmic Koala release.

NorduGrid's repository

In addition to ARC itself, a number of Globus packages is needed to enable the Globus Securty Infrastructure (GSI) and the popular GridFTP file transfer protocol. VOMS is needed for Virtual Organisation management, and LFC is needed for data indexing.

  # Base channel - must be enabled
  deb http://download.nordugrid.org/repos/11.05/debian/ squeeze main
  deb-src http://download.nordugrid.org/repos/11.05/debian/ squeeze main
  
  # Updates to the base release - should be enabled
  deb http://download.nordugrid.org/repos/11.05/debian/ squeeze-updates main
  deb-src http://download.nordugrid.org/repos/11.05/debian/ squeeze-updates main
  
  # Scheduled package updates - optional
  #deb http://download.nordugrid.org/repos/11.05/debian/ squeeze-experimental main
  #deb-src http://download.nordugrid.org/repos/11.05/debian/ squeeze-experimental main


Replace squeeze with lenny or etch when appropriate. Older releases are in a different repository, see information about ARC 0.8.x repositories.

GPG Key

The archives are signed by the key available at:

 http://download.nordugrid.org/DEB-GPG-KEY-nordugrid.asc. 

Download the key and then install it in your apt configuration using the command:

sudo apt-key add DEB-GPG-KEY-nordugrid.asc

Client installation

First, install Globus certificate utilities, you are very likely to need them to create proxies and work with certificates when needed:

  apt-get install globus-gsi-cert-utils-progs
  apt-get install globus-proxy-utils 


Then you'd need keys of potentially relevant Certification Authorities (CAs); a simple approach is to install all of them by using

  sudo apt-get install ca-*

This is however not recommended from the security perspective, and security-conscious people must install necessary certificates one by one.

One can also install CA keys from the original IGTF repository, see section #Installing CA keys from IGTF repository


Then install ARC client itself, which will pull in other necessary packages:

  apt-get install nordugrid-arc-client

In all likelihood, you will need to install Globus plugins as well (as long as GridFTP is around):

  apt-get install nordugrid-arc-plugins-globus


Installing CA keys from EGI repository

EGI has experimental support for Debian when it comes to CA keys:add this repository

  #### EGI Trust Anchor Distribution ####
  deb http://repository.egi.eu/sw/production/cas/1/current egi-igtf core

Get the GPG key

  wget -q -O - \
    https://dist.eugridpma.info/distribution/igtf/current/GPG-KEY-EUGridPMA-RPM-3 \
    | apt-key add -

And install the packages:

  apt-get update
  apt-get install ca-policy-egi-core


Installing CA keys from IGTF repository

When nothing works or you need an urgent update before the packages are in repositories, get the keys directly from the upstream provider:

  wget http://dist.eugridpma.info/distribution/current/igtf-policy-installation-bundle-1.40.tar.gz
  tar xvzf igtf-policy-installation-bundle-1.40.tar.gz
  cd igtf-policy-installation-bundle-1.40
  ./configure --with-profile=classic
  sudo make install

You may need to browse to that repository first and find out what is the most recent version number, instead of 1.40

Status

  • Globus packages are accepted by Debian in June 2009
  • Production ARC packages are in Debian unstable as of June 2011

Motivation to use Debian with Computational Grids

Debian is special in its support of many platforms, across all of which the same set of files is made available at the same locations. Hence, if a software runs on one target, then it should also on all the other Debian systems, if memory and other hardware requirements are met. This is exactly what one needs for Grid Computing when the hardware underneath (PowerPC, Sun, Intel, AMD) may differ.

Communities

  • Bioinformatics is mostly provided via the Debian-Med project, a task of which is the provisioning of a Bioinformatics infrastructure. The KnowARC project has produced an interface to the production ARC middleware from the workflow management tool Taverna, which is very popular among bioinformaticians. See grid.inb.uni-luebeck.de.
  • Physics is partially addressed by the Debian-Science initiative. This group has already provided the following packages that are available in the regular Debian distribution
  • CERN is buffering packages prior to an official upload in their own Debian repository since there are still license issues with the remote distribution for CERN.

With Mattias Wadenstein and Steffen Möller, there are at least two individuals that are very close to the Debian Society and the NorduGrid. Mattias Ellert does some fantastic bits to Globus, albeit more aiming at Fedora than at Debian, possibly :) Anders Wäänänen laid out the packaging of ARC and maintains all the nice regular builds.

Accessories

Debian packages as Runtime Environments

The script debian2runtime was developed within the KnowARC project to convert the description of a Debian package into a script that may serve as a runtime environment with production ARC.

Download and post-processing of external data

Together with the Debian-Med community the script getData was developed that downloads, updates and indexes common public databases. It comes with an extensive description that is available on the Debian Wiki pages.

Detailed notes

Configuration

When run as a client, there is no starting of any daemons. As with the production ARC, the command line tools such as arcsub, arcstat and arcget (counterparts to ngsub, ngstat, ngget etc) can be used directly. The configuration has then ended with the installation of the nordugrid-arc1-client package.

The configuration as a server has several building blocks. The key starting point is that ARC follows a service-based approach, where Grid computing is realised as a set of services that contact each other and can be contacted by clients. In the current production ARC, essential services are: GridFTP, A-REX (formerly Grid Manager, internal, no exposed interface), and infosystem services. Services are hosted inside the Hosting Environment Daemon (HED), which provides external interfaces and internal communications. Using HED, one can also more easily add services for your very own Grid application (Group-organised calendar, collective ice cream orders, ... , data management of large compute jobs).

Every service has its own configuration file, which can be merged in case several services are deployed in one box. The services are then started with HED (binary: arched). Please follow the instructions on usage of ARC components.

Installation of packages

Client installtion

For a client installation, perform as follows

 # execute as root
 apt-get install nordugrid-arc-client 

Presuming that you have certificates that grant Grid access in place already, test it with

 $ arctest -J 2 -d INFO && sleep 300 ngstat -a
 $ arccat "that job id" 

If you can read the "Hello, grid!", then you were successful.

Server installation

For a server installation, execute as root

 apt-get install nordugrid-arc-client nordugrid-arc-server

The client package is only needed to test this installation. If the -libs package is not coming with it, then please install it manually.

The configuration file is expected at

 /etc/arc.conf

as usual for ARC but not placed there by the install script. Upon installation of the host certificates, preparation of the directories, one can start the services

 for i in ftpd "-manager" "-infosys"; do
   /etc/init.d/grid$i start
 done

Now start a test job

 f=`hostname --long`
 arctest -J 2 -d INFO -c $f

Error messages are in /var/log/ and in /var/spool/nordugrid/jobstatus.

Installation of Globus (obsolete now)

You are now ready to perform the downloads:

 # execute as root
 apt-get install gsoap voms globus globus-dev globus-doc

should do. You may be asked if these packages should be trusted if some are not signed. Also, you may be informed that some packages are needed but not installed. It is likely that those needed packages are automatically retrieved by the apt-get tool, aptitude does such, just add those missing packages to the command line.

Shared libraries cache refresh

 # echo "/opt/globus/lib" > /etc/ld.so.conf.d/globus.conf
 # ls -l /etc/ld.so.conf.d/globus.conf
 -rw-r--r-- 1 root root 16 2008-07-25 16:44 /etc/ld.so.conf.d/globus.conf
 # ldconfig 
 ldconfig: /opt/globus/lib/libltdl_gcc32dbg.so.3 is not a symbolic link
 ldconfig: /opt/globus/lib/libxmlsec1_openssl_gcc32dbg.so.1 is not a symbolic link

To build Globus packages for Debian or Ubuntu yourself, this Wiki has very simple to follow instructions.

See also

Personal tools