Debian
From NorduGrid
Debian is a fine distribution of Linux that is provided by a community of users rather than a commercial institution. More rencently, OpenSuSE (Novell) and Fedora (RedHat) have adopted that form of organisation for themselves.
Contents |
Quick start
Repositories
Official channels for Debian
The Globus packackages (not those for LFS and VOMS) have been accepted by the Debian Linux. They are in the testing and unstable distributions. To include these packages to your current Debian installation, please add
deb http://ftp.[yourcountrycode].debian.org/debian/ testing main contrib non-free
to your /etc/apt/sources.list file. If you don't have a Debian mirror with your countrycode, please cross-check with the Debian Mirror List. What is "testing" will become in "stable" in (most likely) 2010. Backports to the old stable ("lenny") have not yet been addressed via the official backports.org initiative. Instead, it is suggested to refer to NorduGrid's repository as shown below. The Globus packages are with Ubuntu since its Karmic Koala release.
NorduGrid's repository
In addition to ARC itself, a number of Globus packages is needed to enable the Globus Securty Infrastructure (GSI) and the popular GridFTP file transfer protocol. VOMS is needed for Virtual Organisation management, and LFC is needed for data indexing.
# Base channel - must be enabled deb http://download.nordugrid.org/repos/11.05/debian/ squeeze main deb-src http://download.nordugrid.org/repos/11.05/debian/ squeeze main # Updates to the base release - should be enabled deb http://download.nordugrid.org/repos/11.05/debian/ squeeze-updates main deb-src http://download.nordugrid.org/repos/11.05/debian/ squeeze-updates main # Scheduled package updates - optional #deb http://download.nordugrid.org/repos/11.05/debian/ squeeze-experimental main #deb-src http://download.nordugrid.org/repos/11.05/debian/ squeeze-experimental main
Replace squeeze with lenny or etch when appropriate. Older releases are in a different repository, see information about ARC 0.8.x repositories.
GPG Key
The archives are signed by the key available at:
http://download.nordugrid.org/DEB-GPG-KEY-nordugrid.asc.
Download the key and then install it in your apt configuration using the command:
sudo apt-key add DEB-GPG-KEY-nordugrid.asc
Client installation
First, install Globus certificate utilities, you are very likely to need them to create proxies and work with certificates when needed:
apt-get install globus-gsi-cert-utils-progs apt-get install globus-proxy-utils
Then you'd need keys of potentially relevant Certification Authorities (CAs); a simple approach is to install all of them by using
sudo apt-get install ca-*
This is however not recommended from the security perspective, and security-conscious people must install necessary certificates one by one.
One can also install CA keys from the original IGTF repository, see section #Installing CA keys from IGTF repository
Then install ARC client itself, which will pull in other necessary packages:
apt-get install nordugrid-arc-client
In all likelihood, you will need to install Globus plugins as well (as long as GridFTP is around):
apt-get install nordugrid-arc-plugins-globus
Installing CA keys from EGI repository
EGI has experimental support for Debian when it comes to CA keys:add this repository
#### EGI Trust Anchor Distribution #### deb http://repository.egi.eu/sw/production/cas/1/current egi-igtf core
Get the GPG key
wget -q -O - \
https://dist.eugridpma.info/distribution/igtf/current/GPG-KEY-EUGridPMA-RPM-3 \
| apt-key add -
And install the packages:
apt-get update apt-get install ca-policy-egi-core
Installing CA keys from IGTF repository
When nothing works or you need an urgent update before the packages are in repositories, get the keys directly from the upstream provider:
wget http://dist.eugridpma.info/distribution/current/igtf-policy-installation-bundle-1.40.tar.gz tar xvzf igtf-policy-installation-bundle-1.40.tar.gz cd igtf-policy-installation-bundle-1.40 ./configure --with-profile=classic sudo make install
You may need to browse to that repository first and find out what is the most recent version number, instead of 1.40
Status
- Globus packages are accepted by Debian in June 2009
- Production ARC packages are in Debian unstable as of June 2011
Motivation to use Debian with Computational Grids
Debian is special in its support of many platforms, across all of which the same set of files is made available at the same locations. Hence, if a software runs on one target, then it should also on all the other Debian systems, if memory and other hardware requirements are met. This is exactly what one needs for Grid Computing when the hardware underneath (PowerPC, Sun, Intel, AMD) may differ.
Communities
- Bioinformatics is mostly provided via the Debian-Med project, a task of which is the provisioning of a Bioinformatics infrastructure. The KnowARC project has produced an interface to the production ARC middleware from the workflow management tool Taverna, which is very popular among bioinformaticians. See grid.inb.uni-luebeck.de.
- Physics is partially addressed by the Debian-Science initiative. This group has already provided the following packages that are available in the regular Debian distribution
- CERN is buffering packages prior to an official upload in their own Debian repository since there are still license issues with the remote distribution for CERN.
With Mattias Wadenstein and Steffen Möller, there are at least two individuals that are very close to the Debian Society and the NorduGrid. Mattias Ellert does some fantastic bits to Globus, albeit more aiming at Fedora than at Debian, possibly :) Anders Wäänänen laid out the packaging of ARC and maintains all the nice regular builds.
Accessories
Debian packages as Runtime Environments
The script debian2runtime was developed within the KnowARC project to convert the description of a Debian package into a script that may serve as a runtime environment with production ARC.
Download and post-processing of external data
Together with the Debian-Med community the script getData was developed that downloads, updates and indexes common public databases. It comes with an extensive description that is available on the Debian Wiki pages.
Detailed notes
Configuration
When run as a client, there is no starting of any daemons. As with the production ARC, the command line tools such as arcsub, arcstat and arcget (counterparts to ngsub, ngstat, ngget etc) can be used directly. The configuration has then ended with the installation of the nordugrid-arc1-client package.
The configuration as a server has several building blocks. The key starting point is that ARC follows a service-based approach, where Grid computing is realised as a set of services that contact each other and can be contacted by clients. In the current production ARC, essential services are: GridFTP, A-REX (formerly Grid Manager, internal, no exposed interface), and infosystem services. Services are hosted inside the Hosting Environment Daemon (HED), which provides external interfaces and internal communications. Using HED, one can also more easily add services for your very own Grid application (Group-organised calendar, collective ice cream orders, ... , data management of large compute jobs).
Every service has its own configuration file, which can be merged in case several services are deployed in one box. The services are then started with HED (binary: arched). Please follow the instructions on usage of ARC components.
Installation of packages
Client installtion
For a client installation, perform as follows
# execute as root apt-get install nordugrid-arc-client
Presuming that you have certificates that grant Grid access in place already, test it with
$ arctest -J 2 -d INFO && sleep 300 ngstat -a $ arccat "that job id"
If you can read the "Hello, grid!", then you were successful.
Server installation
For a server installation, execute as root
apt-get install nordugrid-arc-client nordugrid-arc-server
The client package is only needed to test this installation. If the -libs package is not coming with it, then please install it manually.
The configuration file is expected at
/etc/arc.conf
as usual for ARC but not placed there by the install script. Upon installation of the host certificates, preparation of the directories, one can start the services
for i in ftpd "-manager" "-infosys"; do /etc/init.d/grid$i start done
Now start a test job
f=`hostname --long` arctest -J 2 -d INFO -c $f
Error messages are in /var/log/ and in /var/spool/nordugrid/jobstatus.
Installation of Globus (obsolete now)
You are now ready to perform the downloads:
# execute as root apt-get install gsoap voms globus globus-dev globus-doc
should do. You may be asked if these packages should be trusted if some are not signed. Also, you may be informed that some packages are needed but not installed. It is likely that those needed packages are automatically retrieved by the apt-get tool, aptitude does such, just add those missing packages to the command line.
Shared libraries cache refresh
# echo "/opt/globus/lib" > /etc/ld.so.conf.d/globus.conf # ls -l /etc/ld.so.conf.d/globus.conf -rw-r--r-- 1 root root 16 2008-07-25 16:44 /etc/ld.so.conf.d/globus.conf # ldconfig ldconfig: /opt/globus/lib/libltdl_gcc32dbg.so.3 is not a symbolic link ldconfig: /opt/globus/lib/libxmlsec1_openssl_gcc32dbg.so.1 is not a symbolic link
To build Globus packages for Debian or Ubuntu yourself, this Wiki has very simple to follow instructions.