Chelonia - a lightweight self-healing distributed storage
Chelonia consists of a set of SOAP based services residing within HED. Together, the services provide a data storage system which is:
- self-healing, reliable and robust - by automatically ensuring the given number of replicas is always present
- seamless - by providing a global ("cloud"-wide) namespace
- scalable - through accepting a large variety of storage nodes
- resilient and consistent - through a replicated metadata database
In Chelonia, data is managed in a hierarchical global namespace with files and subcollections grouped into collections. A dedicated root collection serves as a reference point for accessing the namespace. The hierarchy can then be referenced using Logical Names. The global namespace is accessed in the same manner as in local filesystems.
The capabilities of Chelonia are provided by several interconnected web services (see Figure).
- The Librarian (L) services store and manage the metadata for all the entities in the system using the A-Hash services.
- The A-Hash (A-H) service is a replicated database which is used by the Librarian services to store the metadata. A-Hash in based on Berkeley DB.
- The Shepherd (S) services store and manage the storage nodes with all the actual file data.
- The Bartender (B) services serve the users by negotiating with the other services to fulfil the user's requests.
The services communicate with each other through the Message Chain Components in HED. The communication channels are depicted by lines in the Figure.
Chelonia has a modular architecture for file transfer protocols, currently only HTTP(S) is supported. The replicas of the files are stored on different storage nodes. A storage node here is a network-accessible computer having storage space to share, and running a Shepherd service and a supported HTTP server.
The lightweight ARC client tools provides two Command Line Interfaces (CLI) which give the basic data movement and listing capabilities. Chelonia comes also with a FUSE-module (Filesystem in Userspace) which allows users to mount the storage namespace into the local file system enabling the use of graphical browsers and simple drag-and-drop file management.
Chelonia storage can also be specified as a source of input data for jobs on an ARC-connected Grid, as well as the output location for the result files.