Monday, August 11, 2014

Mounting XtreemFS Volumes using Autofs

Autofs is a useful tool to mount networked file systems automatically on access, for instance on machines without a permanent network connectivity like notebooks. We prepared a short tutorial that describes how to use automounter for XtreemFS volumes.

This assumes you'd like a shared directory called /scratch/xtfs/shared across all of your machines and anyone can read/write to it. While I use /scratch in this example, more traditional /net could be used instead.
  • Assume all of XtreemFS is installed, set up properly, volumes are created...
  • Have autofs installed (and started or not).
  • Create an /etc/auto.master with these contents:
# All xtreemfs volumes will be automounted in /scratch/xtfs
/scratch/xtfs   /etc/auto.xtfs
# Include /etc/auto.master.d/*.autofs
# Include central master map if it can be found using
# nsswitch sources.
# Note that if there are entries for /net or /misc (as
# above) in the included master map any keys that are the
# same will not be seen as the first read key seen takes
# precedence.
  • Then create an /etc/auto.xtfs (which you'll have to modify for your MRC).
shared -fstype=fuse,allow_other
  • Restart autofs (A command similar to this):
sudo /etc/init.d/autofs restart
  • Do this for each machine on which you'd like to use autofs.
Thanks for Pete for contributing this tutorial!

Tuesday, June 3, 2014

XtreemFS moved to Github

We moved our Git repository from Google Code to Github. The new Project page is available at All tickets from the issue tracker have been migrated and are available with the same issue number. Other services like the public mailinglist or the binary package repositories are not affected.

We are looking forward to your feedback and contributions.

Thursday, March 27, 2014

Public demo server updated to XtreemFS 1.5

We updated our public demo server to XtreemFS 1.5. To tryout XtreemFS without setting up an own server, just install the client and mount our volume:

mkdir ~/xtreemfs_demo 
mount.xtreemfs ~/xtreemfs_demo 
cd ~/xtreemfs_demo

For testing you can create any directories and files as you like. Please do not upload anything illegal or copyrighted material. For legal reasons every file create/write is logged with the IP address and timestamp. Files are automatically deleted every hour.

Wednesday, March 12, 2014

XtreemFS 1.5 released: Improved support for Hadoop and SSDs

Berlin, Germany. Today, we released a new stable version of the cloud file system XtreemFS.
XtreemFS 1.5 (Codename "Wonderful Waffles") comes with the following major changes:

  • Improved Hadoop Support: Read and write buffers were added to improve the performance for small requests. We also implemented support for multiple volumes e.g., to store input and output on volumes with different replication policies.
  • SSDs support: So far, an OSD was optimized for rotating disks by using a single thread for disk accesses. Solid State Disks (SSDs) cope well with simultaneous requests and show a higher throughput with increased parallelism. To achieve more parallelism per OSD when using SSDs, multiple storage threads are supported now.
  • Multi-Homing Support: XtreemFS can be made available for multiple networks and clients will pick the correct address automatically.
  • Multiple OSDs per Machine: Machines with multiple disks have to run an OSD for each disk. We simplified this process with the new xtreemfs-osd-farm init.d script.
  • Bugfixes for Read/Write and Read-Only Replication: We fixed a problem which prevented read/write replicated files to fail-over correctly. Another problem was that the on-demand read-only replication could hang and access was stalled.
  • Replication Status Page: The DIR status page has got a visualization for the current replica status of open files. For example it shows which replica is the current primary or if a replica is unavailable.

Replication Status Page: "osd0" is the backup replica for the open file, "osd1" the primary and "osd2" is currently unavailable.
Tutorial for Read/Write Replication Fail-Over
Do you want to see the new replication status page in action? We prepared a tutorial which walks you through the setup of a read/write replicated XtreemFS volume on a single machine. 

The tutorial lets you stream a video from the volume and simulate the outage of a replica. You'll learn about the details of the XtreemFS replication protocol and why the video stalls for some seconds and then playback resumes. 

XtreemFS in a Briefcase
Our friends at AlmereGrid put the tutorial to the next level: They created a setup of eight Raspberry Pi mini-computers running XtreemFS - packaged in a briefcase! Check their website for more details. Here's their video which shows the briefcase and the demonstrated fail-over:

CloudCase - XtreemFS Cloud file system demonstration from contrail-project.

Developing for XtreemFS
Did you know that you can use XtreemFS directly in your application with our C++ and Java client libraries? This way you avoid any overhead due to Fuse and can access advanced XtreemFS features which are only available through the maintenance tool "xtfsutil" otherwise e.g., adding replicas.

From using XtreemFS it's only a small step to dive into the XtreemFS source code itself. We collected several introductory documents for novices in a Google Drive folder "XtreemFS Public". For example, have a look how to setup the XtreemFS Server Java projects in Eclipse. Have fun!

Friday, May 17, 2013

Processing a MRC metadata dump with XSLT

TL;DR We describe how to dump the metadata of an XtreemFS installation to a XML file. The XML dump is filtered for files located on a specific OSD using XSLT. You can use this example for own analyzes of your file system's metadata.

At our institute we run an XtreemFS installation for scientific users. The installation spans 16 OSDs which are hosted at our site and are regularly accessed by three other institutes throughout Germany. During recent maintenance work we lost all chunks of one OSD by human error: I accidentally deleted all chunks of that OSD because I mistook the directory for a backup whereas it was the last remaining copy. Since the installation is meant for temporary scientific data, we decided against replication and backups at deployment to maximize the available capacity. (Single-disk failures are covered by the underlying RAID5 used on each OSD.)

Nonetheless, it was necessary to inform all users about their deleted files. Therefore, I had to find out which files were placed on the affected OSD. XtreemFS stores the list of replicas per file at the MRC (Metadata and Replica Catalog). The MRC allows to dump and restore the metadata in XML format. To find the affected files, I filtered the XML dump using XSLT. This blog post details the required steps. You can use the provided example to run your own analyzes on your file system's metadata.

Create a MRC database dump
You can use the XtreemFS tool xtfs_mrcdbtool to dump or restore the MRC database. The MRC will write/read the dump locally. Therefore, you have to specify where the MRC should write the dump on its machine:
xtfs_mrcdbtool -mrc dump /tmp/dump.xml
This command will tell the MRC to write the database dump to the file /tmp/dump.xml. Make sure that the MRC has write permission for the given path. If you configured an "admin_password" for the MRC, you have to set the option --admin_password as well.

Filter the XML database dump using XSLT 
The MRC database dump is in XML format. The XML tree in the dump contains the file system tree of each volume.

You can use XSLT (Extensible Stylesheet Language Transformations) to filter the dump and transform the output to an even more human-readable form. I've added an example file to our code repository: filter_files.xslt You have to use a XSLT processor to transform the original XML dump. For example, use xsltproc:
xsltproc -o filtered_files_output.txt filter_files.xslt /tmp/dump.xml
The resulting file filtered_files_output.txt will have the following output format:
volume name/path on volume|creation time|file size|file's owner name
Modify the filter_files.xslt file to include or exclude other file attributes. This example handles only files which are (at least partially) placed on an OSD with the UUID "zib.mosgrid.osd15". This is realized by the following instruction in the XSLT file which limits the set of selected "file" elements:
<xsl:template match="file[xlocList/xloc/osd/@location='zib.mosgrid.osd15']">
Write your own XPath expression to realize own filters. If you want all files, just write match="file" without the brackets.

Tuesday, November 13, 2012

XtreemFS 1.4 released at Supercomputing 2012

Salt Lake City, Utah. Today we released XtreemFS 1.4, a new stable release of the cloud file system XtreemFS. This release is the result of almost one thousand changes ("commits") to the code repository, and extensive testing throughout the year. We worked both on major improvements to the existing code and new features:

  • Improved stability: Clients and servers are rock solid now. In particular, we fixed client crashes due to network timeouts and issues with the Read/Write file replication.
  • Asynchronous writes: Once enabled (mount option "--enable-async-writes"), write() requests will be executed in the background. This improves the write throughput without weakening semantics. We recommend to enable async writes.
  • Windows Client (beta): Complete rewrite based on the stable C++ libxtreemfs and using the Dokan alternative Callback File System by EldoS corporation. Try it by mounting our public demo server!
  • Hadoop support: Use XtreemFS as replacement for HDFS in your Hadoop setup. This version of XtreemFS comes with a rewritten Hadoop client based libxtreemfs for Java which also provides data locality information to Hadoop.
  • libxtreemfs for Java: Access XtreemFS directly from your Java application. See the user guide for more information.
  • Vivaldi integration: The Vivaldi replica placement and selection policies enable clients to select close-by replicas based on actual network latencies. These latencies are estimated using virtual network coordinates which are also visualized in the DIR web-interface. Check out the demonstration on the web-interface of our public demo server.
  • Extended OSD Selection: Now you can assign custom attributes to OSDs and limit the placement of files on OSDs based on those attributes.

This version also includes an updated version of the DIR/MRC replication and adds fail-over support for DIR replicas. As DIR/MRC replication is still in a very early stage this feature is intended as technology preview for more experimental users.

We are currently at the Supercomputing 2012 exhibition where we present XtreemFS at the Contrail booth #2535 as part of the Contrail project. Since the event takes place in Salt Lake City, Utah, we decided for "Salty Sticks" as release name for the 1.4 version.

Request for Contributions
As XtreemFS is an open source project, we are always looking forward to external contributions and we believe that this release serves as an ideal starting point for that. Here's an incomplete list of things you might be interested to contribute:
  • chef recipe or puppet configuration for automatic deployment
  • a fancy Qt GUI for the client
  • S3-compatible interface based on the client library libxtreemfs
  • direct integration with Qemu/KVM using the C++ libxtreemfs

XtreemFS Survey
At last, do not forget to fill out our survey if you use/have used/plan XtreemFS.

Thursday, October 11, 2012

XtreemFS User Survey

XtreemFS is free software with an anonymous download, and therefore we
only know a fraction of our users. If you are using, have been using,
or plan to use XtreemFS, we would love to hear from you!

To that end, we ask you to fill out this year's XtreemFS survey:

We know that this will take a few minutes of your time, but your
responses will help us tremendously.

If you feel uncomfortable sharing specific information, just skip the
question. But be assured that your information will not be shared with

For any questions or direct feedback, write to