Tuesday, November 13, 2012

XtreemFS 1.4 released at Supercomputing 2012

Salt Lake City, Utah. Today we released XtreemFS 1.4, a new stable release of the cloud file system XtreemFS. This release is the result of almost one thousand changes ("commits") to the code repository, and extensive testing throughout the year. We worked both on major improvements to the existing code and new features:

  • Improved stability: Clients and servers are rock solid now. In particular, we fixed client crashes due to network timeouts and issues with the Read/Write file replication.
  • Asynchronous writes: Once enabled (mount option "--enable-async-writes"), write() requests will be executed in the background. This improves the write throughput without weakening semantics. We recommend to enable async writes.
  • Windows Client (beta): Complete rewrite based on the stable C++ libxtreemfs and using the Dokan alternative Callback File System by EldoS corporation. Try it by mounting our public demo server!
  • Hadoop support: Use XtreemFS as replacement for HDFS in your Hadoop setup. This version of XtreemFS comes with a rewritten Hadoop client based libxtreemfs for Java which also provides data locality information to Hadoop.
  • libxtreemfs for Java: Access XtreemFS directly from your Java application. See the user guide for more information.
  • Vivaldi integration: The Vivaldi replica placement and selection policies enable clients to select close-by replicas based on actual network latencies. These latencies are estimated using virtual network coordinates which are also visualized in the DIR web-interface. Check out the demonstration on the web-interface of our public demo server.
  • Extended OSD Selection: Now you can assign custom attributes to OSDs and limit the placement of files on OSDs based on those attributes.

This version also includes an updated version of the DIR/MRC replication and adds fail-over support for DIR replicas. As DIR/MRC replication is still in a very early stage this feature is intended as technology preview for more experimental users.

We are currently at the Supercomputing 2012 exhibition where we present XtreemFS at the Contrail booth #2535 as part of the Contrail project. Since the event takes place in Salt Lake City, Utah, we decided for "Salty Sticks" as release name for the 1.4 version.

Request for Contributions
As XtreemFS is an open source project, we are always looking forward to external contributions and we believe that this release serves as an ideal starting point for that. Here's an incomplete list of things you might be interested to contribute:
  • chef recipe or puppet configuration for automatic deployment
  • a fancy Qt GUI for the client
  • S3-compatible interface based on the client library libxtreemfs
  • direct integration with Qemu/KVM using the C++ libxtreemfs

XtreemFS Survey
At last, do not forget to fill out our survey if you use/have used/plan XtreemFS.

Thursday, October 11, 2012

XtreemFS User Survey

XtreemFS is free software with an anonymous download, and therefore we
only know a fraction of our users. If you are using, have been using,
or plan to use XtreemFS, we would love to hear from you!

To that end, we ask you to fill out this year's XtreemFS survey:

We know that this will take a few minutes of your time, but your
responses will help us tremendously.

If you feel uncomfortable sharing specific information, just skip the
question. But be assured that your information will not be shared with

For any questions or direct feedback, write to felix@xtreemfs.org

Wednesday, July 11, 2012

What is object-based storage (and what it is not)

TL;DR Object-based storage is a term that categorizes the internal architecture of a file system, it is not a particular features set or interface. While the internal architecture of a file system has many implications for its performance and features, its outer appearance remains that of a file system.

We have often stressed the fact that XtreemFS is an object-based file system. While talking to our users, however, we have realized that this term causes more confusion than enlightenment. I blame this poor choice on our academic ignorance, and I hope I can clean up the confusion a bit. In the end, object is not a very descriptive term and most people associate it with object-oriented programming (totally unrelated) or the objects in Amazon's S3 system (only somewhat related).

In storage, an object is a variable-sized, but limited container of bytes. You probably wonder why this trivial concept deserves its own term and became relevant to the storage community at all. Well, this has mostly two aspects - first the name itself, then its main property, namely the fact that it is variable-sized.

Block, Blocks, Blocks

While storage hardware keeps a series of bytes, no storage hardware exports byte-level interfaces (disks, tapes, flash, even RAM). The reason is efficiency: addressing single bytes would yield long and many addresses (metadata overhead), but also reading and writing single bytes is inefficient (think checksums, latency, seeking, etc). The unit that is actually used is blocks, a fixed-size container of bytes.

File systems organize blocks into larger and variable-sized containers. This is also true for distributed file systems. As many distributed file systems do not run on bare hardware, they can actually chose a certain block size. There is wide range of file systems, where the block size for all files in the file system is fixed. Such a system is not very flexible: you need to chose a block size that fits all and in turn all you file sizes should have simliar size. There was a saying about Google's GFS (a block-based file system with a 64MB block size): it can hold any set of files, as long as they're large and not too many.

There is a second aspect of blocks shared between local and distributed file systems. Blocks are agnostic about files, ie. a block does not know which file it belongs to. While that's a no-brainer for local file systems, storage servers of block-based distributed file systems are somewhat degraded because they only store anonymous blocks. Only the metadata server knows how the blocks make up files.

Here come the objects

You can imagine the joy in the storage community when systems and standards arrived that allowed choosing a block-size per file. This innovation deserved a new term: object. Objects have also a second aspect that makes them great for distributed file system architectures: they raise the abstraction level a bit by making the storage server aware of the object's belonging. Objects are not addressed by a file-agnostic block identifier as blocks are, but by file identifier and sequential object number. This has many advantages for the architecture as storage servers can actually host file system logic (like for replication), which they are equipped for when they run on commodity hardware.

As I hinted earlier: objects are not super-relevant for the user, because all file systems make you work with files (XtreemFS even posix files). And Amazon's S3 objects are not the objects we are talking about here, because they are not size-limited. They are rather files without a hierarchical namespace.