The Open Source Components of Valhalla

Building the Pantheon platform meant tackling one of the big remaining “hard problems” in cloud services head-on: how to provide a secure, scalable, synchronous network-attached filesystem for Drupal? We solved this problem with Valhalla, a breakthrough technology that serves files for the 10s of 1,000s of Drupal sites running on Pantheon.

Parts of Valhalla are proprietary, but much of the engineering work is directly open or goes to support open-source foundations. Between the Valhalla client itself (GPL’ed) and the upstream libraries we use, we’re making a lasting impact to users of open software.

FuseDAV: The Open Valhalla Client

Most network filesystems have two main components: the client and the server. While we initially began Valhalla using an existing — and somewhat long-in-the-tooth — WebDAV client (davfs2), it soon became clear that we needed a much more powerful and modern client component to take our platform to scale.

The most obvious contribution from Valhalla, and certainly the largest in lines of code, is this client. It began as a derivative of the earlier FuseDAV project, originally started by Lennart Potterang (of systemd fame). He wanted a modern replacement for davfs2, but his own personal interest could only take it so far. Pantheon used his work as a foundation for what we’ve developed.

Some of the features we’re most proud of, like caching that supports connectivity interruptions, are usable with FuseDAV with any WebDAV server. This allows the remote file-mounts to weather the inevitable networking hiccoughs and slowdowns that are a part of life on the internet, preventing the dreaded NFS lock-ups that have plagued sysadmins for as long as this technology has existed.

FuseDAV is also the only multithreaded FUSE client for WebDAV, which also makes it one of the few multi-threaded FUSE clients, period. We test every release against the RFC-compliant PyWebDAV server to verify that everything works well even without our WebDAV protocol extensions.

Safer connection lifecycle handling in libcurl

The libcurl project strives to properly handle connection issues over a variety of SSL and TLS implementations. Ever since moving FuseDAV to libcurl (instead of Neon), Pantheon’s put libcurl through a really hard stress-test: our customers’ sites. We found a failure case causing request timeouts to hang if the connection experiences interruption during a particular phase of the connection lifecycle.

This fix is now available to everyone using libcurl with NSS, including users of Fedora, CentOS, and RHEL. This fix also improves connection-handling for indirect users of libcurl, like Yum and PHP. Also, props to Fedora’s libcurl package maintainer for his assistance in identifying the key parts needing backports to older releases.

A new, efficient way to stream-parse HTTP responses

As of a couple months ago, the best examples (in public source code or documentation) of using libcurl to parse XML or HTML responses involved pulling the entire response into a file or memory. But, as we switched FuseDAV from Neon to libcurl, we needed something better for large PROPFIND responses. So, we developed a new, efficient way to integrate libcurl’s response buffer handling with Expat’s stream parsing support. It’s now an official part of libcurl’s documentation:

A cleaner API for efficient, network-based FUSE file systems

In the high-level FUSE API, there are two ways to support open file handles that continue to access unlinked (read: deleted) files. By default, FUSE renames the file to something hidden, but this incurs unnecessary network overhead by writing to and reading from a server even though the content is only needed locally.

The efficient alternative is FUSE’s support for the combined “hard_unlink” and “flag_nullpath_ok” options, the former of which is in the mount options, while the latter is in the operations struct.

Given the fear-inducing documentation for “hard_unlink” (which pretty much says it’s always a bad idea), a developer would have to look at the FUSE source to realize that smart network-based file systems ought to enable and support both. Even then, it’s awkward to do so because “hard_unlink” is a mount option, not an operations struct flag. The FUSE project is taking our suggestion to move flags related to “hard_unlink” into the operations struct to make development of efficient network-based file systems more straightforward. This change should appear in the next API revision.

Our work at Pantheon spans multiple disciplines, languages, and open source projects. While we can’t release every piece of the platform for free, we’re making a solid impact across the Linux ecosystem. Even if you never have cause to use our FuseDAV client, you will probably benefit at some point from our work on the common workhorse libraries that are used under the hood in thousands of open source software applications.

Topics Education