Why Drupal Multisite Is Not Enterprise Grade

I’m just going to come out and say it. I don't think Drupal multisite qualifies as “enterprise grade.” We don’t support it on Pantheon for a reason — it’s a conscious choice based on our experience and technical perspective.

I’m not trying to flame-bait; multisite is right there in Drupal core (at the top of default.settings.php for over five years), and clearly it can work, but then again so can build your CMS in Erlang. The question is whether this is truly something you’d recommend, whether this is something that can drive the industry, whether it actually rises to the level of best practice.

For those of you who don’t know, “multisite” in Drupal context refers to a pattern where you load different configuration files based on the incoming url, allowing you to connect a single codebase to different databases for different domain names, thus serving multiple logical sites (hence “multisite”) from one installation. It’s a clever hack, and like a lot of clever hacks it can save you a lot of time.

But it also frequently means incurring a lot of technical debt, especially if what you actually need is to manage a large portfolio of sites with some amount of diversity in design, functionality, or scale.

Again, I’m not trash-talking. I've done multisite professionally as a consultant, and contributed code to the Aegir project. I’ve seen others do solid builds, and there are probably highly qualified teams working on several of these right now.

The point is I've been there, personally, in the trenches, and I’m saying this now because I don’t think we have to keep beating our heads against the wall. Multisites tend to be brittle and complex, and I’ve certainly never felt good about the single points of failure the architecture invites. Managing multsite at scale is a lot of extra work and I’ve always been uncomfortable with the tradeoffs required. I think there's a better way to architect Drupal to serve millions of customers in the future. 

Here are the top three failings of Multisite that make it not enterprise grade:

1. The Brittleness of a Single Codebase

The biggest drawback with multisite as a pattern is that it immediately cuts against what draws people to Drupal and open-source in the first place: the rich contributed space and ability to iterate, innovate and customize. Unless you’re literally cranking out carbon-copies, the single codebase straightjacket quickly becomes an uncomfortable fit.

For instance, department sites at a University have similar functionality, but some may be more ambitious than others in their use of the web. Some may have grants or programs that make novel use of their site. Some may have students who want to develop or contribute. Strict multisite doesn’t allow that.

Likewise, in chapter-based organizations most chapters have common needs, and will enjoy the quick start and design template that come from a common codebase. But different geographical regions all have their own needs, their own internal systems, sometimes even different regulatory requirements. Multisite requires you to engineer one install that solves all problems ahead of time, which can make for a huge project, verging on the impossible.

Finally, inside any corporate marketing department, microsites or campaign initiatives want to get off the ground quickly, and no CMO wants to waste time and money reinventing the wheel. But try telling a business unit, a brand, or a campaign team what they can and can’t do with their web presence. Try telling them their agency or vendor of choice can’t install their favorite module, or touch the theme. You’ll have a revolution on your hands.

Locking everyone into a single codebase restricts potential and impedes innovation. The choices are A) chafing away within the straightjacket, or B) racking up technical debt with divergent site use-cases on a single codebase or C) somehow building the ultimate “all things to all people” multisite product before the first site launches.

Combining multisite architecture and a rich portfolio of use-cases leads to frustrated stakeholders, baroque deployment workflows, unintended regressions, and finally platform fragmentation and the inevitable un-winding/spinning-out of sites that no longer fit the mold... all the time you saved up front gets paid back, with interest.

2. Single Points of Failure

Multisite architecture is also problematic in that it invites single point(s) of failure, something any enterprise organization seeks to mitigate.

When you share a common codebase across tens, hundreds or thousands of sites, you only have to deploy one bug to take many or all of them down. As you have more and more instances, it becomes increasingly time-consuming to vet a proposed change across all the properties it would impact. If you’re not able to test, you can’t deploy with confidence. Agility and innovation suffer.

Similarly, most multisite infrastructure implementations call for many sites on a single server, often resulting in over-subscription. You can save some cash by throwing a couple hundred sites on a box, but that means a traffic spike on one site can knock them all offline, and managing that box is going to become someone’s job. You can separate higher value properties to their own infrastructure, but that increases cost and complexity, especially when you’re also starting to manage divergent use-cases. It turns into a deployment nightmare.

The single point of failure also exists on the security side. Even if you spread out to a cluster, in most multisite architectures many sites share a MySQL server, relying on different users, passwords, and databases within a single process. They also typically run within a single docroot under an Apache or Nginx “virtualhost”, again running as the same process with no Linux userland separation between them.

This makes it impossible to segregate resource utilization — the “noisy neighbor” problem is un-solvable without separate servers for every site — and increases the attack surface for security exploits. If any one site is ever compromised, it becomes a beach-head for further incursions on the common MySQL or application resources.

3. The Illusion of Simplicity

Multisite’s promise of simplicity — one codebase, many sites; what could be easier? — belies the complexity of building out a successful Drupal platform for an enterprise. It’s an illusion. One should never underestimate the amount of work it takes in requirements-gathering, product design, development, QA and deployment/maintenance to create and support one codebase that runs a large number of sites.

It’s not just a matter of throwing together a couple modules and then spinning out a bunch of copies. You can’t just crank sites out of a factory, at least not if you expect quality and value. It takes serious time and attention to evaluate needs across an organization, develop the right range of (safe) configuration options, designate the design alternatives, etc.

You’ll also need tools to manage your deployments. Aegir is an open-source project that can be used to manage multisites, but it’s fairly complex to set up and run. I’ve seen people roll their own with Jenkins and Puppet, but that usually meant months of DevOps time. There are also proprietary solutions out there that try and make multisite workable.

But if your use-case calls for deep configuration, custom design, or per-site code, get ready to put in some hard yards figuring out what your infrastructure will need to allow (and what it won’t), and how you’ll manage changes to the common vs the particular. When the assembly-line breaks down, everyone suffers.

Pantheon One: The Pantheon Alternative

When we designed Pantheon, we wanted something different and better; an innovation for how you can build, launch and run sites that evolved beyond what we learned from doing traditional enterprise-style builds. We’ve built the only SaaS management platform for websites large and small. But it’s not just individual sites - we also wanted to deliver a better alternative to the portfolio, a real enterprise-ready answer to multisite.

Our answer is codenamed Pantheon One. It's currently managing 100s of sites for univiersites like Berkeley and ASU, chapter-based organizations like FRAG, and companies with many brands — for instance, the reason we tout so many wineries is because of a Pantheon One implementation.

Pantheon's unique ability to work in concert with our partners and internal technical organizations has made these projects huge wins for everyone. Here's how:

A Common Upstream

Pantheon One leverages git, using distributed version control the way Linus intended — delivering all the efficiencies of developing on a common codebase without the straightjacket or risk. With Pantheon One, you take control of the available “start states” for your users on Pantheon: instead of vanilla Drupal and install profiles like Panopoly, they’ll be presented with a menu of choices you created and approve.

These start states can include community projects like Phase2’s Open Public, but also your own. As your team develops and improves your preferred start-state(s) for new sites, updates are delivered via the same mechanism we already use to provide one-click updates to Drupal core.

This gives every site the ability to test and deploy the update, and to leverage the distributed power of Git to add their own modules, themes, etc (should you choose to allow it) and develop their own functionality and use-cases without jeopardizing any other property in your portfolio.

Pantheon also includes godlike tools for the central administration: the ability allocate access, generate reports, and to push out updates en-masse if needed for security.

Dedicated and Secure Resources

Just like everything else on Pantheon, we use our next-generation containerized infrastructure for every Pantheon One site. That means no site can step on any other’s resources, or pose a security risk.

This also means every site has access to the full suite of Pantheon tools as needed: Redis, Solr, Multidev, team management, New Relic... you name it, we’re ready to deliver it. Every site is a first class citizen of the cloud; if it takes off we can let you scale it at a moments notice. Per-site customizations in the code are seamlessly possible if you need them, but there’s never a need to make infrastructure changes to support a growing or evolving site.

Focus on Your Use Case

Pantheon can’t solve the hardest question — “what should my Drupal start state do?” — but we can let you focus on that like a laser by taking away all the administrative headaches. Pantheon One has access control, DNS set up, shared SSL, billing integration, and bulk workfow baked in.

That means you and your Drupal team can focus on the most important problems for your business rather than getting lost in the weeds of infrastructure. It still takes a lot of hard work, and potentially professional help, to build a good Drupal product as your enterprise platform. With Pantheon you can focus your attention directly on this problem, and forget all the risks and hang-ups and plumbing complexities that typically hamper teams when they have to contend with the “multisite” pattern.

If you want to know more about Pantheon One, contact us today.

If you're building WordPress sites, you can learn more about our new WordPress Multisite support on our new resource page.

Topics Multisite, Education, Drupal