David Strauss , Co-Founder & CTO December 4, 2012 Reading estimate: 4 minutes
How Pantheon Scales
We often get questions from customers about how Pantheon scales to meet production traffic demands. Usually, people want to know how much memory or many cores their site gets for running PHP and the database, which is not quite the right questions for Pantheon projects.
Let’s step through common site bottlenecks and explore how Pantheon supports scale—and your sites can too with the right design.
The Application: PHP
This is now the main bottleneck we see in Drupal applications, typically because of many modules and the heavy theme layer in Drupal 7. This is also the easiest to scale on Pantheon through a system we call DROPs.
A “DROP” on Pantheon includes an nginx instance, a pool of up to eight PHP-FPM processes, and a files directory client. Each basic DROP includes at least 256MB of memory for PHP, shared among the processes.
We monitor machines hosting DROPs to ensure that we strike the right balance to give applications access to processing power without contention. We also ensure this balance at the kernel level: DROPs have secure isolation and enforced parity for CPU, memory, bandwidth, and disk access. I discuss some of the technology we use here in Drupal Watchdog (Volume 2, Issue 2: “Daemon and Process Isolation”).
DROPs give Pantheon customers unprecedented precision in scaling Drupal applications while balancing economies of scale from larger process pools—benefits like shared APC opcode caches. New DROPs can be provisioned on-demand by jumping up to the next tier through the site dashboard. They go into service immediately and get billed with the site. We even prorate them down to the hour as they get added or removed.
At our edge, we balance traffic among the DROPs your site has. If one (or even a few) are down, we transparently redirect traffic to healthy ones. We perform this redirection before the first request even fails—this isn’t your standard load balancer with health checks every few minutes.
The Database: MySQL
MySQL is another hotspot for Drupal applications. At Pantheon, we start by putting in place the best vertical scaling practices: InnoDB, a big buffer pool (for pro and enterprise sites), and a modern build of MySQL 5.5. We run databases on server-class hardware with hardware RAID 1+0 and SAS disks.
Adding more DROPs for PHP automatically scales your database vertically with more memory and CPU access. This extra database capacity is built into the cost of your DROPs.
If that’s not enough, we can deploy replica servers the same way as DROPs. These replica servers automatically update the configuration we send into the application. For a Drupal 7 site with Views, sending Views to MySQL replica servers takes our customer support team one click—and you a click in the View configuration to let it run on secondary MySQL servers.
We can configure replicas in a tree that scales out, even something like four replicas of the main primary and four more sub-replicas of each main replica.
The Cache: Redis
We deploy Redis on request for any Pantheon site. Drupal can send its cache items, session data, locks, and queues to Redis. That’s a lot more than was possible with memcached, the old cache of choice.
We deploy native PHP extension drivers for Redis connections, which is faster than interpreted PHP access. Redis even synchronizes your cache data to disk so restarting it doesn’t empty your cache.
The Files: Valhalla
Every Pantheon site accesses a highly available cluster of six nodes for file content. Whenever a file gets added to a Pantheon site, we write it to a minimum of three places. Loading the files through a browser takes a path right to the Valhalla cluster, skipping the common bottleneck of a network-mounted file system for delivery over HTTP. This path sends a strong ETag all the way down to the client browser, enhancing hit rates in Varnish and the browser itself.
We’ve also retargeted many critical Drupal operations to interact directly with Valhalla to, for example, invalidate 100 imagecache derivatives at a time when the original gets updated.
We’re currently working on next-generation client that adds multi-threaded access, a smart directory listing cache that should massively improve performance for sites with folders containing over 5000 files, and better analytics for sites bottlenecked on file access for operations like parsing metadata.
Pantheon is the only platform, anywhere, providing self-service Drupal sites with a highly available files directory and integrated workflow tools.
The Full-Text Index: Solr
Solr doesn’t just give Drupal a quality full-text search engine and flashy faceting tools; it also takes tons of load off MySQL. It’s hard to run a serious Drupal site without it.
We enhance it further with a custom Pantheon, Drupal-integrated Solr client that keeps a high-performance cache of recent queries. This caching helps quite a bit for sites delivering dynamic pages that include Solr-sourced content, like lists of related items.
The Edge Cache: Varnish
If you’ve scaled a Drupal site in the past three years, you probably used Varnish as your first line of defense against traffic spikes. Pantheon runs a highly available cluster of Varnish servers that fronts projects on Pantheon. It’s kind of like a mini-CDN.
Our operations team has experience mitigating DDoS attacks directly at this edge and escalating to Rackspace network engineers as necessary to maintain the availability of Pantheon projects.
If users can’t resolve your site’s DNS records, nothing else matters. Pantheon has always relied on a globally distributed cluster of DNS servers to provide our customers with reliable service for routing to the cluster of edge servers.
We also maintain our own redundant DNS servers that are ready to go into service if our primary global provider experiences service issues.
Analytics: New Relic
Every Pantheon site has access to New Relic data on performance and scalability. With one click, you can get a two-week Pro trial on even free sites. You can also upgrade to a full New Relic Pro plan from your site’s dashboard. Like everything else, we prorate use of this service down to the hour and include it in your unified Pantheon services bill.
Congratudolences: You Inherited a Website
Reading estimate: 3 minutes
How Mercury Healthcare Built a Multimillion Dollar Business on Pantheon
Reading estimate: 3 minutes
Accelerating on Autopilot: 5 Ways to Be More Productive
Reading estimate: 4 minutes