As we’re announcing Pantheon’s transition to the Google Cloud Platform (GCP) from our former home at Rackspace, I’d like to cover why we chose GCP over alternatives. The long-term answer is simply “to provide the best product for our customers,” but we can break that into two categories: today and tomorrow.
We’ve called our initial migration “lift and shift,” a term mostly used in enterprise—even mainframe—contexts. For Pantheon, this meant migrating while avoiding cascading changes to our architecture. However, this doesn’t mean there haven’t been major gains already.
Better network scalability.
Before the migration, one of the most common scalability challenges was cache access. Our servers at Rackspace would have their networks saturated with even one or two gigabits per second, which remains a common throughput limit for many infrastructure providers. GCP provides 2 gbps per core, up to a maximum of 16 gbps per instance. We haven’t seen a saturated network connection to Redis since the migration.
Peering for Global CDN.
One major change to the architecture of Pantheon in 2017 was the introduction of an integrated CDN. We did this to support further enhance performance, deliver universal HTTPS, confront rising threats of DDoS attacks, and provide more flexibility in request routing to different datacenters. We also considered other, future infrastructure partners as well. It’s no accident that GCP and Fastly (our partner for Global CDN) have a peering relationship. This interconnection means more reliability, lower latencies, and lower costs. In this case, “lower costs” are one reason we could replace the old $30/mo HTTPS add-on (restricted to certain plans) with an integrated CDN and HTTPS stack that is provided free for all plans.
In the years since Pantheon’s founding (2010), we’ve seen a shift from bottlenecks in the database to bottlenecks in PHP. We’ve seen this even with the incredible improvements from PHP 7: it’s partly because of database improvements, partly from modern page caches leaving only the heavy work for PHP, and partly from frameworks like Drupal and WordPress leaning heavily on PHP’s more advanced language features. In any case, the CPU running Drupal or WordPress matters more than ever before, and we’re proud to say that GCP leads all clouds in keeping the fleet fresh. Because GCP’s pricing model delivers discounts on a global level (instead of through reserved instances, which requite sticking with the same hardware for years), the platform will continue to enjoy up-to-date hardware into the future.
The Home of Kubernetes.
While Pantheon has deployed containers since before Docker or Kubernetes, the future of container orchestration is, clearly, Kubernetes. Pantheon already runs internal load balancers, filesystem services, and other components in containers managed by Kubernetes, and this has delivered improvements to monitoring, availability, and testing.
While we started with “lift and shift,” we certainly don’t plan to stop there. Now that we can use the entire fleet of GCP services—without crossing high-latency, unreliable WAN links—we can take the philosophy we’ve always advocated for websites to our own engineering team. That philosophy is to “stand on the shoulders of giants.” To us, it means focusing on where we can add the most value. Conversely, it means shedding the coding and operation of systems that others can handle better.
A Data-Centric Cloud from a Data-Centric Company.
While other cloud providers have been struggling to shoehorn high-level, data-centric services into clusters of VMs, Google has been building services on top of BigTable and their incredible fiber network. This takes the value from merely offloading operational duties to being able to do things you can’t do anywhere else: the keywords here are real-time, multi-region, and managed scalability. We’re just starting to dabble in Google’s high-level data tools, but we will use them to deliver better data to customers about their websites and our own operational teams.
Google is on the forefront of discovering vulnerabilities, addressing them quickly, and employing best-in-class security controls (physically and virtually). Their approach to security doesn’t end with the direct offerings on GCP, either. Google has also improved compartmentalization in Kubernetes (e.g. RBAC) and delivered market-leading work in capability-based security with their Firestore APIs. Almost any cloud can recreate the security controls of yesterday; GCP also provides the controls we’ll want going forward.
An Opinionated Cloud.
For Pantheon to (re)build services on high-level tools, we also need to be confident in two things: (1) that the tools we adopt will continue to be maintained and improved and (2) that the tools we adopt will have rich integration with related cloud services. Otherwise, we’re just shedding operational responsibility rather than gaining new capabilities. We’ve been skeptical of an approach we’ve seen outside of GCP, which I’ve called “throw the service against the wall and see if it sticks.” Because there’s little commitment to each service getting long-term investment, it results in (nearly) abandoned APIs and weak integrations with other services on the same cloud. We believe in delivering an opinionated platform to our customers; the same benefits exist for infrastructure.
For the Pantheon of 2020
We’ll always be adjusting our roadmaps, but our long-term goal remains the same: delivering the world’s best website management platform. As we settle into our new home on the Google Cloud, we’re re-dedicating ourselves to that goal by standing on Google’s shoulders to lift the platform capabilities all the higher. This means rebuilding core services on top of high-level tools, migrating to more standardized containers (orchestrated by Kubernetes), and, as a result, having more bandwidth to improve the areas where Pantheon is truly unique. These areas include our dashboard, our developer tools, our WordPress/Drupal workflow integrations, and our choices for the components running the websites themselves.
Here’s to 2018 and the years to come!
You may also like:
[GCP CASE STUDY] Evolving Operations for Mission Critical Websites