Josh Koenig , Co-Founder & Chief Strategy Officer January 23, 2018 Reading estimate: 4 minutes
Pantheon Moves to Google Cloud Platform
Today we’re very happy to publicly announce Pantheon has switched from Rackspace to Google Cloud Platform (GCP) as our primary infrastructure provider. This means faster performance, higher uptime, and increased innovation ahead for all of our customers.
For anyone concerned about this move impacting their site on Pantheon, rest easy: it happened six months ago. We completed the wholesale migration of our CMS container matrix and attached database, filestore, cache, and search index services this past summer, 90% of it in one epic two-week sprint in mid-July.
You read that right. Our team pulled off a complete infrastructure switch for hundreds of thousands of websites, including those under active development and/or heavy production traffic loads, in just two weeks, all without downtime, regressions, or maintenance windows.
A few savvy developers did notice this change at the time (via IP address lookups), but the vast majority of our customers were unaware that anything happened, which was our goal. You can read the Google Cloud’s team take on the migration here.
It took us much of the rest of 2017 to finish migrating all our internal services, and to complete some high-priority refactoring to begin taking advantage of GCP’s advanced capabilities. That’s why we’ve held off on the announcement until now. We hope our customers and partners will join us in celebrating this milestone.
Why GCP? Innovation and Enterprise Capabilities
While the migration story itself is impressive (more on that later) the really important thing is how this move positions us and our customers for the future. GCP represents more than a hardware upgrade: it puts us in a much stronger position to innovate with new capabilities, and to deliver on oft-requested enterprise requirements.
As a newer entrant to the “cloud wars”, Google is investing heavily in their platform, especially compared to our previous provider, Rackspace. While RAX was a great partner for us—especially in the early days when their VMs were significantly faster than even the highest-priced AWS EC2 instances—they’ve made clear their intent focus more on support instead of their own public cloud offering, so it was time for us to move on.
From our perspective, GCP is the best provider when it comes to being a “platform for platforms.” They have an opinionated suite of services which are architected to fit together in specific ways to solve specific problems, an aesthetic which clearly resonates with us. Their commitment to open source, the open web, and Kubernetes also fits with our long-term roadmap.
We now have access to first class infrastructure automation tools, an amazing suite of big data tools, as well as turnkey Machine Learning and analysis capabilities. We are already using these internally, and look forward to developing new and innovative customer-facing features in the months and years to come.
Furthermore, Google operates what is arguably the world’s most sophisticated global network and physical data center infrastructure. This allows us to begin passing through more and better industry certifications for our customers, as well as giving us a clear path to run sites in and across multiple data centers, including non-US regions.
Stay tuned for updates in the coming months as these efforts begin bearing fruit and we bring new capabilities to market. In the meantime, if you have specific questions regarding what’s new as a result of this shift, we’re happy to chat about it.
How We Did It: Software > Servers
Our migration journey was quite a feat; a testament to our engineering team and the best possible proof-point to our “Software > Servers” approach. If we were in the business of providing traditional managed VMs for customers we’d never be able to pull off this scale of migration in the timeframe we did.
The epic migration push in July was a huge success, with over 50,000 of the most active sites on the platform moving over in one two-week sprint. However, that didn’t happen without a lot of planning, testing, and tooling.
First, we were able to leverage a lot of our existing container-balancing technology. Since we already have standard procedures for handling the inevitability of individual servers failing or needing to be replaced, there was an existing foundation in place for “move resource X for site Y from point A to point B” type operations.
This capability is what lets us deliver some of our core features, like smooth scaling to handle internet scale traffic spikes, as well as unlimited on-demand development environments to allow for parallel development, testing, and training.
Of course, there’s a big difference between spinning up a new dev environment and moving every single resource for every environment for every site halfway across the country and onto a completely separate network. It took several months of engineering and testing to bring our migration tooling up to this task.
We were also able to leverage our partnership with New Relic to very closely monitor site performance and respond to edge cases quickly. New Relic’s “X-ray vision” into CMS internals gave us a high level of confidence in our testing, and also allowed us to fix minor regressions in near-real time without having to halt the migration.
Onward and Upward
With the right platform, tools like Drupal and WordPress can run circles around what traditional enterprise vendors are hawking. We expect increasing adoption of open source CMS technology as more and more companies, organizations, and digital agencies see the business value of embracing the open web.
Ultimately websites exist to deliver results. In the modern era, that means moving quickly, experimenting, iterating, doubling-down on what works, and being able to keep up with your own success.
That’s why more and more websites are powered by CMSs vs. made by hand, and also why more and more CMS-powered websites leverage open source. You just can’t move all that fast when you’re dealing with a massive proprietary systems implementation, and you can’t go all that far if you’re inventing it all on your own.
Pantheon has helped thousands of customers make the leap, to realize the benefits of a high-velocity, high-quality, agile web presence. With GCP as our core infrastructure, we’re in a stronger position than ever to deliver value. We look forward to a multitude of improvements to Pantheon’s core platform, dashboard interface, and related tools made possible by this migration.
We Make the Internet: A New Podcast about Website Operations
Reading estimate: 3 minutes
How to Implement a UX Strategy That Works for You
Reading estimate: 5 minutes
Move Your Mission Forward with WebOps
Reading estimate: 3 minutes