Pantheon Advanced Page Cache: Drupal Cache Metadata on a Global CDN

Steve Persch, Director, Developer Relations Reading estimate: 4 minutes

Pantheon has always provided full page caching at our edge layer so that your HTML responses are served as fast as possible to your website's visitors, as well as handling huge spikes of traffic. Now with the launch of our Global CDN, those cached responses and their metadata go to 30+ points of presence worldwide. Responses from your site will come much faster when the caches are physically closer to your site visitors.

The new Pantheon Advanced Page Cache module sends Drupal's cache metadata as HTTP headers so that the Global CDN can clear specific items from cache whenever your underlying data changes. That means you can safely set longer TTLs for your content, increasing your cache hit-rate, and delighting more of your users with a snappy sub-second pageload.

That’s great, but as every developer knows, the real work with caching isn’t getting things to cache in the first place, it’s making sure they expire at the right time. Hence the adage about cache-invalidation being one of the two “hard problems” in computer science along with naming things and off-by-one errors. Enter Pantheon Advanced Page Cache Module.

How the Pantheon Advanced Page Cache Module Works

When a browser requests a page like example.com/node/1, the response from a Drupal site will contain a Surrogate Key header with lots of keys like "node:1" and "user:1". Those keys indicate that the page includes a representation of node 1 and user 1 (the author of the node). If either of those entities are re-saved (the node being edited or maybe the username of user 1 changing), our module will tell the Global CDN to clear any page that included those entities. Pages that did not contain node 1 or user 1 can remain cached.

Before leveraging the underlying Drupal cache metadata system in this way, many sites would just clear all of their caches with every node save. Or they would set a low cache max-age so that content would remain stale but for a limited period of time. Neither was a good option.

Related Caching Modules

For sites making heavy use of the Views module, you will probably want to use the Views Custom Cache Tags module. This module lets you set different tags for different Views. Those tags then become Surrogate Keys sent to the CDN. For example, a View of article nodes could have use the tag "node:type:article" and a View of page nodes could be tagged with "node:type:page". Then when an article node is saved, you can clear article-specific cache and not the cache for the page content type.

In practice, your Views probably have more complex filters than simply their content types. So take the time to check that the correct pages have their caches cleared or not as you save nodes. I personally like to use "curl -I" to look at the HTTP headers of a page. For example, I can pipe to grep just to see the Surrogate Keys and the age of the cache:

curl -IH "Pantheon-Debug:1" http://55-d8papc.pantheonsite.io/custom-cache-tags/article | grep -E 'Surrogate-Key|Age'

 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                Dload  Upload   Total   Spent    Left  Speed

 0 72451    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

Surrogate-Key-Raw: block_view config:block.block.bartik_account_menu config:block.block.bartik_branding config:block.block.bartik_breadcrumbs config:block.block.bartik_content config:block.block.bartik_footer config:block.block.bartik_help config:block.block.bartik_local_actions config:block.block.bartik_local_tasks config:block.block.bartik_main_menu config:block.block.bartik_messages config:block.block.bartik_page_title config:block.block.bartik_powered config:block.block.bartik_search config:block.block.bartik_tools config:block_list config:color.theme.bartik config:filter.format.plain_text config:image.style.medium config:search.settings config:system.menu.account config:system.menu.footer config:system.menu.main config:system.menu.tools config:system.site config:user.role.anonymous config:views.view.view_node_type_ab file:1845 file:1846 file:1847 file:1849 file:1850 file:1851 file:1852 file:1854 file:1855 file:1857 http_response node:4143 node:4144 node:4147 node:4154 node:4155 node:4156 node:4158 node:4160 node:4161 node:4163 node:type:article node_view rendered user:0 user:1 user_view

Age: 580

The "Age" header shows how old the cache is, measured in seconds. If I were to re-save node 4160, one of the nodes whose key appears in the header, or really any article node, the cache for this page would be cleared and the next request to it would get an Age of 0.

Once you trust that you are clearing all the right pages using surrogate keys you can start increasing the max age for your page caches. You move from the assumption that every page will expire often (maybe every 5 minutes or so) to thinking that some pages could be cached for hours, or longer, as long as the data referenced in the surrogate keys remains the same. The Cache Control Override module allows for the max-age information within a page's render arrays to bubble up to the level of HTTP headers. This module will help you take finer grain control of max-age across different pages on your site. Without it, the site-wide setting applies to all pages.

Note: It's possible that by the time you are reading this blog post, the functionality of Cache Control Override has made it into core and the separate module is no longer necessary.

Try It Out

Special thanks to Fabian Franz of Tag1 for writing the module. Fabian is a Drupal 7 Core committer and Framework manager. He is also the co-creator of the underly cache system our module relies upon.

Alpha releases of the module for Drupal 7 and 8 came out last month. For the Drupal 7 version, you'll need Fabian's Drupal 8 Cache Backport module.

Go ahead and try it out! If you have any questions, concerns, or feedback you can find me and other Pantheors in the Drupal.org issue queue, the Pantheon Community and Power Users Group, and via GitHub pull requests.

Discover More

Safely Publish to Web from Google Docs with Pantheon Content Publisher

Roland Benedetti (Senior Director, Product) and Zack Rosen (Co-Founder)
Reading estimate: 7 minutes

Unifying Content and Code: Inside Pantheon’s Vision for Content Operations

Chris Yates
Reading estimate: 5 minutes

Where Did All the Drupal 7 Sites Go? They Moved to WordPress

Josh Koenig
Reading estimate: 6 minutes

Try Pantheon for Free

Join thousands of developers, marketers, and agencies creating magical digital experiences with Pantheon.