Caching proxies¶
It is common to place a so-called caching reverse proxy in front of Zope when hosting large Plone sites. On Unix, a popular option is Varnish, although Squid is also a good choice. On Windows, you can use Squid or the (commercial, but better) Enfold Proxy.
It is important to realise that whilst
plone.app.caching
provides some functionality for controlling how Plone
interacts with a caching proxy, the proxy itself must be
configured separately.
Some operations in
plone.app.caching
can set response headers that instruct the caching proxy how
best to cache content. For example, it is normally a good
idea to cache static resources (such as images and
stylesheets) and "downloadables" (such as Plone
content of the types
File
or
Image
) in the proxy. This content will then be served to most
users straight from the proxy, which is much faster than
Zope.
The downside of this approach is that an old version of a content item may returned to a user, because the cache has not been updated since the item was modified. There are three general strategies for dealing with this:
-
Since resources are cached in the proxy based on their
URL, you can "invalidate" the cached copy by
changing an item's URL when it is updated. This is the
approach taken by Plone's ResourceRegistries (
portal_css
,portal_javascript
& co): in production mode, the links that are inserted into Plone's content pages for resource managed by ResourceRegistries contain a time-based token, which changes when the ResourceRegistries are updated. This approach has the benefit of also being able to "invalidate" content stored in a user's browser cache. - All caching proxies support setting timeouts. This means that content may be stale, but typically only up to a few minutes. This is sometimes an acceptable policy for high-volume sites where most users do not log in.
-
Most caching proxies support receiving PURGE requests for
paths that should be purged. For example, if the proxy has
cached a resource at
/logo.jpg
, and that object is modified, a PURGE request could be sent to the proxy (originating from Zope, not the client) with the same path to force the proxy to fetch a new version the next time the item is requested.
The final option, of course is to avoid caching content in
the proxy altogether. The default policies will not allow
standard content pages to be cached in the proxy, because it
is too difficult to invalidate cached instances. For
example, if you change a content item's title, that may
require invalidation of a number of pages where that title
appears in the navigation tree, folder listings,
Collections
, portlets, and so on. Tracking all these dependencies and
purging in an efficient manner is impossible unless the
caching proxy configuration is highly customised for the
site.
Purging a caching proxy¶
Synchronous and asynchronous purging is enabled via plone.cachepurging. In the control panel, you can configure the use of a proxy via various options, such as:
- Whether or not to enable purging globally.
- The address of the caching server to which PURGE requests should be sent.
- Whether or not virtual host rewriting takes place before the caching proxy receives a URL or not. This has implications for how the PURGE path is constructed.
- Any domain aliases for your site, to enable correct purging of content served via e.g. http://example.com and http://www.example.com.
The default purging policy is geared mainly towards
purging file and image resources, not content pages,
although basic purging of content pages is included. The
actual paths to purge are constructed from a number of
components providing the
IPurgePaths
interface. See
plone.cachepurging
for details on how this works, especially if you need to
write your own.
The default purge paths include:
- ${object_path}, -- the object's canonical path
- ${object_path}/ -- in case the object is a folder
-
${object_path}/view -- the
view
method alias - ${object_path}/${default-view} -- in case a default view template is used
-
The download URLs for any Archetypes object fields, in
the case of Archetypes content. This includes support
for the standard
File
andImage
types.
Files and images created (or customised) in the ZMI are
purged automatically when modified. Files managed through
the ResourceRegistries do not need purging, since they
have "stable" URLs. To purge Plone content when
modified (or removed), you must select the content types
in the control panel. By default, only the
File
and
Image
types are purged.
You should not enable purging for types that are not likely to be cached in the proxy. Although purging happens asynchronously at the end of the request, it may still place unnecessary load on your server.
Finally, you can use the Purge tab in the control panel to manually purge one or more URLs. This is a useful way to debug cache purging, as well as a quick solution for the awkward situation where your boss walks in and wonders why the "about us" page is still showing that old picture of him, before he had a new haircut.
Installing and configuring a caching proxy¶
The
plone.app.caching
package includes some example buildout configurations in
the
proxy-configs
directory. Two versions are included: one demonstrating a
Squid-behind-Apache proxy setup and another demonstrating
a Varnish-behind-Apache proxy setup. Both examples also
demonstrate how to properly configure split-view caching.
These configurations are provided for instructional
purposes but with a little modification they can also be
used in production. To use in a real production instance,
you will need to adjust the configuration to match your
setup. For a simple standard setup, you might only need to
change the
hostname
value in the buildout.cfg. Read the README.txt files in
each example for more instructions.
There are also some alternative buildout recipes for building and configuring proxy configs: plone.recipe.squid and plone.recipe.varnish. The examples in this package do not use these recipes in favor of using a more explicit, and hopefully more educational, template-based approach. Even if you decide to use one of the automated recipes, it will probably be worth your while to study the examples included in this package to get a few pointers.
Running Plone behind Apache 2.2 with mod_cache¶
Apache 2.2 has a known bug around its handling of the HTTP response header CacheControl with value max-age=0 or headers Expires with a date in the past. In these scenarios mod_cache will not cache the response no matter what value of s-maxage is set.
https://issues.apache.org/bugzilla/show_bug.cgi?id=35247
One possible workaround for this is to use mod_headers directives in your Apache configuration to set max-age=1 if s-maxage is positive and max-age is 0 and also to drop the Expires header
Header edit Cache-Control max-age=0(.*s-maxage=[1-9].*) max-age=1$1 Header unset Expires
Dropping the Expires header has the disadvantage that HTTP 1.0 clients and proxies may not cache your responses as you wish.