Cache Busting with Django

2016-04-22 | #django, #helper, #webdev

The moment you start to deploy continuously to production systems you'll run into the problem of cache busting for frontend statics to circumvent CDN and client-side caching of those files. This mechanism is not activated by default and a little bit difficult to find, but relatively painless to setup, even in a running and deployed system. And the best thing: it's robust and part of the core. Hooray!

You'll mainly need one central change in your settings.py, changing the default static storage to the manifest or cached static storage:

# default, replace this with one below

STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.StaticFilesStorage'

# maps filenames in staticfiles.json in /static

STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'

# maps filenames in configured cache

STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.CachedStaticFilesStorage'

How do these storages differ from the default static handling? First of all, this is only relevant for production, with DEBUG = True these storages will behave no different then the default. In production however these storages change the workings of collectstatic and the static template tag.

During collectstatic Django hashes the file contents and appends that hash to the filename, managing the resolution of a vanilla filename to a hashed one via a mapping table. In case of the ManifestStaticFilesStorage, this is done via staticfiles.json in your project's static folder, if you use CachedStaticFilesStorage this mapping is stored in your cache.

The template static tag now automatically resolves vanilla names to the newest version with a unique hash and renders that URL instead. Collectstatic now even searches CSS via regex for @import and url( and replaces all URLs there as well automatically. So, going from there, you'll only have to manage dynamically loaded asset manually (JavaScript AMDs, dynamically set img srcs and such).

Another helpful thing you might want to do (if you are using Git) is to write a simple template filter to add the current commit hash to some URL in a template of yours. Doing this, every commit/deploy invalidates all URLs handled that way. If, for example you need cache busting for any user uploaded files, this might be a good way of doing that. If you are using filer though you'll mostly not have that problem since media folders are named through the file hashes themselves and invalidate themselves thereby automatically through each new upload.

import os

from django import template

from django.conf import settings

register = template.Library()

class GitHeadRevisionTag(template.Node):

head = None

@staticmethod

def _get_head():

git_dir = os.path.normpath(os.path.join(settings.BASE_DIR, '.git'))

# Read the HEAD ref

try:

fhead = open(os.path.join(git_dir, 'HEAD'), 'r')

ref_name = fhead.readline().split(' ')[1].strip()

fhead.close()

# Read the commit id

fref = open(os.path.join(git_dir, ref_name), 'r')

ref = fref.readline().strip()

fref.close()

except:

ref = ''

return unicode(ref)

@staticmethod

def get_head_cached():

if settings.DEBUG:

return ''

if not GitHeadRevisionTag.head:

GitHeadRevisionTag.head = GitHeadRevisionTag._get_head()

return GitHeadRevisionTag.head

def render(self, context):

return GitHeadRevisionTag.get_head_cached()

@register.tag(name='git_head_revision')

def git_head_revision(parser, token):

return GitHeadRevisionTag()

@register.filter(name='git_cache_buster')

def git_cache_buster(url):

hash = GitHeadRevisionTag.get_head_cached()

return '{url}?_gcb={hash}'.format(url=url, hash=hash) if( hash != '' )else url

After importing your extension with that filter you can add a git based cache buster via "|git_cache_buster" to every URL. "settings.BASE_DIR" has to be the projects git root dir. In case DEBUG = True, the filter just does nothing.