HTML5 Page Cache with pjax + Web Storage + Firebase

I was curious if anyone was using HTML5 features like the appCache or localStorage to create some kind of client side cache of rendered pages of a dynamic website, and then using a technology like Firebase or a WebSockets implementation to invalidate the cache.

It seems like this particular type of caching maybe hasn’t been explored enough?

Projects like Rails have taken full advantage of server side caching of pages and fragments of pages. They’ve even helped you do browser side caching with things like Etags. However, Etags still require a round trip to the server to get the Etag in the response.

Then there’s things like Cache-Control headers to tell your browser to cache pages. But Cache-Control doesn’t seem to be accessible from a javascript API to invalidate. And finally, there’s things like HTML5’s appCache and its manifest, but it seems more suited to store static assets.

Of course, you could design your app with javascript projects like Backbone, which removes some of the need for this type of approach, but if you already have a typical web application where HTML templates are rendered on the server, a client side cache of those dynamic pages might be a nice win.

So I created a real quick and dirty proof of concept to see how you could start caching pjax data in a browser’s Web Storage (I chose to use sessionStorage for this) and then use a realtime technology like Firebase to invalidate the cache.

Let me repeat, this is very quick and very dirty :) I merely wanted to explore the facets of a technique like this. I haven’t added this type of thing to any production site or spent much time to make sure this thing is a tremendously fabulous idea.

But so far, it seems very interesting.

I decided to hack on top of Chris Wanstrath’s pjax project for a couple reasons:

  1. The pjax code already does the smart things to intercept clicks and handle push state appropriately.
  2. localStorage/sessionStorage is 5MB. So using pjax to only capture and store part of a page instead of the entire page might be a good way to go.

Here’s my fork of the pjax project:

The localcache_firebase branch is my update of the original heroku demo of pjax. You should be able to pull it down and run the demo locally just as before to see the client side cache in action.

If you pull down the “localcache_firebase” branch, run “rackup” from the root directory, you’ll be running a Sinatra app. If you go into pjax mode, you’ll notice that the first click to a page like “aliens”, makes a trip to the server, but then the second click to aliens is instantaneous and no trip to the server is made.

There isn’t much to this really. The tiny changes I made were to put the ajaxed data into sessionStorage:

 var location = getLocation(options.url)  
 sessionStorage.setItem(location.pathname + location.hash +, data)  

And if pjax is called on a link, it looks to sessionStorage first:

  var location = getLocation(options.url)  
  var data = sessionStorage.getItem(location.pathname + location.hash +  
   handlePush(data, options)  
   return true  
   pjax.xhr = $.ajax(options)  
   $(document).trigger('pjax', [pjax.xhr, options])  
   return pjax.xhr  

Cache invalidations can happen in two ways:

1) Included with the pjax javascript, a form submit listener is automatically setup to clear your sessionStorage cache on any form submit. This way, if you create a new object from a form, you don’t have to wait for Firebase or WebSocket data to asynchronously tell your browser that the cache is stale.

 $('form').live('submit', function() {  
  return true  

2) There’s also the assumption you’ll use a realtime technology like Firebase or a WebSocket implementation to listen for events. So in the heroku demo, I’ve included code to my Firebase account in the head of layout.erb:

  <script src=""></script>  
  <script type="text/javascript">  
   var dataRef = new Firebase('')  
   dataRef.on('value', function(snapshot) {  

Which merely demonstrates that you could use something like Firebase to push events that this user belongs to in order to tell the browser to clear the cache.

For example, let’s pretend we created an app like Basecamp. Anytime there is a new to-do list, we could use Firebase to tell any clients, who belong to the project which owns that to-do list, that they should invalidate their browser cache.

A couple immediate things to note:

1) Cache invalidation isn’t granular right now. I just clear all of sessionStorage which will in all likelihood break your current app if you are presently using sessionStorage for something else.

However, the cache is storing its data using URL paths as its keys. So on the server side, you could make cache invalidation much more granular by invoking methods to invalidate specific URLs.

2) Firebase could be much more utilized here. I’m just using it as an event listener and pusher, but we could get rid of Web Storage entirely and store the cache right into Firebase itself. This could get real interesting, since the cache will then propagate itself to other clients and will further reduce server requests, if the cached entries are already primed on the client.

Finally, if you want to see someone doing some really interesting and ambitious things with pjax, check out Adrian Holovaty’s pjax branch in Django. He’s attempting to make pjax itself more granular and only update fragments of a page based on a diff of what changes from request to request.

There really isn’t much to my demonstration, and I just briefly hacked at the pjax project to create this. But I wanted to get your feedback on this approach to see if you are trying something like it already or have felt this could be handy.

P.S. There’s a chance you might dig following me on Twitter, here.


Now read this

A new high

Most people who’ve read this blog know something of my origin story. It even sounds like the beginning of a Spiderman movie. I used to be a Chemical Engineer in school, but couldn’t stand working at a uranium processing plant during a... Continue →