Oliver Nassar

Caching is easy; releasing them isn't.

July 22, 2009

When developing web apps that are going to even moderately scale, caching will be your saviour. Whether it's assets such as images, js files, css, or data sets such as records from a users table, xml files, or whatever else, caching is key.

But when dealing with data sets, such as user records, posts, comments, or whatever else, caching them will not be the hard part; knowing when to release/renew them will be. For example, if you were to cache the 10 most recent comments for a blog post to prevent sql hits to your database, there are a few times you'd need to release/renew them:

  1. A new comment is added
  2. A comment is deleted
  3. The pagination changes from 10 to some other number
  4. The blog post is deleted

While this may not seem like a big deal, this is just one situation in which the comment data is used, and just one type of data set. Image 20 different tables (common) and more than one place where a data set is shown (very common; for example a widget on your homepage that shows the last 5 comments for all the blog posts, etc).

Having a clear naming convention for your cache keys is very important; probably the most important in terms of keeping your caching layer organized. It makes it easy to track caches that aren't working properly when the keys are something like User[1]->Comments(last 10) rather than a sha1 representation of the sql query that produces the results.

Just a tip.