Image by chascar via Flickr
When Amazon's CloudFront (CF) was announced, we realized that we could move these kinds of static files to CF as a way to reduce our bandwidth and maybe get a performance improvement. If you aren't familiar with it, CF is an Amazon service that fronts their S3 storage system to provide worldwide access & distribution over the 'Net. It's not quite a content delivery network (CDN) but is a substitute for one for many companies.
CloudFront works by fronting S3. Thus, to get something on CloudFront, you upload it to an S3 bucket. You configure CloudFront to point to an S3 bucket and then anything you put in that bucket is available through CloudFront.
Because CloudFront is a simple pointer, if you will, to an S3 bucket, there's no way to tell CF when that item has changed and naturally Amazon doesn't want to check it each time. CF just happily serves up whatever is in it's cache until the object expires. The shortest expiration time you can set on a CF object is 24 hours.
Many items you might serve from CloudFront never, or rarely, change--pictures, for example. In Kynetx' case, the files do change from time to time as we update the library. While you could decide to just live with that, it's not acceptable if you accidentally push a faulty library out. Can you imagine having this conversation? "Gee we're sorry Mr. Customer. We know it's broken. If you'll wait 24 hours, we'll try again."
The common solution seems to be timestamping the filenames so that you just create a new file to serve from CF each time. When you also control the reference to that file, that's a great solution. We don't. The reference is controlled by our customer and telling them all to update their systems each time we want to push a new file is untenable.
You know the saying "there's no Computer Science problem that can't be solved with a layer of indirection." In homage to that, our solution is to redirect from a static URL to the timestamped filename. That's not ideal, but it works. The downside is the other half of that homily: "there's no performance problem that can't be solved by eliminating a layer of indirection." Still, our measurements show that the price we're paying isn't too high.
We put this change in place and tested it for many weeks and then rolled it out to all our customers. So what do we get? Look at this graph showing the dramatic drop in our outbound traffic when we put the change in place:
I did a workup comparing what we'll pay through Amazon and what we were paying for our network traffic and I'm convinced the savings will be nearly as dramatic as the preceding graph.