Nobody likes link rot. Link rot breaks the Web. But in the age of Twitter, short URLs are a necessity. I think the best way to keep my short URLs from rotting is to take charge of the process myself. This post shows how I built a simple, reliable URL shortening service that runs under my control.
I've been wanting to build a custom URL shortener for myself for a number of reasons. Partly, it's fun to think about. Partly it's about control. When I use a URL shortener like bit.ly, I'm at their mercy. If they go away (like a number of ad shorteners have), the links break. Link rot is an ugly thing. It breaks the Web.
How to do it? First, you need a short domain name. I registered wnd.li for this purpose (get it: short for windley). I wanted wnd.ly, but they won't register three letter domains in the ly TLD anymore to organizations outside of Libya.
The second step is to set up the shortener. There's a lot of little scripts out there, but Dave Winer pointed me at a way to do it that is both simple and reliable. One of the reasons why URL shorteners can be unreliable is that most of them are dynamic, database-driven applications. If someone isn't maintaining the database, the shortened links stop working. I've got plenty of services I started on my server at one time or another and no longer maintain.
The method Dave highlights (which he learned from Joe Moreno) is to simply write out files with a meta-refresh to the long URL. The name of the file is the short code. I built a test version at wnd.li/liveweb.
The first step is to configure Apache to service the virtual domain:
<VirtualHost 220.127.116.11> DocumentRoot <root>/wndli ServerName wnd.li DefaultType text/html </VirtualHost>
The DefaultType specification is important because the files won't have extensions. Even so, you need the server understand the files are HTML.
The file itself is also simple:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title></title> <META HTTP-EQUIV="Refresh" CONTENT="0;URL=/liveweb"> <meta name="robots" content="noindex"/> <link rel="canonical" href="/liveweb"/> </head> <body></body> </html>
This is the magic: the short links are just files, so there's no database to maintain; nothing to stop working sometime in the future. I can easily copy them to another server somewhere and if my server goes away for whatever reason, just repoint the name to the other server and everything starts working again. Voila! This is what makes this so simple and so reliable. The shortened links don't depend on anything more than files and browsers.
The other side, creating the short links is also pretty easy. Just a script to take a long URL, create a code, and write out a file like the one shown above from a template. Here's a simple Perl script that does that using the Algorithm::URL::Shorten library:
#!/usr/bin/perl -w use Algorithm::URL::Shorten qw(shorten_url); my $root = "/tmp"; my $shortdomain = "http://wnd.li/"; my $url = $ARGV; my $shorts = shorten_url($url); my $name; $shorts->; my $file_content = <<END; <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title></title> <META HTTP-EQUIV="Refresh" CONTENT="0;$url"> <meta name="robots" content="noindex"/> <link rel="canonical" href="$url"/> </head> <body> <p> You are being redirected to <a href="$url">$url</a> </p> </body> </html> END open(FH, ">$root/$name"); print FH $file_content; close(FH); print <<EOF; Your short code for $url is $shortdomain/$name EOF
I've simplified this somewhat (for example, it doesn't look for collisions), but this give you the idea. This could be run at the command line, but making it into a Web app wouldn't be hard. I'll eventually to that so it's available to me anywhere and create a bookmarklet to call it on any Web page I happen to be looking at. Then I'll have a fully functional URL shortener that is simple, reliable, and under my control. FTW!
Here's an example of a URL I shortened using this program, stored to my server: http://wnd.li//KPXDOP
Bonus: if you want statistics out of your shortener, it's pretty easy to add Google AdSense to the file template I show above.