Learning from Web 1.0
This article appeared as my column for Connect Magazine in January 2006.
People are beginning to refer to the time before the so-called dot-com crash as the Web 1.0 era. In-between 1994 and 2001, we went through five versions of HTML, not to mention browsers. Critical technologies, like client-side scripting and application servers, matured. And distributed computing went from something done by a few data-center denizens to a mass-market phenomenon (think of Web hosting and even GeoCities). Some thought that the crash in 2001 was the end, but in fact it was merely a retrenching. We're now firmly in Web 2.0 territory and the innovation is just as strong and the ideas just as compelling in this part of the game as they were in the first.
Adam Bosworth was one of the people who made Web 1.0 happen, first as leader of the team at Microsoft that developed Internet Explorer's HTML engine and then at BEA as SVP of Advanced Development. Now he's a VP of Engineering at Google. Adam has been talking and writing about the unintuitive lessons we should learn from the Web. They're worth considering as you engineer your next Web application.
Rule 1: Simple, relaxed, sloppily extensible text formats and protocols often work better than complex and efficient binary ones. It's not hard to see what Adam's getting at here. The world had plenty of distributed computing protocols before HTTP came along, but rather than being a handicap, the simplicity of HTTP was its key strength. HTML is another good example from the Web 1.0 era. HTML is sloppy and imprecise, but it was that very flexibility and the forgiving nature of browsers that made it so that almost anyone could build a Web page. This brings to mind Postel's law: "Be liberal in what you accept and conservative in what you send."
Rule 2: It's worth making things simple enough that one can harness Moore's law in parallel. Systems should be able to scale in multiple dimensions simultaneously by taking advantage of lots of cheap hardware (and the still cheaper hardware that will be available tomorrow). DNS is perhaps the best example of this. The architecture of DNS allows it to be completely distributed on millions separate boxes and yet transparently service a tremendous number of requests every day. Google's another example. Google's specifically architected to work on lots (I've seen some reports that put the number at over 100,000 servers) of cheap hardware. Applying this rule requires lots of thought about your data repositories, as a single example. Relational databases are notoriously poor performers in this regard.
Rule 3: It is acceptable to be stale much of the time. It's not always necessary to have the latest data. Search engines are always out of date because people are constantly creating and updating Web pages that won't be indexed until the next time the spider stops by. Even so, search engine results are useful. In fact, most of the data on the Web is stale most of the time--the Web still rocks. Allowing for stale data increases system scalability, decreases costs, and promotes loose coupling.
Rule 4: The wisdom of crowds works amazingly well. The quintessential example of the wisdom of crowds is Google's page rank algorithm. Page rank harnesses the work of millions of people to make search results better. Every time you create a link on the Web you're making Google better. Del.icio.us (which I discussed in June's column) and eBay are other examples of Web sites that are built on the data their customers provide. Not only do architectures of participation, as these kinds of designs are called, make your Web site better, but they also make the user experience more fulfilling. If the only thing users can do on your Web site is push the "buy button," you've got your work cut out for you.
Rule 5: People understand a graph composed of tree-like documents (HTML) related by links (URLs). Adam calls this lesson the most surprising of all. If you sat most people down for a discussion of data structures and graph theory, their eyes would glaze over, but give them HTML and URLs and all of a sudden they're experts. Returning to Rule 1, the hierarchies and graphs that people create are sloppy, but they're meaningful and that's why they do it.
Adam has applied these lessons to critiquing database technology to good effect. You can apply them to the engineering of your Web application as well. Too often we build systems that are inflexible, overly precise, scale poorly, and treat people like cattle. We can do better and the Web has shown us, in flashing lights, that it works.
Phil Windley teaches Computer Science at Brigham Young University. Windley writes a blog on enterprise computing at http://www.windley.com. Contact him at phil@windley.com
Last Modified: Thursday, 10-Nov-2005 23:50:43 UTC