Alex Russell on Comet: Beyond AJAX (ETech 2006)


Alex Russell, who works at JotSpot and did the DOJO Toolkit for JavaScript is talking about Comet and low latency data to and from browsers (slides). The subtitle is "after AJAX." The goal is responsiveness. AJAX gives you half the answer. AJAX is about me. Social applications are driven by others--the multiuser web. How do we send the datagrams that users make to each other.

To any one user, the server represents the other users. Because the Web is a multiuser experience, single interaction updates aren't enough. Users in the same "space" need live updates of their own changes and the changes others make. Updates to content affect available actions. Stake context may mean the wrong decision.

If the Web is a conversation, then stale context kills. When you create a page, or tag a picture, or something else, you're chaining context. Conversation mediums are defined by latency, interrupt, and bandwidth. He gives some examples (in order of latency): snail-mail, email, IRC, SMS, IM, phone, and finally face-to-face.

Polite users use high-interruption mediums as infrequently as possible. Traditional wikis are fraught with usability issues.

Wikis are conversation enablers that are traditionally medium-to-high latency and not well suited to high volume changes. There are locking issues. AJAX allows more context to go stale. What is changing on the wiki as I edit? Who wants to break my lock? Have attachments been added? Is the text of the page itself changing?

Conversations are ordered events. Granular interfaces require granular events. Granular conversations are more immediate (IM vs. email). Social applications are even busses. Social web apps just batch changes today. The are no effective was to "subscribe" to server events today. To fix the context, we need to syndicate the events.

Event broadcast requires synchronization. Comet is a technique for pushing data from the server. New term, but old tech. This is enabled by long-lived HTTP connections instead of polling. There are similarities to AJAX: no new plugins, plain-old HTTP, asynchronous, broad browser support.

Here are some examples of systems that use Comet: GMail, GTalk, JotLive, Renkoo, Meebo, cgi.irc, KnowNow, and others. Note that Comet isn't a framework or toolset, its a concept like AJAX. Alex is coining the term Comet so that it has a name.

How is Comet different from AJAX? In an AJAX application, the client drives the interaction. The problem is that context and manipulated content go stale at different times.

Comet application fight lag by avoiding HTTP and TCP/IP set-up and tear-down and a single connection is re-used. But the big kicker for AJAX is polling latency which Comet avoids. The big takeaway: transfer only the necessary data, exactly when it's most relevant.

There are two implementation techniques: Long-polling where you reconnect after every datagram. This is simple to implement with XmlHTTPRequests. Another method is to use multi-part XmlHTTPRequests. This works differently on IE and Firefox and doesn't work on Safari. No known system does this portably today.

Another technique can use what Alex calls a forever-frame. A forever-frame is an iframe or browser frame that receives script blocks and uses progressive rendering. This is highly portable and allows connections to subdomains. The connection only closes when there's an error of the connection recycles.

Most commodity Web servers won't cut it today. This is why the alternate subdomain idiom is important. You can run the main app off the primary Web servers and then do the continuous updates to a special server. The problem is that each connection takes a process or thread and there might be thousands of them. Comet can reduce load but not on your current Web infrastructure. Polling is a latency trade-off. Comet is an architectural complexity trade-off.

WE need better event based tools. Servers don't know about events as such. OS's have edge-triggered event IO mechanism (epoll on Linux and kqueue on FreeBSD). At the network level, we need application environments. Perl's PEO Python's Twisted, Java's Jetty and event_mpm from Apache (in 2.2). The good news is that the OS's can handle it.

The news on the client isn't so good. Clients can't make more than two HTTP connections to any box/subdomain (per spec). Firefox may not adhere to this limit. IE is draconian. The JavaScript code needs to know about this and deal with it--right down to peering with other browser instances and managing connections

There are some workarounds: multiplex events for multiple components over the same connections. We need message oriented middleware for the client. You can also use DNS hackery with wildcard DNS to increase the available subdomains.

Is Comet good for users? If users are all trying to do the same things at the same place to some piece of data, then you need it. Can presence data improve the conversation. If the data can go stale and no one notices, then you don't need it.

Some early lessons: work with interaction designers. Learn from desktop apps--they have the same design problems. be consistent. Let users know why the data is changing and who changed it. Communicate connection failures clearly. Push data updates, not functionality changes.

Update: Randy Gordon pointed me to this work on an architecture called SEDA which stands for "staged event-driven architecture." i haven't read through it yet to get a handle on it, but I didn't want to lose it--attaching it here may be useful to you and will definitely be useful to me.