I'm at the session being done by Dave Sifry, creator of Technorati.com
- Over 1.6 million sources tracked
- 11,000 new weblogs created everyday, up from 4-5K per day in March 2003.
- About 35% of weblogs are abandoned (no posts in 3 months)
- Over 100,000 updates per day.
- Median time from weblog post to live index (on Technorati) is 7 minutes. This makes the engine usable for tracking weblog conversations.
The nice thing about Technorati is that it tracks deep links. Almost no one links to www.amazon.com. They link to some specific page on Amazon (which, BTW, Amazon has enabled by having a RESTful architecture).
Dave points to a things he hacked together last night to point to products. Dave asks for an experiment. He asks the audience to link to the product page and then periodically check the cosmos for the page to see when they links appear on Technorati.
Dave talks about the Power Law of Blogging and shows his data. The data shows that when you have fair access to media, there will be a relatively small number of things that are linked to by a lot of people. When there were only three networks, they were all even distributed. When there are hundreds, you get a power law graph.
Technorati as Platform Dave's commitment:
- XML API for all functionality based on a RESTful architecture that is free for non-commercial use.
- Today: Link Cosmos, keyword search, top 100, breaking news, and current events.
- Perl, Python, Radio, C#, and ASP interfaces (see developers.technorati.com)
There are IM/SMS notification, movable type plugins, threading on weblog readers, and a high priority indexer (using the high priority indexer). Some application directions: open reviews (RVW format), keyword and Cosmos filters, discovery and filtering of subscriptions lists, vote links (differentiating between links as endorsements and links to non-endorsements), and geographic search and indexing.