This afternoon, I was torn between the session on botnets and one on Amazon's SimpleDB by Mike Culver and Jay Ridgeway. I chose the latter.
The goal is a durable, flexible datastore at a cheap price: $0.14 per machine house, $0.10/Gb into the cloud and $0.18/Gb out.
The API call list is short. Domains are used to partition data. You can think of them as tables, that helps. To add something to a domain you use this syntax:
PUT (item, 123), (description, Sweater), (color, Red), (color, Blue)
The first name-value tuple is the name of the row and needs to be unique. The remaining tuples are attributes and names can be repeated to represent a attribute with multiple values. There are no datatypes. Everything is a string.
A query looks like:
Domain = MyStore ['description' = 'Sweater']
Note that this isn't SQL. :-)
Jay Ridgeway from Nextumi took the mic to talk about their experience using SimpleDB to implement ShareThis. They've made heavy use of SimpleDB. He concluded with the following list of downsides and upsides. On the downside:
- Limited features
- minimal toolset and documentation
- no experience in house
- high switching cost
On the upside:
- zero software cost
- minimal staff costs
- low barrier to development
- responsive and reliable
- simple, pragmatic solution for a complex problem.
Nextumi does maintain a copy of the raw data in case Amazon ceased to exist for some reason, but using it would obviously require some redesign of their site. I wonder if anyone has created the SimpleDB API on top of BerkeleyDB or MySQL? That would be handy.
SimpleDB doesn't handle binary data well. The best thing is to put binary data in S3 and put a reference to it in SimpleDB.