Personal Data, Freedom, and Value Creation

Phil Windley // Thu Apr 29 14:31:00 2010 // data economics pds privacy

Image by mkrigsman via Flickr

Data is big business. Whether its demographics or FICO score, people know things about you and sell it to people who want know about you. If you're read my blog post on the Power of Pull (or listened to the podcast) then you know that I believe we haven't even scratched the surface of where data exchange is going. As more and more of our life goes online there will be more and more semantic, structured data available about every aspect of our lives.

For example, your golf clubs will automatically register your strokes, power, and so on. Your toilet will automatically analyze your waste and register the results. You purchases (even those made offline) will be collected and collated. Is this kind of world scary or not? It depends. If all of this is being done without your involvement, permission, or visibility then it can be very scary. On the other hand, if this kind of personal data ends up in your very own personal data store and you (or agents acting on your behalf) can decide when, where, and for what purpose it's used, then not so much. In fact it could be the source of increased wealth and quality of life for everyone.

Whether or not it does result in increased wealth depends a lot on how it's done. We could create personal data storage and interchange in a couple of different ways. I like to think of these models as railroads and highways.

When they came about in the mid-1800's, the railroads were a better way of transporting goods and people than the previously existing methods by several orders of magnitude. Suddenly a trip across the continent that previously took months and involved great personal peril could be accomplished in day at relative ease. Whole businesses came into being based on the ability to ship goods around cheaply (remember Sears Robuck?).

But railroads have disadvanteges. For one, they require huge capital investments to create. Consequentlty the owners get to make the decisions about how they're used and most of the wealth that is generated ends up concentrated in relatively few hands. That investment in infrastructure also means the new and better ideas in how to move things around are expensive to try and consequently most of them end up as only dreams.

Railroads stand in contrast to the US Interstate Highway System. Sure it is held in common and thus represents a shared investment, but that's not really the point. We could nationalize the railroads but that wouldn't mitigate the disadvantages I speak to above. The interstates have created a situation where individuals can, with a relatively small investment start a trucking or bussing company, try out new business models, and create wealth for themselves, their investors, and their families. Moreover, the interstates enable great personal freedom for people to travel where they want, when they want, in relative anonymity.

There are several important differences between railroads and highways that enable these benefits:

Highways are distributed and require almost no centralized control
Anyone who meets certain minimal requirements (mostly licensing and safety) can use them
Standards ensure that a truck built in Michigan can be used on highways in Utah and that I can buy fuel, tires and other consumables with little worry about interoperability.

We need to ensure that the personal data services that get set up over the next several years are more like highways than railroads. I'm not suggesting a huge public works project since the Internet serves for transport. More important are adhering to the three ideas I list above.

The PDS infrastructure should be designed in a way that requires minimal or no centralized control. We may need naming and discovery and that typically involves a few organizations have disproportiate control over those aspects, but the Internet has taught us a lot about governance in those areas.
The design ought to be such that anyone can implement all or part of the service on their own and have it interoperate with others. This implies that there's no lock-in advantage to other providers int he ecosystem. Fedex and UPS don't have lock-in based on where their trucks go, etc. They compete on terms and service, not by capturing customers. There shouldn't be "approval" by a central authority, although there will certainly be (already are) regulations and trust frameworks.
The standards ought to be open and non-proprietary. User should be able to swap out any piece of the service with another provider of the same type of service without retooling.

I think that personal data services have the promise of creating great wealth and inproving quality of life. However, done wrong there will be huge pushback and those benefits will be lost or delayed for years.

When I was CIO for Utah, there were 27 or 28 different databases that kept track of children's health data. The problem was that there was no good way to link a record in one with a record in another. Part of that was technial, but much of it was a result of paranoia about "big brother" having too much control of data about people. As a result there was no easy way for a receptionist at the children's immunization clinic to know that Johnny hadn't have his free childhood ear exam and recommend an appointment. If one child grew up deaf because of that, we've failed.

Similarly, I'm concerned that many of the benefits that could acrue from personal data services could be lost if they are implemented in ways that concentrate the benefits or, worse, creates backdraft that delays or destroys them. If that happens, we'll still have lots of data being generated and collected, but individuals not only won't see the benefit but may see their quality of life diminish.

I'm working with a group of people trying to define charters and standards to create just such an open personal data service ecosystem called the Personal Data Exchange. Paul Trevethick has produced a strawman architecture that takes a stab at defining how this could work, including applications that use personal data (like Kynetx apps). There will be many discussions and sessions about this concept at the upcoming Internet Identity Workshop on May 17-19 in Mountain View CA. Come and join us; we'd love to have you participate.