Data Abstractions for Richer Cloud Experiences


Summary

A Cloud OS (COS) for a personal cloud will need a data abstraction layer analogous to the kinds of data abstraction that a traditional OS provides. Unlike a traditional filesysystem, however, the COS data layer must deal with a multitude of distributed APIs.

[]

As we discussed earlier, one of the primary services of a cloud OS (COS) will be data abstraction. Traditional operating systems provide data abstraction services by presenting programs and users with a file system view of the data stored in the sectors of the disk.

We have the same kind of data abstraction opportunities in the cloud, although we aren't talking about translating sectors into files, of course. And like the access control problem we discussed earlier, the problems that a cloud OS faces are made more complex by the distributed location and control of the data we want to access.

As the following chart from Programmable Web shows, the growth of the number of APIs has been exponential over the last 12 years.

API growth from Programmable Web

All these APIs present a tremendous opportunity for application developers and they've taken advantage of it. Many of the most interesting applications we've seen in mobile and online over the last few years involved mash-ups between multiple APIs.

But with that abundance of data comes a problem: programmers have to learn the various details of the APIs, their access methods, and error codes. If you're just concerned about a few that you need to create a particular app, that's no problem and thus the API economy has flourished. But there are legitimate uses of all this data require using not only multiple APIs, but also APIs you may not be aware of as you're writing your application.

Let's take a simple example: a phone number. My phone number is stored at Facebook, LinkedIn, Google, and several other places around the Web. Suppose you're writing an application to run in my personal cloud that needs access to my phone number. What do you do? Right now, there are the following options:

  • You can store my phone number yourself, giving me yet another place where my phone number is stored. The downside is that when my number changes, there's one more place I've got to update. If I forget to update your app, it stops working.
  • You could pick an API, say Google, and just tell people they have to use Google. This works as long as everyone is comfortable using Google.
  • You could choose to support a number of APIs where phone numbers are stored and give users a choice. The downside is that the more APIs you support the harder maintenance becomes.

The problem is exacerbated as the number of data elements increases—especially as they need to come from different APIs. We talked about the issues around authorization that this causes in The Foundational Role of Identity in a Personal Cloud. But the problems don't stop there. There are two important issues beyond authorization that we need to address if we're to abstract data access and make the developer's job easier:

  1. How do we know where to get the data?
  2. What is the format of the data and what do the elements mean?

Solving the first requires location-independent references—when a program needs access to the user's phone number, location-independent references provide an abstract means of finding where that data is stored. So, for example, suppose you store your phone number in GMail contacts and I store mine at Personal.com. The application doesn't have to know that or how to connect to those various services. The program references a name that means "user's phone number" and the data abstraction layer in the COS takes care of the messy details.

The solution to the second involves semantic data interchange—suppose the program wants the user's phone number but one API stores it as "cell" and another as "mobile." How do we know that's the same thing? For one or two things, it's easy enough to create ad hoc mappings; but that quickly gets old. The data abstraction layer makes these translations automatically. Moreover, there can be multiple formats that are used for storing phone numbers.

A functional COS should provide the means (i.e. protocols) for performing location independent data references as well as semantic data interchange. This abstraction layer can ensure that the authorization, location, and semantic issues are dealt with in a consistent way that is easy for the developer and the user. There has been much work on this problem over the last decade ever since Tim Berners-Lee, James Hendler and Ora Lassila proposed the Semantic Web in the May 2001 issue of Scientific American. While we acknowledge that much of what has been done in the name of the Semantic Web has seemed overly complicated to developers of modern Web services, we believe that we're beginning to face the exact problems that the ideas behind the Semantic Web were designed to solve.

Our choice for a protocol to provide semantic services to the COS is XDI. XDI (XRI Data Interchange) is a generalized, extensible service for sharing, linking, and synchronizing structured data over the Internet and other data networks using XRI-addressable RDF graphs. XDI is under development by the OASIS XDI Technical Committee. XDI was created to solve the aforementioned problems in a way that is:

  • understandable—XDI does not require pre-defined data schemas for new types of data to be exchanged
  • contextual—The concept of context is built directly into the XDI graph model, so identity, relationships, and permissions can be context-dependent
  • trustable—XDI identification, authorization, and relationship management are integral features of the graph model and protocol
  • portable—An XDI account can be moved to a different host or service provider without breaking links or compromising security or privacy

To see how XDI can help, let's continue the phone number example. A developer using XDI to reference the user's work phone number in a KRL program might write something like this:

user = get_user_iname();
user_work_phone = xri:#{user}+work$!(+tel)

Note: The #{user} syntax shown above is meant to convey the construction of an XRI statement using previously calculated data with a KRL beesting. Some other means of constructing XRIs might ultimately be selected as we make concrete progress on integrating XDI in KRL. If get_user_iname() returned =windley, the resulting XRI reference would be xri:=windley+work$!(+tel). The +work clause provides a context for the phone number. The $!(+tel) clause specifies that we want a single instance of a phone number, not a multi-valued collection (in the case there's more than one).

Location-independence is the easier property to discuss, so let's start there. Resolving a reference like xri:=windley+work$!(+tel) isn't much difference in theory from how a domain name like www.windley.com gets resolved. There is a set of known top-level authorities who know how to determine who or what =windley is. From there, you (literally) follow the graph to the node represented by xri:=windley+work$!(+tel). That node could reference a data value in any API.

Of course, this kind of independence doesn't happen for free. There's no magic way to know that I keep my phone number at Facebook and you have yours on iCloud. But, a COS could, based on standard mappings, know how to access a user's profiles on various services and provide the right link regardless of who's running the program once the user has given her COS access to her data at the services she uses.

Data dereferencing using XDI

Note: The first example in the preceding figure uses an i-name that has been registered with an XDI registry, similar to the way you would register a domain name today. But XDI does not require the use of XDI registries. You can address any data that's available at any URI that hosts an XDI endpoint. This could be any webserver, as shown in the hypothetical example of Facebook supporting an XDI interface, or it could be an XDI endpoint discoverable through an email address using OpenID Connect. All of them work equally well, because once the discovery process reaches an XDI endpoint, all of the data behind it is addressable using XDI.

Who's providing all these XDI endpoints? Ideally the API owners, but that doesn't have to be the case. Think of the XDI endpoints playing the role that drivers play in a traditional OS. If I add a new kind of disk with a different interface then I need a new driver.

The mapping process shows the power of semantic data interchange. Once maps between common concepts like the user's phone number and it's location in various APIs are made, they can be reused over and over again. If the API changes, changing the map in one place updates it for every application and every user.

Moreover, maps can link common semantic concepts so that we know that cell and mobile are the same. Semantic mapping solves three important problems:

  • Poorly defined semantics—an example might be incomplete phone numbers that assume a context, like a country code.
  • Same syntax, different semantics—we might run into data elements that are formatted like phone numbers, but aren't.
  • Different syntax, same semantics—this occurs frequently since different APIs use different string formats for the same concept, like phone numbers.

The good news is that we don't have to boil the ocean to get started. Semantics has been made way too mystical and unapproachable. This is really nothing more than the kinds of techniques a good programmer would use to solve these problems, but standardized. A COS could provide mappings for common data elements and common APIs, like contact data or calendars, and make those available to developers. Adding just a few of the most common required elements to the COS would greatly simplify many applications that need access to personal information.

COS-level data abstraction makes programs easier to write and use because:

  • Developers don't have to understand the intricacies of multiple APIs.
  • The COS manages authorization issues freeing developers from managing the code and allowing users greater visibility into and control over how data is used.
  • The COS provides a consistent configuration experience for users.
  • Developers don't have to write code to manage configuration.

The next post in this series will examine our proposal for COS programming model.