I Am Sybil

Split personalities

Online, I am Sybil. So are you. You have no digital representation of your individual identity. Rather, you have various identities, disconnected and spread out among the administrative domains of the various services you use.

An independent identity is a prerequisite to being able to act independently. When we are everywhere, we are nowhere. We have no independent identity and are thus constantly subject to the intervening administrative identity systems of the various service providers we use.

Building a self-sovereign identity system changes that. It allows individuals to act and interact as themselves. It allows individuals to have more control over the way they are represented and thus seen online. As the number of things that intermediate our lives explodes, having a digital identity puts you at the center of those interchanges. We gain the power to act instead of being acted upon.

This is why I believe the discussion of online privacy sells us short. Being self-sovereign is about much more than controlling how my personal data is used. That's playing defense and is a cheap substitute for being empowered to act as an individual. Privacy is a mess of pottage compared to the vast opportunities that being an autonomous digital individual enables.

Technically, there are several choice for implementing a self-sovereign identity system. Most come down to one of three choices:

  • a public, permissionless distributed ledger (blockchain)
  • a public, permissioned distributed ledger
  • a private, permissioned distributed ledger1

Public or private refers to who can join—anyone can join a public ledger. A public system allows anyone to get an identity on the ledger. Private system restrict who can join. I owe this categorization to Jason Law.

Permissioned and permissionless refers to how the ledger's validators are chosen. As I discussed in Properties of Permissioned and Permissionless Blockchains, these two types of ledgers provide a different emphasis on the importance of protection from censorship and protection from deletion. People of a more libertarian bent will prefer permissionless because of it's emphasis on protection from censorship while those who need to work within regulatory regimes will prefer permissioned.

We could debate the various benefits of each of these types of self-soveregn identity systems, but in truth they are all preferable to what we have today a each allows individuals to create and control identities independent of the various administrative domains with which people interact. In fact, I suspect that one or more instantiations of each these three types will exist in parallel to serve different needs. Unlike the physical world where we live in just one place, online, we can have a presence in many different worlds. People will use all of these systems and more.

Regardless of the choices we make, the principle that ought to guide the design of self-sovereign identity systems is respect for people as individuals and ensuring they have the ability to act as such.

In my discussion on the CompuServe of Things, I said:

"On the Net today we face a choice between freedom and captivity, independence and dependence."

I don't believe this is overstated. As more and more of our lives are intermediated by software-based systems, we will only be free if we are free to act as peers of these services. An independent identity is the foundation for that freedom to act.


  1. A private, permissionless ledger is an oxymoron.


Building a Virtual University

Imagine you wanted to create a virtual university (VU)1. VU admits students and allows them to take courses in programs that lead to certificates, degrees, and other credentials. But VU doesn't have any faculty or even any courses of its own. VU's programs are constructed from courses at other universities. VU's students take courses at whichever university offers it. In an extreme version of this model, VU doesn't even credential students. Rather, those come from participating institutions who have agreed, on a program-by-program basis, to accept certain transfer credits from other participating universities to fulfill program requirements.

Tom Goodwin writes

Uber, the world’s largest taxi company, owns no vehicles. Facebook, the world’s most popular media owner, creates no content. Alibaba, the most valuable retailer, has no inventory. And Airbnb, the world’s largest accommodation provider, owns no real estate.

These companies are thin layers sitting on an abundance of supply. They connect customers to that supply. VU follows a similar pattern. VU has no faculty, no campus, no buildings, no sports teams. VU doesn't have any classes of its own. Moreover, as we'll see, VU doesn't even have much student data. VU provides a thin layer of software that connects students anywhere in the world with a vast supply of courses and degree programs available.

There are a lot of questions about how VU would work, but what I'd like to focus on in this post is how we could construct (or, as we'll see later, deconstruct) IT systems that support this model.

Traditional University IT System

Before we consider how VU can operate, let's look at a simple block model of how traditional university IT systems work.

traditional university system design
Traditional University IT System Architecture

Universities operate three primary IT systems in support of their core business: a learning management system (LMS), a student information system (SIS), and a course management system.2

The LMS is used to host courses and is the primary place that students use course material, take quizzes, and submit assignments. Faculty build courses in the LMS and use it to evaluate student work and assign grades.

The SIS is the system of record for the university and tracks most of the important data about students. The SIS handles student admissions, registrations, and transcripts. The SIS is also the system that a university uses to ensure compliance with various university and government policies, rules, and regulations. The SIS works hand-in-hand with the course management system that the university uses to manage its offerings.

The SIS tells the LMS who's signed up for a particular course. The LMS tells the SIS what grades each student got. The course management system tells the SIS and LMS what courses are being offered.

Students usually interact with the LMS and SIS through Web pages and dedicated mobile apps.

VU presents some challenges to the traditional university IT model. Since these university IT systems are monoliths, you might be able to do some back-end integrations between VU's systems and the SIS and LMS of each participating university. The student would have to then use VU's systems and those of each participating universities.

The Personal API and Learning Records

I've written before about BYU's efforts to build a university API. A university API exposes resources that are directly related to the business of the university such as /students, /instructors, /classes, /enrollments, and so on. Using a standard, consistent API developers can interact with any relevant university system in a way that protects university and student data and ensures that university processes are followed.

We've also been exploring how a personal API functions in a university setting. For purposes or this discussion, let's imagine a personal API that provides an interface to two primary repositories of personal data: the student profile and the learning record store (LRS). The profile is straightforward and contains personal information that the student needs to share with the university in order to be admitted, register for and take courses, and complete program requirements.

student profile
A Student Profile

The LRS stores the stream of all learning activities by the student. These learning activities include things like being admitted to a program, registering for a class, completing a reading assignment, taking a quiz, attending class, getting a B+ in a class, or completing a program of study. In short there is no learning activity that is too small or too large to be recorded in the LRS. The LRS stores a detailed transcript of learning events.3

One significant contrast between the traditional SIS/LMS that students have used and the LRS is this: the SIS/LMS is primarily a record of the current status of the student that records only course-grained achievements whereas the LRS represents the stream of learning activities, large and small. The distinction is significant. My last book was called The Live Web because it explored the differences between systems that make dynamic queries against static data (the traditional SIS/LMS) and those that perform static queries on dynamic streams of data. The LRS is decidedly part of the live web.

The personal API, as it's name suggests, may provide an interface to any data that the person who owns, but right now, we're primarily interested in the profile and LRS data. For purposes of this discussion, we'll refer to the combination profile and LRS as the "student profile."

We can construct the student profile such that it can be hosted. By hosted, I mean that the student profile is built in a way that each profile could, potentially, be run on different machines in different administrative domains, without loss of functionality. One group of students might be running their profiles inside their university's Domain of One's Own system, another group might be using student profiles hosted by their school, other students might be using a commercial product, and some, intrepid students might choose to self-host. Regardless, the API provides the same functionality independent of the domain in which the student profile operates.

Even when the profile is self hosted, the information can still be trusted because institutions can digitally sign accomplishments so others can be assured they're legitimate.

Deconstructing the SIS

With the personal-API-enabled student profile, we're in a position to deconstruct the University IT systems we discussed above. As shown in the following diagram, the student profile can be separated from the university system. They interact via their APIs. Students interact with both of them through their respective APIs using applications and Web sites.

student profile and university systems
Interactions of the University and Student Profile

The university API and the student profile portions of the personal API are interoperable. Each is built so that it knows about and can use the other. For example, the university API knows how to connect to a student profile API, can understand the schema within, respects the student profile's permissioning structures, and sends accomplishments to the LRS along with other important updates.

For its part, the student profile API knows how to find classes, see what classes the student is registered for, receive university notifications, check grades, connect with classmates, and sends important events to the university API.

VU can use both the university systems and the student profile. Students can access all three via their APIs using whatever applications are appropriate.

adding a virtual university
Adding a Virtual University

The VU must manage programs made from courses that other universities teach and keep track of who is enrolled in what (the traditional student records function). But VU can rely on the university's LMS, major functions of its SIS, and information in the student profile to get its job done. For example, if VU trusted that the student profile would be consistently available, it would need to know who its students are, but could evaluate student progress using transcript records written by the university to the student profile.

Building the Virtual University

With this final picture in mind, it's easy to see how multiple universities and different student profile systems could work together as part of VU's overall offerings.

virtual university
The Virtual University

With these systems in place, VU can build programs from courses taught at many universities, relying on them to do much of the work in teaching students and certifying student work. Here is what VU must do:

  • VU still has the traditional college responsibility of determining what programs to offer, selecting courses that make up those programs, and offering those to its students.
  • VU must run a student admissions process to determine who to admit to which programs.
  • VU has the additional task of coordinating with the various universities that are part of the consortium to ensure they will accept each others courses as pre-requisites and, if necessary, as equivalent transfer credits.
  • VU must evaluate student completion of programs and either issue certifications (degrees, certificates of completion, etc.) itself or through one of its member institutions.

Universities aren't responsible for anything more than they already do. Their IT systems are architected differently to have an API and to interact with the student profile, but otherwise they are very similar in functionality to what is in place now. Accomplishments at each participating institution can be recorded in the student profile.

VU students apply to and are admitted by VU. They register for classes with VU. They look to VU for program guidance. When they take a class, they use the LMS at the university hosting the course. The LMS posts calendar and notifications to their student profile. The student profile becomes the primary system the student uses to interact with both VU and the various university LMSs. They have little to no interaction with the SIS of the respective universities.

One of the advantages of the hosted model for student repositories is that they don't have to be centrally located or administered. As a result student data can be located in different political domains in accordance with data privacy laws.

Note that the student profile is more than a passive repository of data that has limited functionality. The student profile is an active participant in the student's experience, managing notifications and other communications, scheduling calendar items, and even managing student course registration and progress evaluation. The student profile becomes a personal learning environment working on the student's behalf in conjunction with the various learning systems the student uses.

Since the best place to integrate student data is in the student profile, it ought to exist long before college. Then students could use their profile to create their application. There's no reason high school activities and results from standardized testing shouldn't be in the LRS. Student-centric learning requires student-centric information management.

We can imagine that this personal learning environment would be useful outside the context of VU and provide the basis for the student's learning even after she graduates. By extending it to allow programs of learning to be created by the student or others, independent of VU, the student profile becomes a tool that students can use over a lifetime.

The design presented here follows the simple maxim that the student is the best place to integrate information about the student. By deconstructing the traditional centralized university systems, we can create a system that supports a much more flexible model of education. APIs provide the means of modularizing university IT systems and creating a student-centric system that sits at the heart of a new university experience.

Related Reading:


  1. Don't construe this post to be "anti-university." In fact, I'm very pro-university and believe that there is great power in the traditional university model. Students get more when they are face-to-face with other learners in a physical space. But that is not always feasible and once students leave the university, their learning is usually much less structured. The tools developed in this post empower students to be life-long learners by making them more responsible for managing their own learning environment.
  2. Universities, like all large organizations, also use things like financial systems, HR systems, and customer management systems, along with a host of other smaller and support IT systems. But I'm going to ignore those in this post since they're really boring and not usually specialized from those used by other businesses.
  3. The xAPI is the proposed way that LRSs interact with each other and other learning systems. xAPI is an event-based protocol that communicates triples that have a subject, verb, and object. Using xAPI, systems can communicate information such as "Phillip completed reading assignment 10." In my more loose interpretation, an LRS might also store information from Activity Streams or other ways that event-like information can be conveyed.


Why Companies Need Self-Sovereign Identity

347/365 Trip to the docs

The Problem

While the Internet seems to have made most everything else in our lives easier, faster, and more convenient, certain industries like healthcare and finance remain stubbornly in the 20th century.

Consider the problem of a person using their medical data. Most of us have multiple healthcare providers ranging from huge conglomerates that run hospitals and clinics to small businesses operated by doctors to retail pharmacies large and small. None of us get healthcare under a single administrative umbrella. Consequently, we leave a trail of medical data spread about the various healthcare provider's incompatible systems.

In this setup, patients can’t easily identify themselves across different administrative domains. Consequently, linking patient data is made more complicated than it otherwise would be. Interoperability of systems is nearly impossible. The result is, at best, increased costs, inefficiency, and inconvenience. At worst, people die.

The seemingly obvious solution is to create a single national patient identifier. But can you hear the howls and screams? This solution has never gained traction because of privacy concerns and fear of corporate and government control and overreach.

What if there were an end-run around a national patient ID? There is.

Self-Sovereign Identity

Self-sovereign identity is based on the premise that the best point for integrating data about a person is the person. At first the idea seems fanciful, but we've arrived at a point where self-sovereign identity is feasible.

You’d have to be living in a cave to have not heard something about Bitcoin in the past few years. But if you’re not a Bitcoin geek, you might not realize that a critical underlying piece of technology that Bitcoin introduced—and popularized—is the blockchain.

A blockchain is a revolutionary way of building a distributed ledger, and distributed ledgers have exciting possibilities for solving sticky identity problems.

Using a distributed ledger, we can create an identity system that:

  • is not owned and controlled by anyone—in the same manner that the Internet is a set of protocols and some loose governance1, distributed ledgers allow us to create identity systems that are usable by everyone equally, without any single entity being able to control it.
  • provides people-controlled identifiers—this is the essence of the term self-sovereign. The identity is under the control of the person using it.
  • is persistent—since the identifier is under the sovereign control of the patient, it is irrevocable and long-lived, potentially for life.
  • is multipurpose—the identifier used at one healthcare provider can be used at another, not to mention your financial institution, your school, and anywhere else.
  • is privacy enhancing—only the parties to any given transaction can see any of the details of the transaction, including who’s involved. People choose what to reveal and to whom in exchange for the services they need. People can have multiple identifiers that correspond to a different persona.
  • is trustable—the distributed ledger and its governance process, both human and algorithmic, provide a means whereby systems using the identity can instantly verify claims against third-parties when self-assertion (of the patient) is insufficient.

Healthcare providers, financial institutions, educational institutions and even governments can be part of this identity system without any of them being in charge of it. Sound impossible? It’s not, that’s how the Internet works today. Not only is it possible, it’s technically feasible right now.

The benefits of using self-sovereign identity go well beyond interoperability and include secure messaging, auditable transaction logs, and natural ways to manage consent.

How Does Self-Sovereign Identity Work?

Think about your online identities. Chances are you don’t control any of them. Someone else controls them and could, without recourse, take them away. While most people can see the problems with that scenario, they can’t imagine another way. They think, “if I want to use an online service, I need an identity from it, right?” We don’t.

There are lots of examples online right now. Have you ever used Google or Facebook to log into some other system (called a “relying party”)? Most of us have. When you log into another system with Facebook, you are using your Facebook identity to establish an account on the other system. Identities and accounts are not the same thing.

Plenty of places already see the value in reducing friction by allowing people to use an online identity from a popular identity provider (e.g. Google or Facebook).

Now imagine that the identity you used to log into a relying party didn’t come from a big company, but instead was something you created yourself on a distributed-ledger-based identity system. And further, imagine that relying parties could trust this identity because of its design. This is the core idea behind self-sovereign identity.

Any online service could use a self-sovereign identity. In fact, because of the strong guarantees around verified claims that a distributed-ledger-based identity system can provide, a self-sovereign identity is significantly more secure and trustworthy than an identity from Facebook, Google, and other social-network-based identity providers, making it usable in healthcare, finance, and other high-security applications.

Companies Need Self-Sovereign Identity

The bottom line is that companies need self-sovereign identity as much as people. Self-sovereign identity based on a distributed-ledger is good for people because it puts them in control of their data and protects their privacy.

But it's also good for companies. This is a classic win-win scenario because the same technology gives companies, governments, and other institutions an identity system they can trust, that enhances their operational capabilities, and is not their responsibility to administer. Organizations can get out of the business of being identity providers, an enticing proposition for numerous reasons. Here's a few examples:

  • As companies face more and more security threats, they are coming to see personally identifying information as a liability. A self-sovereign identity system provides a means for companies to get the data they need to complete transactions without having to keep it in large collections.
  • Claims that have been verified for one purpose can be reused in other contexts. For example, if my financial institution has verified my address as part of their know your customer process, that verified address could be used by my pharmacy.

The Internet got almost everyone except the telecom companies out of the long-haul networking business: throw your packets on the Internet and pick them up at their destination. Think of the efficiencies this has provided. A distributed-ledger-based identity system can do the same thing for identity.

Let’s create one, shall we?


  1. The Internet is governed (mostly) by protocol: the rules that describe how machines on the Internet talk to each other. These rules, determined by "rough consensus and running code," are encoded in software. There are other decisions that need to be made by humans. For example: who gets blocks of IP addresses, how they're handed out, and what top-level domains are acceptable in the domain name system. These decisions are handled by a body called ICANN. This system, while far from perfect, manages to make decisions about the interoperation of a decentralized infrastructure that allows anyone can join and participate.


Principles of Self-Sovereign Identity

Christopher Allen has a nice slide deck online, Identity on the Blockchain: Perils and Promise from his talk at Consensus 2016 Identity Workshop.

His discussion of self-sovereign identity and the principles he believe identity systems ought to possess start on slide 6:

There's tremendous power in the simple declaration that every human being is the original source of their identity. Think about your online identities. Chances are you don't control any of them. Someone else controls them and could, without recourse or appeal, take them from you.

This is, of course, untenable. As software intermediates more and more of our lives we must either gain control of our online identities or be prepared to surrender key rights and freedoms that we have taken for granted in the physical world.

In the principles, Chris lays out necessary attributes that a self-sovereign identity system must have to protect human freedom1:

  1. Existence People have an independent existence — they are never wholly digital
  2. Control People must control their identities, celebrity, or privacy as they prefer
  3. Access People must have access to their own data — no gatekeepers, nothing hidden
  4. Transparency Systems and algorithms must be open and transparent
  5. Persistence Identities must be long-lived — for as long as the user wishes
  6. Portability Information and services about identity must be transportable by the user
  7. Interoperability Identities should be as widely usable as possible; e.g. cross borders
  8. Consent People must freely agree to how their identity information will be used
  9. Minimization Disclosure of claims about an identity must be as few as possible
  10. Protection The rights of individual people must be protected against the powerful

I could quibble with some of the wording, but I think this is a pretty good list. Identity systems that support these principles are possible. And they would not just work with existing administrative systems (which no one is proposing would go away), but enhance them.

I know there are some people reading this and thinking of all the reasons it will never work. If you've got specific comments, questions, or critiques, feel free to use the annotation system on the right of the page to post them. Let's have a discussion.


  1. I've changed "user" to "people" in this list because Doc Searls long ago conditioned me associate it with "drug user" and I've never recovered.


Building a Personal API

Honey bee having a drink

As part of my abstract for a talk at APIStrat 2016, I wrote that BYU was interested in equipping students with personal APIs in an effort to teach them digital autonomy and make them more responsible for their learning.

Unlike others who attempted personal APIs, we're lucky in that we control both sides of the transaction. That is, we have control of the university systems and we have students who will generally use the tools we build for them. So rather than build a conventional tool to, say, create a directory, we can build one that uses the student's personal API. The university systems still work and students get a new tool that they learn to use. This is meaningful because students have a significant number and variety of interactions with the university, making it a great place to explore how architectures based on personal APIs can be designed and used.

The BYU Domains project provides a free domain name, with hosting, for every matriculated student and all faculty and staff. BYU Domains gives students an online space of their own to teach them that they can be contributors to online conversations and introduce them to the concept of self-sovereign identity. BYU Domains is a Cpanel-based hosting system. Consequently students can install numerous applications and give each a unique URL through subdomains, paths, or both. This makes a BYU Domain site a great place to host a personal API.

The BYU Domains project is creating a community directory. The purpose of the directory is to highlight uses of BYU Domains to showcase how different members of the campus community are using them. Here's University of Mary Washington's directory as an example.

The traditional way to build this would be to write some software that looks in the BYU Domains administrative database and uses the information store there to create a directory. Here's a simple diagram illustrating that approach:

community directory traditional architecture
A traditional architecture for the community directory

In this architecture the directory is tightly coupled to the specific system used to implement BYU domains. The application is dependent on the schema in the database. Any changes to the underlying system would necessitate that the directory application be updated as well.

An alternative that avoids this problem is to architect the directory application as an aggregator of information from the personal APIs of any domains that want to participate in the directory. The following diagram illustrates that architecture.

community directory api architecture A personal-API-based architecture for the community directory

In this architecture, the directory uses information from the various APIs of domains that participate. The directory application has a database for information in its domain as well as for caching information from the various APIs. We're unlikely to render a page for the directory by synchronously calling dozens of individual APIs. This design gives the directory application the traditional advantages of API-based architectures:

First, the API represents a contract upon which the directory system can depend. The contract insulates the directory from underlying implementation changes in the personal domain.

Second, the directory has access to the APIs because their owners have authorized the connection using OAuth. Domain owners can take themselves out of the directory at any point by visiting the Authorizations control panel in their domain. This eliminates the need for data sharing agreements and other information governance hurdles.

Third, domains that aren't part of the BYU Domains administrative system can still be part of the directory. For example, a faculty domain hosted by her department could just as easily be featured in the directory as one hosted by BYU Domains.

BYU's University API Committee is currently exploring how to make this work. We've had some good discussions on creating personal APIs that are built incrementally as data is needed, and done flexibly using JSON-LD markup to communicate schema information. I'm excited by the possibilities. We're also exploring API management that can run on the individual domains.

The BYU Domains Community Directory project gives us a good excuse to design and build a basic personal API application for BYU Domains. Follow-on projects, like a xAPI-responsive learning record store (LRS), will take advantage of this foundation to add portfolio data to the student's personal API. University systems can be tuned to use the profile and LRS information while leaving it all firmly in the hands of its owner.


Personal APIs in a University Setting

Southern armyworm, eggs_2014-06-06-14.28.04 ZS PMax

The API economy is in full swing and numerous commercial entities are engaged in producing APIs as a way of extending their reach. But as more and more of life is intermediated by computers, individuals also benefit from providing a personal API that can be used by applications on equal footing with other APIs.

Brigham Young University is teaching students about digital autonomy so that they are better prepared to be lifelong learners. In addition to a Domain of One's Own project, we have also embarked on a program of giving each student a personal API. By making students responsible for their own data, we teach them that they can be active participants in the digital realm.

In addition to profile information, the API is also a means of accessing the student's personal learning record system (LRS). Students grant university and other systems access to the resources in their API. The personal API provides the same features that other APIs have including API management and authorization. The personal API works in concert with BYU's University API to create a rich, permissioned data ecosystem for application developers inside and outside the university.

This talk will discuss design principles, implementation decisions, initial projects, and our experience to date.


A Pico-Based Platform for ESProto Sensors

Connected things need a platform to accomplish anything more than sending data. Picos make an ideal system for providing intelligence to connected devices. This post shows how I did that for the ESProto sensor system and talks about the work my lab is currently doing to make that easier than ever.

ESProto

ESProto is a collection of sensor devices based on the ESP8266, an Arduino-based chip with a built-in WiFi module. My friend Scott Lemon gave me a couple of Wovyn WiFi Emitters based on ESProto1 to play with: a simple temperature sensor and a multi-sensor array (MSA) that includes two temperature transducers (one on a probe), a pressure transducer, and a humidity transducer.

ESProto Mutli-Sensor Array
ESProto Multi-Sensor Array
ESProto Temperature Sensor
ESProto Temperature Sensor

One of the things I love about Scott's design is that the sensors aren't hardwired to a specific platform. When setting up a sensor unit, you provide a URL to which the sensor will periodically POST (via HTTP) a standard payload of data. In stark contrast to most of the Internet of Things products we see on the market, ESProto let's you decide where the data goes.2

Setting up an ESProto sensor device follows the standard methodology for connecting something without a user interface to a WiFi network: (1) put the device in access point mode, (2) connect to it from your phone or laptop, (3) fill out a configuration screen, and (4) reboot. The only difference with the ESProto is that in addition to the WiFi configuration, you enter the data POST URL.

Once configured, the ESProto periodically wakes, makes it's readings, POSTs the data payload, and then goes back to sleep. The sleep period can be adjusted, but is nominally 10 minutes.

The ESProto design can support devices with many different types of transducers in myriad configurations. Scott anticipates that they will be used primarily in commercial settings.

Spimes and Picos

A spime is a computational object that can track the meta data about a physical object or concept through space and time. Bruce Sterling coined the neologism as a contraction of space and time. Spimes can contain profile information about an object, provenance data, design information, historical data, and so on. Spimes provide an excellent conceptual model for the Internet of Things.

Picos are persistent compute objects. Picos run in the pico engine and provide an actor-model for distributed programming. Picos are always on; they are continually listening for events on HTTP-based event channels. Because picos have individual identity, persistent state, customizable programming, and APIs that arise from their programming, they make a great platform for implementing spimes.3

Because picos are always online, they are reactive. When used to create spimes, they don't simply hold meta-data as passive repositories, but rather can be active participants in the Internet of Things. While they are cloud-based, picos don't have to run in a single pico engine to work together. Picos employ a hosting model that allows them to be run on different pico engines and to be moved between them.

In our conception of the Internet of Things, we create a spime for each physical object, whether or not it has a processor. In the case of ESProto, we create a pico-based spime for each ESProto device:

MSA_pico
An ESProto MSA connected to its spime

Spimes can also represent concepts. For organizing devices we not only represent the device itself with a pico-based spime, we also create a spime for each interesting collection. For example, we might have two spimes, representing a multi-sensor array and a temperature sensor. If these are both installed in a hallway, we could create a spime representing the collection of sensors in the hallway. This spime stores and processes meta-data for the hallway, including processing and aggregating data readings from the various sensors in the hallway.

MSA_Temp_Hallway
MSA and Temperature sensors in a Hallway collection

Spimes can belong to more than one collection. The same two sensors that are part of the hallway collection might also, for example, be part of a battery collection that is collecting low battery notifications and is used by maintenance to ensure the batteries are always replaced. The battery collection could also be used to control the battery conservation strategies of various sensor types. For example, sensors could increase their sleep time after they drop below a given battery level. The battery spime would be responsible for managing these policies and ensuring all the spimes representing battery-powered devices were properly configured.

MSA_Temp_Hallway_Many
Multiple MSA and Temperature Sensors with two in the hallway collection

Spimes and ESProto

Pico-based spimes provide an excellent platform for making use of ESProto and other connected devices. We can create a one-to-one mapping between ESProto devices and a spime that represents them. This has several advantages:

  • things don't need to be very smart—a low-power Arduino-based processor is sufficient for powering the ESProto, but it cannot keep up with the computing needs of a large, flexible collection of sensors without a significant increase in cost. Using spimes, we can keep the processing needs of the devices light, and thus inexpensive, without sacrificing processing and storage requirements.
  • things can be low power—the ESProto device is designed to run on battery power and thus needs to be very low power. This implies that they can't be always available to answer queries or respond to events. Having a virtual, cloud-based persona for the device enables the system to treat the devices as if they are always on, caching readings from the device and instructions for the device.
  • things can be loosely coupled—the actor-model of distributed computing used by picos supports loosely coupled collections of things working together while maintaining their independence through isolation of both state and processing.
  • each device and each collection gets its own identity—there is intellectual leverage in closely mapping the computation domain to the physical domain4. We also gain tremendous programming flexibility in creating an independent spime for each device and collection.

Each pico-based spime can present multiple URL-based channels that other actors can use. In the case of ESProto devices, we create a specific channel for the transducer to POST to. The device is tied, by the URL, to the specific spime that represents it.

Using ESProto with Picos

My lab is creating a general, pico-based spime framework. ESProto presents an excellent opportunity to design the new spime framework.

For this experiment, I used manually configured picos to explore how the spime framework should function. To do this, I used our developer tools to create and configure a pico for each individual sensor I own and put them in a collection.

I also created some initial rulesets for the ESProto devices and for a simple collection. The goal of these rulesets is to test readings from the ESProto device against a set of saved thresholds and notify the collection whenever there's a threshold violation. The collection merely logs the violation for inspection.

The Device Pico

I created two rulesets for the device pico: esproto_router.krl and esproto_device.krl. The router is primarily concerned with getting the raw data dump from the ESProto sensor and making sense of it using the semantic translation pattern. For example, the following rule, check_battery, looks at the ESProto data and determines whether or not the battery level is low. If it is, then the rule raises the battery_level_low event:

rule check_battery {
  select when wovynEmitter thingHeartbeat 
  pre {
    sensor_data = sensorData();
    sensor_id = event:attr("emitterGUID");
    sensor_properties = event:attr("property");
  }
  if (sensor_data{"healthPercent"}) < healthy_battery_level
  then noop()
  fired {
    log "Battery is low";
    raise esproto event "battery_level_low"
      with sensor_id = sensor_id
       and properties = sensor_properties
       and health_percent = sensor_data{"healthPercent"}
       and timestamp = time:now();
  } else {
    log "Battery is fine";    
  }
}

The resulting event, battery_level_low, is much more meaningful and precise than the large data dump that the sensor provides. Other rules, in this or other rulesets, can listen for the battery_level_low event and respond appropriately.

Another rule, route_readings, also provides a semantic translation of the ESProto data for each sensor reading. This rule is more general than the check_battery rule, raising the appropriate event for any sensor that is installed in the ESProto device.

rule route_readings {
  select when wovynEmitter thingHeartbeat
  foreach sensorData(["data"]) setting (sensor_type, sensor_readings)
    pre {
      event_name = "new_" + sensor_type + "_reading".klog("Event ");

     }
     always {
       raise esproto event event_name attributes
	 {"readings":  sensor_readings,
	  "sensor_id": event:attr("emitterGUID"),
	  "timestamp": time:now()
	 };
     }
}

This rule constructs the event from the sensor type in the sensor data and will thus adapt to different sensors without modification. In the case of the MSA, this would raise a new_temperature_reading, a new_pressure_reading, and a new_humidity_reading from the sensor heartbeat. Again, other interested rules could respond to these as appropriate.

The esproto_device ruleset provides the means of setting thresholds. In addition, the check_threshold rule listens for new_*_readingevents to check for threshold violations:

rule check_threshold {
  select when esproto new_temperature_reading
	   or esproto new_humidity_reading
	   or esproto new_pressure_reading
  foreach event:attr("readings") setting (reading)
    pre {
      event_type = event:type().klog("Event type: ");

      // thresholds
      threshold_type = event_map{event_type}; 
      threshold_map = thresholds(threshold_type);
      lower_threshold = threshold_map{["limits","lower"]};
      upper_threshold = threshold_map{["limits","upper"]};

      // sensor readings
      data = reading.klog("Reading from #{threshold_type}: ");
      reading_value = data{reading_map{threshold_type}};
      sensor_name = data{"name"};

      // decide
      under = reading_value < lower_threshold;
      over = upper_threshold < reading_value;
      msg = under => "#{threshold_type} is under threshold: #{lower_threshold}"
	  | over  => "#{threshold_type} is over threshold: #{upper_threshold}"
	  |          "";
    }
    if(  under || over ) then noop();
    fired {
      raise esproto event "threshold_violation" attributes
	{"reading": reading.encode(),
	 "threshold": under => lower_threshold | upper_threshold,
	 "message": "threshold violation: #{msg} for #{sensor_name}"
	}	      

    }
}

The rule is made more complex by its generality. Any given sensor can have multiple readings of a given type. For example, the MSA shown in the picture at the top of this post contains two temperature sensors. Consequently, a foreach is used to check each reading for a threshold violation. The rule also constructs an appropriate message to deliver with the violation, if one occurs. The rule conditional checks if the threshold violation has occurred, and if it has, the rule raises the threshold_violation event.

In addition to rules inside the device pico that might care about a threshold violation, the esproto_device ruleset also contains a rule dedicated to routing certain events to the collections that the device belongs to. The route_to_collections rule routes all threshold_violation and battery_level_low events to any collection to which the device belongs.

rule route_to_collections {
  select when esproto threshold_violation
	   or esproto battery_level_low
  foreach collectionSubscriptions() setting (sub_name, sub_value)
    pre {
      eci = sub_value{"event_eci"};
    }
    event:send({"cid": eci}, "esproto", event:type())
      with attrs = event:attrs();
}

Again, this rule makes use of a foreach to loop over the collection subscriptions and send the event upon which the rule selected to the collection.

This is a fairly simple routing rule that just routes all interesting events to all the device's collections. A more sophisticated router could use attributes on the subscriptions to pick what events to route to which collections.

The Collection Pico

At present, the collection pico runs a simple rule, log_violation, that merely logs the violation. Whenever it sees a threshold_violation event, it formats the readings and messages and adds a timestamp5:

rule log_violation {
  select when esproto threshold_violation
  pre {
    readings = event:attr("reading").decode();
    timestamp = time:now(); // should come from device
    new_log = ent:violation_log
		   .put([timestamp], {"reading": readings,
				      "message": event:attr("message")})
		   .klog("New log ");
  }
  always {
    set ent:violation_log new_log
  }
}

A request to see the violations results in a JSON structure like the following:

{"2016-05-02T19:03:36Z":
     {"reading": {
         "temperatureC": "26",
      	 "name": "probe temp",
	 "transducerGUID": "5CCF7F0EC86F.1.1",
         "temperatureF": "78.8",
         "units": "degrees"
         },
       "message": "threshold violation: temperature is over threshold of 76 for probe temp"
     },
 "2016-05-02T20:03:18Z":
     {"reading": {
	 "temperatureC": "27.29",
	 "name": "enclosure temp",
	 "transducerGUID": "5CCF7F0EC86F.1.2",
	 "units": "degrees",
	 "temperatureF": "81.12"
         },
      "message": "threshold violation: temperature is over threshold of 76 for enclosure temp"
     },
     ...
}

We have now constructed a rudimentary sensor platform from some generic rules. The platform accommodates multiple collections and records threshold violations for the any transducer ESProto platform accepts.

MSA_pico_threshold
A ESProto device, associated pico, and collection pico logging a threshold violation

A more complete system would entail rules that do more than just log the violations, allow for more configuration, respond to low battery conditions, and so on.

Spime Design Considerations

The spime framework that we are building is a generalization of the ideas and functionality of that developed for the Fuse connected car platform. At the same time is leverages the learnings of the Squaretag system.

The spime framework will make working with devices like ESProto easier because developers will be able to define a prototype for each device type that defines the channels, rulesets, and initialization events for a new pico. For example, the multi-sensor array could be specified using a prototype such as the following:

{"esproto-msa-16266":
    {"channels": {"name": "transducer",
                  "type": "ESProto"
                 },
     "rulesets": [{"name": "esproto_router.krl",
                   "rid": "b16x37"
                  },
     		  {"name": "esproto_device.krl",
                   "rid": "b16x38"
                  }
                 ],
     "initialization": [{"domain": "esproto",
                         "type": "reset"
                        }
                       ]
    },
 "esproto-temp-2788": {...},
 ...
}

Given such a prototype, a pico representing the spime for the ESProto multi-sensor array could be created by the new_child() rule action6:

wrangler:new_child() with
  name = "msa_00" and
  prototype = "esproto-msa-15266"

Assuming a collection for the hallway already existing, the add_to_collection() action would put the newly created child in the collection7:

spimes:add_to_collection("hallway", "msa_00") with
  collection_role = "" and
  subscriber_role = "esproto_device"

This action would add the spime named "msa_00" to the collection named "hallway" with the proper subscriptions between the device and collection.

Conclusions

This post has discussed the use of picos to create spimes, why spimes are a good organizing idea for the Internet of Things, and demonstrated how they would work using ESProto sensors as a specific example. While the demonstration given here is not sufficient for a complete ESProto system, it is a good foundation and shows the principles that would be necessary to use spimes to build an ESProto platform.

There are several important advantages to the resulting system:

  • Using the spime framework on top of picos is much easier than creating a backend platform from scratch.
  • The use of picos with their actor-model of distributed programming eases the burden associated with programming large collections of indepedent processes.
  • The system naturally scales to meet demand because of the architecture of picos.
  • The use of rules allows picos to naturally layer on functionality. Customizing a pico-based spime is easily accomplished by installing additional rulesets or replacing the stock rulesets with custom implementations. Each pico has a unique set of rulesets and consequently a unique behavior and API.
  • The hosted model of picos enables them to be created, programmed, and operated on one platform and later moved, without loss of functionality or and necessary reprogramming, to another platform. This supports flexibility and substitutability.

Notes:

  1. Wovyn is building sensor products based on ESProto. There's an ESProto Open Source Project that has prototyping PCBs.
  2. I believe this is pretty common in the commercial transducer space. Consumer products build a platform and link their devices to it to provide a simple user experience.
  3. The Squaretag platform was implemented on an earlier version of picos. Squaretag provided metadata about physical objects and was our first experiment with spimes as a model for organizing the Internet of Things.
  4. I am a big fan of domain-driven design and believe it applies as much to physical objects in the Internet of Things as it does to conceptual objects in other programming domains.
  5. Ideally, the timestamp would come from the device pico itself to account for message delivery delays and the collection would only supply one if it was missing, or perhaps add a "received" timestamp.
  6. The wrangler prefix identifies this action as being part of the pico operating system, the home of the operations for managing pico lifecycles.
  7. The spimes prefix identifies this action as the part of the framework for managing spimes. Collections are a spime concept.


Self-Sovereign Identity and Legal Identity

Passport

The Peace of Westphalia, ending the 30 Year's War in 1648 created the concept of Westphalian sovereignty, the principle of international law that "each nation state has sovereignty over its territory and domestic affairs, to the exclusion of all external powers, on the principle of non-interference in another country's domestic affairs, and that each state (no matter how large or small) is equal in international law."

The next century saw many of these states begin civil registration for their citizens. These registrations, from which our modern system of birth certificates springs, became the basis for personal identity and legal identity in a way which conflated these two concepts.

Birth certificates are both the source of identity and proof of citizenship. People present proof of civil registration for many purposes. The birth certificate is thus the basis for individual identity in most countries. We use our physical control of a piece of paper to prove who we are and, springing from that, our citizenship. Civil registration has become the foundation for how states relate to the citizens. As modern nation states have become more and more powerful in the lives of their citizens, civil registration and its attendant legal identity have come to play a larger and larger role in our lives.

Descartes didn't say "I have a birth certificate, therefore, I am." We are, obviously, more than a legal identity. Nevertheless, the civil registration has been with us for almost four centuries and most of us cannot conceive of any basis for trusted identity independent of civil registration.

And yet, presently, 1.8 billion people are without this basic form of identity. As a result, they have difficulty getting basic government services. Most of these people are refugees displaced by war or territorial disputes, victims of famine or ethnic cleansing, outcasts from society, or victims of unscrupulous employers, smugglers, or organized crime. People who want to help them have difficulty because without legal identity they are illegible to state apparatus.

Without a birth certificate, people are in a bind. They have nothing upon which they can establish a legal identity and become legible to governments. And since birth certificates link identity and citizenship, both of these problems have to be solved at once, creating a paradox for authorities trying to help these unidentified people. In this system, they can't be made legible, and thus able to call on governments for aid or protections, without also being granted citizenship of some kind.

We are at a point in the development of identity that it is possible to develop and deploy technologies that allow individuals to create a self-sovereign basis for their identity independent from civil registration.

Such systems allow us to tease apart the purposes of the birth certificate by recognizing a self-sovereign identity independent of the proof of citizenship. This doesn't, by itself, solve the problem of providing legal identity since the self-sovereign identity is self-asserted. But it does provide a foundation upon which a legal identity could be built: specifically it is an identifier that a person can prove they control. Constructing a legal identity on this self-sovereign identity is possible, but would require changes to existing statutes, rules, policy, and processes.

ID2020 is a summit being held at the UN in May with the goal to "by 2030, provide legal identity to all, including birth registration." For this to work, I believe we must succeed in recognizing one or more sources of self-sovereign identity that people can use to bootstrap the process. Such systems must be trustworthy enough that governments will be willing to use them as a basis for a legal identity. Governments must be persuaded to accept this self-declared identity as the basis for establishing a relationship to the state.

The self-sovereign identity will not be all that is needed. The self-sovereign identity won't, at first, be associated with any validated claims. It provides only the identifier that the person can prove they have control over. Beyond that, legal systems will have to provide a route for using that identity to validate the attributes needed for a person to form a recognized relationship with various governments and their agencies.

Please note that this doesn't require that the self-sovereign identity be controlled by the state, only that it be trustable by the state. This also doesn't mean that this same self-sovereign identity wouldn't be usable as the basis for identity in other administrative systems. And using the self-sovereign identity in administrative systems need not diminish its independence in any way.

I'm very excited about developments in self-sovereign identity. There is much to do, but I feel like I can finally see a way forward on an idea many in the identity community have been working on for a decade: to understand how people can use identity in a way that doesn't always rely on some administrative authority to grant that identity.


Note: Christopher Allen's The Path to Self-Sovereign Identity got me thinking about writing this post when he circulated an early draft of his post last week. As he points out, a lot of people have been working on this for many years. User-centric identity was the word we used in the early phrase to refer to these ideas. The Internet Identity Workshop, which is holding it's 22nd meeting this week, was founded to explore user-centric identity on the Internet. As I said, many people in the identity community have been pushing this idea for a long time. And there's finally some light at the end of the tunnel.

Chris is also hosting a Rebooting the Web of Trust workshop in New York in the two days after the UN Summit. This is the second #RebootingWebOfTrust Design Workshop on decentralized identity technologies. The first produced a number of white papers on these ideas.


We're Teaching Programming Wrong

Rule

We're teaching programming the wrong way. Interesting study of how people naturally express solutions to problems concludes that starting with imperative programming languages may not be the best way to teach programming skills. What works? Event-based and rule systems, naturally. Maybe we ought to use KRL for introductory programming. I'm willing to try it.

The majority of the statements written by the participants were in a production-rule or event-based style, beginning with words like if or when. However, the raters observed a significant number of statements using other styles, such as constraints, other declarative statements (that were not constraints), and imperative statements.

The dominance of rule- or event-based statements suggests that a primarily imperative language may not be the most natural choice. One characteristic of imperative languages is explicit control over program flow. Although imperative languages have if statements, they are evaluated only when the program flow reaches them. The participants’ solutions seem to be more reactive, without attention to the global flow of control.

Studying the Language and Structure in Non-Programmers’ Solutions to Programming Problems


Properties of Permissioned and Permissionless Blockchains

chains

According to Robert Sams, there are three properties we want from a decentralized ledger system:

  1. Avoid forgery He calls these "sins of commission".
  2. Avoid censorship He calls these "sins of omission". Censorship might be hiding transactions, but its more often going to be regulatory control over transactions.
  3. Avoid deletion He calls these "sins of deletion". This amounts to reversing transactions after they've been written in the ledger.

Sams's thesis is that this is kind of like the CAP theorem because we get two of these. Since (1) is necessary in almost all cases and quite easily solved through cryptographic means, we really get to choose between the other two.

Permissionless blockchains (like the ones in Bitcoin, Namecoin, etc.) optimize (2) over (3) while permissioned blockchain systems (like the ones in Ripple, Evernym, etc) optimize (3) over (2).

Permissionless blockchains are distinguished by their use of a proof-of-work (or proof-of-stake) to avoid Sybil attacks caused by the cheap pseudonym problem. Permissionless blockchain systems exhibit features that make banks and other existing institutions leery of them. They are hard to use in a regulatory regime where authorities want to exert control over them. This, for blockchain enthusiasts is a feature, not a bug. There are places where (2) is more important than (3). This resembles cash and most of the disadvantages people raise about bitcoin are similar to those they'd raise about cash.

Permissioned blockchains have a governance process that selects validators and thus don't have to survive Sybil attacks since the validators are known--there are no pseudonymous validators. Permissioned blockchains are useful where the ledger bumps up against physical, so-called "off chain" transactions and have to make claims about things about, say a property title. They value (3) over (2) in that case because (a) we have to rely on the legal mechanism of the state to enforce the asset's ownership since it's off-chain and (b) those mechanisms won't rely on transactions that might be reversed (even if unlikely and rare) by anonymous verifiers who can't be held legally responsible for their choices.

Permissionless blockchain cheerleaders will say "write the title in the chain". But there are a bunch of people with guns who have a monopoly on violence (i.e. Governments) who are unlikely to relinquish their control of property records, etc. blithely.

So the world will likely be full of a mix of permissioned and perimissionless ledger systems that don't interoperate or only do so uneasily for some time. The important thing to remember is that they can co-exist.