Self-Sovereign Identity and the Legitimacy of Permissioned Ledgers

Stone record at Behistun

My last blog post was about creating an Internet for identity, a decentralized system that allows people and organizations to create identities independent of intervening administrative authorities. The post describes this system as self-sovereign and I call the system a self-sovereign identity system or SIS.

I believe the right way to construct SIS is using a public, permissioned distributed ledger. Permissioned ledgers have important properties that make them specifically useful for identity systems.

But permissioning implies governance. Someone has to determine who has permission to participate in approving transactions. If there are people making these kinds of decisions, how can a governed, permissioned ledger justify a claim that it supports self-sovereign identity?

John Locke and Sovereignty

John Locke was an English philosopher who had a big impact on the thinking of America’s founding fathers. Locke was concerned with power, who had it, how it was used, and how society is structured. More importantly, Locke’s theory of mind forms the foundation for our modern ideas about identity and independence.

Locke argued that “sovereign and independent” was man’s natural state and that we gave up freedom, our sovereignty, in exchange for something else, protection, sociality, commerce, among others. This grand bargain forms the basis for any society. As a community, the Internet proposes a similar bargain.

The goal of being self-sovereign isn't to be completely independent. With regard to the Internet: only machines without a network connection are completely independent. In the case of identity: only people without any relationships are completely independent. Seen from Locke's viewpoint, sovereignty is a resource each person combines with that of others to create society. Voluntarily giving up some of our rights to a state confers legitimacy on that state and its constitution.

Constitutional Orders and Legitimacy

Wikipedia defines legitimacy as

the right and acceptance of an authority, usually a governing law or a regime.

While this is most often applied to governments, I think we can rightly pose legitimacy questions for technical systems, especially those that have large impacts on people and society.

With respect to legitimacy, Philip Bobbit says:1

The defining characteristic ... of a constitutional order is its basis for legitimacy. The constitutional order of the industrial nation state, within which we currently live, promised: give us power and we will improve the material well-being of the nation.

In other words, legitimacy comes from the constitutional order: the structure of the governance. Citizens grant legitimacy to constitutional orders that meet their expectations by surrendering part of their sovereignty to them.

Regarding constitutional orders, Bobbitt says the following:2

The constitutional order of a state and its strategic posture toward other states together form the inner and outer membrane of a state. That membrane is secured by violence; without that security, a state ceases to exist. What is distinctive about the State is the requirement that the violence it deploys on its behalf must be legitimate; that is, it must be accepted within as a matter of law, and accepted without as an appropriate act of state sovereignty. Legitimacy must cloak the violence of the State, or the State ceases to be. Legitimacy, however, is a matter of history and thus is subject to change as new events emerge from the future and new understandings reinterpret the past.

Without legitimacy, the state cannot take action because neither it's citizens nor those on the outside who interact with it will see that action as authorized and thus it won't be accepted. Again, I believe these same principles can be applied to technical systems with broad societal impact. Without the support of it's users and the organizations rely on it, technical systems fail to be accepted.

My goal in this article is to apply these ideas about sovereignty, legitimacy, and constitutional orders to identity systems. Identity systems that seek to have large adoption and a broad spectrum of applications must have the support of a large class of users (analogous to citizens) and many relying parties (analogous to other states). They make promises that users and relying parties use to judge their legitimacy. Without legitimacy, they cannot succeed in their goals.

Social login, using your social media account to log into other sites, has become a popular means of reducing the identity management burden for users and relying parties alike. As I said earlier, my last post analyzed an emerging class of identity systems based on distributed ledger technology I called sovereign-identity systems (SIS). My thesis is that we are in the early stages of a change in the constitutional order for identity systems from social login to sovereign-identity systems. The rest of this article will focus on these two constitutional orders for identity systems, the promises they make, and their claims to legitimacy.

The Legitimacy of Social Login

Social media sites have become the largest providers of online identity through the use of social login. When you use Facebook to log into a third party Web site (known in identity circles as a relying party), you are participating in an identity regime that has a particular constitutional order and granting it legitimacy by your participation. Further, the relying party has also chosen to recognize the legitimacy of social login.

The constitutional order of social login is found in the terms and conditions in the contracts of adhesion that social login identity providers impose on people and relying parties alike. The system is a "take it or leave it" proposition with terms that can be changed at will by the social login identity provider.

A constitutional order makes different promises to those in the system (the users) and those on the outside (the relying parties). Let's examine the promise that social login makes:

  • To people social login says "use the identity we provide to you and we will make logging into sites you visit easy."

  • To relying parties, social login promises "use the identity we provide and trust us to accurately authenticate your users and we will reduce your costs, increase flexibility, and give you more accurate information about your users."

As successful as social login has been, there are a lot of places that social login has failed to penetrate. By and large, financial and health care institutions, for example, have not joined in to use social login. Why is this?

A constitutional theorist would say that they've failed the legitimacy test. Some relying parties and some people (either completely or for some use cases) have failed to yield their sovereignty to them. Legitimacy ultimately rests on trust that the regime can keep its promises. When that trust is missing or lost, the regime suffers a legitimacy crisis.

For people, the lack of trust in social login might be from fear of identity correlation, fear of what data will be shared, or lack of trust in the security of the social login platform.

For relying parties, the lack of trust may result from the perception that the identity provider performs insufficient identity proofing or the fear of outsourcing a critical security function (user authentication) to a third party. An additional concern is allowing a third party of have administrative authority for the relying party's users—not being in control of a critical piece of infrastructure. That is, they don't trust that the rules of the game might change arbitrarily based on the fluctuating business demands of the identity provider.3

These trust failings ultimately stem from the structure of the trust framework, the constitutional order, of social login. Because it's based on terms and conditions imposed by the identity provider whose primary business is something else, people and relying parties alike have less confidence in the future state of the identity system. So, it's good enough for some purposes, but not all.

The Legitimacy of Distributed Ledger Identity Systems

A distributed ledger identity system, what we've been calling SIS, has a different constitutional order. As we discussed in An Internet for Identity, nobody owns SIS, everybody can use it, and anyone can improve it. There are no identity providers. Anyone can create multiple identities on SIS to suit their needs. SIS is a public infrastructure for identity.

There's another important structural difference: SIS allows third party claim issuers to read and write claims about identifiers on the ledger. For example, the DMV (Department of Motor Vehicles) might write a claim about your authority to drive to the ledger. This claim would be sharable, by you, with others who are willing to trust the DMV. Anyone you share this claim with can verify that it came from the DMV and that it hasn't been tampered with.

Claims issuers are distinguished from identity owners and relying parties.4 Their relationship with identity owners is that they issue claims about them. Relying parties rely on the claims that claims providers issues when they are shared by the identity owner. Allowing third party claim issuers to participate in the identity ecosystem opens up new and powerful opportunities for participants in an SIS ecosystem.

Because claim issuers and relying parties are distinguished, we will refer to them collectively as "other parties."

SIS Identity Relationships
SIS Identity Relationships

SIS makes the following promises:

  • To people: "Create as many identities as you like and use them how you see fit. You can keep them forever. We will ensure that you can use them privately and securely, sharing claims on those identities with other parties as you see fit."

  • To claim issuers: "Write claims about the identities with whom you have a relationship. We will ensure that your claims are secure and private (where appropriate)."

  • To relying parties: "Use claims made by yourself and others (when they are shared with you). We will ensure that these claims are secure and that you can trust them."

There are two ways we can presently realize SIS: with a permissionless ledger or a permissioned ledger. I'm going to assume that both of these are public, meaning anyone can use them. There are private, permissioned ledgers that are used within specific administrative domains. But SIS has to be public—everybody can use it—to meet the criteria of being like the Internet.

One factor in the constitutional order of permissionless and permissioned ledgers is who can validate (and thus, ultimately, write) to the ledger. While it's public and anyone can use it, the transactions have to be validated to prevent fraud.

In a permissionless system, anyone can be a validator. This leads to permissionless ledgers, like the Bitcoin blockchain, having to use some mechanism to ensure that validator voting is fair. Specifically, a permissionless ledger has to protect against Sybil attacks. If pseudonyms are cheap, cheaters can create as many a they like and use them in the validation process to vote for fraudulent transactions. Permissionless ledgers, therefore, use techniques like proof of work or proof of stake to increase the cost of participation and mitigate attacks.

Permissioned systems, on the other hand, control who can be a validator. Sybil attacks are not possible when the validators are known beforehand. But this creates a different problem: someone has to vet the validators and ensure that they are known and play by the rules. When a validator breaks the rules, there has to be a judicial process to review the issue and possibly ban the offending node from participating in future transaction validations. Consequently, we need a governance process to identify validators, set rules for their behavior, bind them to contracts, and adjudicate their behavior.

These different structures will affect how well these two different systems can meet the promises upon which an SIS stakes its legitimacy. While I don't believe it's necessarily a competition and that both kinds of ledgers will co-exist to meet different needs, I believe that permissioned ledgers have an edge in establishing trust with claim issuers and relying parties alike. A governance process allows a clear case to be made about trust and process.

In particular, we've seen instances in permissionless ledgers that led to blocks being orphaned or a smart contract being hacked. In both cases, code maintainers had to choose how to resolve the issue in the ledger. Their choice was a legitimacy problem because they had to convince the validators to move with them and support it. A permissioned system doesn't necessarily prevent these problems, but it can provide a clear, unambiguous judicial process for solving the problem. This provides clarity to everyone about what happens when something goes wrong. This is a clear advantage.

How will permissioned systems fare with people? That depends on the details of their constitutions. To the extent that governance is limited to controlling validators, and not limiting how identities are created and used in the system, people will see clear advantages in permissioned ledgers because they will attract claim issuers and relying parties that people want to interact with. Heavy-handed governance that limits individual control of identity, on the other hand, will turn people off and push them to permissionless systems where anything goes. This isn't binary, there's a spectrum of choices between these two extremes.

Surviving the Transition

In Bobbitt's theory of constitutional orders, transitions from one constitutional order to a new one always requires war. After all, since state constitutions represent a structure seeking legitimacy for its monopoly on violence, violence is necessarily the founding tenet of the state.

Technological systems aren't using their constitution in the same way. I believe one critical difference is that holding multiple citizenships in different online systems is entirely practical. Consequently, I believe people will continue to use social login alongside newer distributed-ledger identity systems for some time to come.

Claim issuers and relying parties might be another story. There very well could be a competition for other parties between different identity regimes and that is where I believe we'll see the benefits of these different constitutional orders being carefully weighed and hotly debated.

At least for some time, social login is going to win the "we have the most users" contest. But as we saw with Internet adoption, that didn't hold true very long for CompuServe, AOL, and Prodigy (among others). AOL was the only online service company to survive the transition. If SIS can fulfill its promise, the scales could tip swiftly as more and more claim issuers and relying parties see the benefit of SIS and adopt it, bringing their users with them.

Different Sovereign Identity Systems will have different constitutions. Not just whether they are permissionless or permissioned, but also within these broad classes. For example, there can be significant differences in the governance of permissioned ledgers. The most successful ones will be more like republics or democratic governments: they exert authority, but only that which is granted to them by the users, for the sake of the users.


  1. Bobbitt, Philip. The Garments of Court and Palace: Machiavelli and the World That He Made (Kindle Locations 462-464).

  2. Bobbitt, Philip. The Shield of Achilles: War, Peace, and the Course of History

  3. Note that identity providers in the social login regime are not primarily in the business of providing identity. Their business is something else (mostly selling ads) and providing identity for social login is, from their perspective, part of serving that end.

  4. Note that in practical usage, any given entity might play any of these three roles at different times.


An Internet for Identity

Internet

In World of Ends, Doc Searls and Dave Weinberger enumerate the Internet's three virtues:

  1. No one owns it.
  2. Everyone can use it.
  3. Anyone can improve it.

If we wanted to build an identity system that was like the Internet, we'd want it to have those same virtues. To make the discussion below easier, let's call that system SIS (for sovereign identity system).

No One Owns It

Every online identity you have was given to you by someone else.

This simple fact makes every online identity completely different from identity in the physical world where you exist first, independently, as a sovereign human being.

As a result, online relationships are skewed. There's an imbalance of power between people and organizations online. Here's why. Online identity looks like this:

bigdot
Fig 1: Current online identity model

To do business with Amazon, you have to create an account. So do I. We both get a relationship with Amazon based on an identity that we create within Amazon's namespace. That identity is subject to the (mostly unread) Terms and Conditions that Amazon places on the use of its service. Your use of the account is subject to whatever restrictions Amazon chooses to place on it. These can be changed retroactively. Furthermore, Amazon can take the account away at any time and you have very little recourse. Clearly Amazon owns the account and is letting you use it so long as such use suits their goals.

Of course, Amazon isn't unique here. This is how identity works online. Everyone knows that. For identity to be different, we'd need a way for people to create online identities that they control.

Such an identity system could turn this diagram inside-out, resulting in a picture where you are at the center:

littledot
Fig 2: A sovereign identity model

In SIS, individuals, businesses, and other organizations establish identities that exist independently of the other identities in the system. Those identities are peers. Of course, being at the center is perception. The reality looks more like the Internet:

heterarchy
Fig 3: Peer to peer relationships in a sovereign model

Everyone Can Use It

Every online identity you have is subject to someone else granting you permission.

Anyone can use the Internet. No one has to give you permission. And no one can cut you off. You do need to get an IP address, but that's not a significant burden—they're widely available from multiple sources (and IPv6 was created to reduce that even further). Once you and I have an address, we can exchange IP packets to our heart's content. If your ISP cuts you off, you get another because they're substitutable, give me your new address, and we're back to exchanging packets.

An Internet-like identity system like SIS would be public. Anyone should be able to use it without getting permission from a system administrator or having to agree to terms and conditions that are changed arbitrarily or controlled and adjudicated by a closed process without recourse. An Internet-like identity system should be built so that everyone can participate on equal footing.

Anyone Can Improve It

You can only improve identity systems in ways their owners allow.

The Internet gets improved by lots of people everyday—most of whom have no formal relationship with the Internet's governance bodies. This happens in a couple of ways:

First, there is an open process for improving the system. On the Internet that happens through open protocols and open source code. This isn't a free-for-all. There are governance processes that control how these improvements are vetted and incorporated.

Second, anyone can design and build a new service on top of these protocols. DNS, email, and other services are all built on top of the Internet. So are the World Wide Web and things like Amazon, Facebook, and Google.

An Internet-like identity system should allow these same kinds of improvements. While governance is necessary, that doesn't diminish the virtues of the open platform. One of the paradoxes of decentralized platforms is that they require more formal governance than centralized ones.

Because its public, anyone can use SIS for anything it's protocols allow. That freedom gives rise to innovation. I fully expect that SIS would be used for purposes and in ways that it's designers could never imagine—just like the Internet.

Properties

As I wrote in The CompuServe of Things, the Internet works because it's (1) decentralized, (2) heterarchical, and (3) interoperable. These properties and others follow from the virtues we've discussed above.

Decentralized—decentralization follows directly from the fact that no one owns it. This is the primary criterion for judging the degree of decentralization in a system. And as I mention above, not only can a system be decentralized and governed, some level of governance is necessary for decentralized systems to function well.

Heterarchical—a heterarchy is a "system of organization where the elements of the organization are unranked (non-hierarchical) or where they possess the potential to be ranked a number of different ways." The diagram in Figure 3 (above) shows identities in relationships with each other as peers. This is a heterarchy; there is no inherent ranking of nodes in the architecture of the system.

Interoperable—regardless of what providers or systems we use to connect to SIS, we should be able to interact with any other principles who are using it. We don't have to worry that our identity will only work on some specific provider's systems. And given that it's based on open protocols and software, other identity systems should be able to interoperate with it as well.

Substitutable—SIS is a protocol, an agreement if you will, about how systems that use it must behave to achieve interoperability. That means that anyone who understands the protocol can write software that uses SIS. The end result is that while there will likely be software, systems, and companies who provide access to and services on SIS, your identity and your use of it don't depend on any one of them. There are usable substitutes that provide choice and freedom.

Reliable—people, businesses, and others must be able to use it without worrying that it will go down, stop working, go up in price, or get taken over by someone who would do it and those who use it harm. This is larger than mere technical trust that a system will be available. It extends to the business model (or lack thereof) and the governance—the full BLT (business, legal and technical) stack.

Non-proprietary—no one has the power to change SIS by fiat or take away a person's identity. Further, it can't go out of business and stop operation because its maintenance and operation are distributed instead of being centralized in the hands of a single organization. Because SIS is more of an agreement than a technology or system, so long as those using it agree to work together according to its principles, it will continue to work—just like the Internet

Frequently Asked Questions

  1. Is SIS an identity provider? No. There are no identity providers in the way we've come to think of them. The whole notion of providing and identity to someone is based in the administrative identity realm.
  2. Why would businesses use SIS? Most businesses see their identity systems as a cost center, not a profit center. And yet they're reluctant to turn over access to their system to some other entity who may turn into a competitor someday. SIS solves this problem by providing an identity system that the business doesn't have to maintain and is non-proprietary. This is a big win for businesses across the Internet because it reduces friction and cost at the same time.
  3. Does this mean that businesses won't have their own accounts? The existence of SIS doesn't mean that businesses don't still have a need to keep track of their customers, record preferences, etc. Take Amazon as an example. While SIS could be used to provide credit card and address information to Amazon, They'd still want to know who their customers are and provide wish lists, shopping carts, and so on. Amazon might choose to store Amazon identity information about a person on SIS.
  4. What would SIS be used for? The short answer is that people and organizations would write claims on SIS about identities in the system. These claims might be public, public and verifiable, encrypted, jointly owned, or self-asserted. Identity owners could make use of these claims in a variety of ways. For example, my bank might write a claim that I control a particular credit card on SIS—encrypted, of course. I could then provide that validated credit card information to others when I needed to make a payment. In some ways it feels like 1Password, Keypass, or iCloud Keychain, but on a much grander scale and far more flexible.
  5. How could SIS be built? A few years ago, SIS was a dream people had, but without a clear technical path to achieving it. The introduction of blockchain technology changed that. Over the last few years, research in distributed ledger technology has exploded and SIS is now not only possible, but versions of it are being conceived and built by several organizations. I prefer a permissioned distributed ledger over permissionless for SIS.

An Internet for Identity

The Internet was created without any way for people, organizations, and other entities to be identified. On the Internet, only machines get identities in the form of IP numbers. This is understandable given what the creators of the Internet were trying to achieve. But the lack of a decentralized, heterarchical, and interoperable identity system has created an environment where the services most people use online are a lot more centralized than the Internet they exist upon.

We're finally at a point where that failing can be rectified. Multiple players are working on sovereign identity systems that use distributed ledgers to create identity systems that are owned by no one, can be used by everyone, and can be improved by anyone. These systems will result in increased flexibility for people, businesses, and others as well as enabling new innovations in online services.


Credits

Thanks to Timothy Ruff of Evernym for the big dot/little dot figures.


Decentralization and Distributed Ledgers

DNS

Last week, I referenced an article in American Banker on the responsibilities of blockchain developers. I focused mainly on the governance angle, but the article makes several pokes at the "decentralization charade" and that's been bothering me. The basic point being that (a) there's no such thing as a blockchain without governance (whether ad hoc or deliberate) and (b) governance means that the ledger isn't truly decentralized.

In Re-imagining Decentralized and Distributed, I make the distinction between distributed and decentralized by stating that decentralized systems are composed of pieces that are not under the control of any single entity. By that definition, DNS, for example, is a pretty good example of a decentralized service since it's composed of servers run by millions of separate organizations around the world, cooperating to map names to IP numbers. There are others including email, the Web, and the Internet itself.

But DNS is clearly subject to some level of governance. The protocol is determined by a standards body. Most of the DNS servers in the world are running an open-source DNS server called BIND that is managed by the Internet System Consortium. Domain names themselves are governed by rules put in place by ICANN. There are a group of people who control, for better or worse, what DNS is and how it works.

So, is DNS decentralized? I maintain that DNS is decentralized, despite a relatively small set of people who, together, govern it. Here's why:

First, we have to recognize that decentralization is a continuum, not a binary proposition. Could we imagine a system for mapping names into IP numbers that is more decentralized? Probably. Could we imagine one less decentralized? Most certainly. And given how DNS is governed, there are a multitude of entities who have to agree to make significant changes to the overall operation of the DNS system.

Second, and more important, the governance of the DNS system is open. Structurally, it's difficult for those who govern DNS to make any large-scale change without everyone knowing about them and, if they choose, objecting.

Third, the kinds of decisions that can be made by the governance bodies are limited, in practice, but the structure of the system, the standards, and the traditions of practice that have grown up around it. For example, there is a well-defined process for handling domain name disputes. Not everyone will be happy with it, but at least it exists and is understood. Dispute resolution, as one example, is not ad hoc, arbitrary, or secret.

Lastly, the DNS system may be governed by a relatively small set of people and organizations, but it's run by literally millions. People running DNS servers have a choice about what server software they run. If enough of them decided to freeze at a particular place because they objected to changes or to fork the code, they could effectively derail an unpopular decision.

Distributed ledgers will have varying levels of decentralization depending on their purpose and their governance model and how that model is made operational. The standard by which they should be judged is not "does any human ever make a decision affecting the ledger" but rather:

  1. Is the ledger as decentralized as we can make it while achieving the ends for which the ledger was created?
  2. Is the governance process open? Who can participate? How are the governing entities chosen?
  3. How light is the governance? Are the kinds of decisions the governing bodies can make limited by declared process?
  4. Is the operation of the system dependent of the voluntary participation of entities outside the governing bodies?

Distributed ledgers are young and the methods and modes of governance, along with those entities participating in their governance, are in flux. There are many decisions yet to be made. What's more, there's not one distributed ledger, but many. We're still experimenting with what will work and what won't.

While a perfectly decentralized system may be beyond our reach and even undesirable for many reasons, we can certainly do better than the centralized systems that have grown up on the Web to date. Might we come up with even more decentralized systems in the future? Yes. But that shouldn't stop us from creating the most decentralized systems we can now. And for now, we've seen that governance is necessary. Let's keep it light and open and move forward.


Governance for Distributed Ledgers

Fiduciary Trust Building

This article by Angela Walch from American Banker makes the (excessively snarky) case that distributed ledger developers and miners ought to be held accountable as fiduciaries.

Non-permissioned distributed ledgers like Ethereum will continue to serve important needs, but organizations like banks, insurance companies, credit unions, and others who act as fiduciaries and must meet regulatory requirements, will prefer permissioned ledgers that can provide explicit governance. See Properties of Permissioned and Permissionless Blockchains for more on this.

Governance models for permissioned ledgers should strike a careful balance between what’s in the code and what’s decided by humans. Having everything in code isn’t necessarily the answer. But having humans too heavily involved can open the system up to interference and meddling—both internal and external.

Permissioned ledgers also need to be very clear about what the procedures are for adjudicating problems with the ledger. They can’t be seen as ad hoc or off the cuff. We must have clear dispute resolution procedures and know what disputes the governance system will handle and those it won't.

Governance in permissioned distributed ledgers provides a real solution to some of the ad hoc machinations that have occurred recently with non-permissioned blockchains.


Service Integration Via a Distributed Ledger

ledger

Consider a distributed ledger that provides people (among other principles) with an identity and a place to read and write, securely and privately, various claims. As a distributed ledger, it's not controlled by any single organization and is radically decentralized and distributed.

In the following diagram, the Department of Motor Vehicles has written a driver's license record on the distributed ledger. Later, John is asked to prove his age at Walmart. John is involved in permissioning both the writing and reading of the record. Further, the record is written so that John doesn't have to disclose the entire driver's license, just the fact that he's over 18.

distributed-ledger-integration
A Distributed-Ledger Integration

Walmart and the DMV are interacting despite the lack of explicit integration of their systems. They are interacting via the a distributed ledger that provides secure and private claim presentment. Further, John (the person they're talking about) is structurally part of the conversation. I call this sovereign-source integration since it's based on sovereign-source identity.

Even if there were 20 different distributed ledger systems that Walmart had to integrate with, that still less work than integrating with every DMV. And, they can now write receipts when you shop or read transcripts when you apply for a job—all with your permission, of course.

Security and privacy is ensured by the proper application of cryptography, including public-private key pairs, digital signatures, and cryptographic hashes. This isn't easy, but it's doable. There's nothing about the scenario I'm painting that is waiting on some technology revolution. Everything we need is available now.

I wrote a post a few weeks about about how sovereign-source integration helps solve the problems of building a virtual university. In that article, the student profile (including an LRS) is the distributed, personally controlled integration point. The information in the student profile might all be written as claims on a distributed ledger, but they could also be in some off-ledger system that the distributed ledger just points to. Either way, once the student has provide the various institutions participating in the virtual university with their integration point, the various university systems are able to work together through the integration point instead of needing point-to-point integrations.

virtual university
The Virtual University

The world is too big and vast to imagine that we can scale point-to-point integrations to cover every imaginable use case. The opportunities for this architecture in finance, healthcare, egovernment, education, and other areas of human interaction boggle the mind. Sovereign-source integration is a way to cut the Gordian knot.


Pico Labs at Open West

PicoLabs at Open West

The students in my lab at BYU are running a booth at OpenWest this year. OpenWest is one of the great open source conferences in the US. There are 1400 people here this year. When the call for papers came out this year, I missed the deadline. Not to worry, I decided to sponsor a booth. That way my students can speak for three days instead of an hour. Here's what they're demoing at OpenWest this week.

A while back, I wrote a blog post about my work with the ESProto sensors from Wovyn. Johannes Ernst responded with an idea he'd had for a little control project in his house. He has a closet with computers in it that sometimes gets too hot. He wanted to automatically control some fans and turn them on when the closet was too hot. I asked my students—Adam Burdett, Jesse Howell, and Nick Angell—to mock up that situation in an old equipment box.

Physically, the box has two pancake fans on the top, a light bulb as a heat source, a ESProto temperature sensor inside the box, and one outside the box. There's a Raspberry Pi that controls the light and fans. The RPi presents an API.

We could just write a little script on the RPi that reads the temperatures and turns fans on or off. But that wouldn't be much fun. And it wouldn't give us an excuse to work on our vision for using picos to create communities of things that cooperate. Granted, this example is small, but we've got to start somewhere.

The overall design uses picos to represent spimes for the physical devices: two fans and two temperature sensors. There is also a pico to represent the community of fans and one to represent the closet, the overall community to which all of these belong. The following diagram illustrates these relationships.

pico structure
Pico Structure for the Closet Demo

The Fan Collection is an important part of the overall design because it abstracts and encapsulates the individual fans so that the closet can just indicate it wants more or less airflow without knowing the details of how many fans there are, how fans are controlled, whether they're single or variable speed, and so on. The Fan Collection manages those details.

That's not to say that the Fan Collection knows the details of the fans themselves. Those details are abstracted by the Fan picos. The Fan picos present a fairly straightforward representation of the fan and its capabilities.

This demo provides us with a project to use Wrangler. Wrangler is the pico operating system that Pico Labs has been working on for the last year. Wrangler is a follow-on to CloudOS, a pico control system that we built at Kynetx and that was the code underlying Fuse, the connected-car platform we built. Wrangler improves on CloudOS by taking its core concepts and extending and normalizing them.

The primary purpose of Wrangler is pico life cycle management. While the pico engine provides methods for creating and destroying picos, installing rulesets, and creating channels, those operations are low-level—using them is a lot of work.

As an example of how Wrangler improves on the low-level functions in the pico engine, consider pico creation. Creating a useful child pico involves the following steps:

  1. create the child
  2. name the child
  3. install rulesets in the child
  4. initialize the child
  5. link the child to other picos using subscriptions

Wrangler uses the concept of prototypes to automate most of this work. For example, a developer can define a prototype for a temperature sensor pico. Then using Wrangler, temperature sensor picos, with the correct configuration, can be created with a single action. This not only reduces the code a developer has to write, but also reduces configuration errors.

The great thing about going to a conference—as a speaker or an exhibitor—is that it gives you a deadline for things you're working on. OpenWest provided just such an excuse for us. The demo drove thinking and implementation. If you're at OpenWest this week, stop by and see what we've done and ask some questions.


A System Ruleset for the Pico Engine

FIGURE 10.2 Reinforcing feedback: An increase results in further increase, and vice versa

I have a problem: a long time ago, Kynetx built a ruleset management tool called AppBuilder. There are some important rulesets in AppBuilder. I'd like to shut down AppBuilder, but first I need to migrate all the important rulesets to the current ruleset registry. There's just one tiny thing standing in my way: I don't know which rulesets are the important ones.

Sure, I could guess and get most of them. Then I'd just wait for things to break to discover the rest. But that's inelegant.

My first thought was to write some code to instrument the pico engine. I'd increment a counter each time it loads a ruleset. That way I see what's being used. No guessing. I'd need some way to get stuff in the database and get it out.

But then I had a better idea. Why not write intrumentation data into the persistent variable space of a system ruleset. The system ruleset can access and modify any of these variables. And it's flexible. Rather than making changes to the engine and rolling to production each time I change the monitoring, I update the system ruleset.

Right now, there's just one variable: rid_usage. The current system ruleset is simple. But it's a start. All the pieces are in place now to use this connection for monitoring, controlling, and configuring the pico engine.

I like this idea a lot because KRL is being used to implement important services on the platform that implements KRL. Very meta... And when systems start to be defined in their own language, that's a good thing.


Failure and the Internet of Things

Summer Sprinkler

I'm now on my second Internet-connected sprinkler controller. The first, a Lono, worked well enough although there were some features missing. Last week, I noticed that the program wasn't running certain zones. I wasn't sure what to do and I couldn't find help from Lono, so I decided I'd try a second one. My second purchase based on both friend's recommendations and reviews on Amazon was a Rachio. I installed it on Saturday.

As I was working on setting up the programs and experimenting with them I noticed that the new sprinkler controller had stopped working. When I went to check on it, I discovered that it was completely dead: no lights, no response.

I rebooted the controller and started over. It got to the same point and sprinkler controller died again. A little thought showed that the Rachio sprinkler controller was dying at exactly the same point that the Lono was failing to complete its program. The problem? A short in one of the circuits.

The Lono and the Rachio both fail at handling failure. The old controller, an Irritrol, just dealt with it and kept right on going. None of them, including the Irritrol, did a good job of telling me that I had a short-circuit.

Building sprinkler controllers is a tough job. The environment is dirty and wet. The valves and sensors are numerous and varied. I don't know about you, but it's a rare year I don't replace valve solenoids or rewrite something. A sprinkler controller has to roll with this environment to pass muster. To be excellent, it has to help with debugging and solving the problems.

Fancy water saving features, cool Web sites, and snaky notifications are fine. But they're like gold-plated bathroom fixtures in a hotel room with dirty sheets if the controller doesn't do it's basic job: run the sprinklers reliably.


Fitbit as Gizmo

Fitbit

In the taxonomy of Bruce Sterling's Shaping Things, the Fitbit is a Gizmo.

"Gizmos" are highly unstable, user-alterable, baroquely multifeatured objects, commonly programmable, with a brief lifespan. Gizmos offer functionality so plentiful that it is cheaper to import features into the object than it is to simplify it. Gizmos are commonly linked to network service providers; they are not stand-alone objects but interfaces. People within an infrastructure of Gizmos are "End-Users."

People buy Fitbits believing that they're buying a thing, but in fact, they're buying a network service. The device is merely the primary interface to that service. The Fitbit is useless without the service. Just a chunk of worthless plastic and silicon.

The device is demanding. We buy Fitbits and then fiddle with them incessantly. Again, to quote Bruce:

...Gizmos have enough functionality to actively nag people. Their deployment demands extensive, sustained interaction: upgrades, grooming, plug-ins, plug-outs, unsought messages, security threats, and so forth.

Sometimes we're messing with them cause we're bored and relieve it with a little configuration. Often we're forced to configure and reconfigure because its not working. We feel guilt over buying something we're not using. Usually, the Fitbit ends up in a drawer unused after the guilt wears off and the pain of configuration overwhelms the perceived benefit.

Fitbit isn't selling things. They probably fancy themselves selling better health or fitness. But, Fitbit is really selling a way to measure, and perhaps analyze, some aspect of your life. They package it up like a traditional product and put it on store shelves, but the thing you buy isn't a traditional product. Without the service and the account underlying it, you have nothing.

Of course, I'm not talking about Fitbit alone. Fitbit is just a well-known example. Everything I've said applies to every current product in the so-called Internet of Things. They are all just interfaces to the real product: a networked service. I say "so-called" because a more appropriate name for the Gizmo ecosystem is CompuServe of Things.

Bruce's book is a trail guide to what comes after Gizmos: something he calls a "spime." Spimes are material instantiations of an immaterial system. They begin and end with data. Spimes will have a different architecture than the CompuServe of Things. To work, they will cooperate and interact in true Internet fashion.


Notes from Gluecon 2016

Gluecon Logo2016_Master

I took the following notes during various sessions at Gluecon 2016 at the Omni Interlocken in Broomfield Colorado. Notes were live tweeted during the event on @windley using Kevin Marks' Noter Live tool.

melody meckfessel:

starting off the day with how cloud accelerates innovation in software development

Three waves of cloud tools: Colocation, virtualized data centers, and 3rd wave: actual, global, flexible cloud

Goal is NoOps: auto everything. No need to manage or spin up servers. Write code, rather than manage servers

Kubernetes manages containers, supports multiple envs & container runtimes, 100% open source

Speaking of developers: "We keep raising the bar on ourselves"

Goal is to let developers focus on code. PaaS (e.g. appEngine) needs to evolve

Now: PaaS is a walled garden; Future: choice of tools, more complex apps, global scale

31% time spent troubleshooting; ofter in time-critical situations; need better tools: trace, error reporting, prod debug

Duncan Johnson-Watt:

Up next: building apps on the blockchain

Hyperledger is new project from Linux Foundation.

Business is increasingly interested in permissioned blockchains rather than promiscuous or permissionless blockchains

Governance of blockchain will be incredibly important if we're going to bet on this technology

Requirements for blockchains vary greatly across different use cases

"This is too important to be owned by a single entity" speaking of distributed ledger tech

Hyper ledger has ~50 members, >$6M in funding, 2300 membership requests

Brian Behlendorf is now executive director

Showing how to deploy a blockchain application with Cloudsoft AMP

Key concepts: shared ledger, smart contract, consensus network, membership, events, management, wallet, integration

live demo of asset transfer demo. #brave Challenge is speed at the moment. Proof of asset ownership controls transfer

Mary Scotton:

want to be diverse? Start by diversifying your twitter feed.

Jen Tong:

Crash course in electrical engineering at start of #IoT session

Done with signals; moving on to components. "kind of like legos except sometimes they catch on fire"

bread boards, perf boards, jumpers, resistors, LEDs, push button, capacitor, servo motors

The recipe: put components on bread board, arduino uno converts component signals, feeds to RaspPI over USB

Johnny Five is a JS code for #IoT on Arduino Uno

Firebase is a real-time database; "data that is best served fresh"

Where's the Bus? as an example of real-time data. I care where the bus is now, not yesterday.

Collaborative drawing is another example where real-time matters. Much less interesting with several second lag

When you're working on your Arduino always do it unplugged or you'll be sad.

After getting the button and LED hooked up: we have a thing, but no Internet yet. Let's add Firebase

why live code when you can live copy/paste

Firebase shows button presses in FB console. "Now, let's go through the Internet to change the LED status"

Connects button to her slides. Button adds rick-roll to the slide.

Celebrate the first time something catches on fire

Slides are here: http://mimming.com/presos/internet-of-nodebots/index.html#/

Alex Balazs:

speaking on how Intuit is breaking up the monolith

Intuit moving everything to AWS; shutting down local data centers

TurboTax is a $2 Billion business; product managers don't want to touch it (other than updating tax logic)

our vision is to make tax prep obsolete; this makes TurboTax irrelevant

20yo tech stack; terrible, horrible, monolith. Written by tax specialists who became programmers.

Going beyond the interview to personal experience. Why as 20yo barista in NYC if they get California RR Retirement?

Can't replace TurboTax by creating something complex to replace it. #FAIL'd twice already. #gallslaw

2nd problem: trying to create better TurboTax instead of creating product to kill TurboTax

Bearing up the TurboTax monolith: everything as a service; quickly create frictionless experiences

Teams work at their own speed; teams are decoupled; services built for other teams.

path forward to create a pirate ship. Everyone wants to be a pirate.

pirate ship means "this was not a sanctioned project"

took the narrowest part of TurboTax: vehicle registration, 3 screens; TurboTax interview has 53K screens

hardest problem to solve in TurboTax: what does back button do.

back button takes you back a screen; should it save the data or not?

Built vehicle registration in 4 weeks and pushed to production; Sr leadership then sanctioned project;

Old stack: changing user experience took 3 months. New stack: 1 week.

Now 14 most common topics in TurboTax are running on new stack

In a world with 50K interview screens, you can build them manually. Intuit has a "tax player" for tax content

Intuit's ability to enter markets on new devices skyrocketed

Old product had 6 different "beaconing" libraries

In three years Intuit will have eliminated every line of code from the monolith and be completely service based

1. Everything as a service 2. Attack the monolith; 3. Build common application fabric (prescriptive on standards)

Rajesh Raman:

Time is Hard: Doing Meaningful Things with Data

Doing meaningful things with fast data

fast data continually reflects changing state; enables real time decision making

Time data often implies big data

Ex: sentiment analysis on Twitter; seismic sensor networks; data fusing from distributed sensors (phones in cars)

individual records are small; all have timestamps; repeated measurements yield time series

Value of time series data diminishes over time; 2 strategies: store nothing & store everything

tiered storage: recent data at high fidelity; older data at low fidelity; store analysis not raw sensor data

Batch processing is common, but reduces responsiveness

Alternative is stream processing; stream processor is stateful and incremental; typically using O(1) algorithm

stream processing: read-once, write-once. No do overs.

stream processing is not only far more timely, but also more efficient than batch processing.

BUT: you lose ability to rewind and do a redo.

Good news: simple primitives take you a long way; bad news: dealing with time is hard

merging streams by timestamp; skewed, irregular, bursty, laggy, jittery, lossy

skew: data from different time series arrive at different timestamps

irregular: aperiodic or unsteady periodicity

bursty: no activity for while, then all arrive at once

laggy: difference between generation and receipt

skew happens all the time. must logically align data within each period; requires understanding data

skew requires alignment between alignment and the periodicity the data is arriving at

Bustiness and lag require deciding how long to wait

The longer you wait the more likely data will appear; but computation is less timely

wait time must be bounded in some way because of finite resources

Types pf clocks: measurement time, receipt time, analytic/processing time

deadlines exclude data: schemes: static guarantees timeliness while dynamic adapts to changing conditions

dynamic deadlines can be set based on how much data is being excluded by deadline

Elliot Turner:

Morning starts off with cognitive computing

Arthur Bock wrote the first gam playing computer (checkers) on the IBM701

IBM's Deep Blue (chess) was 10M times faster than the IBM701; massively parallel with specialized chess playing ASICs

In 1997 Deep Blue was in top 500 super computers. Today the same compute is available on a $400 graphics card

IBM Watson was a 305 year project; team of 15 people. Within a year, it could regularly beat some champions.

Problem domain: broad domain; complex language; high precision, accurate response

Real Jeopardy champions buzz in 80% of the time & answer correctly 80% of the time. That's incredible performance

Can't be solved with lookup table. In 20,000 questions there are over 2500 types; bigger bucket is 3%

Cognitive has evolved from systems that play games to multi-model understanding speech, emotions; all API driven

Cognitive system is a partnership between humans & computers

Cognitive computing depends on understanding, reasoning, and learning.

Cognitive systems are trained, not programmed. They work with humans to develop they're capabilities.

Turning cognitive computing loose on Internet leads to interesting results. For example, it learned dogs are people.

The problem is that there's not "one truth" In some contexts people equate their dogs with people. But dogs aren't people

Now, understanding human speech is an API call away

Three rest calls: speech understanding -> translation -> txt-to-speech yields a speech translator

Eric Norlin:

thanks Kim, Brian, and rest of the staff. They make the conference work

Brendan Burns:

Introspection: find out what went wrong; Insight is finding out things you didn't know

Introspection requirements: specification, status, events, attribution

Audit requirements: transparency, immutability, verification, restrictions & limits, automation (APIs)

Insight requirements: dynamic organization, interactive exploration, visualization

At first blush, IaaS checks a lot of these boxes, but not all of them (eg. immutability, monitoring APIs are wrong)

Containers are the right API object, can be immutable, and can be verifiable

Cluster management has organization, introspection, specification, status, events

Demo of KSQL for querying Kubernetes. Here's the repo: https://github.com/brendandburns/ksql

Kubernetes API server enables immutability by limiting actions on containers (eg ensure code checked in)

ABAC in cluster management allows policies to control actions (eg access, deployment)

Policy ex: Only allow certain people to create containers that come from specific registries/repos

Admission control policies can control resource use (eg auto-approve if resource overage explained in issue ticket)

Mosquito checks for things that haven't changed in 3 months. Finds dead resources.

Or find services or machines that restart the most.

Benjamin Hindman:

@alexwilliams interviewing

VMs didn't change things; containers and cluster management did.

Rosanna Myers:

Robots are lots cheaper; robots are safer;

A collaborative robot or cobot

KRL is Kuka Robot Language (or Kynetx Rule Language)

A 3D printer is a blind robot. Having vision is a good thing.

Big breakthrough is cloud robotics. Off load processing to the cloud. Ex: self driving cars

Big advantages: all robots learn from experiences of the others

Cloud robotics provides designing freedom, collaborative learning and application development

Manufacturing is still area for robotics; only 10% of manufacturing is automated. Barrier: robots are hard to use

Research is another. 90% of research projects not repeatable. And the pipetting by hand for hours isn't fun.

Another: 12M people require full time care. Eg. attach robot arm to wheelchair

Joe Beda:

I made the term "production identity" up

Google systems have largely been in production for > 10yrs & are highly integrated

GOOG doesn't have all the answers, but they do have all the problems.

Solutions aren't as important as understanding how to breakdown and frame problem

Question: how do we identity production services

Trend: Manual -> Automation

We have more things we're dealing with and the change more often than they have in the past.

Evolving security: (1) network problem; lock down network (2) application security both operations and code analysis

Micro segmentation: surround any piece of hardware with it's own policy. chroot for your network

Does readability imply authorization? Doesn't sound very secure

Microservices -> many connections between components. When a micro service has 100 connections, reachability doesn't cut it

Devops is therapy for large organizations

identity is lower level function from authn or authz

We can come up with a one-size-fits-all solution for production ID where authn and authz not so much

Many applications have their own idea of a user. So secret stores become key translators

GOOG has LOAF: stuff in production has an identity that is transported ambiently

SPIFFE: Secure Production Identity Framework for Everyone

SPIFFY is dailtone for identity

SPIFFE ID: urn:spiffe:example.com:alpaca-service

Developer experience: get SPIFFE ID, give it key pair with certificate chain & root certs to trust

Cert usage: TLS verification and message signing

When we talk about message signing, think JWOT

SPIFFE could be integrated in micro service and RPC frameworks, in smart "side car" proxies, & off-the--shelf systems

Future directions for SPIFFE: federation, authorization, delegation, capability tokens

See https://spiffe.io for more information

Chris Richardson:

speaking on patterns languages for microservices

Successful software development depends on architecture, process & organization

Organization should be small, autonomous teams

Process should be agile

There's no silver bullet for architecture (reference to Fred Brookes)

Architecture patterns are a reusable solution to a problem in a particular context

Patterns force you to consider tradeoffs: benefits, drawbacks, issues to resolve

Patterns force you to consider other patterns: alternative solution and solution introduced to the pattern

Microservices pattern available at http://microservices.io

Infrastructure patterns include deployment and communication patterns

Core patterns include cross-cutting concerns

Application patterns include database architectures and data consistency

Monolithic architecture are relatively simple to develop, test, deploy, & scale (in certain contexts)

Problem is that successful applications keep growing; adding code day-after-day; you end up with a "ball of mud"

Monolithic architectures break process goals of agile and continuous deliver & the org goal of autonomous teams

Microservices architecture functionally decomposes app into many services intermediated by API gateways

Microservice architecture drawbacks: complexity, IPC, partial failure, TxN span multiple services; testing is hard

Issues: deployment; communication, partitioning; distributed data management

Shared databases lead to tight coupling between services; each service needs it's own data store

Data store per service -> services communicating via API only

Event-driven, eventually consistent architecture is solution to data store per service downsides

dual write problems traditionally solved using TxNs. Instead must reliably publish events; use Event sourcing

2nd problem: queries are no longer easy across several services; pattern is CQRS and materialized views

There are many more patterns for deployment, communication, etc.

Mark VanderWiele:

Connect and control #IoT devices in minutes using voice commands

Architecture uses MQTT to publish & subscribe data from device; processing in cloud; connecting homekit & TI devs

Learned: make devices more self-describable; allows generic UIs that devices plug into and work

Voice is the last mile in device interaction

Doing demos: monitor and control a device with voice commands

Demo using services from Bluemix services catalog

"ambient computing at your disposal"

Nodered used to get commands from speech application; program processes keywords; sends JSON using mQTT to robot

Using an iPod touch as the HomeKit gateway; using another iPod touch as gateway for bluetooth spheros

Created composite applications from multiple device types and the IoT foundation

Capabilities unfortunately change when manufacturers send firmware updates

John Musser:

APIs can be great, but not always... API Ops is the answer

APIs go down; have unversioned changes; API Ops to the rescue

API Ops is like DevOps for APIs

API Ops should build, test, and deploy APIs more reliably.

API Ops and Dev Ops are similar, but different in subtle ways

We're seeing more and more stories about API failures.

API Ops: design, build, test & release APIs more rapidly, frequently, & reliably

Elephant in the room is micro services; DevOps necessary for managing all these services.

Use of API specification has exploded. So has the number of API tools

Why all the tools? The API Lifecycle. 1st gen focused on operation. 2nd gen focused on the rest

API Lifecycle: requirements, design, development, test, deployment, and operations

DevOps is about looking at entire lifecycle. API Ops is similarly focused on entire lifecycle

Going meta: APIs for API Ops

The entire API lifecycle can be controlled with APIs