Sovrin Web of Trust

The Web of Trust model for Sovrin is still being developed, but differs markedly from the trust model used by the Web.

The Web (specifically TLS/SSL) depends on a hierarchical certificate authority model called the Public Key Infrastructure (PKI) to determine which certificates can be trusted. When your browser determines that the domain name of the site you're on is associated with the public key being used to encrypt HTTP transmissions (and maybe that they’re controlled by a specific organization), it uses a certificate it downloads from the Website itself. How then can this certificate be trusted? Because it was cryptographically signed by some other organization who issued the public key and presumably checked the credentials of the company buying the certificate for the domain.

But that raises the question "how can we trust the organization that issued the certificate?" By checking its certificate, and so on up the chain until we get to the root (this process is called “chain path validation”). The root certificates are distributed with browsers and we trust the browser manufacturers to only include trustworthy root certificates. These root certificates are called “trust anchors” because their trustworthiness is assumed, rather than derived.

Webs of Trust

In contrast, a Web of Trust model uses webs of interlocking certificates, signed by multiple parties to establish the veracity of a certificate (see Web of Trust on Wikipedia for more detail). An entity can be part of many overlapping webs of trust because its certificate can be signed by many parties.

Sovrin uses a heterarchical, decentralized Web of Trust model, not the hierarchical PKI model, to establish the trustworthiness of certificates. The methodology for doing this is built into the Sovrin system, not layered on top of it, by combining a decentralized ledger, decentralized identifiers, and verifiable claims. While Sovrin Foundation will establish some trust anchors1 (usually the same Stewards who operate validator nodes on the Sovrin network), these are merely for bootstrapping the system and aren’t necessary for the trusting the system once it is up and running. Sovrin uses this trust system to protect itself from DDoS attacks, among other things.

By allowing an identifier to be part of multiple webs of trust, Sovrin allows for reputational context to be taken into account. For example, being a good plumber doesn’t guarantee that a person will be a good babysitter, but a person who was a good plumber and a trustworthy babysitter could be part of different webs of trust that take these two contexts into account. The makes the overall trust model much more flexible and adaptable to the circumstances within which it will be used. PKI is good for one thing on the Web: showing the public key used to secure HTTP transmissions is correct. In contrast, Sovrin’s decentralized web of trust model is good for anything people need.

The goal of Sovrin is to provide the infrastructure upon which these overlapping webs of trust can be built for various applications. Lyft, Airbnb, and countless other sharing economy businesses are essentially specialized trust frameworks. Sovrin provides the means of creating similar trust frameworks without the need to build the trust infrastructure over and over.

Building Trust

Trust anchors are not the only way one could build trust in an identifier for a given purpose. I can think of the following ways that an identifier could come to be trusted:

  1. Personal knowledge — The identifier is personally known to you through interactions outside Sovrin. People close to me would be in this category. This category also includes identifiers like those for my employer or other entities who I interact with in the physical world and can thus establish a trust connection through means outside of Sovrin. If I know and trust the Web site of an entity, a well-known discovery scheme could let me know what identifiers they claim and consequently I’d gain trust in those identifiers. Such a system could piggyback on the PKI used for Web certificates.
  2. Verifiable claims — The identifier is verified by reliance on other trustworthy claims. This is analogous to how banks use KYC to establish trust in a person opening an account. They check other documents that they can trust (like a driver license or passport). They trust those documents by relying on trust established via the method described in (1).
  3. Web of trust — The identifier is introduced to me by someone I trust or who can be transitively associated with someone I trust. This category most closely follows the PGP Web of trust model described in the wikipedia article I reference above. Various entities have signed the certificate associated with the identifier and I can trace those signatures back to other entities I trust.

Trust anchors, as the term is used in the Sovrin documents, have been designated by Sovrin or other trust anchors as a trustworthy source. But they are not the only source of trust.

In conclusion, trust on Sovrin works much the same way as it does in the physical world. This has advantages and disadvantages. The chief advantage, as I’ve pointed out, is that Sovrin is flexible and adaptible to various circumstances.

The chief disadvantage is that it relies on people being responsible for determining whether or not to trust something. There are lots of clues that people can use. How well it ultimately works depends on the user experience and whether people understand what’s being asked of them. But there will be cases where people mistakenly trust things they shouldn’t. Of course that happens today, online and off, as well. I’m hopeful that a richer set of trust clues will lead to less fraud than we currently see. Whether that’s true or not depends on design decisions yet to be made. Fortunately those can be made with 15 years of experience from identity on the Web to guide us.


Endnotes:

  1. Because the term “trust anchor" is heavily associated with PKI, it might make sense to use another one to avoid confusion.

Photo Credit: Spiderweb with Frost from Yintan (CC BY 4.0)


Sovrin In-Depth Technical Review

The Sovrin Foundation and Engage Identity announced a new partnership today. Security experts from Engage Identity will be completing an in-depth technical review of the Sovrin Foundation’s entire security architecture.

Sovrin Foundation is very concerned that the advanced technology utilized by everyone depending on Sovrin is secure. That technology protects many valuable assets including private personal information and essential business data. As a result, we wanted to be fully aware of the risks and vulnerabilities in Sovrin. In addition, The Sovrin Foundation will benefit from having a roadmap for future security investment opportunities.

We're very happy to be working with Engage Identity, a leader in the security and identity industry. Established and emerging cryptographic identity protocols are one of their many areas of expertise. They have extensive experience providing security analysis and recommendations for identity frameworks.

The Engage Identity team is lead by Sarah Squire, who has worked on user-centric open standards for many organizations including NIST, Yubico, and the OpenID Foundation. Sarah will be joined by Adam Migus and Alan Viars, both experienced authorities in the fields of identity and security.

The final report will be released this summer, and will include a review of the current security architecture, as well as opportunities for future investment. We intende to make the results public. Anticipated subjects of in-depth research are:

  • Resilience to denial of service attacks
  • Key management
  • Potential impacts of a Sovrin-governed namespace
  • Minimum technical requirements for framework participants
  • Ongoing risk management processes

Sovrin Foundation is excited to take this important step forward with Engage Identity to ensure that the future of self-sovereign identity management can thrive and grow.


Life-Like Anonymity

At VRM-day prior to Internet Identity Workshop last week, Joyce Searls commented that she wants the same kind of natural anonymity in her digital life that she has in real life.

In real life, we often interact with others—both people and institutions—with relative anonymity. For example, if I go the store and buy a coke with cash there is no exchange of identity information necessary. Even if I use a credit card it's rarely the case that the entire transaction happens under the administrative authority of the identity system inherent in the credit card. Only the financial part of the transaction takes place in that identity system. This is true of most interactions in real life.

In contrast, in the digital world, very few meaningful transactions are done outside of some administrative identity system. There are several reasons why identity is so important in the digital world:

  • Continuity—Because of the stateless nature of HTTP, building a working shopping cart, for example, requires using some kind of token for correlation of independent HTTP transactions. These tokens are popularly known as cookies. While they can be pseudonymous, they are often correlated across multiple independent sessions using a authenticated identifier. This allows, for example, the customer to have a shopping cart that persists across time on different devices.
  • Convenience—So long as the customer is authenticating, we might as well further store additional information like addresses and credit card numbers for their convenience, to extend the shopping example. Storing these allows the customer to complete transactions without having to enter the same information over and over.
  • Trust—There are some actions that should only be taken by certain people, or people in certain roles, or with specific attributes. Once a shopping site has stored my credit card, for example, I ought to be the only one who can use it. Identity systems provide authentication mechanisms as the means of knowing who is at the other end of the wire so that we know what actions they're allowed to take. This places identifiers in context so they can be trusted.
  • Surveillance—Identity systems provide the means of tracking individuals across transactions for purposes of gathering data about them. This data gathering may be innocuous or nefarious, but there is no doubt that it is enabled by identity systems in use on the Internet.

In real life, we do without identity systems for most things. You don't have to identify yourself to the movie theater to watch a movie or log into some system to sit in a restaurant and have a private conversation with friends. In real life, we act as embodied, independent agents. Our physical presence and the laws of physics have a lot to do with our ability to function with workable anonymity across many domains.

So, how did we get surveillance and it's attendant affects on natural anonymity as an unintended, but oft-exploited feature of administrative digital identity systems? Precisely because they are administrative.

Legibility

Legibility is a term used to describe how administrative systems make things governable by simplifying, inventorying, and rationalizing things around them. Venkatesh Rao nicely summarized James C. Scott's seminal book on legibility and its unintended consequences: Seeing Like a State.

Identity systems make people legible in order to offer continuity, convenience, and trust. But that legibility also allows surveillance. In some respect, this is the trade off we always get with administrative systems. By creating legibility, they threaten privacy.

Administrative systems are centralized. They are owned. They are run for the purposes of their owners, not the purposes of the people or things being administered. They are bureaucracies for governing something. They rely on rules, procedures, and formal interaction patterns. Need a new password? Be sure to follow the password rules of what ever administrative system you're in.

Every interaction you have online happens under the watchful eye of a bureaucracy built to govern the system and the people using it. The bureaucracy may be benevolent, benign, or malevolent but it controls the interaction.

Real Life is Decentralized

On the other hand, in real life we interact as peers. We do interact with administrative systems of various sorts, but no one would describe that as real life. When I go to a store, I don't think about shopping within their administrative system. Rather, I walk in, look at stuff, talk to people, put things in a cart, and check out. The administrative system is there, but it's for governing the store, not the customers.

We can't achieve Joyce's vision of anonymous online interactions until we redecentralize the Internet. The Internet started out decentralized. The early Web was decentralized. But the need for continuity, convenience, and trust led more and more interactions to happen within someone's administrative system.

Most online administrative systems make themselves as unobtrusive as they can. But there's no getting around the fact that every move we make is within a system that knows who we are and monitors what we're doing. In real life, I don't rely on the administrative system of the restaurant to identify the people I'm having dinner with. The restaurant doesn't need to check thead IDs of me and the others in my party or surveil us in order to create an environment where we can talk.

The good news is that we're finally developing the tools necessary to create decentralized online experiences. What if you could interact with your friends on the basis of an identity that they bring to you directly—one that you could recognize and trust? You wouldn't need Facebook or Snapchat to identify and track your friends for you. You could do it yourself.

One of the reasons I jumped at the chance to help get Sovrin up and running is that I believe decentralized identity is the foundation for a decentralized Web—a Web that flexibly supports the kind of ad hoc interactions people have with each other all the time in real life. We'll never get an online world that mirrors real life and it's natural anonymity until we do.


Photo Credit: Network from Pezibear (CC0 Public Domain)


Hyperledger Welcomes Project Indy

We’re excited to announce Indy, a new Hyperledger project for supporting independent identity on distributed ledgers. Indy provides tools, libraries, and reusable components for providing digital identities rooted on blockchains or other distributed ledgers so that they are interoperable across administrative domains, applications, and any other silo.

Why Indy?

Internet identity is broken. There are too many anti-patterns and too many privacy breaches. Too many legitimate business cases are poorly served by current solutions. Many have proposed distributed ledger technology as a solution, however building decentralized identity on top of distributed ledgers that were designed to support something else (cryptocurrency or smart contracts, for example) leads to compromises and short-cuts. Indy provides Hyperledger projects and other distributed ledger systems with a first-class decentralized identity system.

Indy’s Features

The most important feature of a decentralized identity system is trust. As I wrote in A Universal Trust Framework, Indy provides accessible provenance for trust to support user-controlled exchange of verifiable claims about an identifier. Indy also has a rock-solid revocation model for cases where those claims are no longer true. Verifiable claims are a key component of Indy’s ability to serve as a universal platform for exchanging trustworthy claims about transactions. Provenance is the foundation of accountability through recourse.

Another vital feature of decentralized identity&mdaqsh;especially for a public ledger—is privacy. Privacy by Design is baked deep into Indy architecture as reflected by three fundamental features:

  • First, identifiers on Indy are pairwise unique and pseudonymous by default to prevent correlation. Indy is the first DLT to be designed around Decentralized Identifiers (DIDs) as the primary keys on the ledger. DIDs are a new type of digital identifier that were invented to enable long-term digital identities that don’t require centralized registry services. DIDs on the ledger point to DID Descriptor Objects (DDOs), signed JSON objects that can contain public keys and service endpoints for a given identifier. DIDs are a critical component of Indy’s pairwise identifier architecture.
  • Second, personal data is never written to the ledger. Rather all private data is exchanged over peer-to-peer encrypted connections between off-ledger agents. The ledger is only used for anchoring rather than publishing encrypted data.
  • Third, Indy has built-in support for zero-knowledge proofs (ZKP) to avoid unnecessary disclosure of identity attributes—privacy preserving technology that has been long pursued by IBM Research (Idemix) and Microsoft (UProve), but which a public ledger for decentralized identity now makes possible at scale.

Indy is all about giving identity owners independent control of their personal data and relationships. Indy is built so that the owner of the identity is structurally part of transactions made about that identity. Pairwise identifiers stop third parties from talking behind the identity owner’s back, since the identity owner is the only place pairwise identifiers can be correlated.

Sovereign identity star
Sovereign Identity (click to enlarge)

Indy is based on open standards so that it can interoperate with other distributed ledgers. These start, of course, with public-key cryptography standards. Other important standards cover things like the format of Decentralized Identifiers, what they point to, and how agents exchange verifiable claims. Indy also supports a system of attribute and claim schemas that are written to the ledger for dynamic discovery of previously unseen claim types. Relying parties can make their own entitlement decisions based on schemas with publicly known identifiers.

The result is a new way of doing systems integration on the Internet that is much less costly while also being more trustworthy. As I wrote in When People Can Share Verifiable Attributes, Everything Changes, owner-provided attributes are a powerful driver that will push decentralized identity systems well beyond the current uses of federation and social login. Organizations can reduce, or even eliminate, costly manual verification processes and API integrations, and instead trust the identity claims presented to them, precisely because these claims can be verified. People and organizations become the source of what's true about them.

Indy Shares the Internet’s Virtues

As I wrote in An Internet for Identity, Indy shares three important virtues with the Internet:

  1. No one owns it.
  2. Everyone can use it.
  3. Anyone can improve it.

Launching Indy as a Hyperledger Project is a critical component of allowing anyone to improve how Indy works.

These virtues are supported by Indy’s permissioned-validation ledger model and open-source code base. This has important consequences for scale and cost. But unlike other permissioned ledgers like R3’s Corda, CULedger, or SecureKey, Indy is designed for global public access. Even though Indy is permissioned, anyone can access Indy’s features.

Validation is performed by a set of validator nodes running a modified, redundant Byzantine fault tolerant protocol called Plenum that is part of the Indy project. With Plenum, network nodes run come to collective agreement about the validity and order of events.

This diagram shows how these different dimensions of validation and access play across different distributed ledger systems.

Validation and Access
Validation and Access (click to enlarge)

What is the Relationship of Indy and Sovrin?

Sovrin Foundation is contributing the Indy code base to the Hyperledger Project. Established in September 2016, the Sovrin Foundation is an international non-profit foundation created to govern a global public utility for decentralized identity. The trustees of the Sovrin Foundation believe the public, permissioned quadrant above is the only one that can achieve both high trust and global adoption for a decentralized identity system.

The Sovrin Foundation developed the Sovrin Trust Framework to govern how trusted institutions, called stewards, will operate validator nodes of the Sovrin Network. All stewards will run an instance of Project Indy. However the Sovrin Network is only one network designed to run Indy; any number of other networks may be created to run their own instances.

How Will Hyperledger Enhance Indy?

The Indy code base, originally developed by Evernym, was donated to the Sovrin Foundation to establish a strong open source foundation for the Sovrin Network. The Sovrin Foundation has been building a global community of developers who are passionate about independent identity and the economic and social benefits it brings to both individuals and enterprises.

With the contribution of Indy to Hyperledger, we take the next step in that process, opening up Project Indy to the entire Hyperledger family of developers. Our hope is to attract even more developers who want to unleash the transformative power of digital identity that is decentralized, self-sovereign, and independent of any silo. We would also like to explore direct synergies with the identity management goals and requirements of the other Hyperledger projects.

Learn More about Indy

To learn more or contribute to the Indy project:


Pico Programming Lesson: Modules and External APIs

Apollo Modules

I recently added a new lesson to the Pico Programming Lessons on Modules and External APIs. KRL (the pico programming language) has parameterized modules that are great for incorporating external APIs into a pico-based system.

This lesson shows how to define actions that wrap API requests, put them in a module that can be used from other rulesets, and manage the API keys. The example (code here) uses the Twilio API to create a send_sms() action. Of course, you can also use functions to wrap API requests where appropriate (see the Actions and Functions section of the lesson for more detail on this).

KRL includes a key pragma in the meta block for declaring keys. The recommended way to use it is to create a module just to hold keys. This has several advantages:

  • The API module (Twilio in this case) can be distributed and used without worrying about key exposure.
  • The API module can be used with different keys depending on who is using it and for what.
  • The keys module can be customized for a given purpose. A given use will likely include keys for multiple modules being used in a given system.
  • The pico engine can manage keys internally so the programmer doesn't have to worry (as much) about key security.
  • The key module can be loaded from a file or password-protected URL to avoid key loss.

The combination of built-in key management and parameterized modules is a powerful abstraction that makes it easy to build easy-to-use KRL SDKs for APIs.

Going Further

The pico lessons have all been recently updated to use the new pico engine. If you're interested in learning about reactive programming and the actor model with picos, walk through the Quickstart and then dive into the lessons.


Photo Credit: Blue Marble Geometry from Eric Hartwell (CC BY-NC-SA 3.0)


The New Pico Engine Is Ready for Use

The mountains, the lake and the cloud

A little over a year ago, I announced that I was starting a project to rebuild the pico engine. My goal was to improve performance, make it easier to install, and supporting small deployments while retaining the best features of picos, specifically being Internet first.

Over the past year we've met that goal and I'm quite excited about where we're at. Matthew Wright and Bruce Conrad have reimplemented the pico engine in NodeJS. The new engine is easy to install and quite a bit faster than the old engine. We've already got most of the important features of picos. My students have redone large portions of our supporting code to run on the new engine. As a result, the new engine is sufficiently advanced that I'm declaring it ready for use.

We've updated the Quickstart and Pico Programming Lessons to use the new engine. I'm also adding new lessons to help programmers understand the most important features of Picos and KRL.

My Large-Scale Distributed Systems class (CS462) is using the new pico engine for their reactive programming assignments this semester. I've got 40 students going through the pico programming lessons as well as reactive programming labs from the course. The new engine is holding up well. I'm planning to expand it's use in the course this spring.

Adam Burdett has redone the closet demo we showed at OpenWest last summer using the new engine running on a Raspberry Pi. One of the things I didn't like about using the classic pico engine in this scenario was that it made the solution overly reliant on a cloud-based system (the pico engine) and consequently was not robust under network partitioning. If the goal is to keep my machines cool, I don't want them overheating because my network was down. Now the closet controller can run locally with minimal reliance on the wider Internet.

Bruce was able to use the new engine on a proof of concept for BYU's priority registration. This was a demonstration of the ability for the engine to scale and handle thousands of picos. The engine running on a laptop was able to handle 44,504 add/drop events in over 8000 separate picos in 35 minutes and 19 seconds. The throughput was 21 registration events per second or 47.6 milliseconds per request.

We've had several lunch and learn sessions with developers inside and outside BYU to introduce the new engine and get feedback. I'm quite pleased with the reception and interactions we've had. I'm looking to expand those now that the lessons are completed and we have had several dozen people work them. If you're interested in attending one, let me know.

Up Next

I've hired two new students, Joshua Olson and Connor Grimm, to join Adam Burdett and Nick Angell in my lab. We're planning to spend the summer getting Manifold, our pico-based Internet of Things platform, running on the new engine. This will provide a good opportunity to improve the new pico engine and give us a good IoT system for future experiments, supporting our idea around social things.

I'm also contemplating a course on reactive programming with picos on Udemy or something like it. This would be much more focused on practical aspects of reactive programming than my BYU distributed system class. Picos are a great way to do reactive programming because they implement an actor model. That's one reason they work so well for the Internet of Things.

Going Further

If you'd like to explore the pico engine and reactive programming with picos, you can start with the Quickstart and move on to the pico programming lessons.

We'd also love help with the open source implementation of the pico engine. The code is on Github and there's well-maintained set of issues that need to be worked. Bruce is the coordinator of these efforts.

Any questions on picos or using them can be directed to the Pico Labs forum and there's a pretty good set of documentation.


Photo Credit: The mountains, the lake and the cloud from CameliaTWU (CC BY-NC-ND 2.0)


What Use is a Master of Science in CS?

Graduation

Recently a friend of mine, Eric Wadsworth, remarked in a Facebook post:

My perspective, in my field (admittedly limited to the tech industry) has shifted. I regularly interview candidates who are applying for engineering positions at my company. Some of them have advanced degrees, and to some of those I give favorable marks. But the degree is not really a big deal, doesn't make much difference. The real question is, "Can this person do the work?" Having more years of training doesn't really seem to help. Tech moves so fast, maybe it is already stale when they graduate. Time would be better spent getting actual experience building real software systems.

I read this, and the comments that followed, with a degree of interest because most of them reflect a gross misunderstanding of what an advanced degree indicates. The assumption appears to be that people who get a BS in Computer Science are learning to program and therefore getting a MS means you're learning more about how to program. I understand why this can be confusing. We don't often hire a plumber when we need a mechanical engineer, but Computer Science and programming are still relatively young and we're still working out exactly what the differences are.

The truth is that CS programs are not really designed to teach people to code, except as a necessary means to learning computer science, which is not merely programming. That's doubly true of a masters degree. There are no courses in a master's program that are specifically designed to teach anyone to program anything. You can learn to code at a 1000 web sites. A CS degree includes topics like computational theory and practices, algorithms, database design, operating system design, networking, security, and many others. All presented in a way designed to create well-rounded professionals. The ACM Curriculum Guidelines (PDF) are a good place to see some of the detail in a program independent way.

Most of what one learns in a Computer Science program has a long shelf life—by design. For example, I design the modules in my Large Scale Distributed Programming class to teach principles that have been important for 30 years and are likely to be important for 30 more. Preventing Byzantine failure, for example, has recently become the latest fad with the emergence of distrubted ledgers. I learned about it in 1988. If your interview questions are asking people what they know about the latest JavaScript framework, you're unlikely to distinguish the person with a CS degree from someone who just completed a coding bootcamp.

What does one learn getting an advanced degree in Computer Science? People who've completed a masters degree have proven their ability to find the answers to large, complex, open-ended problems. This effort usually lasts at least six months and is largely self-directed. They have shown that they can explore scientific literature to find answers and synthesize that information into a unique solution. This is very different than, say, looking at Stack Overflow to find the right configuration parameters for a new framework. Often, their work is only part of some larger problem being explored by a team of fellow students under the direction of someone recognized as an expert in a specific field in Computer Science.

If these kinds of skills aren't important to your project, then you're wasting your time and money hiring someone with an advanced degree. As Eric points out, the holder of an MS won't necessarily be a better programmer, especially as measured by your interview questions and tests. And if they are important, you're unlikely to uncover a candidate's abilities in an interview. Luckily, someone else has spent a lot of time and money certifying that the person sitting in front of you has them. All free to you. That's what the letters MSCS on their resume mean.

Obviously, every position comes with an immediate need. Sometimes that can be filled by a candidate with good programming skills and a narrow education. Sometimes you want something more. But don't hire poorly because you misunderstand the credentials you're evaluating.


Photo Credit: Graduation from greymatters (CC0 Public Domain)


Verifying Constituency: A Sovrin Use Case

Jason Chaffetz Town Hall Meeting

Recently, my representative, held a town hall that didn't go so well. Rep. Chaffetz claims that "the protest crowd included people brought in from outside his district specifically to be disruptive." I'm not here to debate the veracity of that claim, but to make a proposal.

First, let's recognize that members of Congress are more apt to listen when they know they are hearing from constituents. Second, this problem is exacerbated online. They wonder, "Are all the angry tweets coming from voters in my district?" and likely conclude they're not. Britt Blaser's been trying to solve this problem for a while.

Suppose that I had four verified claims in my Sovrin agent:

  1. Address Claim—A claim that I live at a certain address, issued by someone that we can trust to not lie about this (e.g. my bank, utility company, or a third party address verification service).
  2. Constituency Claim—A claim written by the NewGov Foundation or some other trusted third party, based on the Address Claim, that I'm a constituent of Congressional District 3.
  3. Voter Claim—A claim that says I'm a registered voter. Ideally this would be written by the State of Utah Election Office, but might need to be done by someone like NewGov based on voter rolls for now.
  4. Twitter Claim—A claim that proves I own a particular Twitter handle. Again, this would ideally be written by Twitter, but could be the work of a third party for now.1

Given these claims, Sovrin can be used to create a proof that @windley belongs to a verified voted in Congressional District 3. More generally, the proof shows a given social media account belongs to a constituent who lives in a specific political jurisdiction.

Anyone would be able to validate that proof and check the claims that it is based on. This proof doesn't need to disclose anything beyond the Twitter handle and congressional district. No other personally identifying information need be disclosed by the proof.

How would we use this proof? Imagine, for example, a Website that publishes all tweets that contain the hashtag #UTCD3, but only if they are from twitter handles that are certified to have come from people who live in Congressional District 3.

A more ambitious use case would merge these verification with the NewGov GEOvoter API to place the tweets on interactive maps to show where hotspots are. Combined with sentiment analysis, the constituency proof could be used to show political sentiment across the country, across the state, or within the local water district.

Sovrin provides a trusted infrastructure for issuing and using the verified claims. NewGov or someone else would provide the reason for trusting the claims written about verified voters. Eventually these claims should written by the Elections Office or Twitter directly providing even more trust in the system.


Photo Credit: Jason Chaffetz Town Hall Meeting in American Fork, Utah on August 10, 2011 from Michael Jolley (CC BY 2.0).

This post originated in a conversation I had with Britt Blaser.

Notes:

  1. Twitter is simply one example. We could have claims for Facebook Instagram, other social media, email, or any online tool.


Student Profiles: A Proof of Concept

Students in Class

In Sovrin Use Cases: Education, I broadly outlined how a decentralized identity ledger, Sovrin, could provide the tools necessary to build a decentralized university. This post takes the next step by laying out a three phase project to provide a proof of concept.

Background

Teaching students to be life-long learners in a digital age includes giving them tools and techniques they can use after they leave the university. We do students a disservice when we supply them with only enterprise-level tools that they lose access to once they've graduated. Learning management systems are fine, but they're not a personal tool that supports life-long learning. BYU has been exploring personal leaning environments and operates a thriving Domain of One's Own program in support of this ideal.

Architected correctly, personal learning environments provide additional, important benefits. For example, we're exploring the use of decentralized personal student profiles to create a virtual university using programs, certifications, and courses from several different institutions.

A Proof of Concept

In Sovrin Use Cases: Education, I wrote:

The idea is to create a decentralized system of student profiles that contain both profile data as well as learning records. These learning records can attest to any kind of learning activity from the student having read a few paragraphs to the completion of her degree. By making the student the point of integration, we avoid having to do heavy, expensive point-to-point integrations between all the student information systems participating in the educational initiative.

The architecture relies on being able to write verifiable claims about learning activities and other student attributes. A verifiable claim allows the student to be the source of information about themselves that other parties can trust. Without this property, the decentralized approach doesn't work. I describe the details here: A Universal Trust Framework.

The proof of concept has three phases that are described below. When finished, we will have a prototype system that demonstrates all the technology and interactions required to put this architecture into use. The things we learn in the proof of concept will guide us as we roll the architecture out globally.

Phase 0: A Learning Record Store

The goal of Phase 0 is to build a basic student profile that has an API manager, learning record store, and some system for storing basic profile information.

Basic Student Profile
Basic Student Profile (click to enlarge)

The system produced in Phase 0 is foundational. The student profile and learning record store (I'll use "student profile" to inclusively talk about the entire system from now on) provide the repository of student data. The student profile provides an API that supports event-based notification (mostly through the xAPI).

The requirements for the system built in Phase 0 include the following:

  • API Manager—the student profile will include a simple API manager.
  • Profile data—the student profile will be capable of storing and providing (through an API) basic profile data.
  • Learning record store—the student profile will include an xAPI-compatible LRS.
  • xAPI notifications from Canvas—The student profile should accept xAPI calls from the University's test instance of Canvas and make, as necessary, University API calls to other campus systems.
  • Permissioned Access—The student profile should support OAuth-based access to data.
  • Open source—Components of the student profile should be open source so that they can be modified to meet the needs of the proof of concept.
  • Hostable—The overall student profile should be built so as to allow it to be run in a variety of cloud environments.

Phase 1: Creating Claims

The goal of Phase 1 is to introduce Sovrin agents for both the student and BYU, and use those agents to create claims about some of the learning records in the student's LRS.

Making a Claim
Making a Claim (click to enlarge)

The system produced in Phase 1 uses the Sovrin identity network to manage the claim creation process. Both BYU and the student profile will use what Sovrin calls an agent to interact with the network. The agent represents the entity (in our case either BYU or the student) to the Sovrin identity network. The agent is also responsible for managing any claims that the entity makes or possesses.

The requirements for the system built in Phase 1 include the following:

  • Identity on the network—The system should support entities creating identities in the form of Decentralized Identifiers (DIDs).
  • Student profile makes claim requests (2)—The student profile (through its agent) should be able to make claim requests of BYU's agent about any statement in the LRS.
  • Claim requests are validated (3a)—BYU's agent validates claim requests before issuing the claim.
  • Claims are issued (3b)—BYU's agent issues claims to the student's agent.
  • Claims are stored—The student agent stores claims it receives.
  • Claims are backed by pre-registered schema—Any claim issued by BYU will be based on claim schemas pre-registered in the Sovrin ledger. They can be BYU's or other's claim schemas, but the actual registering of the schema in the ledger is out of scope.

In Phase 1, any interactions with the student will be stubbed out with a default response.

Phase 2: Using Claims

The goal of Phase 2 is to provide proofs to another party about claims in the student profile.

Using a Claim
Using a Claim (click to enlarge)

The requirements for the system built in Phase 2 include the following:

  • Relying Party with an Agent—A relying party, meant to simulate another school, uses an agent to interact with the ledger and other agents.
  • Student client—A student uses a client capable of interacting with the student's agent.
  • Student with multiple DIDs—The student uses different DIDs for interacting with BYU than she does for interacting with the relying party.
  • Relying Party asks for proof of some assertion (4)—The relying party can ask the student agent for proof of some assertion they have a claim for.
  • The student's agent asks the student client for permission (5 & 6)—The agent interacts with the client to get the student's permission to create a proof from the claim.
  • Agent creates proof (7)—The student agent creates a proof from an existing claim and returns it to the relying party.
  • Relying party validates the proof (8)—The relying party's agent uses the ledger to validate the proof. For the proof of concept, we can assume the relying party knows BYU (by its DID) and trusts BYU.
  • Generalizable—The system is capable of supporting multiple relying parties and BYU's agent can accept claims from them. The foundational evidence for the claim at the relying party can be mocked up, it doesn't need to come from Canvas or some other system.

Bonus Phase: Self-Issued Claims as a Personal API

Matthew Hailstone made an interesting point on a draft of this post: proof requests amount to a very flexible personal API. This bonus phase explores that idea.

Self-Issued Claims
Self-Issued Claims (click to enlarge)

The requirements for the system built in the Bonus Phase include the following:

  • Student can self-issue claims (9)—The student can create claims based on information in the student profile.
  • Relying parties use claims (10-14)—Relying parties can use self-issued claims in the same way they can in Phase 2.
  • Self-issued claims are backed by pre-registered schema—Any claim issued by the student will be based on claim schemas pre-registered in the Sovrin ledger.

Sovrin proofs based on claims can be thought of as a very flexible personal API where any claim schemas the student profile supports become valid requests.

Future Work

The following is reserved for future work and is outside the scope of the proof of concept.

  • Using public keys for authentication—An entity's DID is linked not only to it's agent, but also to a public key that has been created specifically for that identifier. These public keys could be very useful for authenticating the various entities in the system. The proof of concept won't do that unless specifically required for issuing and using claims.
  • Social Login—the profile doesn't have to support OAuth-based login or OpenID Connect for use as an authentication platform.
  • Domains—The proof of concept does not have to be hostable on Cpanel-based hosting system like BYU's Domains system.


Photo Credit: Student in Class from Albert Herring (CC BY 2.0)


A Universal Trust Framework

Lorimerlite Structure as a Framework

In We've stopped trusting institutions and started trusting strangers, Rachel Botsman talks about the "trust gap" that separates a place of certainty from something that is unknown. Some force has to help us "make the leap" from certainty to uncertainty and that force is trust.

Traditionally, we've relied on local trust that is based on knowing someone—acquired knowledge or reputation. In a village, I know the participants and can rely on personal knowledge to determine who to trust. As society got larger, we began to rely on institutions to broker trust. Banks, for example, provide institutional trust by brokering transactions—we rely on the bank to transfer trust. I don't need to know you to take your credit card.

But lately, as Botsman says, "we've learned that institutional trust isn't meant for the digital age." In the digital age, we have to learn to how to trust strangers. Botsman discusses sharing platforms like AirBnB and BlaBlaCar. You might argue that they're just another kind of institution. But there's a key difference: those platforms are bidirectional For example AirBnB lets guests rate their hosts, but also lets hosts rate guests. These platforms give more information about the individual in order to establish trust.

But beyond platforms like AirBnB lies distributed trust based on blockchains and distributed ledgers. Botsman makes the point that distributed trust provides a system wherein you don't have to trust the individual, only the idea (e.g. distributed cash transactions) and the platform (e.g. Bitcoin). You can do this when the system itself make its difficult for the person to misrepresent themselves or their actions. When I send you Bitcoin, you don't have to trust me because the system provides provenance of the transaction and ensures that it happens correctly. I simply can't cheat.

At a fundamental level, trust leads us to believe what people say. Online this is difficult because identifiers lacks the surrounding trustworthy context necessary provide the clues we need to establish trust. Dick Hardt said this back in 2007. The best way to create context around an identifier is to bind it to other information in a trustworthy way. Keybase does this, for example, to create context for a public key. Keybase creates a verifiable context for a public key by allowing its owner to cryptographically prove she is also in control of certain domain names, Twitter accounts, and Github accounts, among others. Keybase binds those proofs—the context—to the key. Potential users of the public key can validate for themselves the cryptographic proofs and decide whether or not to trust that the public key belongs to the person they wish to communicate with.

Another key idea in reputation and trust is reciprocity. Accountability and a means of recourse when something goes wrong create an environment where trust can thrive. This is one of the secrets to sharing economy platforms like AirBnB. Botsman makes the point that she never leaves the towel on the floor of an AirBnB because the host "knows" her. She is accountable and there is the possibility for recourse (a bad guest rating).

Trust Frameworks and Trust Transactions

The phrase we use to describe the platforms of AirBnB, BlaBlaCar, and other sharing economy companies is trust framework. Simply put, a trust framework provides the structure necessary to leap between the known and unknown.

For example, social login presents a trust leap for the Web sites relying on the social media site that's authenticating the user. When a user logs into a Web site using Facebook, trust is transferred between Facebook and the site they're logging into. Facebook establishes that the user is the same person who created the account based on the fact that she knows things like the username and password. The relying Web site trusts that Facebook will do a good job of this and thus is willing to accept Facebook's authentication in lieu of its own. This transfer of trust from Facebook is a trust transaction.

Trust frameworks generally rely on technologies, business processes, and legal agreements. All of these are important. For example, how much recourse a relying party has against Facebook is unclear, so social login has been limited to identity providers who relying parties trust. I could become an identity provider, but few Web sites will add me to their login process because they can't trust me.

Trust frameworks are all around us, but they are one-offs, too specialized to be universally applicable. In the case of AirBnB, the platform can only be used by AirBnB for trust transactions between hosts and guests. In the case of social login, the framework is open and non-proprietary, but limited to authentication transactions. Furthermore, only a few identity providers are trusted due to insufficient business process and legal structures.

Sovrin as a Universal Trust Framework

All of which brings me to Sovrin. If you've been following my blog, you probably get that Sovrin is a decentralized identity system based on a distributed ledger. But Sovrin's killer feature is verifiable claims1. The combination of decentralized identifiers (DIDs), verifiable claims, and a ledger that is available to all make Sovrin a universal trust framework.

Let's unpack these to see why they're all necessary:

  • Decentralized Identifiers—DIDs allow anyone to create identifiers for anything. Furthermore, they are in a standard, interoperable format. People will have hundreds or thousands of DIDs representing all of the various digital relationships to which they're a party. These relationships might be with organizations they do business with, friends they interact with, or things they own. Organizations and many people will have public DIDs that represent their public digital presence. For example, I might have a DID that represents me to my employer, BYU, and another that represents me to my bank.

  • Verifiable claims—verifiable claims allow trustworthy assertions to be made about anything that has an identifier. These claims are standard and interoperable. Furthermore, they're based on strong cryptography to bind the claim issuer, the claim subject, and the claim itself. For example, BYU might issue a claim that says I'm an employee. My bank might issue a claim saying I have an account balance of $X. Issuing a claim is a trust transaction that is recorded on the ledger.

  • Sovrin ledger—the ledger provides the means of discovering the keys and endpoints associated with a particular DID. The ledger also records information about claims (although not the claims themselves). Consequently, Sovrin creates provenance about trust transactions and their constituent parts. For example, the claim that BYU makes that I'm an employee would reference BYU's public DID, the DID by which BYU knows me, a claim schema (for employees), and the assertions BYU is making within that schema. These would be packaged up and cryptographically signed. I'd hold the claim, but it's existence would be recorded on the ledger, as would the DIDs it references and the claim schema.

    When I need to prove to the bank that I'm employed by BYU, I don't give them the claim. Instead I generate a proof—an incontrovertable certification of some fact—from the claim. The proof discloses only the information the bank needs. Further, the proof uses the DID that represents the relationship I have with the bank, not the one I have with BYU (since the bank doesn't know about that one). All this is done cryptographically2 so that no party to the transaction has any doubt whether or not the information is correct.

Properties of a Universal Trust Framework

DIDs, verifiable claims, and the Sovrin ledger give our trust framework several important properties.

First, Sovrin scales in applicability as well as raw transaction power. The use of a decentralized ledger and standards like decentralized identifiers and verifiable claims mean that anyone can make use of Sovrin for any kind of trust transaction. As I've discussed in detail before, Sovrin shares important virtues with the Internet: No one owns it, everyone can use it, and anyone can improve it. The use of a permissioned decentralized ledger allows Sovrin to scale to meet the needs of a global trust network with billions of users.

Second, Sovrin is general purpose. Where other platforms like AirBnB or BlaBlaCar are aimed at a specific problem, Sovrin can be used for any type of trust transaction. This means that you can use it for whatever is important to you. Sovrin is a tool that anyone can use to fill the trust gap. In this way it's more like the Internet or a programming language.

Third, Sovrin provides accessible provenance for trust transactions. Provenance is the foundation of accountability through recourse. In my previous example, my bank can look up the claim that is the basis of the proof, the claim schema, the DID BYU uses in the claim about my employment on the ledger, and the DID I use with them. They can cryptographically check that these are all correct. Further, they can determine whether to trust BYU based on the public claims recorded about its DID. Sovrin provides irrefutable evidence of trust transactions. If BYU's claim about my employment is wrong, my bank can track that down, and BYU knows this. This possibility encourages good behavior by all parties to the trust transaction.

Universal solutions solve previously intractable problems and make new applications more broadly available. A trust framework with the three properties listed above changes how we conduct business online. The Internet changed the world because it provided a universal means of communicating. Sovrin changes the world by providing a universal means of trusting. Sovrin can be used by anyone to solve their online trust problems. I've outlined a number of use cases for Sovrin, but these only scratch the surface because the world is full of use cases that share the problem Rachel Botsman describes—filling the trust gap so people can move from the known to the unknown.



Photo Credit: Lorimerlite Framework from Astris1 (CC BY-SA 3.0)

This post originate in and was made better through discussions with Craig Burton and Steve Fulling.

Notes:

  1. In identity circles, a claim is an assertion about a digital subject that is open to doubt. Thus a verifiable claim is an assertion that can be validated by the recipient.
  2. Technically, this is a zero knowledge proof.