What Use is a Master of Science in CS?


Recently a friend of mine, Eric Wadsworth, remarked in a Facebook post:

My perspective, in my field (admittedly limited to the tech industry) has shifted. I regularly interview candidates who are applying for engineering positions at my company. Some of them have advanced degrees, and to some of those I give favorable marks. But the degree is not really a big deal, doesn't make much difference. The real question is, "Can this person do the work?" Having more years of training doesn't really seem to help. Tech moves so fast, maybe it is already stale when they graduate. Time would be better spent getting actual experience building real software systems.

I read this, and the comments that followed, with a degree of interest because most of them reflect a gross misunderstanding of what an advanced degree indicates. The assumption appears to be that people who get a BS in Computer Science are learning to program and therefore getting a MS means you're learning more about how to program. I understand why this can be confusing. We don't often hire a plumber when we need a mechanical engineer, but Computer Science and programming are still relatively young and we're still working out exactly what the differences are.

The truth is that CS programs are not really designed to teach people to code, except as a necessary means to learning computer science, which is not merely programming. That's doubly true of a masters degree. There are no courses in a master's program that are specifically designed to teach anyone to program anything. You can learn to code at a 1000 web sites. A CS degree includes topics like computational theory and practices, algorithms, database design, operating system design, networking, security, and many others. All presented in a way designed to create well-rounded professionals. The ACM Curriculum Guidelines (PDF) are a good place to see some of the detail in a program independent way.

Most of what one learns in a Computer Science program has a long shelf life—by design. For example, I design the modules in my Large Scale Distributed Programming class to teach principles that have been important for 30 years and are likely to be important for 30 more. Preventing Byzantine failure, for example, has recently become the latest fad with the emergence of distrubted ledgers. I learned about it in 1988. If your interview questions are asking people what they know about the latest JavaScript framework, you're unlikely to distinguish the person with a CS degree from someone who just completed a coding bootcamp.

What does one learn getting an advanced degree in Computer Science? People who've completed a masters degree have proven their ability to find the answers to large, complex, open-ended problems. This effort usually lasts at least six months and is largely self-directed. They have shown that they can explore scientific literature to find answers and synthesize that information into a unique solution. This is very different than, say, looking at Stack Overflow to find the right configuration parameters for a new framework. Often, their work is only part of some larger problem being explored by a team of fellow students under the direction of someone recognized as an expert in a specific field in Computer Science.

If these kinds of skills aren't important to your project, then you're wasting your time and money hiring someone with an advanced degree. As Eric points out, the holder of an MS won't necessarily be a better programmer, especially as measured by your interview questions and tests. And if they are important, you're unlikely to uncover a candidate's abilities in an interview. Luckily, someone else has spent a lot of time and money certifying that the person sitting in front of you has them. All free to you. That's what the letters MSCS on their resume mean.

Obviously, every position comes with an immediate need. Sometimes that can be filled by a candidate with good programming skills and a narrow education. Sometimes you want something more. But don't hire poorly because you misunderstand the credentials you're evaluating.

Photo Credit: Graduation from greymatters (CC0 Public Domain)

Verifying Constituency: A Sovrin Use Case

Jason Chaffetz Town Hall Meeting

Recently, my representative, held a town hall that didn't go so well. Rep. Chaffetz claims that "the protest crowd included people brought in from outside his district specifically to be disruptive." I'm not here to debate the veracity of that claim, but to make a proposal.

First, let's recognize that members of Congress are more apt to listen when they know they are hearing from constituents. Second, this problem is exacerbated online. They wonder, "Are all the angry tweets coming from voters in my district?" and likely conclude they're not. Britt Blaser's been trying to solve this problem for a while.

Suppose that I had four verified claims in my Sovrin agent:

  1. Address Claim—A claim that I live at a certain address, issued by someone that we can trust to not lie about this (e.g. my bank, utility company, or a third party address verification service).
  2. Constituency Claim—A claim written by the NewGov Foundation or some other trusted third party, based on the Address Claim, that I'm a constituent of Congressional District 3.
  3. Voter Claim—A claim that says I'm a registered voter. Ideally this would be written by the State of Utah Election Office, but might need to be done by someone like NewGov based on voter rolls for now.
  4. Twitter Claim—A claim that proves I own a particular Twitter handle. Again, this would ideally be written by Twitter, but could be the work of a third party for now.1

Given these claims, Sovrin can be used to create a proof that @windley belongs to a verified voted in Congressional District 3. More generally, the proof shows a given social media account belongs to a constituent who lives in a specific political jurisdiction.

Anyone would be able to validate that proof and check the claims that it is based on. This proof doesn't need to disclose anything beyond the Twitter handle and congressional district. No other personally identifying information need be disclosed by the proof.

How would we use this proof? Imagine, for example, a Website that publishes all tweets that contain the hashtag #UTCD3, but only if they are from twitter handles that are certified to have come from people who live in Congressional District 3.

A more ambitious use case would merge these verification with the NewGov GEOvoter API to place the tweets on interactive maps to show where hotspots are. Combined with sentiment analysis, the constituency proof could be used to show political sentiment across the country, across the state, or within the local water district.

Sovrin provides a trusted infrastructure for issuing and using the verified claims. NewGov or someone else would provide the reason for trusting the claims written about verified voters. Eventually these claims should written by the Elections Office or Twitter directly providing even more trust in the system.

Photo Credit: Jason Chaffetz Town Hall Meeting in American Fork, Utah on August 10, 2011 from Michael Jolley (CC BY 2.0).

This post originated in a conversation I had with Britt Blaser.


  1. Twitter is simply one example. We could have claims for Facebook Instagram, other social media, email, or any online tool.

Student Profiles: A Proof of Concept

Students in Class

In Sovrin Use Cases: Education, I broadly outlined how a decentralized identity ledger, Sovrin, could provide the tools necessary to build a decentralized university. This post takes the next step by laying out a three phase project to provide a proof of concept.


Teaching students to be life-long learners in a digital age includes giving them tools and techniques they can use after they leave the university. We do students a disservice when we supply them with only enterprise-level tools that they lose access to once they've graduated. Learning management systems are fine, but they're not a personal tool that supports life-long learning. BYU has been exploring personal leaning environments and operates a thriving Domain of One's Own program in support of this ideal.

Architected correctly, personal learning environments provide additional, important benefits. For example, we're exploring the use of decentralized personal student profiles to create a virtual university using programs, certifications, and courses from several different institutions.

A Proof of Concept

In Sovrin Use Cases: Education, I wrote:

The idea is to create a decentralized system of student profiles that contain both profile data as well as learning records. These learning records can attest to any kind of learning activity from the student having read a few paragraphs to the completion of her degree. By making the student the point of integration, we avoid having to do heavy, expensive point-to-point integrations between all the student information systems participating in the educational initiative.

The architecture relies on being able to write verifiable claims about learning activities and other student attributes. A verifiable claim allows the student to be the source of information about themselves that other parties can trust. Without this property, the decentralized approach doesn't work. I describe the details here: A Universal Trust Framework.

The proof of concept has three phases that are described below. When finished, we will have a prototype system that demonstrates all the technology and interactions required to put this architecture into use. The things we learn in the proof of concept will guide us as we roll the architecture out globally.

Phase 0: A Learning Record Store

The goal of Phase 0 is to build a basic student profile that has an API manager, learning record store, and some system for storing basic profile information.

Basic Student Profile
Basic Student Profile (click to enlarge)

The system produced in Phase 0 is foundational. The student profile and learning record store (I'll use "student profile" to inclusively talk about the entire system from now on) provide the repository of student data. The student profile provides an API that supports event-based notification (mostly through the xAPI).

The requirements for the system built in Phase 0 include the following:

  • API Manager—the student profile will include a simple API manager.
  • Profile data—the student profile will be capable of storing and providing (through an API) basic profile data.
  • Learning record store—the student profile will include an xAPI-compatible LRS.
  • xAPI notifications from Canvas—The student profile should accept xAPI calls from the University's test instance of Canvas and make, as necessary, University API calls to other campus systems.
  • Permissioned Access—The student profile should support OAuth-based access to data.
  • Open source—Components of the student profile should be open source so that they can be modified to meet the needs of the proof of concept.
  • Hostable—The overall student profile should be built so as to allow it to be run in a variety of cloud environments.

Phase 1: Creating Claims

The goal of Phase 1 is to introduce Sovrin agents for both the student and BYU, and use those agents to create claims about some of the learning records in the student's LRS.

Making a Claim
Making a Claim (click to enlarge)

The system produced in Phase 1 uses the Sovrin identity network to manage the claim creation process. Both BYU and the student profile will use what Sovrin calls an agent to interact with the network. The agent represents the entity (in our case either BYU or the student) to the Sovrin identity network. The agent is also responsible for managing any claims that the entity makes or possesses.

The requirements for the system built in Phase 1 include the following:

  • Identity on the network—The system should support entities creating identities in the form of Decentralized Identifiers (DIDs).
  • Student profile makes claim requests (2)—The student profile (through its agent) should be able to make claim requests of BYU's agent about any statement in the LRS.
  • Claim requests are validated (3a)—BYU's agent validates claim requests before issuing the claim.
  • Claims are issued (3b)—BYU's agent issues claims to the student's agent.
  • Claims are stored—The student agent stores claims it receives.
  • Claims are backed by pre-registered schema—Any claim issued by BYU will be based on claim schemas pre-registered in the Sovrin ledger. They can be BYU's or other's claim schemas, but the actual registering of the schema in the ledger is out of scope.

In Phase 1, any interactions with the student will be stubbed out with a default response.

Phase 2: Using Claims

The goal of Phase 2 is to provide proofs to another party about claims in the student profile.

Using a Claim
Using a Claim (click to enlarge)

The requirements for the system built in Phase 2 include the following:

  • Relying Party with an Agent—A relying party, meant to simulate another school, uses an agent to interact with the ledger and other agents.
  • Student client—A student uses a client capable of interacting with the student's agent.
  • Student with multiple DIDs—The student uses different DIDs for interacting with BYU than she does for interacting with the relying party.
  • Relying Party asks for proof of some assertion (4)—The relying party can ask the student agent for proof of some assertion they have a claim for.
  • The student's agent asks the student client for permission (5 & 6)—The agent interacts with the client to get the student's permission to create a proof from the claim.
  • Agent creates proof (7)—The student agent creates a proof from an existing claim and returns it to the relying party.
  • Relying party validates the proof (8)—The relying party's agent uses the ledger to validate the proof. For the proof of concept, we can assume the relying party knows BYU (by its DID) and trusts BYU.
  • Generalizable—The system is capable of supporting multiple relying parties and BYU's agent can accept claims from them. The foundational evidence for the claim at the relying party can be mocked up, it doesn't need to come from Canvas or some other system.

Bonus Phase: Self-Issued Claims as a Personal API

Matthew Hailstone made an interesting point on a draft of this post: proof requests amount to a very flexible personal API. This bonus phase explores that idea.

Self-Issued Claims
Self-Issued Claims (click to enlarge)

The requirements for the system built in the Bonus Phase include the following:

  • Student can self-issue claims (9)—The student can create claims based on information in the student profile.
  • Relying parties use claims (10-14)—Relying parties can use self-issued claims in the same way they can in Phase 2.
  • Self-issued claims are backed by pre-registered schema—Any claim issued by the student will be based on claim schemas pre-registered in the Sovrin ledger.

Sovrin proofs based on claims can be thought of as a very flexible personal API where any claim schemas the student profile supports become valid requests.

Future Work

The following is reserved for future work and is outside the scope of the proof of concept.

  • Using public keys for authentication—An entity's DID is linked not only to it's agent, but also to a public key that has been created specifically for that identifier. These public keys could be very useful for authenticating the various entities in the system. The proof of concept won't do that unless specifically required for issuing and using claims.
  • Social Login—the profile doesn't have to support OAuth-based login or OpenID Connect for use as an authentication platform.
  • Domains—The proof of concept does not have to be hostable on Cpanel-based hosting system like BYU's Domains system.

Photo Credit: Student in Class from Albert Herring (CC BY 2.0)

A Universal Trust Framework

Lorimerlite Structure as a Framework

In We've stopped trusting institutions and started trusting strangers, Rachel Botsman talks about the "trust gap" that separates a place of certainty from something that is unknown. Some force has to help us "make the leap" from certainty to uncertainty and that force is trust.

Traditionally, we've relied on local trust that is based on knowing someone—acquired knowledge or reputation. In a village, I know the participants and can rely on personal knowledge to determine who to trust. As society got larger, we began to rely on institutions to broker trust. Banks, for example, provide institutional trust by brokering transactions—we rely on the bank to transfer trust. I don't need to know you to take your credit card.

But lately, as Botsman says, "we've learned that institutional trust isn't meant for the digital age." In the digital age, we have to learn to how to trust strangers. Botsman discusses sharing platforms like AirBnB and BlaBlaCar. You might argue that they're just another kind of institution. But there's a key difference: those platforms are bidirectional For example AirBnB lets guests rate their hosts, but also lets hosts rate guests. These platforms give more information about the individual in order to establish trust.

But beyond platforms like AirBnB lies distributed trust based on blockchains and distributed ledgers. Botsman makes the point that distributed trust provides a system wherein you don't have to trust the individual, only the idea (e.g. distributed cash transactions) and the platform (e.g. Bitcoin). You can do this when the system itself make its difficult for the person to misrepresent themselves or their actions. When I send you Bitcoin, you don't have to trust me because the system provides provenance of the transaction and ensures that it happens correctly. I simply can't cheat.

At a fundamental level, trust leads us to believe what people say. Online this is difficult because identifiers lacks the surrounding trustworthy context necessary provide the clues we need to establish trust. Dick Hardt said this back in 2007. The best way to create context around an identifier is to bind it to other information in a trustworthy way. Keybase does this, for example, to create context for a public key. Keybase creates a verifiable context for a public key by allowing its owner to cryptographically prove she is also in control of certain domain names, Twitter accounts, and Github accounts, among others. Keybase binds those proofs—the context—to the key. Potential users of the public key can validate for themselves the cryptographic proofs and decide whether or not to trust that the public key belongs to the person they wish to communicate with.

Another key idea in reputation and trust is reciprocity. Accountability and a means of recourse when something goes wrong create an environment where trust can thrive. This is one of the secrets to sharing economy platforms like AirBnB. Botsman makes the point that she never leaves the towel on the floor of an AirBnB because the host "knows" her. She is accountable and there is the possibility for recourse (a bad guest rating).

Trust Frameworks and Trust Transactions

The phrase we use to describe the platforms of AirBnB, BlaBlaCar, and other sharing economy companies is trust framework. Simply put, a trust framework provides the structure necessary to leap between the known and unknown.

For example, social login presents a trust leap for the Web sites relying on the social media site that's authenticating the user. When a user logs into a Web site using Facebook, trust is transferred between Facebook and the site they're logging into. Facebook establishes that the user is the same person who created the account based on the fact that she knows things like the username and password. The relying Web site trusts that Facebook will do a good job of this and thus is willing to accept Facebook's authentication in lieu of its own. This transfer of trust from Facebook is a trust transaction.

Trust frameworks generally rely on technologies, business processes, and legal agreements. All of these are important. For example, how much recourse a relying party has against Facebook is unclear, so social login has been limited to identity providers who relying parties trust. I could become an identity provider, but few Web sites will add me to their login process because they can't trust me.

Trust frameworks are all around us, but they are one-offs, too specialized to be universally applicable. In the case of AirBnB, the platform can only be used by AirBnB for trust transactions between hosts and guests. In the case of social login, the framework is open and non-proprietary, but limited to authentication transactions. Furthermore, only a few identity providers are trusted due to insufficient business process and legal structures.

Sovrin as a Universal Trust Framework

All of which brings me to Sovrin. If you've been following my blog, you probably get that Sovrin is a decentralized identity system based on a distributed ledger. But Sovrin's killer feature is verifiable claims1. The combination of decentralized identifiers (DIDs), verifiable claims, and a ledger that is available to all make Sovrin a universal trust framework.

Let's unpack these to see why they're all necessary:

  • Decentralized Identifiers—DIDs allow anyone to create identifiers for anything. Furthermore, they are in a standard, interoperable format. People will have hundreds or thousands of DIDs representing all of the various digital relationships to which they're a party. These relationships might be with organizations they do business with, friends they interact with, or things they own. Organizations and many people will have public DIDs that represent their public digital presence. For example, I might have a DID that represents me to my employer, BYU, and another that represents me to my bank.

  • Verifiable claims—verifiable claims allow trustworthy assertions to be made about anything that has an identifier. These claims are standard and interoperable. Furthermore, they're based on strong cryptography to bind the claim issuer, the claim subject, and the claim itself. For example, BYU might issue a claim that says I'm an employee. My bank might issue a claim saying I have an account balance of $X. Issuing a claim is a trust transaction that is recorded on the ledger.

  • Sovrin ledger—the ledger provides the means of discovering the keys and endpoints associated with a particular DID. The ledger also records information about claims (although not the claims themselves). Consequently, Sovrin creates provenance about trust transactions and their constituent parts. For example, the claim that BYU makes that I'm an employee would reference BYU's public DID, the DID by which BYU knows me, a claim schema (for employees), and the assertions BYU is making within that schema. These would be packaged up and cryptographically signed. I'd hold the claim, but it's existence would be recorded on the ledger, as would the DIDs it references and the claim schema.

    When I need to prove to the bank that I'm employed by BYU, I don't give them the claim. Instead I generate a proof—an incontrovertable certification of some fact—from the claim. The proof discloses only the information the bank needs. Further, the proof uses the DID that represents the relationship I have with the bank, not the one I have with BYU (since the bank doesn't know about that one). All this is done cryptographically2 so that no party to the transaction has any doubt whether or not the information is correct.

Properties of a Universal Trust Framework

DIDs, verifiable claims, and the Sovrin ledger give our trust framework several important properties.

First, Sovrin scales in applicability as well as raw transaction power. The use of a decentralized ledger and standards like decentralized identifiers and verifiable claims mean that anyone can make use of Sovrin for any kind of trust transaction. As I've discussed in detail before, Sovrin shares important virtues with the Internet: No one owns it, everyone can use it, and anyone can improve it. The use of a permissioned decentralized ledger allows Sovrin to scale to meet the needs of a global trust network with billions of users.

Second, Sovrin is general purpose. Where other platforms like AirBnB or BlaBlaCar are aimed at a specific problem, Sovrin can be used for any type of trust transaction. This means that you can use it for whatever is important to you. Sovrin is a tool that anyone can use to fill the trust gap. In this way it's more like the Internet or a programming language.

Third, Sovrin provides accessible provenance for trust transactions. Provenance is the foundation of accountability through recourse. In my previous example, my bank can look up the claim that is the basis of the proof, the claim schema, the DID BYU uses in the claim about my employment on the ledger, and the DID I use with them. They can cryptographically check that these are all correct. Further, they can determine whether to trust BYU based on the public claims recorded about its DID. Sovrin provides irrefutable evidence of trust transactions. If BYU's claim about my employment is wrong, my bank can track that down, and BYU knows this. This possibility encourages good behavior by all parties to the trust transaction.

Universal solutions solve previously intractable problems and make new applications more broadly available. A trust framework with the three properties listed above changes how we conduct business online. The Internet changed the world because it provided a universal means of communicating. Sovrin changes the world by providing a universal means of trusting. Sovrin can be used by anyone to solve their online trust problems. I've outlined a number of use cases for Sovrin, but these only scratch the surface because the world is full of use cases that share the problem Rachel Botsman describes—filling the trust gap so people can move from the known to the unknown.

Photo Credit: Lorimerlite Framework from Astris1 (CC BY-SA 3.0)

This post originate in and was made better through discussions with Craig Burton and Steve Fulling.


  1. In identity circles, a claim is an assertion about a digital subject that is open to doubt. Thus a verifiable claim is an assertion that can be validated by the recipient.
  2. Technically, this is a zero knowledge proof.

Every Computer is Distributed

LG 5K Ultrafine

I have one of the new 13" Macbook Pros with four USB C ports. I also have one of the new LG Ultrafine 5K monitors to use with it. Ever since I got the monitor, I've had all kinds of problems. The display will work for 10 minutes and then start resetting (acting like no display is connected, then reconnecting, then repeating) until the laptop gives up, freezes, and eventually restarts. I was hopeful the latest MacOS release might fix it, but it just kept on acting up, even after installing the latest.

I was about to give up on the monitor and just go back to my old reliable 27" Thunderbolt display. I tried disconnecting everything, rebooting the computer, using different ports...basically everything I could think of. Then I remembered one computer I hadn't rebooted: the monitor. Sure enough, unplugging the monitor from power so it was forced to reboot solved the problem.

My theory is that the monitor had gotten itself into a weird state that merely disconnecting it from the computer couldn't reset. The monitor is one of the computers in a distributed system. There are others, of course. We're already used to things like printers being computers in their own right because they have a network connection. But the monitor, keyboard, mouse, and even cables all have computers in them that have to cooperate to provide the personal computing experience.

So, next time you're computer isn't working, you might just need to reboot one of the cables.

As an aside, the LG 5K Ultrafine is an incredible monitor: bright and clear. But it's not much of a hub because it has three spare USB C ports so you still have to have all kinds of dongles. What's more, the 13" Macbook Pro can only drive one of them. And even with a 15" Macbook Pro you can't daisy chain the monitors—not enough bandwidth.

Using Picos for BYU's Priority Registration

Maeser Building

Every semester over 30,000 BYU students register for two to six sections out of about 6,000 sections being offered. During the priority registration period this results in a surge as students try to register for the classes they need.

One proposed solution for handling the surge is to create a microservice representing sections that can spawn copies of itself as needed. Thinking reactively, I concluded that we might want to just have an independent actor to represent each section. Actors are easy to create and the underlying system handles the scaling for you.

Coincidentally, I happen to have an actor-style programming system. Picos are an actor-model programming system that supports reactive programming. As such they are great for building micorservices.

Bruce Conrad and Matthew Wright are reimplementing the KRE pico platform using Node JS. We call the new system the "pico engine." I suggested to Bruce that this class registration problem would be a fun way to test the capabilities of the new pico engine. What follows is the results of Bruce's experiment.

Experimental Setup

We were able to get the add and drop requests for the first two days of priority registration for Fall 2016. This data included 6,675 students making 44,505 registration requests against 5,393 sections.

The root actor in a pico system is called the Owner Pico. The Owner Pico is the ancestor of all other picos. For this experiment, Bruce created two child picos for the Owner Pico to represent the collection of sections and the collection of students. The Section Collection Pico is the parent of a pico for each section. The Student Collection Pico is the parent of a pico for each student. The resulting pico system looked like the following diagram from the Pico Engine's developer console:

Overall Setup for Registration System
Overall Setup for Registration System (click to enlarge)

Bruce chose not to create picos for each student for this experiment. He programmatically created picos for each section. The following figure shows the resulting pico graph:

Picos for Every Section Taught in Fall 2016
Picos for Every Section Taught in Fall 2016 (click to enlarge)

Each section pico has a unique identity and knows things like it's name, section id, and seating capacity. The pico also keeps a roster of students who have registered for the course. The picos instantiated themselves with this data by calling a microservice with the data.

Using this configuration, Bruce replayed the add/drop requests as events to the pico engine. Events were replayed by sending an section:add or section:drop event to the Section Collection Pico (which routed requests to the individual section picos).

After the requests have been replayed, we can query individual section picos to ensure that the results are as expected. This figure shows the JSON result of querying the section pico for STDEV 150-3. Queries were done programmatically to verify all results.

Querying a section pico shows it has the right state
Querying a section pico shows it has the right state (click to enlarge)


Bruce conducted two tests using a pico engine running on a Macbook Pro (2.6 GHz Intel Core i5).

The first replay, done with per-event timing enabled, required just under an hour for the 44,504 requests. The average time per request was about 80 milliseconds (of which approximately 30 milliseconds was overhead relating to measuring the time).

A second replay, done with the per-event timing disabled, processed the same 44,504 add/drops in 35 minutes and 19 seconds. The throughput was 21 registration events per second or 47.6 milliseconds per request.

For reference, the legacy system, running on several large servers, sustains a peak rate of 80 requests per second, while averaging one request about every 3.9 seconds.

These tests show that the pico engine can handle a significant number of events with a good event processing rate.

Conclusions and Going Further

Based on the results of this experiment, I believe we're closing in on the performance goal we set for Pico Engine rewrite. I believe we could achieve better results by running picos across multiple engines. Since every pico is independent, this is easily done and would avoid resource contention by better parallelizing pico operation.

The current pico engine does not support dynamic pico assignment where any free engine in a group can reconstitute and operate a pico as necessary. KRE, the original pico platform, does this to make use of multiple servers for parallelizing requests. The engineering to achieve this is straight forward.

If you're interested in programming with picos, the quickstart will guide you through getting it set up (about a five minute operation) and the lessons will introduce you to reactive programming with picos.

Photo Credit: Maesar Building from Ken Lund (CC BY-SA 2.0)

Sovrin Use Cases: GPG as a Sovrin Client

Lately, I’ve been thinking a lot about use cases for self-sovereign identity. This series of blog posts discusses how Sovrin can be used in different industries. In this article I discuss Sovrin and GPG.

As I read I’m throwing in the towel on PGP, and I work in security from Filippo Valsorda, I couldn't help but think that it illustrates a real problem that Sovrin solves. Valdorda says:

But the real issues, I realized, are more subtle. I never felt confident in the security of my long-term keys. The more time passed, the more I would feel uneasy about any specific key. Yubikeys would get exposed to hotel rooms. Offline keys would sit in a far away drawer or safe. Vulnerabilities would be announced. USB devices would get plugged in.

A long-term key is as secure as the minimum common denominator of your security practices over its lifetime. It's the weak link.

To see how Sovrin can help, let's talk about DIDs and DDOs.

DIDs and DDOs

A distributed ledger like Sovrin is a key-value store with entries that are temporally ordered and immutable. Decentralized Identifiers (DIDs) are intended to be one kind of key for a ledger. A DID is similar to a URN (universal resource name) and has a similar syntax. Here's an example DID:


The ternary structure includes a schema identifier (did), a namespace identifier (sov in this case) and an identifier within that namespace (21tDAKCERh95uGgKbJNHYp) separated by colons. DIDs are meant to be permanently assigned and not reused. DIDs can be used to identify anything: person, organization, place, thing, or even concept.

DIDs point to DDOs, or DID Descriptor Objects. A DDO is a JSON-LD-formatted data structure that links the DID to public keys, signatures, and service endpoints. We can use the signatures to validate that the DDO has not been tampered with, the service endpoint to discover more information about the DID, and the public key to secure communication with the entity identified by the DID.

Sovrin is designed to support a different key pair for each DID. Consequently, a DID represents an identifier that can be specific to a particular relationship. Say I give 20 DIDs to friends I want to communicate with. Each associated DDO would contain a public key that I generated just for that relationship. I would sign these with a key that is well known and associated with social media and other well known online personas I maintain.1

The keys associated with these DIDs can be rotated as frequently as necessary since people never store my key, they only store the DID I give them for communicating with me. The ledger ensures that the most recent public key for a given DID can always be validated by checking the signature against the key associated with my well known DID.

Of course, I've also stored DIDs for my friends and can check communications from them in the same way.

These features, taken together, do away with the need for long-term keys and ease the burden of knowing the public key for someone you want to communicate with. So long as you know DID for the person, you can encrypt messages they can read. 2

A Proposal: GPG and Sovrin

Which brings me to my proposal. Sometime in the first part of 2017, the Sovrin identity network will go live. The sandbox is available now. Many of the most advanced features of Sovrin will not be available in the MVP, but DIDs, DDOs, and public-private key pairs will be.

Could GPG be modified to perform as a Sovrin Client? I believe the following would be required:

  1. Create DIDs with valid public keys on the Sovrin ledger3
  2. Store and manage the private keys associated with those public keys
  3. Store DIDs and associate them with contacts on the user's computer
  4. Look up the DDO and public key for a given DID on the Sovrin ledger
  5. Check signatures on the DDOs
  6. Use the keys in the DDO for cryptographic operations

I'll admit to being a naive user of GPG, so perhaps there are problems in what I'm proposing that I can't see. Please let me know. But if GPG could be made to work with Sovrin, it would seem to solve some of the problems that have long plagued PGP-based message exchange and present a great use case for the Sovrin ledger.

  1. This feature makes DID-based public keys superior to other solutions where the users is identified everywhere by a single identifier (say a phone number) since it prevents correlation.
  2. I recognize that this proposal doesn't solve the very real issue of private key management on the client.
  3. Key creation would be aided by a agency run by the Sovrin Foundation to save GPG from having to do the heavy lifting.

Photo credit: The Enigma Machine from j. (CC BY 2.0)

Sovrin Use Cases: Portable Picos

Lately, I’ve been thinking a lot about use cases for self-sovereign identity. This series of blog posts discusses how Sovrin can be used in different industries. In this article I discuss Sovrin and Picos.

Anyone who reads this blog regularly knows that I've been working on a system for creating online agents using an event-driven, actor-model programming system called "persistent compute objects" or "picos" for short. Picos support a personal, decentralized model for the Internet of Things. Fuse, a connected-car platform, was designed to show how this model works.

Picos support a hosting model that supports the same properties that the Internet itself has, namely decentralization, heterarchy, and interoperability. Central to these features is the idea of portability—a pico should be movable from one hosting platform (something we call a "pico engine") to another without loss of functionality.

I believe that the future Internet of Things will require architectures where online "device shadows" (as Amazon calls them) are not only under the control of the owner, but hostable in a variety of places, not just on the manufacturer's servers. For example, I might buy a toaster that has a device shadow hosted by the Black and Decker and later decide to move that device shadow (along with its interaction history) to a hosting platform I control. Picos are a way of imagining postable device shadows that are under owner control.


The most difficult problem standing in the way of easily moving picos between hosting engines is that picos receive messages using a URL that follows the format defined by the Sky Event Protocol. Because it's a URL, it's a location. Moving a pico from one engine to another requires updating all the places that refer to that pico, an impossible task. HTTP redirects could potentially be used, but they have scaling limitations I'd rather stay away from. Specifically, frequent moves might create long chains of redirects that have to be maintained since you can never be sure when every service with an interest in the pico had seen the redirect and updated their records.

The more general solution is to have picos refer to each other by name and have some way of resolving a name to the current location of the pico. In the past this always led me to an uncomfortable reliance on some kind of centralized registry, but recent developments in distributed ledgers have made name resolution possible without reliance on a central registry.

DIDs and DDOs

A distributed ledger like Sovrin is a key-value store with entries that are temporally ordered and immutable. Decentralized Identifiers (DIDs) are intended to be one kind of key for a ledger. A DID is similar to a URN (universal resource name) and has a similar syntax. Here's an example DID:


The ternary structure includes a schema identifier (did), a namespace identifier (sov in this case) and an identifier within that namespace (21tDAKCERh95uGgKbJNHYp) separated by colons. DIDs are meant to be permanently assigned and not reused. DIDs can be used to identify anything: person, organization, place, thing, or even concept.

DIDs point to DDOs, or DID Descriptor Objects. A DDO is a JSON-LD-formatted data structure that links the DID to public keys, signatures, and service endpoints. We can use the signatures to validate that the DDO has not been tampered with, the service endpoint to discover more information about the DID, and the public key to secure communication with the entity identified by the DID.

Using Sovrin with Picos

DIDs on the Sovrin ledger provide a convenient way to name picos with an identifier that can be resolved to the current location of the pico without relying on a centralized registry. The figure below shows how this could work.

Intial State (click to enlarge)

The figure shows two pico engines and five picos. Two of the picos are acting as decentralized directories. The Sovrin ledger is associating a DID for each pico with a DDO that names the directory as the service endpoint. The databases representations on the right show the persistent state of the directory picos and map DIDs to the current location of the pico (which engine).

We use a directory pico, rather than writing the engine directly in the DDO, to allow for frequent updates. Suppose we wanted to move the pico between engines frequently. A directory can be easily updated, multiple times per second if necessary. There can be many directories. The location of the directories is arbitrary and unimportant. You'll notice I put them both in the same engine, but they could be anywhere because they're located using the DDO.

Finding a pico, given its DID, is simply a matter of looking up the DDO for the DID on the ledger, retrieving the directory from the service endpoint entry, consulting with the directory to find the engine (and any other important information) and constructing a location pointer (what we call an ESL or Event Service Location).

Multiple Identifiers

This model easily supports a pico having multiple identifiers (an important feature of picos that prevents correlation). This is supported in the Sky Event Protocol by what are called Event Channels. Each event channel has a unique event channel identifier, or ECI. In addition to preventing correlation, having separate event channels allows the pico to manage, permission, and respond to messages independently. The figure below shows how this is supported:

Supporting Multiple DIDs (click to enlarge)

In the figure, Pico 2 has two DIDs. Each points, arbitrarily, to a different directory. The directories not only contain the location of the pico engine, but a unique ECI. The information in the directory can be used to construct an ESL from each DID that can't be correlated with others.

Creating a New Pico

The process for creating and registering the name for a new pico is straightforward as shown in the following figure. The new pico is created and appropriate entries are added to the directory. A new entry is written to the ledger associating the DID for the pico with a DDO pointing to the directory.

Creating a New Pico (click to enlarge)


Pico discovery is simple using the DID. Suppose Pico 2 has the DID for the newly created Pico 4. The following figure shows the discovery process.

Discovering a Pico (click to enlarge)

  1. Pico 2 uses the DID to discover the directory using the DDO written on the ledger.
  2. Pico 2 queryies the directory.
  3. The directory returns the data about Pico 4
  4. Pico 2 contructs an ESL.

When a pico wants to send a message to another pico, it may have to repeat the discovery process to find out where it is. The ESL that is generated from the discovery process can be cached and expired based on time or failure.

Creating a Subscription

Picos create relationships with each other that are called subscriptions. The primary task in creating a subscription is key exchange. DIDs allow this to happen easily. In the following figure, assume that Pico 2 has already gone through the discovery process outlined above and has an ESL for Pico 4.

Creating a Subscription (click to enlarge)

  1. Pico 2 sends a signed subscription request that includes a DID to Pico 4.
  2. Pico 4 uses the discovery process to get Pico 2’s public key and other information from the DDO in the ledger.
  3. Pico 4 uses the public key it gathers in (2) to validate that the request in (1) came from Pico 2.
  4. Pico 4 stores Pico 2’s public key, name, and other relevant data; caches the directory and engine; and sends a subscription response.
  5. Pico 2 validates the response and stores subscription information for Pico 4.

Moving Picos

As shown in the following figure, when a pico is moved from one engine to another, only the directory need be updated.

Moving a Pico (click to enlarge)

Changing Directories

There will be reasons to change what directory a pico is using for a given DID. That requires updating the ledger as shown in the following figure.

Updating the Directory (click to enlarge)

The crossed out entry in the ledger is meant to indicate that it's been superseded, not that it's been deleted. Ledger entries are immutable.


The preceding scenarios describe the methods picos can use to make them portable while maintaining maximal decentralization. The use of names (DIDs) recorded on the Sovrin distributed ledger that reference directories for pico location provide an efficient, flexible, and complete means of maintaining connectivity between picos even if they are moved often. There are no centralized services that could create a single point of failure.

Photo Credit: sunrise_sequence_01w by Jan (Arny) Messersmith (CC BY-NC-ND 2.0)

Sovrin Use Cases: Education

University Lecture hall

Lately, I’ve been thinking a lot about use cases for self-sovereign identity. This series of blog posts discusses how Sovrin can be used in different industries. In this article I discuss Sovrin and education.

Last spring I wrote about how deconstructing the student information system at BYU is allowing us to create a virtual university. The idea is to create a decentralized system of student profiles that contain both profile data as well as learning records. These learning records can attest to any kind of learning activity from the student having read a few paragraphs to the completion of her degree. By making the student the point of integration, we avoid having to do heavy, expensive point-to-point integrations between all the student information systems participating in the educational initiative.

A virtual university using a student profile

This diagram shows a number of players in the virtual university (VU) system (labeled CES Initiative). They have no direct connections to each other, but present their own APIs, as does the student profile. As students interact with the various institutions, they present data in their profile (using an API). That's the only point of integration for these institutions, including VU itself.

One of the design requirements for the student profile is that it be hostable where ever the student desires. We also put the student in control of authorizations to use data. These features mitigate regulatory requirements around privacy of student data. With millions of potential student spread among 188 different countries, this saves significant cost and headache trying to meet various regulatory hurdles. When the student is an active participant in sharing their data, the hurdles can be more easily overcome.

The Role of Self-Sovereign Identity

Self-sovereign identity can simplify many of the most difficult challenges of building the student profile. One of the key features of Sovrin is support for verifiable claims shared by their subject. Nearly everything in the student profile looks like a verifiable claim. Sovrin provides the infrastructure for issuing claims and using them in disclosures. Sovrin enables the receiver of the disclosure to verifying the claims in it.

Verifiable claims also significantly ease the integration burden between the various systems involved in the virtual university. Say Alice wants to be a book keeper, but the book keeping microcourse has requirements for English language proficiency and secondary school completion, that she lacks. There could be four institutions involved in this scenario: Virtual University (VU), the English language certifier (ELC), the secondary education provider (SEP), and the college offering book keeping (BK). These last three institutions all have agreements with VU, but no technical integration with VU or each other.

Alice has be admitted to VU and paid for the book keeping microcourse along with the require pre-requisites.1 She goes to ELC. Alice can prove she's a VU student and has paid for the English language certification by disclosing that information from the verifiable claims that VU wrote to her student profile. ELC can validate that the disclosure is true without talking VU directly. Alice registers for a course and, after she completes it, ELC writes a claim that she passed to her student profile.

Alice, guided by the Student Profile, heads to SEP and discloses that she's a VU student. SEP validates her student status and that she's paid for the SEP program. They ask her to prove that she has the required level of english competancy and she provides a disclosure based on the claim that ELC wrote. She completes the secondary education program and SEP writes a claim stating such.2

The same thing happens for Alice's book keeping course at BK. One notable difference is that the microcourse that Alice signed up for might be made up of courses offered at different univerisities. You might be shocked to know that even in 2016, transcripts are still largely shared on PDF, or even fax, and processed using teams of people that transcribe them. Verifiable claims provide a solution to this problem that is decentralized, open, standards-based, and non-proprietary. Because the claims are anchored in the Sovrin ledger, they can be hosted anywhere in the world and still be usable by any participant in the system.

Alice may have the option of choosing the take the same course from several institutions. Alice can share her current status with VU and they can give her recommendations based on the possibilities. Alice can go to any available institution. They simply read disclosures she's created about the past courses she's taken to ensure that the meets the pre-requisites and write claims each time she completes a course. Once all the courses are complete, VU issues a claim for the book keeping certificate that Alice can use when she applies for jobs or if she decides to talk additional courses.

Student Profiles as Sovrin Agencies

One of the interesting factors of Virtual University is that many participants will be subject to different legal jurisdictions. Trust frameworks can be built on Sovrin to ensure that they can trust each other regardless. For example, Sovrin's built-in system of consent receipts allows a student's consent regarding data use to be recorded and notarized by a system outside the University. So long as Sovrin can be trusted to record and notarize these consent receipts properly, they can be trusted by participants regardless of where they reside.

The preceding case study was a little vague about where things are stored, merely talking about "writing claims" and "creating disclosures." As a rule, it's not a good idea to write claims, disclosures, and other data directly to the Sovrin ledger. Instead, these are generally written to some kind of data store and then anchored in the ledger by writing a signature of the data on the ledger. The student profile and learning record store functions as a Sovrin Agent. Agents are Sovrin-enabled systems that store and process information on their owner's behalf.

As education is increasingly deconstructed, a trustworthy system for issuing and relying on verifiable claims and consent receipt enables decentralized, interoperable education ecosystems. Sovrin significantly decreases the number of heavy integrations necessary to realize the virtual university. Universities exist, to some degree, because they are trustworthy issuers of credentials. Sovrin provides a foundation for extending that trust outside these traditional institutions.

  1. She could, of course pay for them as she takes them, but this makes the case study a little simpler.
  2. Alice isn't necessarily involved in the details of any of this. The Student Profile could be acting as her agent to automatically disclose this information to institutions she trusts based on policy.

Notes from Defrag 2016

Defrag logo

The following are my live tweets from Defrag at the Omni Interlocken in Denver, November 16-17, 2016.

Eric Norlin:

Defrag is 10 years old. Happy birthday.

Announcing that this is the last time Defrag exists independently. Combining with Glue. #sad

Tim Wagner:

Going server less with AWS lambda

Does serverless mean no servers? No, but there are no software servers (web server framework, etc.)

Serverless goes beyond VMs and containers so that functions are the unit of abstraction

Serverless enforces good design: small individual units of code, persistence separated from compute, one way to do things

Triggers by events or calls from APIs. Easy to do real-time processing, event processing

Bring your own code, Lambda is the web server, uses simple resource model, processes are stateless

Economic proposition for lambda is "never pay for idle"

Lambda economic model makes building a service simple by removing need to understand servers, availability zones, scaling

Would you rather carry a giant beachball of 700 marbles without a bag. Both suck. Monoliths and microservices...

Lambda provides the bag for the marbles.pen spec and tools to package and deploy entire serverless model #comingsoon

Serverless means bring your own frameworks. No need to learn a new framework

Lambda is a universal webhook receiver platform.

Lambda reduces the need for understanding complex distributed computing concepts

Future is a hybrid of containers and serverless to bridge these two worlds

Prediction: we go from abstracting compute infrastructure to abstracting computational systems

Lorinda Brandon:

Marketing person at a tech conference gets "you have stickers, right?"

Marketing smells funny: they always have to good news, they're perky. Makes you think they're hiding something

Marketing thinks developers are mean and scary.

developers are great conversationalists. Developer marketing has to take the bad with the good. Get feedback

Tips for developer marketing: Start with GOOD; respond with truth; actively listen be flexible; get out of the way

API == assume positive intent

Duncan Johnston-Watt:

Hyperledger is a project of the Linux foundation; lots of fancy sponsors

A world of many chains. There will not be a chain of all chains. There will be many public chains and millions of private chains

Blockchain allows everyone to agree on who owns what (for example) rather than having central registry for the asset

Hyperledger fabric is permissioned blockchain network written in Go, chaincode in Go or Java

Cloudsoft's interest in Hyperledger is deploying managing the consensus network and deploying applications on it.

For example: Sotheby's want to run an auction. Cloudsoft could manage the deployment and management of Sotheby's auction code

Mike Zaccardo:

Now Mike Zaccardo is demoing the Sotheby's auction demo

Hyperledger fabric has a notion of containers that allow multiple blockchain apps to run on the Hyperledger fabric

demo shows transferring lots from one client to another, ensuring ownership

Duncan Johnston-Watt

Hyperledger is too big to be owned by a single entity. We should own it in common.

See http://www.cloudsoft.io/gethlf for more info