Sovrin Status: Alpha Network Is Live

Sunrise

Sovrin is based on a permissioned distributed ledger. Permissioned means that their are known validators that achieve consensus on the distributed ledger. The validators are configured so as to achieve Byzantine fault tolerance but because they are known, the network doesn't have to deal with Sybil attacks. This has several implications:

  1. The nodes are individually unable to commit transactions, but collectively they work together to create a single record of truth. Individual nodes are run by organizations called "Sovrin Stewards."
  2. Someone or something has to chose and govern the Stewards. In the case of Sovrin, that is the Sovrin Foundation. The nodes are governed according to the Sovrin Trust Framework.

The Sovrin Network has launched in alpha. The purpose of the Alpha Network is to allow Founding Stewards to do everything necessary to install and test their validator nodes before we collectively launch the Provisional Network. It’s our chance to do a dry-run to work out any kinks that we may find before the initial launch.

Here’s what we want to accomplish as part of this test run:

  • Verify technical readiness of the validator nodes
  • Verify security protocols and procedures for the network
  • Test emergency response protocols and procedures
  • Test the distributed, coordinated upgrade of the network
  • Get some experience running the network as a community
  • Work out any kinks and bugs we may find.

With these steps complete, Sovrin will become a technical reality. It’s an exciting step. We currently have nine stewards running validators nodes and expect more to come online over the next few weeks. Because the Alpha Network is for conducting tests, we anticipate that the genesis blocks on the ledger will be reset once the testing is complete.

Once the Alpha Network has achieved it's goals, it will transition to the Provisional Network. The Sovrin Technical Governance Board (TGB) chose to operate the network in a provisional stage as a beta period where all transactions were real and permanent, but still operating under a limited load. This will enable the development team and Founding Stewards to do performance, load, and security testing against a live network before the Board of Trustees declares it generally availability.

After many months of planning and working for the network to go live, we're finally on our way. Congratulations and gratitude to the team at Evernym doing the heavy lifting, the Founding Stewards who are leading the way, and the many volunteers who sacrifice their time to build a new reality for online identity.


Photo Credit: Sunrise from dannymoore1973 (CC0 Public Domain)


Correlation Identifiers

Let's talk for a bit about correlation identifiers. Correlation IDs are used to identify the different parts of a computation when they happen concurrently. Picos are single threaded, so you never have to worry about this in a single evaluation cycle (response to one event) in a given pico. But as soon as there is more than one actor involved in a computation, you run the risk that things might not happen in order and parts of your computation will come back differently than you expect.

For example, in Fuse, creating fleet report was initiated by an event to the fleet pico. This, in turn, caused the fleet pico to send events to the vehicle picos asking them to do part of the computation. The vehicle picos responded asynchronously and the fleet pico combines the results to create the completed report. Suppose the initiating event is received twice in quick succession, the vehicle picos will get multiple requests. It's possible (due to network delays, for example) that the responses will come back out of order. This picture shows the situation:

Asynchronous responses can lead to out or order reports
Asynchronous responses can lead to out or order reports (click to enlarge)

The response pattern on the left is what we want, but the one of the right is one of many possible combinations that could occur in practice. The colors of the two events and the events they spawn (red or blue) are a pictoral correlation identifier. We can easily save the various responses and then combine them into the right report later if we know all the red responses go together and all the blue responses go together.

To make correlation IDs work in KRL, you wouldn't use colors, of course, but rather a random string for each initiating event. So long as all the participants pass this ID along as an event attribute in the various computations that occur, the fleet pico will be able to use the right pieces to create each report.


Updated Pico Programming Workflow

Pico Tool chain and programming workflow

I just got done updating the page in the Pico documentation that talks about the pico programming workflow. I use the idea of a toolchain as an organizing principle. I think it turned out well. If you program picos, it might be of some help.


Sovrin Web of Trust

The Web of Trust model for Sovrin is still being developed, but differs markedly from the trust model used by the Web.

The Web (specifically TLS/SSL) depends on a hierarchical certificate authority model called the Public Key Infrastructure (PKI) to determine which certificates can be trusted. When your browser determines that the domain name of the site you're on is associated with the public key being used to encrypt HTTP transmissions (and maybe that they’re controlled by a specific organization), it uses a certificate it downloads from the Website itself. How then can this certificate be trusted? Because it was cryptographically signed by some other organization who issued the public key and presumably checked the credentials of the company buying the certificate for the domain.

But that raises the question "how can we trust the organization that issued the certificate?" By checking its certificate, and so on up the chain until we get to the root (this process is called “chain path validation”). The root certificates are distributed with browsers and we trust the browser manufacturers to only include trustworthy root certificates. These root certificates are called “trust anchors” because their trustworthiness is assumed, rather than derived.

Webs of Trust

In contrast, a Web of Trust model uses webs of interlocking certificates, signed by multiple parties to establish the veracity of a certificate (see Web of Trust on Wikipedia for more detail). An entity can be part of many overlapping webs of trust because its certificate can be signed by many parties.

Sovrin uses a heterarchical, decentralized Web of Trust model, not the hierarchical PKI model, to establish the trustworthiness of certificates. The methodology for doing this is built into the Sovrin system, not layered on top of it, by combining a decentralized ledger, decentralized identifiers, and verifiable claims. While Sovrin Foundation will establish some trust anchors1 (usually the same Stewards who operate validator nodes on the Sovrin network), these are merely for bootstrapping the system and aren’t necessary for the trusting the system once it is up and running. Sovrin uses this trust system to protect itself from DDoS attacks, among other things.

By allowing an identifier to be part of multiple webs of trust, Sovrin allows for reputational context to be taken into account. For example, being a good plumber doesn’t guarantee that a person will be a good babysitter, but a person who was a good plumber and a trustworthy babysitter could be part of different webs of trust that take these two contexts into account. The makes the overall trust model much more flexible and adaptable to the circumstances within which it will be used. PKI is good for one thing on the Web: showing the public key used to secure HTTP transmissions is correct. In contrast, Sovrin’s decentralized web of trust model is good for anything people need.

The goal of Sovrin is to provide the infrastructure upon which these overlapping webs of trust can be built for various applications. Lyft, Airbnb, and countless other sharing economy businesses are essentially specialized trust frameworks. Sovrin provides the means of creating similar trust frameworks without the need to build the trust infrastructure over and over.

Building Trust

Trust anchors are not the only way one could build trust in an identifier for a given purpose. I can think of the following ways that an identifier could come to be trusted:

  1. Personal knowledge — The identifier is personally known to you through interactions outside Sovrin. People close to me would be in this category. This category also includes identifiers like those for my employer or other entities who I interact with in the physical world and can thus establish a trust connection through means outside of Sovrin. If I know and trust the Web site of an entity, a well-known discovery scheme could let me know what identifiers they claim and consequently I’d gain trust in those identifiers. Such a system could piggyback on the PKI used for Web certificates.
  2. Verifiable claims — The identifier is verified by reliance on other trustworthy claims. This is analogous to how banks use KYC to establish trust in a person opening an account. They check other documents that they can trust (like a driver license or passport). They trust those documents by relying on trust established via the method described in (1).
  3. Web of trust — The identifier is introduced to me by someone I trust or who can be transitively associated with someone I trust. This category most closely follows the PGP Web of trust model described in the wikipedia article I reference above. Various entities have signed the certificate associated with the identifier and I can trace those signatures back to other entities I trust.

Trust anchors, as the term is used in the Sovrin documents, have been designated by Sovrin or other trust anchors as a trustworthy source. But they are not the only source of trust.

In conclusion, trust on Sovrin works much the same way as it does in the physical world. This has advantages and disadvantages. The chief advantage, as I’ve pointed out, is that Sovrin is flexible and adaptible to various circumstances.

The chief disadvantage is that it relies on people being responsible for determining whether or not to trust something. There are lots of clues that people can use. How well it ultimately works depends on the user experience and whether people understand what’s being asked of them. But there will be cases where people mistakenly trust things they shouldn’t. Of course that happens today, online and off, as well. I’m hopeful that a richer set of trust clues will lead to less fraud than we currently see. Whether that’s true or not depends on design decisions yet to be made. Fortunately those can be made with 15 years of experience from identity on the Web to guide us.


Endnotes:

  1. Because the term “trust anchor" is heavily associated with PKI, it might make sense to use another one to avoid confusion.

Photo Credit: Spiderweb with Frost from Yintan (CC BY 4.0)


Sovrin In-Depth Technical Review

The Sovrin Foundation and Engage Identity announced a new partnership today. Security experts from Engage Identity will be completing an in-depth technical review of the Sovrin Foundation’s entire security architecture.

Sovrin Foundation is very concerned that the advanced technology utilized by everyone depending on Sovrin is secure. That technology protects many valuable assets including private personal information and essential business data. As a result, we wanted to be fully aware of the risks and vulnerabilities in Sovrin. In addition, The Sovrin Foundation will benefit from having a roadmap for future security investment opportunities.

We're very happy to be working with Engage Identity, a leader in the security and identity industry. Established and emerging cryptographic identity protocols are one of their many areas of expertise. They have extensive experience providing security analysis and recommendations for identity frameworks.

The Engage Identity team is lead by Sarah Squire, who has worked on user-centric open standards for many organizations including NIST, Yubico, and the OpenID Foundation. Sarah will be joined by Adam Migus and Alan Viars, both experienced authorities in the fields of identity and security.

The final report will be released this summer, and will include a review of the current security architecture, as well as opportunities for future investment. We intende to make the results public. Anticipated subjects of in-depth research are:

  • Resilience to denial of service attacks
  • Key management
  • Potential impacts of a Sovrin-governed namespace
  • Minimum technical requirements for framework participants
  • Ongoing risk management processes

Sovrin Foundation is excited to take this important step forward with Engage Identity to ensure that the future of self-sovereign identity management can thrive and grow.


Life-Like Anonymity

At VRM-day prior to Internet Identity Workshop last week, Joyce Searls commented that she wants the same kind of natural anonymity in her digital life that she has in real life.

In real life, we often interact with others—both people and institutions—with relative anonymity. For example, if I go the store and buy a coke with cash there is no exchange of identity information necessary. Even if I use a credit card it's rarely the case that the entire transaction happens under the administrative authority of the identity system inherent in the credit card. Only the financial part of the transaction takes place in that identity system. This is true of most interactions in real life.

In contrast, in the digital world, very few meaningful transactions are done outside of some administrative identity system. There are several reasons why identity is so important in the digital world:

  • Continuity—Because of the stateless nature of HTTP, building a working shopping cart, for example, requires using some kind of token for correlation of independent HTTP transactions. These tokens are popularly known as cookies. While they can be pseudonymous, they are often correlated across multiple independent sessions using a authenticated identifier. This allows, for example, the customer to have a shopping cart that persists across time on different devices.
  • Convenience—So long as the customer is authenticating, we might as well further store additional information like addresses and credit card numbers for their convenience, to extend the shopping example. Storing these allows the customer to complete transactions without having to enter the same information over and over.
  • Trust—There are some actions that should only be taken by certain people, or people in certain roles, or with specific attributes. Once a shopping site has stored my credit card, for example, I ought to be the only one who can use it. Identity systems provide authentication mechanisms as the means of knowing who is at the other end of the wire so that we know what actions they're allowed to take. This places identifiers in context so they can be trusted.
  • Surveillance—Identity systems provide the means of tracking individuals across transactions for purposes of gathering data about them. This data gathering may be innocuous or nefarious, but there is no doubt that it is enabled by identity systems in use on the Internet.

In real life, we do without identity systems for most things. You don't have to identify yourself to the movie theater to watch a movie or log into some system to sit in a restaurant and have a private conversation with friends. In real life, we act as embodied, independent agents. Our physical presence and the laws of physics have a lot to do with our ability to function with workable anonymity across many domains.

So, how did we get surveillance and it's attendant affects on natural anonymity as an unintended, but oft-exploited feature of administrative digital identity systems? Precisely because they are administrative.

Legibility

Legibility is a term used to describe how administrative systems make things governable by simplifying, inventorying, and rationalizing things around them. Venkatesh Rao nicely summarized James C. Scott's seminal book on legibility and its unintended consequences: Seeing Like a State.

Identity systems make people legible in order to offer continuity, convenience, and trust. But that legibility also allows surveillance. In some respect, this is the trade off we always get with administrative systems. By creating legibility, they threaten privacy.

Administrative systems are centralized. They are owned. They are run for the purposes of their owners, not the purposes of the people or things being administered. They are bureaucracies for governing something. They rely on rules, procedures, and formal interaction patterns. Need a new password? Be sure to follow the password rules of what ever administrative system you're in.

Every interaction you have online happens under the watchful eye of a bureaucracy built to govern the system and the people using it. The bureaucracy may be benevolent, benign, or malevolent but it controls the interaction.

Real Life is Decentralized

On the other hand, in real life we interact as peers. We do interact with administrative systems of various sorts, but no one would describe that as real life. When I go to a store, I don't think about shopping within their administrative system. Rather, I walk in, look at stuff, talk to people, put things in a cart, and check out. The administrative system is there, but it's for governing the store, not the customers.

We can't achieve Joyce's vision of anonymous online interactions until we redecentralize the Internet. The Internet started out decentralized. The early Web was decentralized. But the need for continuity, convenience, and trust led more and more interactions to happen within someone's administrative system.

Most online administrative systems make themselves as unobtrusive as they can. But there's no getting around the fact that every move we make is within a system that knows who we are and monitors what we're doing. In real life, I don't rely on the administrative system of the restaurant to identify the people I'm having dinner with. The restaurant doesn't need to check thead IDs of me and the others in my party or surveil us in order to create an environment where we can talk.

The good news is that we're finally developing the tools necessary to create decentralized online experiences. What if you could interact with your friends on the basis of an identity that they bring to you directly—one that you could recognize and trust? You wouldn't need Facebook or Snapchat to identify and track your friends for you. You could do it yourself.

One of the reasons I jumped at the chance to help get Sovrin up and running is that I believe decentralized identity is the foundation for a decentralized Web—a Web that flexibly supports the kind of ad hoc interactions people have with each other all the time in real life. We'll never get an online world that mirrors real life and it's natural anonymity until we do.


Photo Credit: Network from Pezibear (CC0 Public Domain)


Hyperledger Welcomes Project Indy

We’re excited to announce Indy, a new Hyperledger project for supporting independent identity on distributed ledgers. Indy provides tools, libraries, and reusable components for providing digital identities rooted on blockchains or other distributed ledgers so that they are interoperable across administrative domains, applications, and any other silo.

Why Indy?

Internet identity is broken. There are too many anti-patterns and too many privacy breaches. Too many legitimate business cases are poorly served by current solutions. Many have proposed distributed ledger technology as a solution, however building decentralized identity on top of distributed ledgers that were designed to support something else (cryptocurrency or smart contracts, for example) leads to compromises and short-cuts. Indy provides Hyperledger projects and other distributed ledger systems with a first-class decentralized identity system.

Indy’s Features

The most important feature of a decentralized identity system is trust. As I wrote in A Universal Trust Framework, Indy provides accessible provenance for trust to support user-controlled exchange of verifiable claims about an identifier. Indy also has a rock-solid revocation model for cases where those claims are no longer true. Verifiable claims are a key component of Indy’s ability to serve as a universal platform for exchanging trustworthy claims about transactions. Provenance is the foundation of accountability through recourse.

Another vital feature of decentralized identity&mdaqsh;especially for a public ledger—is privacy. Privacy by Design is baked deep into Indy architecture as reflected by three fundamental features:

  • First, identifiers on Indy are pairwise unique and pseudonymous by default to prevent correlation. Indy is the first DLT to be designed around Decentralized Identifiers (DIDs) as the primary keys on the ledger. DIDs are a new type of digital identifier that were invented to enable long-term digital identities that don’t require centralized registry services. DIDs on the ledger point to DID Descriptor Objects (DDOs), signed JSON objects that can contain public keys and service endpoints for a given identifier. DIDs are a critical component of Indy’s pairwise identifier architecture.
  • Second, personal data is never written to the ledger. Rather all private data is exchanged over peer-to-peer encrypted connections between off-ledger agents. The ledger is only used for anchoring rather than publishing encrypted data.
  • Third, Indy has built-in support for zero-knowledge proofs (ZKP) to avoid unnecessary disclosure of identity attributes—privacy preserving technology that has been long pursued by IBM Research (Idemix) and Microsoft (UProve), but which a public ledger for decentralized identity now makes possible at scale.

Indy is all about giving identity owners independent control of their personal data and relationships. Indy is built so that the owner of the identity is structurally part of transactions made about that identity. Pairwise identifiers stop third parties from talking behind the identity owner’s back, since the identity owner is the only place pairwise identifiers can be correlated.

Sovereign identity star
Sovereign Identity (click to enlarge)

Indy is based on open standards so that it can interoperate with other distributed ledgers. These start, of course, with public-key cryptography standards. Other important standards cover things like the format of Decentralized Identifiers, what they point to, and how agents exchange verifiable claims. Indy also supports a system of attribute and claim schemas that are written to the ledger for dynamic discovery of previously unseen claim types. Relying parties can make their own entitlement decisions based on schemas with publicly known identifiers.

The result is a new way of doing systems integration on the Internet that is much less costly while also being more trustworthy. As I wrote in When People Can Share Verifiable Attributes, Everything Changes, owner-provided attributes are a powerful driver that will push decentralized identity systems well beyond the current uses of federation and social login. Organizations can reduce, or even eliminate, costly manual verification processes and API integrations, and instead trust the identity claims presented to them, precisely because these claims can be verified. People and organizations become the source of what's true about them.

Indy Shares the Internet’s Virtues

As I wrote in An Internet for Identity, Indy shares three important virtues with the Internet:

  1. No one owns it.
  2. Everyone can use it.
  3. Anyone can improve it.

Launching Indy as a Hyperledger Project is a critical component of allowing anyone to improve how Indy works.

These virtues are supported by Indy’s permissioned-validation ledger model and open-source code base. This has important consequences for scale and cost. But unlike other permissioned ledgers like R3’s Corda, CULedger, or SecureKey, Indy is designed for global public access. Even though Indy is permissioned, anyone can access Indy’s features.

Validation is performed by a set of validator nodes running a modified, redundant Byzantine fault tolerant protocol called Plenum that is part of the Indy project. With Plenum, network nodes run come to collective agreement about the validity and order of events.

This diagram shows how these different dimensions of validation and access play across different distributed ledger systems.

Validation and Access
Validation and Access (click to enlarge)

What is the Relationship of Indy and Sovrin?

Sovrin Foundation is contributing the Indy code base to the Hyperledger Project. Established in September 2016, the Sovrin Foundation is an international non-profit foundation created to govern a global public utility for decentralized identity. The trustees of the Sovrin Foundation believe the public, permissioned quadrant above is the only one that can achieve both high trust and global adoption for a decentralized identity system.

The Sovrin Foundation developed the Sovrin Trust Framework to govern how trusted institutions, called stewards, will operate validator nodes of the Sovrin Network. All stewards will run an instance of Project Indy. However the Sovrin Network is only one network designed to run Indy; any number of other networks may be created to run their own instances.

How Will Hyperledger Enhance Indy?

The Indy code base, originally developed by Evernym, was donated to the Sovrin Foundation to establish a strong open source foundation for the Sovrin Network. The Sovrin Foundation has been building a global community of developers who are passionate about independent identity and the economic and social benefits it brings to both individuals and enterprises.

With the contribution of Indy to Hyperledger, we take the next step in that process, opening up Project Indy to the entire Hyperledger family of developers. Our hope is to attract even more developers who want to unleash the transformative power of digital identity that is decentralized, self-sovereign, and independent of any silo. We would also like to explore direct synergies with the identity management goals and requirements of the other Hyperledger projects.

Learn More about Indy

To learn more or contribute to the Indy project:


Pico Programming Lesson: Modules and External APIs

Apollo Modules

I recently added a new lesson to the Pico Programming Lessons on Modules and External APIs. KRL (the pico programming language) has parameterized modules that are great for incorporating external APIs into a pico-based system.

This lesson shows how to define actions that wrap API requests, put them in a module that can be used from other rulesets, and manage the API keys. The example (code here) uses the Twilio API to create a send_sms() action. Of course, you can also use functions to wrap API requests where appropriate (see the Actions and Functions section of the lesson for more detail on this).

KRL includes a key pragma in the meta block for declaring keys. The recommended way to use it is to create a module just to hold keys. This has several advantages:

  • The API module (Twilio in this case) can be distributed and used without worrying about key exposure.
  • The API module can be used with different keys depending on who is using it and for what.
  • The keys module can be customized for a given purpose. A given use will likely include keys for multiple modules being used in a given system.
  • The pico engine can manage keys internally so the programmer doesn't have to worry (as much) about key security.
  • The key module can be loaded from a file or password-protected URL to avoid key loss.

The combination of built-in key management and parameterized modules is a powerful abstraction that makes it easy to build easy-to-use KRL SDKs for APIs.

Going Further

The pico lessons have all been recently updated to use the new pico engine. If you're interested in learning about reactive programming and the actor model with picos, walk through the Quickstart and then dive into the lessons.


Photo Credit: Blue Marble Geometry from Eric Hartwell (CC BY-NC-SA 3.0)


The New Pico Engine Is Ready for Use

The mountains, the lake and the cloud

A little over a year ago, I announced that I was starting a project to rebuild the pico engine. My goal was to improve performance, make it easier to install, and supporting small deployments while retaining the best features of picos, specifically being Internet first.

Over the past year we've met that goal and I'm quite excited about where we're at. Matthew Wright and Bruce Conrad have reimplemented the pico engine in NodeJS. The new engine is easy to install and quite a bit faster than the old engine. We've already got most of the important features of picos. My students have redone large portions of our supporting code to run on the new engine. As a result, the new engine is sufficiently advanced that I'm declaring it ready for use.

We've updated the Quickstart and Pico Programming Lessons to use the new engine. I'm also adding new lessons to help programmers understand the most important features of Picos and KRL.

My Large-Scale Distributed Systems class (CS462) is using the new pico engine for their reactive programming assignments this semester. I've got 40 students going through the pico programming lessons as well as reactive programming labs from the course. The new engine is holding up well. I'm planning to expand it's use in the course this spring.

Adam Burdett has redone the closet demo we showed at OpenWest last summer using the new engine running on a Raspberry Pi. One of the things I didn't like about using the classic pico engine in this scenario was that it made the solution overly reliant on a cloud-based system (the pico engine) and consequently was not robust under network partitioning. If the goal is to keep my machines cool, I don't want them overheating because my network was down. Now the closet controller can run locally with minimal reliance on the wider Internet.

Bruce was able to use the new engine on a proof of concept for BYU's priority registration. This was a demonstration of the ability for the engine to scale and handle thousands of picos. The engine running on a laptop was able to handle 44,504 add/drop events in over 8000 separate picos in 35 minutes and 19 seconds. The throughput was 21 registration events per second or 47.6 milliseconds per request.

We've had several lunch and learn sessions with developers inside and outside BYU to introduce the new engine and get feedback. I'm quite pleased with the reception and interactions we've had. I'm looking to expand those now that the lessons are completed and we have had several dozen people work them. If you're interested in attending one, let me know.

Up Next

I've hired two new students, Joshua Olson and Connor Grimm, to join Adam Burdett and Nick Angell in my lab. We're planning to spend the summer getting Manifold, our pico-based Internet of Things platform, running on the new engine. This will provide a good opportunity to improve the new pico engine and give us a good IoT system for future experiments, supporting our idea around social things.

I'm also contemplating a course on reactive programming with picos on Udemy or something like it. This would be much more focused on practical aspects of reactive programming than my BYU distributed system class. Picos are a great way to do reactive programming because they implement an actor model. That's one reason they work so well for the Internet of Things.

Going Further

If you'd like to explore the pico engine and reactive programming with picos, you can start with the Quickstart and move on to the pico programming lessons.

We'd also love help with the open source implementation of the pico engine. The code is on Github and there's well-maintained set of issues that need to be worked. Bruce is the coordinator of these efforts.

Any questions on picos or using them can be directed to the Pico Labs forum and there's a pretty good set of documentation.


Photo Credit: The mountains, the lake and the cloud from CameliaTWU (CC BY-NC-ND 2.0)


What Use is a Master of Science in CS?

Graduation

Recently a friend of mine, Eric Wadsworth, remarked in a Facebook post:

My perspective, in my field (admittedly limited to the tech industry) has shifted. I regularly interview candidates who are applying for engineering positions at my company. Some of them have advanced degrees, and to some of those I give favorable marks. But the degree is not really a big deal, doesn't make much difference. The real question is, "Can this person do the work?" Having more years of training doesn't really seem to help. Tech moves so fast, maybe it is already stale when they graduate. Time would be better spent getting actual experience building real software systems.

I read this, and the comments that followed, with a degree of interest because most of them reflect a gross misunderstanding of what an advanced degree indicates. The assumption appears to be that people who get a BS in Computer Science are learning to program and therefore getting a MS means you're learning more about how to program. I understand why this can be confusing. We don't often hire a plumber when we need a mechanical engineer, but Computer Science and programming are still relatively young and we're still working out exactly what the differences are.

The truth is that CS programs are not really designed to teach people to code, except as a necessary means to learning computer science, which is not merely programming. That's doubly true of a masters degree. There are no courses in a master's program that are specifically designed to teach anyone to program anything. You can learn to code at a 1000 web sites. A CS degree includes topics like computational theory and practices, algorithms, database design, operating system design, networking, security, and many others. All presented in a way designed to create well-rounded professionals. The ACM Curriculum Guidelines (PDF) are a good place to see some of the detail in a program independent way.

Most of what one learns in a Computer Science program has a long shelf life—by design. For example, I design the modules in my Large Scale Distributed Programming class to teach principles that have been important for 30 years and are likely to be important for 30 more. Preventing Byzantine failure, for example, has recently become the latest fad with the emergence of distrubted ledgers. I learned about it in 1988. If your interview questions are asking people what they know about the latest JavaScript framework, you're unlikely to distinguish the person with a CS degree from someone who just completed a coding bootcamp.

What does one learn getting an advanced degree in Computer Science? People who've completed a masters degree have proven their ability to find the answers to large, complex, open-ended problems. This effort usually lasts at least six months and is largely self-directed. They have shown that they can explore scientific literature to find answers and synthesize that information into a unique solution. This is very different than, say, looking at Stack Overflow to find the right configuration parameters for a new framework. Often, their work is only part of some larger problem being explored by a team of fellow students under the direction of someone recognized as an expert in a specific field in Computer Science.

If these kinds of skills aren't important to your project, then you're wasting your time and money hiring someone with an advanced degree. As Eric points out, the holder of an MS won't necessarily be a better programmer, especially as measured by your interview questions and tests. And if they are important, you're unlikely to uncover a candidate's abilities in an interview. Luckily, someone else has spent a lot of time and money certifying that the person sitting in front of you has them. All free to you. That's what the letters MSCS on their resume mean.

Obviously, every position comes with an immediate need. Sometimes that can be filled by a candidate with good programming skills and a narrow education. Sometimes you want something more. But don't hire poorly because you misunderstand the credentials you're evaluating.


Photo Credit: Graduation from greymatters (CC0 Public Domain)