Decentralization and Distributed Ledgers


Last week, I referenced an article in American Banker on the responsibilities of blockchain developers. I focused mainly on the governance angle, but the article makes several pokes at the "decentralization charade" and that's been bothering me. The basic point being that (a) there's no such thing as a blockchain without governance (whether ad hoc or deliberate) and (b) governance means that the ledger isn't truly decentralized.

In Re-imagining Decentralized and Distributed, I make the distinction between distributed and decentralized by stating that decentralized systems are composed of pieces that are not under the control of any single entity. By that definition, DNS, for example, is a pretty good example of a decentralized service since it's composed of servers run by millions of separate organizations around the world, cooperating to map names to IP numbers. There are others including email, the Web, and the Internet itself.

But DNS is clearly subject to some level of governance. The protocol is determined by a standards body. Most of the DNS servers in the world are running an open-source DNS server called BIND that is managed by the Internet System Consortium. Domain names themselves are governed by rules put in place by ICANN. There are a group of people who control, for better or worse, what DNS is and how it works.

So, is DNS decentralized? I maintain that DNS is decentralized, despite a relatively small set of people who, together, govern it. Here's why:

First, we have to recognize that decentralization is a continuum, not a binary proposition. Could we imagine a system for mapping names into IP numbers that is more decentralized? Probably. Could we imagine one less decentralized? Most certainly. And given how DNS is governed, there are a multitude of entities who have to agree to make significant changes to the overall operation of the DNS system.

Second, and more important, the governance of the DNS system is open. Structurally, it's difficult for those who govern DNS to make any large-scale change without everyone knowing about them and, if they choose, objecting.

Third, the kinds of decisions that can be made by the governance bodies are limited, in practice, but the structure of the system, the standards, and the traditions of practice that have grown up around it. For example, there is a well-defined process for handling domain name disputes. Not everyone will be happy with it, but at least it exists and is understood. Dispute resolution, as one example, is not ad hoc, arbitrary, or secret.

Lastly, the DNS system may be governed by a relatively small set of people and organizations, but it's run by literally millions. People running DNS servers have a choice about what server software they run. If enough of them decided to freeze at a particular place because they objected to changes or to fork the code, they could effectively derail an unpopular decision.

Distributed ledgers will have varying levels of decentralization depending on their purpose and their governance model and how that model is made operational. The standard by which they should be judged is not "does any human ever make a decision affecting the ledger" but rather:

  1. Is the ledger as decentralized as we can make it while achieving the ends for which ledger was created?
  2. Is the governance process open? Who can participate? How are the governing entities chosen?
  3. How light is the governance? Are the kinds of decisions the governing bodies can make limited by declared process?
  4. Is the operation of the system dependent of the voluntary participation of entities outside the governing bodies?

Distributed ledgers are young and the methods and modes of governance, along with those entities participating in their governance, are in flux. There are many decisions yet to be made. What's more, there's not one distributed ledger, but many. We're still experimenting with what will work and what won't.

While a perfectly decentralized system may be beyond our reach and even undesirable for many reasons, we can certainly do better than the centralized systems that have grown up on the Web to date. Might we come up with even more decentralized systems in the future? Yes. But that shouldn't stop us from creating the most decentralized systems we can now. And for now, we've seen that governance is necessary. Let's keep it light and open and move forward.

Governance for Distributed Ledgers

Fiduciary Trust Building

This article by Angela Walch from American Banker makes the (excessively snarky) case that distributed ledger developers and miners ought to be held accountable as fiduciaries.

Non-permissioned distributed ledgers like Ethereum will continue to serve important needs, but organizations like banks, insurance companies, credit unions, and others who act as fiduciaries and must meet regulatory requirements, will prefer permissioned ledgers that can provide explicit governance. See Properties of Permissioned and Permissionless Blockchains for more on this.

Governance models for permissioned ledgers should strike a careful balance between what’s in the code and what’s decided by humans. Having everything in code isn’t necessarily the answer. But having humans too heavily involved can open the system up to interference and meddling—both internal and external.

Permissioned ledgers also need to be very clear about what the procedures are for adjudicating problems with the ledger. They can’t be seen as ad hoc or off the cuff. We must have clear dispute resolution procedures and know what disputes the governance system will handle and those it won't.

Governance in permissioned distributed ledgers provides a real solution to some of the ad hoc machinations that have occurred recently with non-permissioned blockchains.

Service Integration Via a Distributed Ledger


Consider a distributed ledger that provides people (among other principles) with an identity and a place to read and write, securely and privately, various claims. As a distributed ledger, it's not controlled by any single organization and is radically decentralized and distributed.

In the following diagram, the Department of Motor Vehicles has written a driver's license record on the distributed ledger. Later, John is asked to prove his age at Walmart. John is involved in permissioning both the writing and reading of the record. Further, the record is written so that John doesn't have to disclose the entire driver's license, just the fact that he's over 18.

A Distributed-Ledger Integration

Walmart and the DMV are interacting despite the lack of explicit integration of their systems. They are interacting via the a distributed ledger that provides secure and private claim presentment. Further, John (the person they're talking about) is structurally part of the conversation. I call this sovereign-source integration since it's based on sovereign-source identity.

Even if there were 20 different distributed ledger systems that Walmart had to integrate with, that still less work than integrating with every DMV. And, they can now write receipts when you shop or read transcripts when you apply for a job—all with your permission, of course.

Security and privacy is ensured by the proper application of cryptography, including public-private key pairs, digital signatures, and cryptographic hashes. This isn't easy, but it's doable. There's nothing about the scenario I'm painting that is waiting on some technology revolution. Everything we need is available now.

I wrote a post a few weeks about about how sovereign-source integration helps solve the problems of building a virtual university. In that article, the student profile (including an LRS) is the distributed, personally controlled integration point. The information in the student profile might all be written as claims on a distributed ledger, but they could also be in some off-ledger system that the distributed ledger just points to. Either way, once the student has provide the various institutions participating in the virtual university with their integration point, the various university systems are able to work together through the integration point instead of needing point-to-point integrations.

virtual university
The Virtual University

The world is too big and vast to imagine that we can scale point-to-point integrations to cover every imaginable use case. The opportunities for this architecture in finance, healthcare, egovernment, education, and other areas of human interaction boggle the mind. Sovereign-source integration is a way to cut the Gordian knot.

Pico Labs at Open West

PicoLabs at Open West

The students in my lab at BYU are running a booth at OpenWest this year. OpenWest is one of the great open source conferences in the US. There are 1400 people here this year. When the call for papers came out this year, I missed the deadline. Not to worry, I decided to sponsor a booth. That way my students can speak for three days instead of an hour. Here's what they're demoing at OpenWest this week.

A while back, I wrote a blog post about my work with the ESProto sensors from Wovyn. Johannes Ernst responded with an idea he'd had for a little control project in his house. He has a closet with computers in it that sometimes gets too hot. He wanted to automatically control some fans and turn them on when the closet was too hot. I asked my students—Adam Burdett, Jesse Howell, and Nick Angell—to mock up that situation in an old equipment box.

Physically, the box has two pancake fans on the top, a light bulb as a heat source, a ESProto temperature sensor inside the box, and one outside the box. There's a Raspberry Pi that controls the light and fans. The RPi presents an API.

We could just write a little script on the RPi that reads the temperatures and turns fans on or off. But that wouldn't be much fun. And it wouldn't give us an excuse to work on our vision for using picos to create communities of things that cooperate. Granted, this example is small, but we've got to start somewhere.

The overall design uses picos to represent spimes for the physical devices: two fans and two temperature sensors. There is also a pico to represent the community of fans and one to represent the closet, the overall community to which all of these belong. The following diagram illustrates these relationships.

pico structure
Pico Structure for the Closet Demo

The Fan Collection is an important part of the overall design because it abstracts and encapsulates the individual fans so that the closet can just indicate it wants more or less airflow without knowing the details of how many fans there are, how fans are controlled, whether they're single or variable speed, and so on. The Fan Collection manages those details.

That's not to say that the Fan Collection knows the details of the fans themselves. Those details are abstracted by the Fan picos. The Fan picos present a fairly straightforward representation of the fan and its capabilities.

This demo provides us with a project to use Wrangler. Wrangler is the pico operating system that Pico Labs has been working on for the last year. Wrangler is a follow-on to CloudOS, a pico control system that we built at Kynetx and that was the code underlying Fuse, the connected-car platform we built. Wrangler improves on CloudOS by taking its core concepts and extending and normalizing them.

The primary purpose of Wrangler is pico life cycle management. While the pico engine provides methods for creating and destroying picos, installing rulesets, and creating channels, those operations are low-level—using them is a lot of work.

As an example of how Wrangler improves on the low-level functions in the pico engine, consider pico creation. Creating a useful child pico involves the following steps:

  1. create the child
  2. name the child
  3. install rulesets in the child
  4. initialize the child
  5. link the child to other picos using subscriptions

Wrangler uses the concept of prototypes to automate most of this work. For example, a developer can define a prototype for a temperature sensor pico. Then using Wrangler, temperature sensor picos, with the correct configuration, can be created with a single action. This not only reduces the code a developer has to write, but also reduces configuration errors.

The great thing about going to a conference—as a speaker or an exhibitor—is that it gives you a deadline for things you're working on. OpenWest provided just such an excuse for us. The demo drove thinking and implementation. If you're at OpenWest this week, stop by and see what we've done and ask some questions.

A System Ruleset for the Pico Engine

FIGURE 10.2 Reinforcing feedback: An increase results in further increase, and vice versa

I have a problem: a long time ago, Kynetx built a ruleset management tool called AppBuilder. There are some important rulesets in AppBuilder. I'd like to shut down AppBuilder, but first I need to migrate all the important rulesets to the current ruleset registry. There's just one tiny thing standing in my way: I don't know which rulesets are the important ones.

Sure, I could guess and get most of them. Then I'd just wait for things to break to discover the rest. But that's inelegant.

My first thought was to write some code to instrument the pico engine. I'd increment a counter each time it loads a ruleset. That way I see what's being used. No guessing. I'd need some way to get stuff in the database and get it out.

But then I had a better idea. Why not write intrumentation data into the persistent variable space of a system ruleset. The system ruleset can access and modify any of these variables. And it's flexible. Rather than making changes to the engine and rolling to production each time I change the monitoring, I update the system ruleset.

Right now, there's just one variable: rid_usage. The current system ruleset is simple. But it's a start. All the pieces are in place now to use this connection for monitoring, controlling, and configuring the pico engine.

I like this idea a lot because KRL is being used to implement important services on the platform that implements KRL. Very meta... And when systems start to be defined in their own language, that's a good thing.

Failure and the Internet of Things

Summer Sprinkler

I'm now on my second Internet-connected sprinkler controller. The first, a Lono, worked well enough although there were some features missing. Last week, I noticed that the program wasn't running certain zones. I wasn't sure what to do and I couldn't find help from Lono, so I decided I'd try a second one. My second purchase based on both friend's recommendations and reviews on Amazon was a Rachio. I installed it on Saturday.

As I was working on setting up the programs and experimenting with them I noticed that the new sprinkler controller had stopped working. When I went to check on it, I discovered that it was completely dead: no lights, no response.

I rebooted the controller and started over. It got to the same point and sprinkler controller died again. A little thought showed that the Rachio sprinkler controller was dying at exactly the same point that the Lono was failing to complete its program. The problem? A short in one of the circuits.

The Lono and the Rachio both fail at handling failure. The old controller, an Irritrol, just dealt with it and kept right on going. None of them, including the Irritrol, did a good job of telling me that I had a short-circuit.

Building sprinkler controllers is a tough job. The environment is dirty and wet. The valves and sensors are numerous and varied. I don't know about you, but it's a rare year I don't replace valve solenoids or rewrite something. A sprinkler controller has to roll with this environment to pass muster. To be excellent, it has to help with debugging and solving the problems.

Fancy water saving features, cool Web sites, and snaky notifications are fine. But they're like gold-plated bathroom fixtures in a hotel room with dirty sheets if the controller doesn't do it's basic job: run the sprinklers reliably.

Fitbit as Gizmo


In the taxonomy of Bruce Sterling's Shaping Things, the Fitbit is a Gizmo.

"Gizmos" are highly unstable, user-alterable, baroquely multifeatured objects, commonly programmable, with a brief lifespan. Gizmos offer functionality so plentiful that it is cheaper to import features into the object than it is to simplify it. Gizmos are commonly linked to network service providers; they are not stand-alone objects but interfaces. People within an infrastructure of Gizmos are "End-Users."

People buy Fitbits believing that they're buying a thing, but in fact, they're buying a network service. The device is merely the primary interface to that service. The Fitbit is useless without the service. Just a chunk of worthless plastic and silicon.

The device is demanding. We buy Fitbits and then fiddle with them incessantly. Again, to quote Bruce:

...Gizmos have enough functionality to actively nag people. Their deployment demands extensive, sustained interaction: upgrades, grooming, plug-ins, plug-outs, unsought messages, security threats, and so forth.

Sometimes we're messing with them cause we're bored and relieve it with a little configuration. Often we're forced to configure and reconfigure because its not working. We feel guilt over buying something we're not using. Usually, the Fitbit ends up in a drawer unused after the guilt wears off and the pain of configuration overwhelms the perceived benefit.

Fitbit isn't selling things. They probably fancy themselves selling better health or fitness. But, Fitbit is really selling a way to measure, and perhaps analyze, some aspect of your life. They package it up like a traditional product and put it on store shelves, but the thing you buy isn't a traditional product. Without the service and the account underlying it, you have nothing.

Of course, I'm not talking about Fitbit alone. Fitbit is just a well-known example. Everything I've said applies to every current product in the so-called Internet of Things. They are all just interfaces to the real product: a networked service. I say "so-called" because a more appropriate name for the Gizmo ecosystem is CompuServe of Things.

Bruce's book is a trail guide to what comes after Gizmos: something he calls a "spime." Spimes are material instantiations of an immaterial system. They begin and end with data. Spimes will have a different architecture than the CompuServe of Things. To work, they will cooperate and interact in true Internet fashion.

Notes from Gluecon 2016

Gluecon Logo2016_Master

I took the following notes during various sessions at Gluecon 2016 at the Omni Interlocken in Broomfield Colorado. Notes were live tweeted during the event on @windley using Kevin Marks' Noter Live tool.

melody meckfessel:

starting off the day with how cloud accelerates innovation in software development

Three waves of cloud tools: Colocation, virtualized data centers, and 3rd wave: actual, global, flexible cloud

Goal is NoOps: auto everything. No need to manage or spin up servers. Write code, rather than manage servers

Kubernetes manages containers, supports multiple envs & container runtimes, 100% open source

Speaking of developers: "We keep raising the bar on ourselves"

Goal is to let developers focus on code. PaaS (e.g. appEngine) needs to evolve

Now: PaaS is a walled garden; Future: choice of tools, more complex apps, global scale

31% time spent troubleshooting; ofter in time-critical situations; need better tools: trace, error reporting, prod debug

Duncan Johnson-Watt:

Up next: building apps on the blockchain

Hyperledger is new project from Linux Foundation.

Business is increasingly interested in permissioned blockchains rather than promiscuous or permissionless blockchains

Governance of blockchain will be incredibly important if we're going to bet on this technology

Requirements for blockchains vary greatly across different use cases

"This is too important to be owned by a single entity" speaking of distributed ledger tech

Hyper ledger has ~50 members, >$6M in funding, 2300 membership requests

Brian Behlendorf is now executive director

Showing how to deploy a blockchain application with Cloudsoft AMP

Key concepts: shared ledger, smart contract, consensus network, membership, events, management, wallet, integration

live demo of asset transfer demo. #brave Challenge is speed at the moment. Proof of asset ownership controls transfer

Mary Scotton:

want to be diverse? Start by diversifying your twitter feed.

Jen Tong:

Crash course in electrical engineering at start of #IoT session

Done with signals; moving on to components. "kind of like legos except sometimes they catch on fire"

bread boards, perf boards, jumpers, resistors, LEDs, push button, capacitor, servo motors

The recipe: put components on bread board, arduino uno converts component signals, feeds to RaspPI over USB

Johnny Five is a JS code for #IoT on Arduino Uno

Firebase is a real-time database; "data that is best served fresh"

Where's the Bus? as an example of real-time data. I care where the bus is now, not yesterday.

Collaborative drawing is another example where real-time matters. Much less interesting with several second lag

When you're working on your Arduino always do it unplugged or you'll be sad.

After getting the button and LED hooked up: we have a thing, but no Internet yet. Let's add Firebase

why live code when you can live copy/paste

Firebase shows button presses in FB console. "Now, let's go through the Internet to change the LED status"

Connects button to her slides. Button adds rick-roll to the slide.

Celebrate the first time something catches on fire

Slides are here:

Alex Balazs:

speaking on how Intuit is breaking up the monolith

Intuit moving everything to AWS; shutting down local data centers

TurboTax is a $2 Billion business; product managers don't want to touch it (other than updating tax logic)

our vision is to make tax prep obsolete; this makes TurboTax irrelevant

20yo tech stack; terrible, horrible, monolith. Written by tax specialists who became programmers.

Going beyond the interview to personal experience. Why as 20yo barista in NYC if they get California RR Retirement?

Can't replace TurboTax by creating something complex to replace it. #FAIL'd twice already. #gallslaw

2nd problem: trying to create better TurboTax instead of creating product to kill TurboTax

Bearing up the TurboTax monolith: everything as a service; quickly create frictionless experiences

Teams work at their own speed; teams are decoupled; services built for other teams.

path forward to create a pirate ship. Everyone wants to be a pirate.

pirate ship means "this was not a sanctioned project"

took the narrowest part of TurboTax: vehicle registration, 3 screens; TurboTax interview has 53K screens

hardest problem to solve in TurboTax: what does back button do.

back button takes you back a screen; should it save the data or not?

Built vehicle registration in 4 weeks and pushed to production; Sr leadership then sanctioned project;

Old stack: changing user experience took 3 months. New stack: 1 week.

Now 14 most common topics in TurboTax are running on new stack

In a world with 50K interview screens, you can build them manually. Intuit has a "tax player" for tax content

Intuit's ability to enter markets on new devices skyrocketed

Old product had 6 different "beaconing" libraries

In three years Intuit will have eliminated every line of code from the monolith and be completely service based

1. Everything as a service 2. Attack the monolith; 3. Build common application fabric (prescriptive on standards)

Rajesh Raman:

Time is Hard: Doing Meaningful Things with Data

Doing meaningful things with fast data

fast data continually reflects changing state; enables real time decision making

Time data often implies big data

Ex: sentiment analysis on Twitter; seismic sensor networks; data fusing from distributed sensors (phones in cars)

individual records are small; all have timestamps; repeated measurements yield time series

Value of time series data diminishes over time; 2 strategies: store nothing & store everything

tiered storage: recent data at high fidelity; older data at low fidelity; store analysis not raw sensor data

Batch processing is common, but reduces responsiveness

Alternative is stream processing; stream processor is stateful and incremental; typically using O(1) algorithm

stream processing: read-once, write-once. No do overs.

stream processing is not only far more timely, but also more efficient than batch processing.

BUT: you lose ability to rewind and do a redo.

Good news: simple primitives take you a long way; bad news: dealing with time is hard

merging streams by timestamp; skewed, irregular, bursty, laggy, jittery, lossy

skew: data from different time series arrive at different timestamps

irregular: aperiodic or unsteady periodicity

bursty: no activity for while, then all arrive at once

laggy: difference between generation and receipt

skew happens all the time. must logically align data within each period; requires understanding data

skew requires alignment between alignment and the periodicity the data is arriving at

Bustiness and lag require deciding how long to wait

The longer you wait the more likely data will appear; but computation is less timely

wait time must be bounded in some way because of finite resources

Types pf clocks: measurement time, receipt time, analytic/processing time

deadlines exclude data: schemes: static guarantees timeliness while dynamic adapts to changing conditions

dynamic deadlines can be set based on how much data is being excluded by deadline

Elliot Turner:

Morning starts off with cognitive computing

Arthur Bock wrote the first gam playing computer (checkers) on the IBM701

IBM's Deep Blue (chess) was 10M times faster than the IBM701; massively parallel with specialized chess playing ASICs

In 1997 Deep Blue was in top 500 super computers. Today the same compute is available on a $400 graphics card

IBM Watson was a 305 year project; team of 15 people. Within a year, it could regularly beat some champions.

Problem domain: broad domain; complex language; high precision, accurate response

Real Jeopardy champions buzz in 80% of the time & answer correctly 80% of the time. That's incredible performance

Can't be solved with lookup table. In 20,000 questions there are over 2500 types; bigger bucket is 3%

Cognitive has evolved from systems that play games to multi-model understanding speech, emotions; all API driven

Cognitive system is a partnership between humans & computers

Cognitive computing depends on understanding, reasoning, and learning.

Cognitive systems are trained, not programmed. They work with humans to develop they're capabilities.

Turning cognitive computing loose on Internet leads to interesting results. For example, it learned dogs are people.

The problem is that there's not "one truth" In some contexts people equate their dogs with people. But dogs aren't people

Now, understanding human speech is an API call away

Three rest calls: speech understanding -> translation -> txt-to-speech yields a speech translator

Eric Norlin:

thanks Kim, Brian, and rest of the staff. They make the conference work

Brendan Burns:

Introspection: find out what went wrong; Insight is finding out things you didn't know

Introspection requirements: specification, status, events, attribution

Audit requirements: transparency, immutability, verification, restrictions & limits, automation (APIs)

Insight requirements: dynamic organization, interactive exploration, visualization

At first blush, IaaS checks a lot of these boxes, but not all of them (eg. immutability, monitoring APIs are wrong)

Containers are the right API object, can be immutable, and can be verifiable

Cluster management has organization, introspection, specification, status, events

Demo of KSQL for querying Kubernetes. Here's the repo:

Kubernetes API server enables immutability by limiting actions on containers (eg ensure code checked in)

ABAC in cluster management allows policies to control actions (eg access, deployment)

Policy ex: Only allow certain people to create containers that come from specific registries/repos

Admission control policies can control resource use (eg auto-approve if resource overage explained in issue ticket)

Mosquito checks for things that haven't changed in 3 months. Finds dead resources.

Or find services or machines that restart the most.

Benjamin Hindman:

@alexwilliams interviewing

VMs didn't change things; containers and cluster management did.

Rosanna Myers:

Robots are lots cheaper; robots are safer;

A collaborative robot or cobot

KRL is Kuka Robot Language (or Kynetx Rule Language)

A 3D printer is a blind robot. Having vision is a good thing.

Big breakthrough is cloud robotics. Off load processing to the cloud. Ex: self driving cars

Big advantages: all robots learn from experiences of the others

Cloud robotics provides designing freedom, collaborative learning and application development

Manufacturing is still area for robotics; only 10% of manufacturing is automated. Barrier: robots are hard to use

Research is another. 90% of research projects not repeatable. And the pipetting by hand for hours isn't fun.

Another: 12M people require full time care. Eg. attach robot arm to wheelchair

Joe Beda:

I made the term "production identity" up

Google systems have largely been in production for > 10yrs & are highly integrated

GOOG doesn't have all the answers, but they do have all the problems.

Solutions aren't as important as understanding how to breakdown and frame problem

Question: how do we identity production services

Trend: Manual -> Automation

We have more things we're dealing with and the change more often than they have in the past.

Evolving security: (1) network problem; lock down network (2) application security both operations and code analysis

Micro segmentation: surround any piece of hardware with it's own policy. chroot for your network

Does readability imply authorization? Doesn't sound very secure

Microservices -> many connections between components. When a micro service has 100 connections, reachability doesn't cut it

Devops is therapy for large organizations

identity is lower level function from authn or authz

We can come up with a one-size-fits-all solution for production ID where authn and authz not so much

Many applications have their own idea of a user. So secret stores become key translators

GOOG has LOAF: stuff in production has an identity that is transported ambiently

SPIFFE: Secure Production Identity Framework for Everyone

SPIFFY is dailtone for identity


Developer experience: get SPIFFE ID, give it key pair with certificate chain & root certs to trust

Cert usage: TLS verification and message signing

When we talk about message signing, think JWOT

SPIFFE could be integrated in micro service and RPC frameworks, in smart "side car" proxies, & off-the--shelf systems

Future directions for SPIFFE: federation, authorization, delegation, capability tokens

See for more information

Chris Richardson:

speaking on patterns languages for microservices

Successful software development depends on architecture, process & organization

Organization should be small, autonomous teams

Process should be agile

There's no silver bullet for architecture (reference to Fred Brookes)

Architecture patterns are a reusable solution to a problem in a particular context

Patterns force you to consider tradeoffs: benefits, drawbacks, issues to resolve

Patterns force you to consider other patterns: alternative solution and solution introduced to the pattern

Microservices pattern available at

Infrastructure patterns include deployment and communication patterns

Core patterns include cross-cutting concerns

Application patterns include database architectures and data consistency

Monolithic architecture are relatively simple to develop, test, deploy, & scale (in certain contexts)

Problem is that successful applications keep growing; adding code day-after-day; you end up with a "ball of mud"

Monolithic architectures break process goals of agile and continuous deliver & the org goal of autonomous teams

Microservices architecture functionally decomposes app into many services intermediated by API gateways

Microservice architecture drawbacks: complexity, IPC, partial failure, TxN span multiple services; testing is hard

Issues: deployment; communication, partitioning; distributed data management

Shared databases lead to tight coupling between services; each service needs it's own data store

Data store per service -> services communicating via API only

Event-driven, eventually consistent architecture is solution to data store per service downsides

dual write problems traditionally solved using TxNs. Instead must reliably publish events; use Event sourcing

2nd problem: queries are no longer easy across several services; pattern is CQRS and materialized views

There are many more patterns for deployment, communication, etc.

Mark VanderWiele:

Connect and control #IoT devices in minutes using voice commands

Architecture uses MQTT to publish & subscribe data from device; processing in cloud; connecting homekit & TI devs

Learned: make devices more self-describable; allows generic UIs that devices plug into and work

Voice is the last mile in device interaction

Doing demos: monitor and control a device with voice commands

Demo using services from Bluemix services catalog

"ambient computing at your disposal"

Nodered used to get commands from speech application; program processes keywords; sends JSON using mQTT to robot

Using an iPod touch as the HomeKit gateway; using another iPod touch as gateway for bluetooth spheros

Created composite applications from multiple device types and the IoT foundation

Capabilities unfortunately change when manufacturers send firmware updates

John Musser:

APIs can be great, but not always... API Ops is the answer

APIs go down; have unversioned changes; API Ops to the rescue

API Ops is like DevOps for APIs

API Ops should build, test, and deploy APIs more reliably.

API Ops and Dev Ops are similar, but different in subtle ways

We're seeing more and more stories about API failures.

API Ops: design, build, test & release APIs more rapidly, frequently, & reliably

Elephant in the room is micro services; DevOps necessary for managing all these services.

Use of API specification has exploded. So has the number of API tools

Why all the tools? The API Lifecycle. 1st gen focused on operation. 2nd gen focused on the rest

API Lifecycle: requirements, design, development, test, deployment, and operations

DevOps is about looking at entire lifecycle. API Ops is similarly focused on entire lifecycle

Going meta: APIs for API Ops

The entire API lifecycle can be controlled with APIs

I Am Sybil

Split personalities

Online, I am Sybil. So are you. You have no digital representation of your individual identity. Rather, you have various identities, disconnected and spread out among the administrative domains of the various services you use.

An independent identity is a prerequisite to being able to act independently. When we are everywhere, we are nowhere. We have no independent identity and are thus constantly subject to the intervening administrative identity systems of the various service providers we use.

Building a self-sovereign identity system changes that. It allows individuals to act and interact as themselves. It allows individuals to have more control over the way they are represented and thus seen online. As the number of things that intermediate our lives explodes, having a digital identity puts you at the center of those interchanges. We gain the power to act instead of being acted upon.

This is why I believe the discussion of online privacy sells us short. Being self-sovereign is about much more than controlling how my personal data is used. That's playing defense and is a cheap substitute for being empowered to act as an individual. Privacy is a mess of pottage compared to the vast opportunities that being an autonomous digital individual enables.

Technically, there are several choice for implementing a self-sovereign identity system. Most come down to one of three choices:

  • a public, permissionless distributed ledger (blockchain)
  • a public, permissioned distributed ledger
  • a private, permissioned distributed ledger1

Public or private refers to who can join—anyone can join a public ledger. A public system allows anyone to get an identity on the ledger. Private system restrict who can join. I owe this categorization to Jason Law.

Permissioned and permissionless refers to how the ledger's validators are chosen. As I discussed in Properties of Permissioned and Permissionless Blockchains, these two types of ledgers provide a different emphasis on the importance of protection from censorship and protection from deletion. People of a more libertarian bent will prefer permissionless because of it's emphasis on protection from censorship while those who need to work within regulatory regimes will prefer permissioned.

We could debate the various benefits of each of these types of self-soveregn identity systems, but in truth they are all preferable to what we have today a each allows individuals to create and control identities independent of the various administrative domains with which people interact. In fact, I suspect that one or more instantiations of each these three types will exist in parallel to serve different needs. Unlike the physical world where we live in just one place, online, we can have a presence in many different worlds. People will use all of these systems and more.

Regardless of the choices we make, the principle that ought to guide the design of self-sovereign identity systems is respect for people as individuals and ensuring they have the ability to act as such.

In my discussion on the CompuServe of Things, I said:

"On the Net today we face a choice between freedom and captivity, independence and dependence."

I don't believe this is overstated. As more and more of our lives are intermediated by software-based systems, we will only be free if we are free to act as peers of these services. An independent identity is the foundation for that freedom to act.

  1. A private, permissionless ledger is an oxymoron.

Building a Virtual University

Imagine you wanted to create a virtual university (VU)1. VU admits students and allows them to take courses in programs that lead to certificates, degrees, and other credentials. But VU doesn't have any faculty or even any courses of its own. VU's programs are constructed from courses at other universities. VU's students take courses at whichever university offers it. In an extreme version of this model, VU doesn't even credential students. Rather, those come from participating institutions who have agreed, on a program-by-program basis, to accept certain transfer credits from other participating universities to fulfill program requirements.

Tom Goodwin writes

Uber, the world’s largest taxi company, owns no vehicles. Facebook, the world’s most popular media owner, creates no content. Alibaba, the most valuable retailer, has no inventory. And Airbnb, the world’s largest accommodation provider, owns no real estate.

These companies are thin layers sitting on an abundance of supply. They connect customers to that supply. VU follows a similar pattern. VU has no faculty, no campus, no buildings, no sports teams. VU doesn't have any classes of its own. Moreover, as we'll see, VU doesn't even have much student data. VU provides a thin layer of software that connects students anywhere in the world with a vast supply of courses and degree programs available.

There are a lot of questions about how VU would work, but what I'd like to focus on in this post is how we could construct (or, as we'll see later, deconstruct) IT systems that support this model.

Traditional University IT System

Before we consider how VU can operate, let's look at a simple block model of how traditional university IT systems work.

traditional university system design
Traditional University IT System Architecture

Universities operate three primary IT systems in support of their core business: a learning management system (LMS), a student information system (SIS), and a course management system.2

The LMS is used to host courses and is the primary place that students use course material, take quizzes, and submit assignments. Faculty build courses in the LMS and use it to evaluate student work and assign grades.

The SIS is the system of record for the university and tracks most of the important data about students. The SIS handles student admissions, registrations, and transcripts. The SIS is also the system that a university uses to ensure compliance with various university and government policies, rules, and regulations. The SIS works hand-in-hand with the course management system that the university uses to manage its offerings.

The SIS tells the LMS who's signed up for a particular course. The LMS tells the SIS what grades each student got. The course management system tells the SIS and LMS what courses are being offered.

Students usually interact with the LMS and SIS through Web pages and dedicated mobile apps.

VU presents some challenges to the traditional university IT model. Since these university IT systems are monoliths, you might be able to do some back-end integrations between VU's systems and the SIS and LMS of each participating university. The student would have to then use VU's systems and those of each participating universities.

The Personal API and Learning Records

I've written before about BYU's efforts to build a university API. A university API exposes resources that are directly related to the business of the university such as /students, /instructors, /classes, /enrollments, and so on. Using a standard, consistent API developers can interact with any relevant university system in a way that protects university and student data and ensures that university processes are followed.

We've also been exploring how a personal API functions in a university setting. For purposes or this discussion, let's imagine a personal API that provides an interface to two primary repositories of personal data: the student profile and the learning record store (LRS). The profile is straightforward and contains personal information that the student needs to share with the university in order to be admitted, register for and take courses, and complete program requirements.

student profile
A Student Profile

The LRS stores the stream of all learning activities by the student. These learning activities include things like being admitted to a program, registering for a class, completing a reading assignment, taking a quiz, attending class, getting a B+ in a class, or completing a program of study. In short there is no learning activity that is too small or too large to be recorded in the LRS. The LRS stores a detailed transcript of learning events.3

One significant contrast between the traditional SIS/LMS that students have used and the LRS is this: the SIS/LMS is primarily a record of the current status of the student that records only course-grained achievements whereas the LRS represents the stream of learning activities, large and small. The distinction is significant. My last book was called The Live Web because it explored the differences between systems that make dynamic queries against static data (the traditional SIS/LMS) and those that perform static queries on dynamic streams of data. The LRS is decidedly part of the live web.

The personal API, as it's name suggests, may provide an interface to any data that the person who owns, but right now, we're primarily interested in the profile and LRS data. For purposes of this discussion, we'll refer to the combination profile and LRS as the "student profile."

We can construct the student profile such that it can be hosted. By hosted, I mean that the student profile is built in a way that each profile could, potentially, be run on different machines in different administrative domains, without loss of functionality. One group of students might be running their profiles inside their university's Domain of One's Own system, another group might be using student profiles hosted by their school, other students might be using a commercial product, and some, intrepid students might choose to self-host. Regardless, the API provides the same functionality independent of the domain in which the student profile operates.

Even when the profile is self hosted, the information can still be trusted because institutions can digitally sign accomplishments so others can be assured they're legitimate.

Deconstructing the SIS

With the personal-API-enabled student profile, we're in a position to deconstruct the University IT systems we discussed above. As shown in the following diagram, the student profile can be separated from the university system. They interact via their APIs. Students interact with both of them through their respective APIs using applications and Web sites.

student profile and university systems
Interactions of the University and Student Profile

The university API and the student profile portions of the personal API are interoperable. Each is built so that it knows about and can use the other. For example, the university API knows how to connect to a student profile API, can understand the schema within, respects the student profile's permissioning structures, and sends accomplishments to the LRS along with other important updates.

For its part, the student profile API knows how to find classes, see what classes the student is registered for, receive university notifications, check grades, connect with classmates, and sends important events to the university API.

VU can use both the university systems and the student profile. Students can access all three via their APIs using whatever applications are appropriate.

adding a virtual university
Adding a Virtual University

The VU must manage programs made from courses that other universities teach and keep track of who is enrolled in what (the traditional student records function). But VU can rely on the university's LMS, major functions of its SIS, and information in the student profile to get its job done. For example, if VU trusted that the student profile would be consistently available, it would need to know who its students are, but could evaluate student progress using transcript records written by the university to the student profile.

Building the Virtual University

With this final picture in mind, it's easy to see how multiple universities and different student profile systems could work together as part of VU's overall offerings.

virtual university
The Virtual University

With these systems in place, VU can build programs from courses taught at many universities, relying on them to do much of the work in teaching students and certifying student work. Here is what VU must do:

  • VU still has the traditional college responsibility of determining what programs to offer, selecting courses that make up those programs, and offering those to its students.
  • VU must run a student admissions process to determine who to admit to which programs.
  • VU has the additional task of coordinating with the various universities that are part of the consortium to ensure they will accept each others courses as pre-requisites and, if necessary, as equivalent transfer credits.
  • VU must evaluate student completion of programs and either issue certifications (degrees, certificates of completion, etc.) itself or through one of its member institutions.

Universities aren't responsible for anything more than they already do. Their IT systems are architected differently to have an API and to interact with the student profile, but otherwise they are very similar in functionality to what is in place now. Accomplishments at each participating institution can be recorded in the student profile.

VU students apply to and are admitted by VU. They register for classes with VU. They look to VU for program guidance. When they take a class, they use the LMS at the university hosting the course. The LMS posts calendar and notifications to their student profile. The student profile becomes the primary system the student uses to interact with both VU and the various university LMSs. They have little to no interaction with the SIS of the respective universities.

One of the advantages of the hosted model for student repositories is that they don't have to be centrally located or administered. As a result student data can be located in different political domains in accordance with data privacy laws.

Note that the student profile is more than a passive repository of data that has limited functionality. The student profile is an active participant in the student's experience, managing notifications and other communications, scheduling calendar items, and even managing student course registration and progress evaluation. The student profile becomes a personal learning environment working on the student's behalf in conjunction with the various learning systems the student uses.

Since the best place to integrate student data is in the student profile, it ought to exist long before college. Then students could use their profile to create their application. There's no reason high school activities and results from standardized testing shouldn't be in the LRS. Student-centric learning requires student-centric information management.

We can imagine that this personal learning environment would be useful outside the context of VU and provide the basis for the student's learning even after she graduates. By extending it to allow programs of learning to be created by the student or others, independent of VU, the student profile becomes a tool that students can use over a lifetime.

The design presented here follows the simple maxim that the student is the best place to integrate information about the student. By deconstructing the traditional centralized university systems, we can create a system that supports a much more flexible model of education. APIs provide the means of modularizing university IT systems and creating a student-centric system that sits at the heart of a new university experience.

Related Reading:

  1. Don't construe this post to be "anti-university." In fact, I'm very pro-university and believe that there is great power in the traditional university model. Students get more when they are face-to-face with other learners in a physical space. But that is not always feasible and once students leave the university, their learning is usually much less structured. The tools developed in this post empower students to be life-long learners by making them more responsible for managing their own learning environment.
  2. Universities, like all large organizations, also use things like financial systems, HR systems, and customer management systems, along with a host of other smaller and support IT systems. But I'm going to ignore those in this post since they're really boring and not usually specialized from those used by other businesses.
  3. The xAPI is the proposed way that LRSs interact with each other and other learning systems. xAPI is an event-based protocol that communicates triples that have a subject, verb, and object. Using xAPI, systems can communicate information such as "Phillip completed reading assignment 10." In my more loose interpretation, an LRS might also store information from Activity Streams or other ways that event-like information can be conveyed.