Authorization, Workflow, and HATEOAS

APIs present different authorization challenges than when people access a Web site or other service. Typically, API access is granted using what are called "developer keys" but are really an application specific identifier and password (secret). That allows the API to track who's making what call for purposes of authorization, throttling, or billing.

Often, more fine-grained permissioning is needed. If the desired access control is for data associated with a user, the API might use OAuth. OAuth, is sometimes called "Alice-to-Alice sharing" because it's a way for a user on one service to grant access to their own account at some other service.

For more fine-grained authorization than just user-data control, I'm a proponent of policy-engine-based access control to resources. A policy engine works in concert with the API manager to answer questions like "Can Alice perform action X on resource Y?" The big advantages of a policy engine are as follows:

  • A policy engine allows access control policy to be specified as pattern-based declarations rather than in algorithms embedded deep in the code.
  • A policy engine stops access at the API manager, saving resources below the manager from being bothered with requests that will eventually be thrown out.

Recently, Joel Dehlin at got me thinking of another pattern for API access control that relies on workflow.

Consider an API for course management at a university. The primary job of the course management API is to serve as a system or record for courses that the university teaches. There are lots of details about how courses relate to each other, how they're associated programs, assigned to departments, expected learning outcomes, and so on. But we can ignore that for now. Let's just focus on how a course gets added.

The university doesn't let just anyone add classes. In fact, other than for purposes of importing data in bulk, no one has authority to simply add a class. Only proposals that have gone through a certain workflow and received approvals required by the university's procedure can be considered bonafide courses.

So the secretary for the Univerity Curriculum Committee (UCC) might only be allowed add the class if it's been proposed by a faculty member, approved by the department, been reviewed by the college, and, finally, accepted by the UCC. That is, the secretary's authorization is dependent on the current state of the proposal and that state includes all the required steps.

This is essentially the idea of workflow as authorization. The authorization is dependent on being at the end of a long line of required steps. There could be alternative paths or exceptions along the way. At each step along the way, authorization to proceed is dependent on both the current state and the attributes of the person or system taking action.

In the same way that we'd use a policy engine to normalize the application of policy for access control, we can consider the use of a workflow engine for many of the same reasons:

  • A general-purpose workflow engine makes the required workflow declarative rather than algorithmic.
  • Workflow can be adjusted as procedures change without changing the code.
  • Declarative workflow specifications are more readable that workflow hidden in the code.
  • A workflow engine provides a standard way for developers to create workflow rather than requiring every team to make it up.

One of our principles for designing the University API at BYU is to keep workflow below the API since we can't rely on clients to enforce workflow requirements. What's more, developers writing the clients don't want that burden. As we contemplated how best to put the workflow into the API, we determined that HATEOAS links were the best option.

If you're not familiar with HATEOAS, it's an awkward acronym for "hypertext as the engine of application state." The idea is straightforward conceptually: your API returns links, in addition to data, that indicate the best ways to make progress from the current state. There can be more than one since there might be multiple paths from a given state. Webber el. al.'s How to GET a Cup of Coffee is a pretty good introduction to the concept.

HATEOAS is similar to the way web pages work. Pages contain links that indicate the allowed or recommended next places to go. Users of the web browser determine from the context of the link what action to take. And thus they progress from page to page.

In the API, the data returned from the API contains links that are the allowed or recommended next actions and the client code uses semantic information in rel tags associated with each link to present the right choices to the user. The client code doesn't have to be responsible for determining the correct actions to present to the user. The API does that.

Consider the application of HATEOAS to the course management example from above. Suppose the state of a course is that it's just been proposed by a faculty member. The next step is that it needs approval by the department chair. GETting the course proposal via the course management API returns the data about the proposed course, as expected, regardless of who the GET is for. What's different are the HATEOAS links that are also returned:

  • For the faculty member, the links might allow for updating or deleting the proposal.
  • For the department chair, the links might be for approving or rejecting the course proposal.
  • For anyone else, the only link might be to return a collection of proposals.

Seen this way, a workflow engine is a natural addition to an API management system in the same way a policy engine is. And HATEOAS becomes something that can be driven from the management tool rather than being hard coded in the underlying application. I'm interested in seeing how this plays out.

Social Things, Trustworthy Spaces, and the Internet of Things

20110529 Bee Swarm-3

Humans and other gregarious animals naturally and dynamically form groups. These groups have frequent changes in membership and establish trust requirements based on history and task. Similarly, the Internet of Things (IoT) will be built from devices that be must be able to discover other interesting devices and services, form relationships with them, and build trust over time based on those interactions. One way to think about this problem is to envision things as social and imagine how sociality can help solve some of the hard problems of the IoT.

Previously I've written about a Facebook of Things and a Facebook for My Stuff that describe the idea of social products. This post expands that idea to take it beyond the commercial.

As I mentioned above, humans and other social animals have created a wide variety of social constructs that allow us to not only function, but thrive in environments where we encounter and interact other independent agents—even when those agents are potentially harmful or even malicious. We form groups and, largely, we do it without some central planner putting it all together. Individuals in these groups learn to trust each other, or not, on the basis of social constructions that have evolved over time. Things do fail and security breaks down from time to time, but those are exceptions, not the rule. We're remarkably successful at protecting ourselves from harm and dealing with anomalous behavior from other group members or the environment, while getting things done.

There is no greater example of this than a city. Cities are social systems that grow and evolve. They are remarkably resilient. I've referenced Geoffrey West's remarkable TED talk on the surprising math of cities and corporations before. As West says "you can drop an atom bomb on a city and it will survive."

The most remarkable thing about city planning is perhaps the fact that cities don't really need planning. Cities happen. They are not only dynamic, but spontaneous. The greatness of a city is that it isn't planned. Similarly, the greatness of IoT will be in spontaneous interactions that no one could have foreseen.

My contention is that we want device collections on the Internet of Things to be more like cities, where things are heterarchical and spontaneous, than corporations, where things are hierarchical and planned. Where we've built static networks of devices with a priori determined relationships in the past, we have to create systems that support dynamic group forming based on available resources and goals. Devices on the Internet of Things will often be part of temporary, even transient, groups. For example, a meeting room will need to be constantly aware of its occupants and their devices so it can properly interact with them. I'm calling these groups of social things "trustworthy spaces."

My Electric Car

As a small example of this, consider the following example: suppose I buy an electric car. The car needs to negotiate charging times with the air conditioner, home entertainment system, and so on. The charging time might change every day. There are several hard problems in that scenario, but the one I want to focus on is group forming. Several things need to happen:

  • The car must know that it belongs to me. Or, more generally, it has to know it's place in the world.
  • The car must be able to discover that there's a group of things that also belong to me and care about power management.
  • Other things that belong to me must be able to dynamically evaluate the trustworthiness of the car.
  • Members of the group (including the car) must be able to adjust their interactions with each other on the basis of their individual calculations of trustworthiness.
  • The car may encounter other devices that misrepresent themselves and their intentions (whether due to fault or outright maliciousness).
  • Occasionally, unexpected, even unforeseen events will happen (e.g. a power outage). The car will have to adapt.

We could extend this situation to a group of devices that don't all belong to the same owner too. For example, I'm at my friend's house and want to charge the car.

The requirements outlined above imply several important principles:

  • Devices in the system interact as independent agents. They have a unique identity and are capable of maintaining state and running programs.
  • Devices have a verifiable provenance that includes significant events from their life-cycle, their relationships with other devices, and a history of their interactions (i.e. a transaction record).
  • Devices are able to independently calculate and use the reputation of other actors in the system.
  • Devices rely on protecting themselves from other devices rather than a system preventing bad things from happening.

I'm also struck that other factors, like allegiance, might be important, but I'm not sure how at the moment. Provenance and reputation might be general enough to take those things into account.

Trustworthy Spaces

A trustworthy space is an abstract extent within which a group of agents interact, not a physical room or even geographic area. It is trustworthy only to the extent an individual agent deems it so.

In a system of independent agents, trustworthiness is an emergent property of the relationships among a group of devices. Let's unpack that.

When I say "trustworthiness," that doesn't imply a relationship is trustworthy. The trustworthiness might be zero, meaning it's not trusted. When I say "emergent," I mean that this is a property that is derived from other attributes of the relationship.

Trustworthy spaces don't prevent bad things from happening, any more than we can keep every bad thing from happening in social interactions. I think it's important to distinguish safety from security. We are able to evaluate security in relatively static, controlled situations. But usually, when discussing interactions between independent agents, we focus on safety.

There are several properties of trustworthy spaces that are important to their correct functioning:


By definition, a trustworthy space is decentralized because the agents are independent. They may be owned, built, and operated by different entities and their interactions cross those boundaries.


A trustworthy space is populated by independent agents. Their interactions with one another will be primarily event-driven. Event-based systems are more loosely coupled than other interaction methodologies. Events create a networked pattern of interaction with decentralized decision making. Because new players can enter the event system without others having to give permission, be reconfigured, or be reprogrammed, event-based systems grow organically.


Trustworthy spaces are robust. That is they don't break under stress. Rather than trying to prevent failure, systems of independent agents have to accept failure and be resilient.

In designed systems we rely on techniques such as transactions to ensure that the system remains in a consistent state. Decentralized systems rely on retries, compensating actions, or just plain giving up when something doesn't work as it should. We have some experience with this in distributed systems that are eventually consistent, but that's just a start at what the IoT needs.

Inconsistencies will happen. Self-healing is the process of recognizing inconsistent states and taking action to remediate the problem. Internal monitoring by the system of anything that might be wrong and then taking corrective action has to be automatic.


More than robustness, antifragility is the property systems exhibit when they don't just cope with anomalies, but instead thrive in their presence. Organic systems exhibit antifragility; they get better when faced with random events.

IoT devices will operate in environments that are replete with anomalies. Most anomalies are not bad or even errors. They're simply unexpected. Antifragility takes robustness to the next level by not merely tolerating anomalous activity, but using it to adapt and improve.

I don't believe we know a lot about building systems that exhibit antifragility, but I believe that we'll need to develop these techniques for a world with trillions of connected things.

Trust Building

Trust building will be an important factor in trustworthy spaces. Each agent must learn what other agents to trust and to what level. These calculations will be constantly adjusted. Trust, reputation, and reciprocity (interaction) are linked in some very interesting ways. Consider the following diagram from a paper by Mui et al entitled A Computational Model of Trust and Reputation:

The relationship between reputation, trust, reciprocity, and social benefit

We define reputation as the perception about an entity's intentions and norms that it creates through past actions. Trust is a subjective expectation an entity has about another's future behavior based on the history of their encounters. Reciprocity is a mutual exchange of deeds (such as favor or revenge). Social benefit or harm derives from this mutual exchange.

If you want to build a system where entities can trust one another, it must support the creation of reputations since reputation is the foundation of trust. Reputation is based on several factors:

  • Provenance—the history of the agent, including a "chain of custody" that says where it's been and what it's done in the past, along with attributes of the agent, verified and unverified.
  • Reciprocity—the history of the agent's interaction with other agents. A given agent knows about it's interactions and the outcomes. To the extent they are visible, interactions between other agents can also be used.

Reputation is not static and it might not be a single value. Moreover, reputation is not a global value, but a local one. Every agent continually calculates and evaluates the reputation of every other agent. Transparency is necessary for the creation of reputation.

A Platform for Exploring Social Things

A few weeks ago I wrote about persistent compute objects, or picos. In the introduction to that piece, I write:

Persistent Compute Objects, or picos, are tools for modeling the Internet of Things. A pico represents an entity—something that has a unique identity and a long-lived existence. Picos can represent people, places, things, organizations, and even ideas.

The motivation for picos is to design infrastructure to support the Internet of Things that is decentralized, heterarchical, and interoperable. These three characteristics are essential to a workable solution and are sadly lacking in our current implementations.

Without these three characteristics, it's impossible to build an Internet of Things that respects people's privacy and independence.

Picos are a perfect platform for exploring social products. They come with all the necessary infrastructure built in. Their programmability makes them flexible enough and powerful enough to demonstrate how social products can interact through reputation to create trustworthy spaces.

Benefits of Social Things

Social things, interacting with each other in trustworthy spaces offer significant advantages over static networks of devices:

  • Less configuration and set up time since things discover each other and set up mutual interactions on their own.
  • More freedom for people to buy devices from different manufactures and have them work together.
  • Better overall protection from anomalies, perhaps even systems of devices that thrive in their presence.

Social things are a way of building a true Internet of Things instead of CompuServe of Things.

My thoughts on this topic were influenced by a CyDentity workshop I attended last week put on by the Department of Homeland Security at Rutgers University. In particular, some of the terminology, such as "provenance" and "trustworthy spaces," were things I heard there that gelled with some of my thinking on reputation and social things.

Choosing a Car for it's Infotainment System


Recently when I've rented cars I've increasingly asked for a Ford. Usually a Ford Fusion.

It's true that I like Fords, but that's not why I ask for them when renting. I'm more concerned about a consistent user experience in the car's infotainment system.

I have a 2010 F-150 that has been a great truck. I wrote about the truck and it's use as a big iPhone accessory when I first got it. The truck is equipped with Microsoft Sync and I use it a lot.

I don't know if Sync is the best in-car infotainment system or not. First I've not extensively tried others. Second, car company's haven't figured out that they're really software companies, so they don't regularly update them. I've reflashed the firmware in my truck a few times, but I never saw any significant new features.

Even so, when faced with a rental car, I'd rather get something that I know how to use. Sync is familiar, so I prefer to rent cars that have it. I get a consistent, known user experience that allows me to get more out of the vehicle.

What does this portend for the future? Will we become more committed to the car's infotainment system than we are to the brand itself? Ford is apparently ditching Sync for something else. Others use Apple's system. At CES this past January there were a bunch of them. I'm certain there's a big battle opening up here and we're not likely to see resolution anytime soon.

Car manufacturers don't necessarily get that they're being disrupted by the software in the console. And those that do aren't necessarily equipped to compete. Between the competition in self-driving cars, electric vehicles, and infotainment systems, car manufacturers are in in a pinch.

API Management and Microservices


At Crazy Friday (OIT's summer developer workshop) we were talking about using the OAuth client-credential flow to manage access to internal APIs. An important part of enabling this is to use standard API management tools (like WSO2) to manage the OAuth credentials as well as access.

We got down a rabbit hole with people trying to figure out ways to optimize that. "Surely we don't need to check each time?!?," "Can we cache the authorization??," and so on. We talked about not trying to optimize too soon, but that misses the bigger point.

The big idea here isn't using OAuth for internal APIs instead of some ad hoc solution. The big idea is to use API management for everything. We have to ensure that the only way to access service APIs is through the managed API endpoint. API management is about more than just authentication and authorization (the topic of our discussion on Friday).

API management handles, discovery, security, identity, orchestration, interface uniformity, versioning, traffic shaping, monitoring, and metering. Internal APIs—even those between microservices—need those just as badly as external APIs. No service should have anything exposed that isn’t managed. Otherwise we'll never succeed.

I can hear the hue and cry: "This is a performance nightmare!!" Of course, many of us said the same thing about object-relational mapping, transaction monitors, and dozens of other tools that we accept as best practices today. We were wrong then and we'd be wrong now to throw out all the advantages of API management for what are, at present, hypothetical performance problems. We'll solve the performance problems when they happen, not before.

But what about building frameworks to do the things we need without the overhead of the management platform? Two big problems: First, we want to use multiple languages and systems for microservices. This isn't a monolith and each team is free to choose their own. We can't build the framework for every language that comes along and we don't want to lose the flexibility of teams using the right tools for the job.

Second, and more importantly, if we use a standard API management tool any performance problems we experience will also be seen by other customers. There will be dozens, even hundreds of smart people trying to solve the problem. Using a standard tool gives us the advantage having all the smart people who don't work for us worried about it to.

If there's anything we should have learned from the last 15 years, it's that standard tooling gives us tremendous leverage to do things we'd never be able to do otherwise. Consequently, regardless of any potential performance problems, we need to use API management between microservices.

Errors and Error Handling in KRL


Errors are events that say "something bad happened." Conveniently, KRL is event-driven. Consequently, using and handling errors in KRL feels natural. Moreover, it is entirely consistent with the rest of the language rather than being something tacked on. Even so, error handling features are not used often enough. This post explores how error events work in KRL and describes how I used them in building Fuse.

Built-In Error Processing

KRL programs run inside a pico and are executed by KRE, the pico engine. KRE automatically raises system:error events when certain problems happen during execution of a ruleset. These events are raised differently than normal explicit events. Rather than being raised on the pico's event bus by default, they are only raised within the current ruleset.

Because developers often want to process all errors from several rulesets in a consistent way, KRL provides a way of automatically routing error events from one ruleset to another. In the meta section of a ruleset, developers can declare another ruleset that is the designated error handler using the errors to pragma.

Developers can also raise error events explicitly using an error statement in the rule postlude.

Handling Errors in Practice

I used KRL's built-in error handling in building Fuse, a connected-car product. The result was a consistent notification of errors and easier debugging of run-time problems.

Responding to Errors

I chose to create a single ruleset for handling errors, fuse_error.krl, and refer all errors to it. This ruleset has a single rule, handle_error that selects on a system:error event, formats the error, and emails it to me using the SendGrid module.

Meanwhile all of the other rulesets in Fuse use the errors to pragma in their meta block to tell KRE to route all error events to fuse_error.krl like so:

meta {
  errors to v1_fuse_errors

This ensures that all errors in the Fuse rulesets are handled consistently by the same ruleset. A few points about generalizing this:

  • There's no reason to have just one rule. You could have multiple rules for handling errors and use the select statement to determine which rules execute based on attributes on the error like the level or genus.
  • There's no requirement that the error be emailed. That was convenient for me, but the rule could send them to online error management systems, log them, whatever.

Raising Errors

As mentioned above, the system automatically raises errors for certain things like type mismatches, undefined functions, invalid operators, and so on. These are great for alerting you that something is wrong, although they don't always contain enough information to fix the problem. More on that below.

I also use explicit error statements in the rule postlude to pass on erroneous conditions in the code. For example, Fuse uses the Carvoyant API. Consequently, the Fuse rulesets make numerous HTTP calls that sometimes fail. KRL's HTTP actions can automatically raise events upon completion. An http:post() action, for example will raise an http:post event with attributes that include the response code (as status_code) when the server responds.

Completion events are useful for processing the response on success and handling the error when their is a problem. For example, the following rule handles HTTP responses when the status code is 4XX or 5XX:

rule carvoyant_http_fail {
  select when http post status_code re#([45]\d\d)# setting (status)
           or http put status_code re#([45]\d\d)# setting (status)
           or http delete status_code re#([45]\d\d)# setting (status) 
  pre {
  ... // all the processing code
  event:send({"eci": owner}, "fuse", "vehicle_error") with
    attrs = {
          "error_type": returned{"label"},
          "reason": reason,
          "error_code": errorCode,
          "detail": detail,
          "field_errors": error_msg{["error","fieldErrors"]},
          "set_error": true
  always {
    error warn msg

I've skipped the processing that the prelude does to avoid too much detail. Note three things:

  1. The select statement is handling errors for various HTTP errors as a group. If there were reasons to treat them differently, you could have different rules do different things depending on the HTTP method that failed, the status code, or even the task being performed.
  2. The action sends the fuse:vehicle_error event to another pico (in this case the fleet) so the fleet is informed.
  3. The postlude raises a system:error event that will be picked up and handled by the handle_error rule we saw in the last section.

This rule has proven very useful in debugging connection issues that tend to be intermittent or specific to a single user.

Using Explicit Errors to Debug

I ran into an type mismatch error for some users when a fuse:new_trip event was raised. I would receive, automatically, an error message that said "[hash_ref] Variable 'raw_trip_info' is not a hash" when the system tried to pull a new trip from the Carvoyant API. The error message doesn't have enough detail to track down what was really wrong. The message could be a little better (tell me what type it is, rather than just saying it is not a hash), but even that wouldn't have helped much.

My first thought was to dig into the system and see if I could enrich the error event with more data about what was happening. You tend to do that when you have the source code for the system. But after thinking about it for a few days, I realized that just wasn't possible to do in a generalized way. There are too many possibilities.

The answer was to raise an explicit error in the postlude to gather the right data. I added this statement to the rule that was generating the error:

error warn "Bad trip pull (tripId: #{tid}): " + raw_trip_info.encode() 
   if raw_trip_info.typeof() neq "hash";

This information was enlightening because I found out that rather than being an HTTP failure disguised as success, the problem was that the trip data was being pulled without a trip ID and as a consequence the API was giving me a collection rather than the item—as it should.

This pointed back to the rule that raises the fuse:new_trip event. That rule, ignition_status_changed, fires whenever the vehicle is turned on or off. I figured that the trip ID wasn't getting lost in transmission, but rather never getting sent in the first place. Adding this statement of the postlude of that rule confirmed my suspicions:

error warn "No trip ID " + trip_data.encode()  if not tid;

When this error occurred, I got an email with this trip data:

  "accountId": "4",
  "eventTimestamp": "20150617T130419+0000",
  "ignitionStatus": "OFF",
  "notificationPeriod": "STATECHANGE",
  "minimumTime": null,
  "subscriptionId": "4015",
  "vehicleId": "13",
  "dataSetId": "25857188",
  "timestamp": "20150617T135901+0000",
  "id": "3530587",
  "creatorClientId": "",
  "httpStatusCode": null

Note that there's no tripId, so the follow-on code never saw one either, causing the problem. This wasn't happening universally, just occasionally for a few users.

I was able to add a guard to ignition_status_changed so that it didn't raise a fuse:new_trip event if there were no trip ID. Problem solved.


One of the primary tools developers use for debugging is logging. In KRL, the Pico Logger and built-in language primitives like the log statement and the klog() operator make that easy to do and fairly fruitful if you know what you're looking for.

Error handling is primarily about being alerted to problems you may not know to look for. In the case I discuss above, built-in errors alerted me to a problem I didn't know about. And then I was able to use explicit errors to see intermittent problems and capture the relevant data to easily determine the real problem and solve it. Without the error primitives in KRL, I'd have been left to guess, make some changes, and see what happens.

Being able to raise explicit errors allows the developer, who knows the context, to gather the right data and send it off when appropriate. KRL gave me all the tools I needed to do this surgically and consistently.

Picos: Persistent Compute Objects

Persistent Compute Objects, or picos, are tools for modeling the Internet of Things. A pico represents an entity—something that has a unique identity and a long-lived existence. Picos can represent people, places, things, organizations, and even ideas.

The motivation for picos is to design infrastructure to support the Internet of Things that is decentralized, heterarchical, and interoperable. These three characteristics are essential to a workable solution and are sadly lacking in our current implementations.

Without these three characteristics, it's impossible to build an Internet of Things that respects people's privacy and independence.


Picos are:

  • persistent: They exist from when they are created until they are explicitly deleted. Picos retain state based on past operations. 
  • unique: They have an identity that is immutable. While attributes of the pico, its state, may change, its identity does not. 
  • online: They are available on the Internet and respond to events and queries. 
  • concurrent: They operate independently of one another and process events and queries asynchronously. 
  • event-driven: They respond to events by changing state and sending new events. 
  • rule-based: Their behavior is expressed as rules that pattern-match against incoming events. Put another way, rules listen for events on the pico's internal event bus. 

Collections of picos are used to create models of interacting entities in the Internet of Things. Picos communicate by sending events to or making requests of each other in an Actor-like manner. These communications are point-to-point and every pico can have a unique address, shared by no one else, to any other pico to which it communicates. Collections of picos were used in architecting the Fuse connected-car system with significant advantage.

Pico Building Blocks

Picos are part of a system that supports programming them. While you can imagine different implementations that support the characteristics of picos enumerated in the previous section, this post will describe the implementation and surrounding ecosystem that I and others have been building for the past seven years.

The various pieces of the pico ecosystem and their relationship is shown in the following diagram (click for enlarged diagram).

pico system relationships

For people who've read this blog, many of the titles in these boxes will be familiar, but I suspect that the exact nature of how they relate to each other has been a mystery in many cases. Here are some brief descriptions of the primary components and some explanation of the relationships.

Event-Query API

The event-query API is a name I gave the style of interaction that picos support. Picos don't implement RESTful APIs. They aren't meant to. As I explain in Pico APIs: Events and Queries, picos are primarily event-driven but also support a query API for getting values from the pico. Each pico has an internal event-bus. So while picos interact with each other and the world in a point-to-point Actor model, internally, they distribute events with a publish and subscribe mechanism.


Picos use the event-query API to communicate with each other. So do applications using a programming style called the pico application architecture (formerly the personal cloud application architecture). The PAA is a variant on an architecture that is being promoted as unhosted web apps and remotestorage. PAA goes beyond those models by offering a richer API that includes not just storage, but other services that developers might need. In fact the set of services is infinitely variable in each pico.


In the same way that operating systems provide more complex, more flexible services for developers than the bare metal of the machine, CloudOS provides pico programmers with important services that make picos easier to use and manage. For example, CloudOS provides services for creating new picos and creating communication channels between picos.

Note: I don't really like the name CloudOS, but it's all I've got for now. If you have ideas, I'm open to them so long as they are not "pico os" or "POS."


The basic module for programming picos is a ruleset. A ruleset is a collection of rules that respond to events. But a ruleset is more than that. Functions in the ruleset make up the queries that are available in the event query API. Thus, the specific event-query API that a given pico presents to the world correlates exactly to the rulesets that are installed in the pico.

The following diagram shows the rules and functions in a pico presenting an event-query API to an application.

event-query model

CloudOS provides functionality to installing rulesets in a pico and they can change overtime just as the programs installed on a computer change over time. As the installed rulesets change, so does the pico's API.


KRL is the language in which rulesets are programmed. Picos run KRL using the event evaulation cycle. Rules in KRL are "event-condition-action" rules because they tie together an event expression, a condition, and an action. Event expressions are how rules subscribe to specific events on the pico's event bus. KRL supports complex, declarative event expressions. KRL also supports persistent variables, which is how developers access the pico's state. KRL developers do not need a database to store attributes for the pico because of persistent variables.


KRE is a container for picos. A given instance of KRE can host any number of picos. KRE is the engine that makes picos work. KRE is an open source project, hosted on Github.

KRL rulesets are hosted online. Developers register the URL with KRE to create a ruleset ID or RID. The RID is what is installed in the pico. When the pico runs, it gets the ruleset source, parses it, optimizes it, and executes it.

The diagram below shows an important property of pico hosting. Picos can have communication relationships with each other even though they are hosted on different instances of KRE. The KRE instances need have no specific relationship with each other for picos to interact.

hosting and pico space

This hosted model is important because it provides a key component of ensuring that picos can run everywhere, not only in one organization's infrastructure.


Picos present a powerful model for how a decentralized, heterarchical, interoperable Internet of Things can be built. Picos are built on open-source software and support a unbiased hosting model for deployment. They have been used to build and deploy several production systems, including the Fuse connected-car system. They provide the means for giving people direct, unintermediated control of their personal data and the devices that are generating it.

I invite your questions and participation.

University API and Domains Workshop

BYU Logo Big

BYU is hosting a face-to-face meeting for university people interested in APIs and related topics (see below) on June 3 and 4 in Salt Lake City. Register here.

We've been working on a University API at BYU for almost a year now. The idea is to put a consistent API on top of the 950-some-odd services that BYU has in its registry. The new API uses resources that be understandable to anyone familiar with how a university works.

As we got into this effort, we found other universities had similar initiatives. Troy Martin started some discussions with various people involved in these efforts. There's a lot of excitement about APIs right now at universities big and small. That's why we thought a workshop would be helpful and fun.

The University API & Domains (UAD) workshop covers topics on developing API’s, implementing DevOps practices, deploying Domain of One’s Own projects, improving the use of digital identity technologies, and framing digital fluency on University campuses, This workshop is focused on addressing current issues and best practices experienced in building out conceptual models and example real-life use cases. Attendees include IT architects, educational technologists, faculty, and software engineers from many universities.

Kin Lane, the API Evangelist, will be with us to get things kicked off on the first morning. After that, the agenda will be up to the attendees because UAD is an unconference. It has no assigned speakers or panels, so it's about getting stuff done. We will have a trained open space facilitator at the workshop to run the show and make sure we are properly organized.

Because UAD is an unconference, you’re invited to speak. If you have an idea for a session now or even get one in the middle of the conference, you’re welcome to propose and run a session. At the beginning of each day we will meet in opening circle and allow anyone who wants to run a session that day to make a proposal and pick a time. There is no voting or picking other than with your feet when you choose to go.

Whether you're working at a university or just interested in APIs and want to get together with a bunch of smart folk who are solving big, hairy API problems, you'll enjoy being at this workshop and we'd love to have you: Register here.

New in Fuse: Notifications

Guide to Fuse Replacement

I've recently released changes to the Fuse system that support alert notifications via email and text.

This update is in preparation for maintenance features that remind you of maintenance that needs to be done and uses these alerts to help you schedule maintenance items.

Even now, however, this update will be useful since you will receive emails or SMS messages when you car issues a diagnostic trouble code (DTC), low fuel alert, or low battery alert. I found I had a battery going bad recently because of several low battery alerts from my truck.

In addition, I've introduced new events for the device being connected or disconnected that will also be processed as alerts. Several people have reported their device stopped working when in fact what had happened was the device became unplugged. Now you will get an email when your device gets unplugged.

Also, I've updated the system so that vehicle data will be processed approximately once per minute while the vehicle is in motion. Previously Fuse only updated on "ignition on" and "ignition off". This means that when one of your vehicle is in motion and you check either the app or the Fuse management console (FMC) you'll see the approximate current position as last reported by the device.

These changes don't get installed in your account automatically. You can activate them by going to the Fuse management console (FMC), opening your profile, and saving it.

If your profile has a phone number, you will receive Fuse alerts via text. Otherwise they will come via email. More flexibility in this area is planned for future releases.

This update also reflects several important advances for picos and CloudOS that underlie Fuse. The new notification ruleset that is being installed is the progenitor to a new notification system for CloudOS and the updates are happening via a rudimentary "pico schema and initialization" system that I've developed and will be incorporated into CloudOS this summer as part of making it easier for developers to work with collections of picos.

What's New With KRL


In The End of Kynetx and a New Beginning, I described the shutdown of Kynetx and said that the code supporting KRL has been assigned to a thing I created called Pico Labs. Here's a little more color.

All of the code that Kynetx developed, including the KRL Rules Engine and the KRL language is open source (and has been for several years). My intention is to continue exploring how KRL and the computational objects that KRL runs in (called picos) can be used to build decentralized solutions to problems in the Internet of Things and related spaces.

I have some BYU students helping me work on all this. This past semester they built and released a KRL developer tools application. This application is almost complete in terms of features and will provide a firm foundation for future KRL work.

Our plan is to rewrite the CloudOS code and the JavaScript SDK that provides access to it over the summer. The current version of CloudOS is an accretion of stuff from various eras, has lots of inconsistencies, and is missing some significant features. I used CloudOS extensively as part of writing the Fuse API which is based on KRL and picos. I found lots of places we can improve it. I'm anxious to clean it up.

As a motivating example for the CloudOS rewrite, we'll redo some of the SquareTag functionality in the PCAA architectural style. SquareTag makes extensive use of picos and will provide a great application of CloudOS for testing and demonstration.

I also continue to work (slowly) on a docker instance of KRE so that others can easily run KRE on their own servers.

I'm hopeful that these developments will make picos, KRL, and CloudOS easier to use and between Fuse, Guard Tour, and SquareTag we'll have some interesting examples that demonstrate why all of this is relevant to building the Internet of Things.

Why is Blockchain Important

Colorful Wooden Blocks Children's Museum Macro April 17, 20114

The world is full of directories, registries, and ledgers—mappings from keys to values. We have traditionally relied on some central authority (whoever owns the ledger) to ensure its consistency and availability. Blockchain is a global-scale, practical ledger system that demonstrates consistency and availability without a central authority or owner. This is why blockchain matters.

This is not to say that blockchain is perfect and solves every problem. Specifically, while the blockchain easy to query, it is computationally expensive to update. That is the price for consistency (and is likely not something that can be overcome). But there is also a real cost for creating centrally controlled ledgers. DNS, banks, and so on aren't free. Cost limits applicability, but doesn't change the fact that a new thing exists in the world, a distributed ledger that works at global scale. That new thing will disrupt many existing institutions and processes that have traditionally had no choice but to use a central ledger.