Persistence, Programming, and Picos

Phil Windley // Mon Feb 8 14:49:00 2021 // identity persistence picos programming

Summary

Picos show that image-based development can be done in a manner consistent with the best practices we use today without losing the important benefits it brings.

Jon Udell introduced me to a fascinating talk by the always interesting r0ml. In it, r0ml argues that Postgres as a programming environment feels like a Smalltalk image (at least that's the part that's germane to this post). Jon has been working this way in Postgres for a while. He says:

For over a year, I’ve been using Postgres as a development framework. In addition to the core Postgres server that stores all the Hypothesis user, group, and annotation data, there’s now also a separate Postgres server that provides an interpretive layer on top of the raw data. It synthesizes and caches product- and business-relevant views, using a combination of PL/pgSQL and PL/Python. Data and business logic share a common environment. Although I didn’t make the connection until I watched r0ml’s talk, this setup hearkens back to the 1980s when Smalltalk (and Lisp, and APL) were programming environments with built-in persistence.
From The Image of Postgres
Referenced 2021-02-05T16:44:56-0700

Here's the point in r0ml's talk where he describes this idea:

As I listened to the talk, I was a little bit nostalgic for my time using Lisp and Smalltalk back in the day, but I was also excited because I realized that the model Jon and r0ml were talking about is very much alive in how one goes about building a pico system.

Picos and Persistence

Picos are persistent compute objects. Persistence is a core feature of how picos work. Picos exhibit persistence in three ways. Picos have¹:

Persistent identity—Picos exist, with a single identity, continuously from the moment of their creation until they are destroyed.
Persistent state—Picos have state that programs running in the pico can see and alter.
Persistent availability—Picos are always on and ready to process queries and events.

Together, these properties give pico programming a different feel than what many developers are used to. I often tell my students that programmers write static documents (programs, configuration files, SQL queries, etc.) that create dynamic structures—the processes that those static artifacts create when they're run. Part of being a good programmer is being able to envision those dynamic structures as you program. They come alive in your head as you imagine the program running.

With picos, you don't have to imagine the structure. You can see it. Figure 1 shows the current state of the picos in a test I created for a collection of temperature sensors.

Figure 1: Network of Picos for Temperatures Sensors (click to enlarge)

In this diagram, the black lines show the parent-child hierarchy and the dotted pink lines show the peer-to-peer connections between picos (called "subscriptions" in current pico parlance). Parent-child hierarchies are primarily used to manage the picos themselves whereas the heterarchical connections between picos is used for programmatic communication and represent the relationships between picos. As new picos are created or existing picos are deleted, the diagram changes to show the dynamic computing structure that exists at any given time.

Clicking on one of the boxes representing a pico opens up a developer interface that enables interaction with the pico according to the rulesets that have been installed. Figure 2 shows the Testing tab for the develop interface of the io.picolabs.wovyn.router ruleset in the pico named sensor_line after the lastTemperature query has been made. Because this is a live view into the running system, the interface can be used to query the state and raise events in the pico.

Figure 2: Interacting with a Pico (click to enlarge)

A pico's state is updated by rules running in the pico in response to events that the pico sees. Pico state is made available to rules as persistent variables in KRL, the ruleset programming language. When a rule sets a persistent variable, the state is persisted after the rule has finished execution and is available to other rules that execute later². The Testing tab allows developers to raise events and then see how that impacts the persistent state of the pico.

Programming Picos

As I said, when I saw r0ml's talk, I was immediately struck by how much programming picos felt like using the Smalltalk or Lisp image. In some ways, it's like working with Docker images in a Fargate-like environment since it's serverless (from the programmer's perspective). But there's far less to configure and set up. Or maybe, more accurately, the setup is linguistically integrated with the application itself and feels less onerous and disconnected.

Building a system of picos to solve some particular problem isn't exactly like using Smalltalk. In particular, in a nod to modern development methodologies, the rulesets are installed from URLs and thus can be developed in the IDE the developer chooses and versioned in git or some other versioning system. Rulesets can be installed and managed programmatically so that the system can be programmed to manage its own configuration. To that point, all of the interactions in developer interface are communicated to the pico via an API installed in the picos. Consequently, everything the developer interface does can be done programmatically as well.

Figure 3 shows the programming workflow that we use to build production pico systems.

The developer may go through multiple iterations of the Develop, Build, Deploy, Test phases before releasing the code for production use. What is not captured in this diagram is the interactive feel that the pico engine provides for the testing phase. While automated tests can test the unit and system functionality of the rules running in the pico, the developer interface provides a visual tool for envisioning the interaction of the picos that are animated by dynamic interactions. Being able to query the state of the picos and see their reaction to specific events in various configurations is very helpful in debugging problems.

A pico engine can use multiple images (one at a time). And an image can be zipped up and shared with another developer or checked into git. By default the pico engine stores the state of the engine, including the installed rulesets, in the ~/.pico-engine/ directory. This is the image. Developers can change this to a different directory by setting the PICO_ENGINE_HOME environment variable. By changing the PICO_ENGINE_HOME environment variable, you can keep different development environments or projects separate from each other, and easily go back to the place you left off in a particular pico application.

For example, you could have a different pico engine image for a game project and an IoT project and start up the pico engine in either environment like so:

# work on my game project
PICO_ENGINE_HOME=~/.dnd_game_image pico-engine

# work on IoT project
PICO_ENGINE_HOME=~/.iot_image pico-engine

Images and Modern Development

At first, the idea of using an image or the running system and interacting with it to develop an application may see odd or out of step with modern development practices. After all, developers have had the idea of layered architectures and separation of concerns hammered into them. And image-based development in picos seems to fly in the face of those conventions. But it's really not all that different.

First, large pico applications are not generally built up by hand and then pushed into production. Rather, the developers in a pico-based programming project create a system that comes into being programmatically. So, the production image is separate from the developer's work image, as one would like. Another way to think about this, if you're familiar with systems like Smalltalk and Lisp is that programmers don't develop systems using a REPL (read-eval-print loop). Rather they write code, install it, and raise events to cause the system to talk action.

Second, the integration of persistence into the application isn't all that unusual when one considers the recent move to microservices, with local persistence stores. I built a production connected-car service called Fuse using picos some years ago. Fuse had a microservice architecture even though it was built with picos and programmed with rules.

Third, programming in image-based systems requires persistence maintenance and migration work, just like any other architecture does. For example, a service for healing API subscriptions in Fuse was also useful when new features, requiring new APIs, were introduced since the healing worked as well for new, as it did existing, API subscriptions. These kinds of rules allowed the production state to migrate incrementally as bugs were fixed and features added.

Image-based programming in picos can be done with all the same care and concern for persistence management and loose coupling as in any other architecture. The difference is that developers and system operators (these days often one and the same) in a pico-based development activity are saved the effort of architecting, configuring, and operating the persistence layer as a separate system. Linguistically incorporating persistence in the rules provides for more flexible use of persistence with less management overhead.

Stored procedures will not likely soon lose their stigma. Smalltalk images, as they were used in the 1980's, are unlikely to find a home in modern software development practices. Nevertheless, picos show that image-based development can be done in a manner consistent with the best practices we use today without losing the important benefits it brings.

Future Work

There are some improvements that should be made to the pico-engine to make image-based development better.

Moving picos between engines is necessary to support scaling of pico-based system. It is still too hard to migrate picos from one engine to another. And when you do, the parent-child hierarchy is not maintained across engines. This is a particular problem with systems of picos that have varied ownership.
Managing images using environment variables is clunky. The engine could have better support for naming, creating, switching, and deleting images to support multiple project.
Bruce Conrad has created a command-line debugging tool that allows declarations (which don't affect state) to be evaluated in the context of a particular pico. This needs functionality could be better integrated into the developer interface.

If you're intrigued and want to get started with picos, there's a Quickstart along with a series of lessons. If you want support, contact me and we'll get you added to the Picolabs Slack.

The pico engine is an open source project licensed under a liberal MIT license. You can see current issues for the pico engine here. Details about contributing are in the repository's README.

Notes

These properties are dependent on the underlying pico engine and the persistence of picos is subject to availability and correct operation of the underlying infrastructure.
Persistent variables are lexically scoped to a specific ruleset to create a closure over the variables. But this state can be accessed programmatically by other rulesets installed in the same pico by using the KRL module facility.

Please leave comments using the Hypothes.is sidebar.

Last modified: Tue Feb 9 10:16:11 2021.