« August 2010 | Main | October 2010 »

September 29, 2010

KRL Patterns: Building Event Intermediaries


The Silver Jubilee Bridge

Recently we somewhat quietly added a BIG new feature to KRL: explitic events. Using an explicit event, one rule can raise an event for another rule. Explicit events are raised in the rule postlude like so

raise explicit event foo [for ] 
  with fizz = "bazz" 
   and fozz = 4 + x;

If the optional for clause isn't given, the event is raised for the current ruleset, otherwise it's raised for the named ruleset. The with clause allows the developer to add event parameters to the explicit event. The right-hand side of the individual bindings in the with clause can be any KRL expression. Like any other postlude statement, explicit events can be guarded:

raise explicit event foo 
    with fizz = "bazz" 
     and fozz = 4 + x
  if (flipper == "two");

The event in the preceding example will only be raised if the variable flipper has the value two.

Explicit events allow KRL programmers to chain rules together. Rule chaining is good for modularization, preprocessing, and abstraction as we'll show in the following sections. We'll first discuss event intermediary patterns in general and then go through several example patterns.

KRL Event Patterns

KRL is a rule language, a style that's unfamiliar to most programmers. Consequently, it's useful to see patterns and idioms for common operations. Additionally, KRL is an event processing language. As such, events are at the core of what happens inside a KRL program. That means that understanding how to process and manipulate events is important.

One thing that most event intermediary patterns have in common is that they usually take no action. You'll see that the noop() action is prevalent in the examples below. There is a (complex) event expression, sometimes some data manipulation in the prelude, and finally one or more events raised in the postlude.

As currently implemented, events in KRL have several limitations that can limit the ability of KNS to serve as an event intermediary in certain situations.

  1. KRL currently limits an explicit event to be raised for one ruleset--either the current ruleset (the default) or one given in the for clause.
  2. There is no way at present to pass all of the event parameters from the event expression to an explicit event when it is raised. The developer must explicitly (no pun intended) pass an event parameters that necessary in any following steps.

We will be taking steps to remove these limitations in future releases of KRL.

Event Logging

One of the simplest intermediary patterns is the event logging pattern. The intermediary rule looks for the expected event scenario, calls a logging statement (either using the built-in log command in KRL or by making an HTTP post) and then passes the event on using an explicit event.

Here's an example:

rule logger_rule is active {
   select when phone outboundconnected

   http:post("http://example.com/mylogger.cgi") with
      with number = event:param("phonenumber")

   always {
     raise explicit event outboundconnected with
        phonenumnber = event:param("phonenumber") and
        time = event:param("time");
   }
}

rule use_phone {
  select when explicit outboundconnected 
  ...
}

In this example, the rule is logging the event and some data from it before passing the event on (as an explicit event). This might be useful for debugging or billing. Note that Kynetx terms of service explicitly disallow the use of calls to other systems for purposes of smuggling people's private data.

Abstract Event Expressions

Sometimes you will have a complex event expression (one that uses compound event expressions) that you need to use in more than one rule. Good programming practice dictates that you abstract that complex event expression so that if it changes, you don't have multiple places to remember to update it. Additionally, giving a complex event expression a name can facilitate program readability. Explicit events give us the means to accomplish complex event expression abstraction.

Here's an example:

rule called_first is active {
  select when phone outboundconnected
       before mail received from "@apple.com"
  noop();
  always {
    raise explicit event called_first with msg = event:param("msg");
  }
}
...
rule use_called_first_1 is active {
  select when explicit called_first
  ...
}
...
rule use_called_first_2 is active {
  select when explicit called_first
  ...
}

Notice that the first rule raises an explicit event with the name "called_first" whenever it saw a particular event pattern. Two later rules uses the "called_first" explicit event. If the complex event expression is changed or updated the two rules will both respond appropriately. When used like this, we call "called_first" an abstract event expression.

Event Preprocessing

Sometimes the data from an event (in the event parameters) will need to be preprocessed before it is used. Based on the results of that preprocessing, you may want to do different things.

Here's an example:

rule pinentered is active {
  select when mail received
  pre {
    msg = event:param("msg");
    from = event:param("from");
    item = datasource:pds({"key":from});
    relevant_data = msg.query("li[type=#{item}]");
  }
  noop();
  always {
    raise explicit event mail_received with
      from = event:param("from") and
      to = event:param("to") and
      msg = relevant_data
  }
}
 

In this example the message from a mail that's been received is preprocessed using the query operator to retrieve just those portions that are HTML <li> elements with an attribute with the type equal to a value that it retrieves from a datasource using the from address of the message.

This is an example of a complex mapping step that might need to be done for several rules. Using explicit events we can pull it out into a single place where it can be more easily maintained and tested.

Event Stream Splitting

Related to the idea of event preprocessing is the notion of event stream splitting. The previous example shows event parameter preprocessing. We can use the event parameters to split the event stream and send it in two different directions depending on the result of a test on the data. Often preprocessing will be done in support of splitting the event stream.

Here's an example:

rule pinentered is active {
  select when webhook pinentered
  pre {
    pinattempt = event:param("Digits");
    phone = datasource:pds({"key":"phone"});
    pin = phone.pick("$..value.pin");
  }
  if pinattempt == pin then
    noop();
  fired {
    raise explicit event correct_pin
  } else {
    raise explicit event bad_pin
  }
}
  
rule badpin is active {
  select when explicit bad_pin
  ...
}
  
rule correctpin is active {
  select when explicit correct_pin
  ...
}

In this example the data in the event parameter Digits is compared with data retrieved from another data source (datasource:pds). If they're equal, the explicit event correct_pin is raised, otherwise the explicit event bad_pin is raised. Each of these rules continue processing as necessary. In this case none of the event data from the original event is passed on with the new events, but that need not be the case.

App Controller Ruleset

An app, in Kynetx nomenclature, consists of one or more rulesets. Complex apps consist of multiple rulesets. We've built some that use a dozen or so and I expect to see apps that use many more than this. One of the problems in building complex apps that comprise multiple rulesets is keeping track of the control points in the app--what events are causing what to behave. Developers often want a single place in the code where they manage control flow.

Using the patterns outlined above, you can create a controller ruleset that is the main entry point for the app and controls the rules that get executed in other rulesets. Here are a few of the advantages of using a controller ruleset in your app:

  • routing - Each complex event pattern that the app responds to is represented in the controller ruleset. Each of these event patterns raises an explicit event that other rules in the app respond to (event abstraction).
  • authentication - Kynetx Marketplace offers developers a way to charge for their apps. But only one ruleset can be listed in marketplace as the "app." An event controller solves this problem by being the one point of control and this the place where authentication is controlled as well.
  • normalization - preprocessing event parameters in the controller app provides a normalized version of data and can serve to insulate the rest of the app from changes in outside event sources and endpoints.

Conclusion

Explicit events open up the use of event intermediaries in KRL and significantly expand the viability of complex apps built from multiple rulesets. Explicit intermediary rulesets like a controller greatly reduce the cognitive complexity of large apps. The example above are just a few of the interesting patterns I've noticed. As you notice others, I hope you'll let me know so I can collect and share them.

9:38 AM | Comments () | Recommend This | Print This

September 27, 2010

Starting a High-Tech Business: Hiring Your First Engineer

I'm starting a new business called Kynetx. As I go through some of the things I do, I'm planning to blog them. The whole series will be here. This is the twenty-sixth installment. You may find my efforts instructive. Or you may know a better way---if so, please let me know!

Allan Carroll asked me a question on LinkedIn that I thought I would answer more publicly:

What's most important in hiring the first engineer at a startup and how do you find that person?

On one hand, I'm probably not the right guy to answer this question because I cheat. At both iMall and at Kynetx, the first engineer I hired was a (barely) former student. Kelly Hall in the case of iMall and Sam Curren in the case of Kynetx. Being a professor gives me the opportunity to get to know, over the course of months or years, the character, work habits, and innovative abilities of a number of great programmers and I simply cherry pick the best of the best.

On the other hand, that doesn't mean that I didn't have some clear objectives. A startup's first engineering hire comes at a special time. You've likely been writing code solo for several months. You've picked many of the architectural features and implementation directions. You need additional horsepower in several different areas:

  1. You need someone to help flesh out the parts you don't have time to do alone.
  2. You need new skills for some things you can't do or aren't particularly good at.
  3. You need innovation around ideas that haven't even surfaced yet.

Because you're fiscally constrained, you can only hire one person. In the short run, you need (1) and (2) badly. In the long run (3) is more important. In the case of Kynetx, I was looking for someone to build other pieces of the architecture while I continued to work on the core engine (which was written in Perl). So I was able to relax requirement (1).

One of the things I've written about before is my desire to hire people I know. People have pointed out that that idea doesn't scale, but in the early stages of a startup, it doesn't have to. The early stages of a startup are when you can least afford a hiring mistake. Hiring the wrong person can be deadly. So, I simply wouldn't trust it to someone I've met in an interview and exchanged a few emails with. I want to know that I'm hiring someone who is "smart and gets things done" in Joel Spolsky's words. If you can't hire someone you know well, then you'd better find someone who will give you honest references for whoever you're considering.

As you can see from my penchant for hiring students, I like to hire engineers early in their careers. That has the advantage of being able to get them into your culture with as little baggage as possible. I'm a big believer in culture. When you hire someone who's worked other places you need to assess whether their cultural experience and expectations will match your own. That doesn't mean I won't hire engineers further along in their careers. Although most of them are people I've worked with before so I know what baggage they bring along and they know what they're getting in to.

Ideally, the first engineer you hire will be more than a code monkey. You need a leader and someone who can jump into things that they're not familiar with. You need someone who has their own ideas and isn't afraid to express them. At the same time, they need to be able to withstand the blow to their ego when not all of their ideas are immediately taken up and paraded around as the salvation of the company.

Finally, I think that the first engineering hire has to be "startup compatible." Startups frequently have lower pay and fewer benefits than a job with an established firm. The other side of that coin is that your first engineering hire should get (and expect) equity of some form as an offset. The hours are long and unpredictable. They might be asked to do things beyond coding like speaking, teaching, or doing tech support. Heck, the junior engineer at Kynetx even has to stock the fridge. All these things can be a problem for an engineer (or spouse) who isn't used to them--even when disclosed up front.

2:21 PM | Comments () | Recommend This | Print This

September 23, 2010

Static Queries, Dynamic Data: Enabling the Real Time Web

Notice

If you're not up on event systems and event programming, no worries. Most programmers aren't. But I'm here to tell you that they're cool, they're accessible, and they're not just for enterprise kinds of problems anymore. Event processing enables the real-time Web. Most of the cool things we want the Web to do will be easier when we no longer rely exclusively on time and request-response based architectures and look to event based architectures to solve more problems.

One of the unique things about the Kynetx architecture is that there's no database in the traditional sense. We have a ruleset repository and a session store for persistent context information, but the architecture is very different from what people consider a traditional Web service architecture that, almost always, has a database at its core. When I created the architecture I wanted to stay away from traditional databases for scaling reasons and to ease the operations burden.

What I've never had a good answer for was "why can you get away with that?" Yesterday it hit me. In a traditional database, the data is (relatively) static and the queries are dynamic. An event processing system is the dual of that. In an event processing system, the queries are static and the data is dynamic. Since KNS is an event processing network, or a system of programmable event loops, and KRL is an event processing language, it has provision for expressing static queries that it continuously applies to streams of events.

There are several well-known examples of this concept. When you create a Google alert, for example, you're giving Google a standing, static query that it continuously runs over the stream of new stuff that it sees. When there's match, you get an email. Twitter searches are similar. Once you've done the search, Twitter continues to apply it to the stream of tweets looking for matches. Static queries, dynamic data.

Here's a KRL example. In KRL, queries on an event stream are specified using the select statement in a rule, like so:

select when pageview "www.google.com/calendar"
       then mail received subject "schedule change" 

This query, or event expression, says "select this rule when the user has been on Google calendar and then receives an email with a subject that contains the words 'schedule change'". When a rule containing this event expression is active, the system continuously watches the stream of event data looking for this pattern. Whenever the pattern is seen, the rule engine selects the rule and evaluates it. Static query, dynamic data.

Making the real-time Web real will require more systems to turn the traditional Web service inside-out and handle dynamic data. Events are the way to do that because they enable loosely coupled systems that are always watching for scenarios that matter to the user. Without events, this is hard and expensive to pull off.

10:15 AM | Comments () | Recommend This | Print This

CTO Breakfast Tomorrow!

We'll be holding September's CTO Breakfast tomorrow morning at 8am at the cafeteria on the Novell campus. Come join us for a free ranging discussion of technology and high-tech business. You don't have to be a CTO to come--just someone who's interested in high-tech products and businesses. I hope you can make it.

Also, we're going to do a special CTO Breakfast meetup in conjunction with the Utah Open Source Conference on Thursday, Oct 7th. We'll have a room at the Miller Campus of Salt Lake Community College and be done before the morning's talks begin. More details to follow, but add this to your calendar now. And don't forget to register for the conference. The schedule is full of great talks.

8:30 AM | Comments () | Recommend This | Print This

September 16, 2010

Beyond the API: The Event Driven Internet

Summary: There's no question that APIs are hot and generating a lot of buzz and excitement. In this article, I'll review why APIs are causing so much excitement, make an argument for why APIs are not enough, and finally propose a model that significantly extends the power of an API: an event-driven view of the Internet. Extending your API with events will make your APIs much more able to compete and make your business more competitive. After reviewing event models, I discuss webhooks as an event model that complements an API strategy and then briefly talk about how Kynetx extends the webhook idea into something that is truly powerful. Using Kynetx, you can give your API an instant developer program. Let Kynetx help you with your API strategy.

APIs are Hot

A recent article at Forbes by Dan Woods talked about a "multi-dimensional gold rush [happening on the Internet]--with APIs at the center."

To understand the business value of something as technical sounding as an "application programming interface"--the almost never-used expansion of the acronym API--look at several powerful forces that are converging to make a programming tool for developers an engine for sales, marketing and customer lock in. APIs are rapidly going to be vitally important for every business, not just the Silicon Valley technology giants.

APIs allow developers to create new applications that incorporate the service underneath the API. For example on eBay, a programmer can user the API to create a gateway to move all the items in a company's catalog onto the auction site. But it works the other way as well. Items that are being auctioned off on eBay or that are for sale on Amazon can find their way to new audiences when programmers use APIs to include the product listings in their own applications.

...

If this sounds a bit obscure, think of it this way: Google and Facebook have 5 billion API calls a day, according to John Musser, editor of ProgrammableWeb, the leading publication covering internet-accessible APIs. Twitter has 3 billion calls a day that amount to 75% of its traffic. Salesforce.com has 50% of its traffic flowing through APIs. Don't think APIs, think billions of dollars of money sloshing through obscure programming methods.

Fred Wilson, a partner in venture capital firm Union Square Ventures, put it this way in his note about investment strategy at the beginning of 2010: "Developers are the new power users. If you cater to them, you can build a large user base with significant network effects." To cater to developers you must offer them an API to play with.

From What's Fueling The API Gold Rush - Forbes.com
Referenced Wed Sep 15 2010 16:23:02 GMT-0600 (MDT)

I can't help repeating Fred's comment "Developers are the new power users. If you cater to them, you can build a large user base with significant network effects."

If this is still not resonating with you, spend a little time with Sam Ramji's talk on Darwin's Finches, 20th Century Business, and APIs. Here's the summary from Sam's blog:

There is a perspective some people apply to evolution, social theory, and language change called punctuated equilibrium (credit goes to Jess Ruefli for pointing this out). It suggests that change is not gradual, but that change comes in sudden punctuated bursts between stretches of relative stasis or equilibrium. The Web from 1995-2000 was certainly a surge like this as every business "went online" in order to continue to function in a newly competitive economy. I believe that we're going through such a surge right now as the early versions of the web - designed for people using browsers - gives way to the next version: using APIs to design the web for people using applications that communicate on their behalf in complex ways to the services that make up the world's businesses. If we look to evolution and to the last similar shift - the move from direct to indirect channels for business in the 20th century, we can apply old lessons to this new world in order to succeed.

The primary point Sam makes is that as the offline world went from direct to indirect selling in the mid-20th century, successful business were those that sold through successful retailers. So too, the Web is going indirect. Building a Web site for people to visit with their browser is "direct." Building an API where other Web sites can use your data to succeed is "indirect." Good APIs take your business with them all over the Web as they get used. But for that to happen your API has to appeal to the businesses that will use it. You have to make them successful for your business to succeed. Sam gives some principles that help you help your partners succeed:

  1. Realize that developers are your channel
  2. Be recombinant and easily mixed
  3. Unlock your legacy data into open APIs
  4. Drive new data into your system via open APIs
  5. Support your application ecosystem

When you hear people clamouring about cloud computing, you'll see a lot of smoke around infrastructure cloud providers like Amazon, Google, and Rackspace, but that's not where the fire (energy) is. The energy, and the real money, will be made in APIs. That's what has Dan Wood so excited in the Forbes article I reference above.

Beyond the API: Events

I'm a huge fan of APIs. I thnk they change the game and will open up numerous new services on the Internet. But with all the excitement over APIs, I think they only get us part way to where we want to be. An API can only respond when it receives a request. Many interesting services will also need to make requests of their own. That pattern is broadly called an event architecture.

Events, and the need for them, isn't an idea that's surprising to most people who've worked with computers. We've all contemplated the uses of interrupts in computing systems to avoid the need for constant polling by one device or system of another. If an I/O device isn't able to interrupt the CPU, the CPU has to constantly check with the I/O device to determine if it's got something new.

Moving up the stack, a bit, one of the disadvantages of RSS is that there's no standard way for an RSS feed to notify interested parties when it's got a new item. The only way to get new items is to constantly poll the system hosting the RSS feed to see if it's changed. Pubsubhubbub is a protocol that rectifies that by defining a way for systems to register their interest in a particular feed. When the feed publishes a new item, subscribers are notified. More generally, an event is raised telling subscribers there are new items.

But the need for event-driven architctures in combination with APIs goes beyond the standard "polling vs. interrupts" argument. Most of us associate that too closely with the "I've got something for you" kind of occurance. This feels too much like message passing and implies a tighter coupling than is necessary or desirable for most event systems. In general, events can be associated with several types of occurences

  • A specified action is taken (e.g. a new item is published to a feed)
  • A spontaneous act of nature too complicated to be fully understood is detected (e.g. your computer just crashed)
  • One or more conditions are determined to have been met (e.g. the temperature of the oven reaches 450 degrees)

In these examples the system raising the event has no knowledge or understanding of the downstream systems that are seeing the event or what they might do. Event architectures can be characterized by extreme loose coupling.

Event Systems

Event systems can be characterized as "simple" or "complex." Simple event systems look for and react to one kind of event. In contrast, complex event systems can monitor scenarios made up of more than one event, reacting only after certain patterns of events have been detected. The events in such a scenario may come from multiple event domains and be correlated in space, time, or causality. Responding to complex event scenarios requires sophisticated event interpretation, event pattern definition notations and matching engines, and event correlation techniques.

Event systems have several components:

  • event generators - the event generator raises the initial event. There may be translation steps to get the event into the right format for use by the event processor.
  • channels - this is the messaging protocol that's used to transfer the event from the generator to the processor. The channel can take many forms and a given event system might support multiple protocols such as HTTP, XMPP, SMTP, and so on.
  • event engine - event engines match event patterns and initiate action. Simple event engines respond to each event seperately and are sometimes refered to as "handlers" or "listeners." In complex event scenarios, the event engine processes the events and only initiates action when the simple events in the scenario match the required correlation pattern.
  • responders - responders take directives from the event engine and take desired actions. A given event scenario match in the engine may result in zero or more activities on a variety of responders.

The overall power of a given event system is proportional to the flexibility it has in regards to its support for the number and type of generators, the variety of channels, the complexity of the acceptable event scenarios, and the number and types of activities it can initiate. Of course, the greater the power of the event system the more complicated it can be to configure.

The loose coupling properties of event systems follow from the fact that the event scenarios and follow-on activities are defined according to the needs and desires of the interested parties, not the organizations and systems generating the events. Once the event has been generated, anyone who sees it chooses to respond however they like.

Webhooks

The webhook concept popularized by Jeff Lindsay is an example of a simple event system built on top of the existing Web infrastructure.

  • generators - in the webhooks model, anything that can call a URL can be a generator and events are raised by performing an HTTP method on a URL
  • channel - the channel in webhooks is the HTTP protocol
  • event engine and responders - the engine and responder are the web application that the URL points to.

Webhooks overlay an event model on the Web. There's no "system" per se, just a usage pattern for enabling user-defined callbacks on the Web. Jeff points our that webhooks can be used for everything from simple notifications to more sophisticated service chaining and even Web application plugins.

There are already a number of Web applications that support Web hooks (some, I'm sure, without really knowing they were doing so--true of all great, natural patterns). Perhaps the most familar is Paypal's instant payment notification. The idea is simple. Paypal lets users provide a URL that Paypal calls whenever any of a number of transactions are made. As the developer of the application at the other end of the URL, you can do whatever you like when Paypal calls your listener (i.e. the CGI program you write) using the URL you supply. Your program can ignore the Paypal data, store it in a database, or whatever.

Paypal IPN

This is an example of a simple notification webhook. Amazon payments has a merchant callback API that functions as a webhook plugin. When Amazon gets an order on your behalf, they will call the webhook you give them. They expect to receive the taxes and shipping cost. You return those and they put them in the checkout for your customer. Again, your listener can do anything you like when the callback URL (webhook) is called as long as you return the right data. Amazon is using webhooks to create a flexible plugin architecture for their service.

Twilio is a cloud-based telephony platform that uses webhooks to do service chaining. For example, when you use Twilio to place a call, you can give it a webhook that it calls when the call is answered. You return the XML payload that defines what happens next. Twilio will then call a webhook for the next action. Twilio and the webhooks it calls form chain of executions that achieve the ultimate purpose.

Webhooks provide a useful pattern for extending APIs beyond the traditional interactions that have defined them. Webhooks are easy to implement and very flexible because they require no special tooling or software to make work. Anything application that speaks HTTP can generate or process webhook-style events.

Beyond Webhooks: Kynetx

Webhooks have several limitations:

  • webhooks lack of a formal framework means that programmers are responsible for managing event scenarios. For simple listeners this isn't a problem, but as event scenarios become more complicated, this places a large burden on the developer.
  • webhooks are largely about the web (go figure). This means that interacting with event generators or responders in other domains (e.g. email) requires the creation of a gateway between that taarget domain and the Web. This isn't a big deal except that the lack of standards around how things should work make re-use of these gateways difficult.
  • webhooks don't supply any special facility for managing user identity. Each service defines it's own method for managing identity.

A more formal event framework can overcome these problems by defining notation for the engine and standards around interactions. Kynetx has developed an event service for the Internet that provides a more formal structure. The Kynetx Network Service (KNS) maps onto the event model given above as follows:

  • generators - in KNS programs called endpoints raise events using an agreed upon format. Endpoints can use the KNS APIs to determine which events are salient to limit communication traffic.
  • channel generators communicate with the engine over HTTP. Protocol translators can be used to accomodate other protocols.
  • event engine the Kynetx Rule Engine (KRE) is the event processor. KRE supports complicated event scenarios. Handlers are scripted using the Kynetx Rule Language (KRL), a domain specific language for Internet events.
  • responders - endpoints are also responsible for responding to directives from KRE and taking appropriate action.

Endpoints serve two functions in this architecture, but there's no need for any given endpoint to do both, although most do. Endpoints that merely raise events or just respond to directives are supported.

KNS Overview

The Kynetx Rule Language provides a unifying notation for reinforcing the conceptual event framework implemented in KNS as well as providing all the traditional benefits of a notation. In particular, KRL provides a convenient notation for specifying complex event scenarios. KNS automatically builds state machines that correlate multiple events across event domains using that event scenario specification. As each new event is received, KRE evaluates it in the context of past events and the event specification to determine when activity should be initated.

As mentioned, KRL provides a domain specific language for specifying event handlers. Each rule in a ruleset has an associated event specification that determines whether the rule is selected or not. Whenever a rule is selected, the rule body is evaluated. The ultimate effect of this evaluation is a set of directives that inform endpoints of the actions they should take. Along the way data sources and persistent data about the entity raising the event are consulted and used in calculations to compute the appropriate set of directives.

For more details on how Kynetx works, check out our free white paper: The Kynetx Rule Language - The First Internet Application Platform.

It's All About the Individual

One of the key features of KNS that makes it uniquely built for programming interactions on the Internet is that it's got identity built-in. Every event is raised on behalf of a particular entity1. Even if you and I have the same apps installed (and thus are using the same rulesets), the behavior you see might be radiacally different than the behavior I see based on our context.

Endpoints, events, and directives

This contrasts to other business rule languages where the process is the fulcrum around which the ruleset executes. KRL primitives understand the context of the individual and take it into account when they execute. Even persistent variables store their values on behalf of a specific entity.

By putting entities--the individual--at the center of KNS, we have created an event system that is aimed squarely at creating apps that are user-centric. This is why Kynetx has been so interested in the personal data store conversations, because it is a natural way to script the interactions that individuals have with various services around the Internet.

KRL and KNS Benefits

As we've seen KNS and KRL are unique in the use of events as a unifying abstraction for creating Internet apps. They are also unique in their focus on the individual and support of user-centric applications. In addition to these key differences, KNS and KRL provide a number of important benefits in creating a event-driven Web:

  • Cross domain--Kynetx apps can work across domains so that user purpose can be advanced regardless of online location. KRL is designed to cross the silos that have sprung up, as standalone Web applications, so developers can create applications that mash-up data from all across the Internet regardless of location or protocol.
  • Cross protocol--Kynetx apps easily work across Internet protocols such as the Web, email, and so on. KNS is easily extensible by developers to support any protocol.
  • Data and context driven--KRL and KNS are designed to easily and naturally work with the burgeoning array of data and APIs available online. Correlated data provides context about users. Using KRL and KNS, developers can create applications that respond to user context for a more compelling experience.
  • Cloud based--because Kynetx apps are cloud based, they work consistently and ubiquitously. They can be accessed from multiple platforms while providing the same context, identity, and experience. Cloud based programming means that programs always work because they are updated without user interaction in response to changing conditions.
  • Browser independent--Kynetx apps work in all the major browsers without modification. The browser has become a sort of universal application platform, but browser differences make programming on them difficult. KRL provides a unifying framework for easily working with all of the popular browsers.
  • Internet app centric language and design--KRL provides a powerful notation for creating apps that run across the Internet. KNS provides the platform that makes that possible.
  • Security and privacy are built-in--the architecture of KNS is designed to limit nefarious activity structurally. In addition, operating in the cloud makes it easy to turn off apps that are misbehaving. User control provides the means to create privacy respecting apps.
  • Late binding--Kynetx apps run at the exact moment that the user needs them. They bind to data and functionality that is appropriate for the user's current context. In contrast, conventional Web applications exist at a single location and operate without the benefit of user context.
  • Multi-endpoint--KNS provides application program endpoints that work with Web browsers, email servers, and other Internet systems. Kynetx plans to provide endpoints for popular and important Internet protocols and applications as part of its ongoing development roadmap. Developers can easily extend KNS to include endpoints for any Internet protocol.
  • Developer friendly--KRL is designed to provide developers with a powerful and easy to use abstraction layer for apps. KRL provides a notation that lets programmers easily complete Internet programming tasks that previously took many lines of code. Event expressions, datasets, and data sources are just a few examples. Because Kynetx apps are hosted, developers are spared operational and maintenance headaches that come with servers.

Using KNS as part of your API Strategy

As I pointed out at the beginning, there's a lot of energy surrounding APIs right now. If you're building a company and considering an API, I'd ask you to consider going beyond the simple API. Consider how an event strategy can compliment your API to provide even greater functionality. We'd be happy to consult on that.

If you do decide to incorporate events into your API strategy, there are two ways to use Kynetx to make it even more snappy:

  1. Implement a straight webhook strategy of Web callbacks and a defined interaction protocol
  2. Build a Kynetx endpoint into your product so that raising events that Kynetx understands--and taking actions based on those event--is easy for your developers.

Personally, I think you ought do both because it's relatively easy. If you go with a straight webhook implementation, we supply a webhook translation service so that your webhooks can be used by Kynetx developers. Using Kynetx as part of your API strategy gives you an instant developer program and makes your API available to Kynetx developers as well. We're happy to help you develop your strategy and work out how Kynetx can help you make your API program a success. Just let us know.

Footnotes:

  1. Actually, you can raise events without specifying an entity, but this is usually done for reasons of configuring an application or some general system. Think of entity-less events as similar to class variables in an object-oriented language.

3:09 PM | Comments () | Recommend This | Print This

September 15, 2010

Come to Kynetx Impact Dev Day this Saturday!

Kynetx

Don't forget that the Kynetx Impact Dev Day is this Saturday, Sept. 18th. A full day of intensive training, brainstorming and app-building for developers. And it's FREE!

If you haven't signed up yet, make sure you sign up today so we can save you a seat.

Can't make it, but want to watch? We'll be streaming the main sessions on the Kynetx Ustream channel.

If you are coming, we'll be having an open sign-up for the App Showcase at 4:00. Have a cool app? We want you to see it, so come prepared to show & tell.

3:12 PM | Comments () | Recommend This | Print This

September 14, 2010

The Cost of Fighting Illegal Immigration

Chihuahua

This week NPR is running a story on the unknown price of border convictions. The question is that we're "getting tough on illegal immigration" but what is it costing. Turns out, no one3 knows. But we can guess:

But even tripling the number of Operation Streamline defendants wouldn't come close to meeting the program's stated goal of zero tolerance: prosecuting everyone caught crossing illegally. In the Tucson sector, that would currently be nearly 1,000 prosecutions every weekday -- a quarter-million people a year.

The presiding federal judge for Arizona, John Roll, says it's his job to carry out policy, not to make it. But, Roll says, prosecuting everyone is not possible.

"You can't prosecute all 250,000 people in Arizona. We would have more cases than the rest of the entire country. You would take the resources now for the entire country and just double it and put them in Arizona," he says.

In other words, to prosecute these misdemeanors, Arizona would need to have a federal criminal justice system twice the size of the rest of the country. No one has contemplated what that would cost. There is one estimate of how much it would cost just to detain and hire a lawyer for every illegal immigrant caught entering the Tucson sector: close to $1 billion a year. That estimate was done by the Warren Institute at the University of California, Berkeley law school.

From Border Convictions: High Stakes, Unknown Price : NPR
Referenced Tue Sep 14 2010 07:50:34 GMT-0600 (MDT)

Did you catch that, to prosecute the illegal immigrants in Arizona alone would require a judiciary twice the size of the entire judiciary of the United States. This is to prosecute misdemeanors by people who are just coming into the US to find work and make a better life for themselves. We're not talking about dangerous felons here. We could spend $10's of billions "getting tough."

The problem facing the US is real, but the solution isn't getting tougher on illegal immigration. That is just a money pit--probably more expensive than just letting them come and live with us. The problem is that the US, a wealthy country, shares a long, impossible-to-protect border with a relatively poorer country. Short of moving Mexico somewhere else, we can't make this problem go away. So what to so?

Let me give an analogy. Suppose you're getting on a plane and it's clear, because they're closing the door, that you're the last person. You get to the aisle and see that the plane is completely full. Only one seat left--yours. And that seat is right next to someone who's very overweight. They're spilling into your seat. Not a good situation. What are you choices? Not take the flight, get mad, or snuggle in.

The first doesn't get you where you want to go. The second does nothing productive. The third, while not plesant, gets you where you want to be with the least fuss. The US can't "not take the flight". Mexico is our neighbor and will remain so. We are getting mad right now and all it's doing is costing lots of money and not solving the problem.

The third choice is our only viable alternative: embrace Mexico and it's citizens. Many won't want to, but it's the only real solution. Ultimately the US and Mexico will have to become one. The way to solve the problems of Mexico is to extend to them the solutions of the United States. They need our help fighting drugs and growing their economy. Unless they are really part of the US, we will continue to ignore them. Over time, I think we need to consider how we can make the Mexican states part of these united states. Statehood for Chihuahua!!

8:01 AM | Comments () | Recommend This | Print This

September 10, 2010

PDX Principles

Locked Folder

There was a lot of discussion around Personal Data Stores (PDS) and Personal Data Lockers at IIW East. Every time slot on both days had at least one and sometimes two sessions on the subject. (As an aside, if you're not familiar with IIW, the agenda is created in real time, by the participants, not months in advance by a program committee, so it represents more fully the interests of the participants than a normal conference aganda might.) I'm confident that this will also be a major theme at the upcoming IIW in Mountain View CA in November.

The term itself is a problem. When you say "store" or "locker" people assume that this is a place to put things (not surprisingly). While there will certainly be data stored in the PDS, that really misses it's primary purposes: acting as a broker for all the data you've got stored all over the place and managing the metadata about that data. That is, it is a single place, but a place of indirection not storage. The PDS is the place where services that need access to your data will come for permission, metadata, and location. Similarly for services that need to give you data.

Consequently, some have taken to calling it a PDX, where "x" stands for the "variable x." That is, we don't know what to call the last thing, so we'll say "x" and leave it at that.

In the discussions, I started to tease out a few prinicples that define the PDX and make it something different from just a database where my stuff is. We all have lots of places where data about us is stored and since it's personal data, we might think of them as "personal data stores" but when people at IIW (and elsewhere) use the term, they're talking about something larger and more capable that just a passive database.

Here's a list of a few things that I think distinguish a PDX from just places where your personal data is stored:

  • user-controlled - the user needs to be in control of the data, who has access, and how it is used. Once that data is in my PDX, I make decisions about it. That doesn't mean the data might not also be somewhere else. For example, data about my purchases from Amazon will certainly be stored at Amazon and not under my control. But I might also be emailing the receipts to a service that parses them and puts the data in my PDX for my use.
  • federated - there isn't one place where your data is stored, but multiple places that the data needs to be able to flow between, in a permissioned way. There's no center, just a lot of cooperating system with my PDX orchestrating the interactions. While Amazon might not give my PDX access to and control over my transactions, my phone company might provide a PDX-capable contact service where I choose to store my contact information.
  • interoperable - various PDX services and brokers have to be able to operate together according to standards to perform their roles. When I take money out of my account at Wells Fargo and deposit it at Chase, I don't lose part of the value because Chase doesn't know how to handle some part of the transaction. The monetary system is interoperable with standards and, sometimes, shims that connect it all together.
  • semantic - a PDX knows more about the data that it holds than existing data stores do. Consider Dropbox. I can put all kinds of things in my Dropbox, but it's syntactic, not semantic. By that I mean that if I want to put healthcare data in Dropbox and control who uses it, I create a folder and put the data in it with specific permissions. The fact that there is a folder with a certain name located at a particular place in the folder hierarchy is purely syntactic. In a semantic world, the data itself is tagged as healthcare data and no matter where it is, it's protected according to the policies I've put in place.
  • portability - a PDX doesn't trap data in proprietary formats. If my phone company is storing my contact data in the cloud and I decide that I want to move it to my own server or another service, I can--from a technical as well as a policy standpoint. Note that this doesn't mean we have to wait until thousands upon thousands of data format specification get hammered out. Semantic metadata can provide a means of translating from one format to another.
  • metadata management - one of the primary roles of the PDX is managing data about my data. What are the roles I've created? What permissions have I granted as exceptions to the defaults? What semantics surround the various data fields? What data sharing, encoding, and encrypting policies have I created? All of this has to be kept and managed in my behalf in the PDX.
  • broker services - the PDX is a place where the user manages a federated network of data stores. As an example of why this is important, consider the shortcomings of OAuth. If I use an application that needs access to four OAuth mediated APIs, I have to go through the OAuth ceremnoy with each API provider separately. Now consider that I might have dozens of apps that use a popular API. I have to go through the OAuth ceremony for each of them separately. In short a broker saves us from the N x M explosion of permissioning ceremonies. Similarly for various data services.
  • discoverable - a PDX should provide discoverability for its APIs and schemas so that any application I'm interested in knows how to interact with it. Discoverability protects users from having to completely specify addresses, mappings, and schemas to every application that comes along.
  • automatable and scriptable - a PDX without automation is worse than no PDX at all because it burdens the user rather than saving effort. A PDX will be a player in a larger ecosystem of services. I don't see is as a mere API that allows services and applications to GET and PUT data--it's not WEBDAV on steoids. The PDX is an active participant in the greater ecosystem of services that are cooperating on the user's behalf.

Surely I've missed some, but this list is a good start. What would you add?

Update: Kaliya wrote up a vision and principles document for personal data stores a month ago. Not surprising to people who know us both, they differ radically in perspective, but are coherent in spirit.

12:22 PM | Comments () | Recommend This | Print This

Referencing and Encoding Metadata

graph

We need data permissions to be as portable as the data itself. So too for all metadata. Over the course of IIW East, I had a revelation (for me) that there's real power in having metadata encoded in the same format as the data itself and, in a related way, allowing self-refernce so that the meta data can be referenced from the document it describes. I think I've always believed that, but hadn't really articulated it to myself until yesterday.

Certainly, this idea isn't new. Just look at XML for the largest, recent example. Nearly everything about and XML document is encoded in XML and even, at times, embedded in the document itself--the schema, the signatures, policies, and so on is expressable in XML.

One of the things I like about XDI is that it is capable of encoding and embedding metadata in the same format as the data. For example, XDI link contracts are the permissions model of what can be shared with whom. Link contracts are just an XDI document. They can be referenced from the XDI document they describe (or anywhere else).

Theoretically, this makes sense. Self referential documents have a power that merely hierarchical document structures don't. Trees are a subset of graphs. Non-restricted graphs can, by definition, describe more than trees.

Practically, using the same encoding formats saves time in learning, building, and buying new tooling. Allowing self-reference let's you say more and create structures that you otherwise can't. Here's to metadata!

9:03 AM | Comments () | Recommend This | Print This

September 9, 2010

Changes for IIW

IIW Logo

There are a couple of changes coming to IIW, one pragmatic and one philosophical. First the pragmatic...

Due to some scheduling snafus, the Computer History Museum is not available during the time we'd advertised for IIW XI (Nov 9-11). After much thought and discussion we've determined that the best course of action is to move it to another day rather than change the venue. CHM has many things to like and it's become the workshop's home. So, we're moving IIW XI to November 2-4.

We realize the 2nd is election day and hope you'll vote early. We also realize this causes some conflicts for people--personally, I'll reschedule or simply miss three things I was counting on. We're sorry. Refunds are available if you're already paid and can't make it now. Hopefully this is far enough out that it won't cause too many problems.

Now, for the philosophical. As I blogged after IIW X, it's clear that the conversation at IIW is continuing to evolve and there is a lot of interest around personal data, authorization, and related work. I view this as a natural outcome of people finally getting some basic technologies set up around wide-area identity. The conversation is moving up the stack, as it were.

IIW ultimately had a choice to make. We can stand still and serve as a workshop for introducing newbies to Internet identity or we can move forward as the conversation shifts to the interesting questions that Internet identity brings up. We're choosing the later path. IIW can't serve both groups effectively and ultimately we're rather be a smaller workshop closer to the "get your hands dirty" attitude that has characterized IIW from the start.

Consequently, we're going to stop calling IIW the "Internet Identity Workshop" and simple refer to it as IIW or the IIW Workshop. The conversation has outgrown the name, but not the underlying philosophy that the end user--people--are at the center of technologies which will propel the most important changes coming on the Internet. People centric technologies, like OpenID, Information Cards, OAuth, UMA, and others, are the heart of the discussion and will remain so.

Based on the discussions we had at the last IIW and the ones happening now at IIW East, I'm sure that personal data stores (PDS), vendor-relationship management (VRM), and related ideas are going to be a big topic at the upcoming IIW in November. I'm anxious to see that conversation develop and hope that you'll be able to join us. Please register for IIW XI here.

3:42 PM | Comments () | Recommend This | Print This

New Twitter Spam Tactic?

Today, someone (or some bot) tweeted something that had nothing to do with me, but had my Twitter handle in it.

Twitter spam example

The interesting thing about this is that the URL shortener is smart and goes to Amazon when you first click it, but there after goes to another site (something about Tatoos for Geeks--not sure what the point is). If you just go to the URL shortener's base URL, you get redirected to bit.ly.

This seems to be a new tactic to keep Twitter from finding spam: disguise the links so that Twitter doesn't see the real target.

7:22 AM | Comments () | Recommend This | Print This

September 2, 2010

Twitter and the OAuthalypse: A RESTful Misfire

Fail Whale

Yesterday was the OAuthalypse--the day when Twitter stopped accepting HTTP Basic authorizations on theis API. I had a few apps break--like almost everything I've done with Twitter. To get them back working I'll have to spend some time on each moving them over to OAuth. For some that won't be hard--they're already using a library that supports OAuth. For others it will be more work. All of them are single user apps (like the UtahPolitics retweeter and so will use the OAuth single token pattern.

The reason for moving to OAuth is so that apps won't need to ask users for their Twitter password or store it anymore. Twitter had a bad experience with this and that led to the decision to go nuclear on usernames and passwords on their API. This is a clear win for delegated authorization protocols like OAuth and the more capable ones that are surely to follow. What's more it trains users to use a delegated authorization scheme. I love it.

But what's curious about the move is that in everycase (except the retweeter) my apps are not updating information. These are read-only apps that simply read a friend timeline for a partcular user. I can't figure out why any authorization is needed at all. Since who I follow is public information, it would be simple enough to reconstruct my friend timeline from available information. My theory is that Twitter uses authentication on read-only data as a substitute for a poorly designed API. That is, they use the authentication as a substitute for merely allowing me to specify whose timeline I want to see.

This is classic REST stuff and it seems that Twitter got it wrong. Thousands of apps are failing today because Twitter requires them to authorize when they don't really need to. Am I wrong?

9:55 AM | Comments () | Recommend This | Print This