Looping in KRL


Kynetx Logo

One of the design goals of the Kynetx Rule Language (KRL) is to make it easy to use online data sources to augment the user's experience in the browser. Using interesting data implies some kind of iteration. KRL supports both implicit and explicit looping. Ths article discusses looping in KRL and how looping in a rule language like KRL differs from how you might use it in an imperative language.

First, recognize that the ruleset itself is a loop. You should imagine the rulset as a big switch statement inside a loop that is executed over and over again as the user visits pages. Persistent variables allow you to store data that can be used across separate evaluations of this loop. After each loop execution, the client is consulted for input before the loop is evaluated again.

In addition to this big, implicit loop of the ruleset, there are other ways to loop in KRL. With the recent addition of foreach, you can now loop over elements of an array and fire a rule for each item. I'm sure many KRL programmers, having cut their teeth on imperative languages like Java, PHP, and so on will be excited to use it. The problem is that it probably doesn't work exactly how your past experience with imperative programming languages might lead you to expect. To see how, consider the following example.

Suppose that you have a data source that lists a number of sites by URL and gives some data about each of them. Further, suppose you'd like to annotate those sites with the data out of the dataset so that you don't have to republish the ruleset each time it changes. Consider, this data:

{count: 3
 value: {pubDate: "Fri, 11 Dec 2009 10:24:37 -0800"
         generator: http://pipes.yahoo.com/pipes/
         items: [
          {page: "baconsalt.com"
           content: "Hello World. Go Bacon."
           header: "Bacon Salt Test"
          }
          {page: "craigburton.com"
 \t   content: "Hello World. Burtonian methods."
           header: "Craig Burton Test"
          }
          {page: "kynetx.com"
           content: "Hello World. The World According to Kynetx"
           header: "Kynetx Test"
          }
         ]
        }
}

Using this data, we'll place a notification box on the three sites listed in the page field. The notification is to use the content and header data out of the dataset associated with the page.

For purposes of what follows, assume we've added this data feed to the globals section of the ruleset like so

dataset site_data 
          <- "http://pipes.yahoo.com/p...2ec&_render=json";

items = site_data.pick("$.value.items");  

Note that we've also use the pick operator to pick out just the array of items in the dataset.

Your first attempt, using foreach might look something like this:

rule using_foreach is active {
  select using "." setting () 
    foreach items setting (d)

  pre {
    h = d.pick("$.header") + " using foreach";
    c = d.pick("$.content");
    domain = page:url("domain");
  }

  if(domain eq d.pick("$.page")) then
    notify(h,c);
}

This does the job, looping through each item (binding its value to d) and using the premise of the rule to check that the current domain is applicable before placing the notification.

But there's a simpler, more efficient way to accomplish this same thing:

 
rule without_foreach is active {
  select using ".*" setting ()

   pre {
    dom = page:url("domain");
    content = 
     site_data.pick("$..items[?(@.page eq '"+dom+"')].content");
    header = 
     site_data.pick("$..items[?(@.page eq '"+dom+"')].header")
       + " without foreach";
  }

  notify(header,content);
}

Note that there's no foreach and no premise on the rule action. Why does it work?

First, remember that for both of these rules we had to set the dispatch section of the ruleset to say which sites it was applicable to:

dispatch {
  domain "baconsalt.com"
  domain "craigburton.com"
  domain "kynetx.com"
}

This is required and, at least for now, can't be done using a wild card or driven from a dataset due to security and performance concerns. So, the sites that the ruleset fires on is already determined by the dispatch. That does away with the need for the check in the premise.

The second rule takes advantage of that by using the domain in constructing the pick pattern to select the right element of the array of items. The pick is the second secret to how the above rule works: the pick is doing an implicit loop over the data and only selecting the items where page matches the domain. Consequently, we don't need the foreach.

Now, let's look at an example where using foreach to loop explicitly based on data might be a good thing. The foreach causes the same rule to be fired multiple times in a single ruleset evaluation. In general, you're not going to do that with certain actions like notify, redirect, or alert. But you might with an append, replace, and so on. Suppose that I had a dataset that looked like this:

{"desc": "Data set to test foreach",
 "replacements": [
    {"selector":"#categories",
     "text":"This was the cloud tag"
    },
    {"selector":"#friends",
     "text":"This was a list of friends"
    },
    {"selector":".action-stream",
     "text":"This is where the action stream was"
    }
   ]
}

In this dataset, the selector is the jQuery selector pattern for elements on a Web page and the text is what we're going to use with the action. Given this, the following ruleset uses the items in this dataset to prepend the text in each item above to the element on the page that matches the associated selector pattern:

rule prepend is active {
  select using "windley.com" setting ()
    foreach replacements.pick("$.replacements") setting (r)

  pre {
    sel = r.pick("$.selector");
    new_text = r.pick("$.text");
  }

  prepend(sel,new_text);
}

This changes the same page multiple times according to the contents of the dataset.

Using data often requires loops. As we've seem, there are multiple ways to loop in KRL: a ruleset is a loop, each rule can loop explicitly using the foreach, and implicit looping is accomplished using pick.