Using Schema.org Microdata in KRL


Summary

Schema.org microdata has been proposed as a way to encoding semantic data in HTML markup. Using semantic data from within KRL allows rules to be more general and loosely coupled. I recently used a jQuery library to gather Schema.org product microdata from a page and forward it to the KRL engine for processing. This blog post explains how I did it.

[]

In my blog post, Anonymous eCommerce: Building a Real 4th Party Offer Application with Kynetx, I mentioned that I used HTML Microdata in the form of a Schema.org product microdata as the means of getting product data from the page to the app. A few people have asked for more details on that.

Microdata, like microformats and RDFa, is a means of encoding semantic mark-up inside HTML. I'm not going to try to take a stand on which is better for a particular job (see the results of a microdata vs microformats vs rdfa" if you're curious). Suffice it to say that the Schema.org product microdata met my needs and I went with it.

The idea, for the app I was building, was that the app would place a "want" button on product pages and when the shopper clicked it, the product data would be sent with the click event as attributes. To do this in a general way required programmtic access to the product data. I could have assumed that everyone had a product API, but that's not very general--the app would only be able to work on sites that had been integrated with it. A more loosely coupled solution needs a standard way of getting product data and the Schema.org product microdata is a good answer. Any site can easily add it by updating their product templates and then it's immediately available. Given that Google and others have announced that they'll use microdata to determine relevancy for search results, it's not a stretch to imagine that ecommerce sites will mark up their pages with more semantic data.

I was fortunately to find a jQuery plugin for processing microdata that Philip Jagenstedt wrote. I modified it slightly for my purposes:

  1. I put two files juery.microdata.js and jquery.microdata.json.js together by putting the function in the second into the first.
  2. I modified them to work with the KRL runtime by wrapping them in a closure that is applied to $K so that they extend the KRL runtime-included copy of jQuery.

The final result was included in the KRL ruleset by adding the following line to the meta block:

use javascript 
        resource "http://www.windley.com/want/jquery.microdata.js"

We place the "want" button on the page using a KRL action:

after("#buy_button", want_button);

But rather than using the watch action to attach a listener to the button, I emitted the following JavaScript:

emit <<
$K("#want_button").click(function(){
  var jsonText = 
       $K.microdata.json("[itemtype='http://schema.org/Product']");
  var prodprops = jsonText.items[0].properties;
  var offer = prodprops.offers[0].properties;
  var seller = offer.seller[0].properties;
  app = KOBJ.get_application("a16x108");
  app.raise_event("product_found", 
                  {"prodname":prodprops.name[0],
                   "modelno":prodprops.model[0],
                   "produrl":prodprops.url[0],
                   "price":offer.price[0],
                   "shipping":offer.shipping[0],
                   "seller":JSON.stringify(seller.name[0], 
                                            undefined, 2)
                  });
});
>>

The final two lines in this JavaScript raise an event to the KRL engine called product_found when the "want" button is clicked. The event attributes are gathered using the microdata library. We get the JSON text using the $K.microdata.json function referencing the nodes in the DOM that have an attribute named itemtype that has the value http://schema.org/Product.

The reference returns an array of items that have the right itemtype. In my case, I only cared about the first, so I'm referencing items[0]. Note that the data is arranged hierarchically in a manner that matches the Schema.org product microdata specification. I get the product properties from the first time, get the offer from that and get the seller from the offer.

The rule that responds to the product_found event merely pulls the data from the event attributes

rule process_product {
  select when web product_found
  pre {
    price = event:param("price");
    shipping = event:param("shipping");
    final_price = price+shipping;
  ...
  }
...
}

If more sites start encoding semantic data in microdata, I'll be tempted to find a generalized way of allowing access inside KRL The easiest way to do this would be to allow for a microdata specification in the watch action and then just return all the JSON for processing inside KRL using JSONPath expressions. I'd also like to enable discovery, so an event is raised when microdata is found and why kind. That way, for example, you might automatically place a "want" button on any page that has product microdata.