« November 2009 | Main | January 2010 »

December 30, 2009

Build 384: Annotating Search Results with Large Datasets

Kynetx Logo

Recently Azigo and the Better Business Bureau launched the BBB app that helps people locate BBB accredited business:

The BBB's Accredited Business Locator is powered by Azigo's free, downloadable browser plug-in and displays the "BBB Accredited Business" seal whenever consumers search for products or services using popular search engines such as Google, Yahoo! and Bing. This seal indicates that a business is reputable and offers one-click access to the BBB report for that business. This provides maximum consumer protection and minimizes the risk of online fraud.

This app is powered by Kynetx and the techniques necessary to make it work have been factored into the Kynetx Rule Language (KRL) so that they're available to everyone. The BBB app required two things that KRL didn't support well:

  1. The BBB data set of accredited businesses was very large running to millions of records
  2. The BBB was interested in annotating local search results

The problem with large data sets is that previous versions of KRL had an action called annotate_search_results that relied on preloaded data. The Kynetx Network Service (KNS) ensured that the data was available based on information from the KRL program. With millions of records we could hardly just load up the entire data set and there was no way to segment it since you don't know what the user's going to search on (and thus what results are relevant) until they do it. Enter remote

We've added a new option to annotate_search_results called remote (optional parameters to actions are specified using a with clause). The remote option specifies a URL that is the location of a remote data set. When it's used the selector function is ignored (and can be left out) of the annotate_search_results call. Instead the remote data URL is called after the search results are available and passed the relevant data from the results. The remote service uses the data about the search results to query the database and returns a JSONP response that will be evaluated to annotate the results.

Given that this runs after the search results page has loaded, you might think it would be slow, but our experience with BBB has convinced us that it works just fine.

We've also added a annotate_local_search_results action that works just like annotate_search_results, but is specifically tailored to annotate the local results in each search engine.

You can find more information about search annotation in KRL in the documentation. You can also download and try out the BBB app to see how it works for yourself. We've built a sample app so that you can see a sample app that uses these two new methods with remote data.

One other thing: build 384 included the let_it_snow() action because it's winter and snowy Web sites can be fun.

4:09 PM | Comments () | Recommend This | Print This

December 24, 2009

Quit Your Job and Get a PhD

A mathematics lecture, apparently about linear...

Image via Wikipedia

Every once in a while I run across this question from people wanting to get a PhD: "I am interested in going on to the PhD level but I have run into a wall. Most of the traditional schools I have looked into want me to quit my job and attend full-time. I can't do this because of my family and house payment."

There's a reason schools want you to be a full time student at the PhD level: it's the only way it will work. Getting a PhD isn't like getting a BS or MS; it's an apprenticeship to become a professor. One of the most important things you will learn is how to do research and how to publish your results. That's more than a full time job. You won't have time for anything else.

Remember, you're asking a professor to make a BIG commitment to you when he or she takes you on as a student. Understandably, they require that you make a commitment back. Speaking from experience, having a PhD student is like taking on another child. I wouldn't do it for anyone not willing to put some skin in the game.

Your major professor will be the most important factor in determining what you get out of your PhD. Pick your professor, not the school. The good thing about the top 40 schools (roughly the top 20% of PhD granting institutions for Computer Science) is that they'll have multiple stars. You'll likely find someone who you want to work with and who wants to work with you there. But a school further down the list who has a professor doing work you're interested in shouldn't be overlooked.

Whereas, undergraduate and, to a large extent, masters-level study are education on a mass production basis, studying for a PhD is a completely individualized experience that is customized to the person and their interests. The resources necessary to create a single PhD graduate can probably support dozens of undergraduates. You simply can't expect the department to commit those kind of resources to someone who is unwilling to commit to the course of study on a full time basis.

No online PhD program will give you what you want or need: the mentoring that comes from a PhD program. Skip them.

It comes down to this: you can't apprentice one place and work somewhere else full time. If you really want a PhD, then find a way to quit your job, sell your house, and go to school. Keep in mind that in school, you'll get paid as a TA or RA. That plus some faith is enough to get most people through. If you do otherwise, you will cheat yourself out of one of the most tremendous learning experiences that any one can have. I've never been sorry I got a PhD. I heartily recommend it to the curious and committed.

11:27 AM | Comments () | Recommend This | Print This

December 22, 2009

Service Oriented Architecture and Uncle Walter

This little slideshow from Michael Bell is an entertaining metaphor that introduces the concepts of service oriented architectures.

The idea is that Uncle Walter has a business that is set up as silos the way most organizations set up their business processes (via their IT systems). He solves his problem by applying SOA principles.

I think some people may object and say that Bell only mentions business processes--what does that have to do with architectures? Anymore, your business processes and your IT systems architecture are inseperable. You can't fix one without fixing the other.

10:05 AM | Comments () | Recommend This | Print This

December 17, 2009

Burtonian Tutorials on the Kynetx Rule Language

Kynetx Logo

Craig Burton has been busy the last few weeks cranking out tutorials on how to use KRL--the Kynetx Rule Langauge--in certain situations.

  1. What would a programming language introducation be without a "hello world" example and Craig delivers. The first tutorial anyone ought to watch is Hello World, which gives a video view of the instructions here.
  2. After that simple introduction, Craig goes right to the heart of the features that are important for creating useful Kynetx apps. The second tutorial is on External Data. In this tutorial, Craig extends the Hello World tutorial to use an external file as the source of the data displayed.
  3. The second tutorial's data is just HTML--non-structured--so the third tutorial rectifies that and redoes the program to Using JSON Data. This tutorial contains some interesting techniques like using Yahoo! Pipes to convert CSV data into JSON. Craig put notifications on three Web sites based on the data in the JSON.
  4. The fourth tutorial, Pipes, Pick and Google Docs, shows how to place the data in a Google spreadsheet, turn it into JSON, and use KRL's support for JSONPath (think XPAth for JSON) to grab data out of te JSON and use it in augmenting three sites.
  5. The third and fourth tutorials used three rules to put notifications on three Web sites. That's not really making good use of the data, so the fifth tutorial, Dynamic Data and Dynamic Rule, fixes that by using the pick operator in a more sophisticated way to pick the right data out of the JSON using the domain of the site that the rule is firing on. This way a single, data-driven rule does the whole job. I wrote more about how this works last week.

By the time you're done watching these five tutorials, each around 5 minutes, you'll have a good idea how to use structured data in augmenting Web pages. Craig plans on continuing the series and I'm anxious to see where he takes it. I'd love to see more detailed looks at actions, callbacks, postludes, and persistent variables.

10:02 AM | Comments () | Recommend This | Print This

December 15, 2009

Free Pizza and Kynetx on Wednesday

Kynetx Logo

On Wednesday we're going to have a little dev party at Kynetx for anyone who wants to stop by, ask questions, learn how to program in KRL, or just hand out. The Kynetx development team will be there along with other developers who are using KRL. Come by around 5pm and we'll stick around at least until 7, later if people want. Here's the address:

3098 Executive Parkway
Suite 280
Lehi, UT 84043

Suite 280 is in the southeast corner of the 2nd floor. Here's a link to a Google map. I hope you'll stop by, check out our cool new IKEA lamps, and eat some pizza with us.

9:52 AM | Comments () | Recommend This | Print This

December 11, 2009

Looping in KRL

Kynetx Logo

One of the design goals of the Kynetx Rule Language (KRL) is to make it easy to use online data sources to augment the user's experience in the browser. Using interesting data implies some kind of iteration. KRL supports both implicit and explicit looping. Ths article discusses looping in KRL and how looping in a rule language like KRL differs from how you might use it in an imperative language.

First, recognize that the ruleset itself is a loop. You should imagine the rulset as a big switch statement inside a loop that is executed over and over again as the user visits pages. Persistent variables allow you to store data that can be used across separate evaluations of this loop. After each loop execution, the client is consulted for input before the loop is evaluated again.

In addition to this big, implicit loop of the ruleset, there are other ways to loop in KRL. With the recent addition of foreach, you can now loop over elements of an array and fire a rule for each item. I'm sure many KRL programmers, having cut their teeth on imperative languages like Java, PHP, and so on will be excited to use it. The problem is that it probably doesn't work exactly how your past experience with imperative programming languages might lead you to expect. To see how, consider the following example.

Suppose that you have a data source that lists a number of sites by URL and gives some data about each of them. Further, suppose you'd like to annotate those sites with the data out of the dataset so that you don't have to republish the ruleset each time it changes. Consider, this data:

{count: 3
 value: {pubDate: "Fri, 11 Dec 2009 10:24:37 -0800"
         generator: http://pipes.yahoo.com/pipes/
         items: [
          {page: "baconsalt.com"
           content: "Hello World. Go Bacon."
           header: "Bacon Salt Test"
          }
          {page: "craigburton.com"
 	   content: "Hello World. Burtonian methods."
           header: "Craig Burton Test"
          }
          {page: "kynetx.com"
           content: "Hello World. The World According to Kynetx"
           header: "Kynetx Test"
          }
         ]
        }
}

Using this data, we'll place a notification box on the three sites listed in the page field. The notification is to use the content and header data out of the dataset associated with the page.

For purposes of what follows, assume we've added this data feed to the globals section of the ruleset like so

dataset site_data 
          <- "http://pipes.yahoo.com/p...2ec&_render=json";

items = site_data.pick("$.value.items");  

Note that we've also use the pick operator to pick out just the array of items in the dataset.

Your first attempt, using foreach might look something like this:

rule using_foreach is active {
  select using "." setting () 
    foreach items setting (d)

  pre {
    h = d.pick("$.header") + " using foreach";
    c = d.pick("$.content");
    domain = page:url("domain");
  }

  if(domain eq d.pick("$.page")) then
    notify(h,c);
}

This does the job, looping through each item (binding its value to d) and using the premise of the rule to check that the current domain is applicable before placing the notification.

But there's a simpler, more efficient way to accomplish this same thing:

 
rule without_foreach is active {
  select using ".*" setting ()

   pre {
    dom = page:url("domain");
    content = 
     site_data.pick("$..items[?(@.page eq '"+dom+"')].content");
    header = 
     site_data.pick("$..items[?(@.page eq '"+dom+"')].header")
       + " without foreach";
  }

  notify(header,content);
}

Note that there's no foreach and no premise on the rule action. Why does it work?

First, remember that for both of these rules we had to set the dispatch section of the ruleset to say which sites it was applicable to:

dispatch {
  domain "baconsalt.com"
  domain "craigburton.com"
  domain "kynetx.com"
}

This is required and, at least for now, can't be done using a wild card or driven from a dataset due to security and performance concerns. So, the sites that the ruleset fires on is already determined by the dispatch. That does away with the need for the check in the premise.

The second rule takes advantage of that by using the domain in constructing the pick pattern to select the right element of the array of items. The pick is the second secret to how the above rule works: the pick is doing an implicit loop over the data and only selecting the items where page matches the domain. Consequently, we don't need the foreach.

Now, let's look at an example where using foreach to loop explicitly based on data might be a good thing. The foreach causes the same rule to be fired multiple times in a single ruleset evaluation. In general, you're not going to do that with certain actions like notify, redirect, or alert. But you might with an append, replace, and so on. Suppose that I had a dataset that looked like this:

{"desc": "Data set to test foreach",
 "replacements": [
    {"selector":"#categories",
     "text":"This was the cloud tag"
    },
    {"selector":"#friends",
     "text":"This was a list of friends"
    },
    {"selector":".action-stream",
     "text":"This is where the action stream was"
    }
   ]
}

In this dataset, the selector is the jQuery selector pattern for elements on a Web page and the text is what we're going to use with the action. Given this, the following ruleset uses the items in this dataset to prepend the text in each item above to the element on the page that matches the associated selector pattern:

rule prepend is active {
  select using "windley.com" setting ()
    foreach replacements.pick("$.replacements") setting (r)

  pre {
    sel = r.pick("$.selector");
    new_text = r.pick("$.text");
  }

  prepend(sel,new_text);
}

This changes the same page multiple times according to the contents of the dataset.

Using data often requires loops. As we've seem, there are multiple ways to loop in KRL: a ruleset is a loop, each rule can loop explicitly using the foreach, and implicit looping is accomplished using pick.

7:27 PM | Comments () | Recommend This | Print This

December 7, 2009

Ford Sync and the iPhone

Ford F-150 Lariat

On Black Friday I bought a big, expensive, mobile iPhone docking station called a Ford F-150 pickup. Ford, in an effort to compete with GM's OnStar system, co-developed a system called Sync with Microsoft. Sync is available as an option in most Ford vehicles and is standard in certain models (like my Lariat). My vehicle has the optional Sony music system as well, but not the $2500 touch-screen navigation system. That just seemed like something else to break.

Sync provides phone integration, support for external music players, vehical status reporting, and navigation (turn-by-turn directions). The whole thing has an excellent voice interface that I prefer to the buttons. In fact there are some things (like directions) I don't even know how to do with the buttons. The voice recognition is on target and almost always gets things right. I was skeptical of it and thought I'd never use it, but it's very useful.

Pairing my iPhone with the truck was dirt simple. Tell it to pair, type in the code, and it's done. The subsequent pairing with my phone when I get in and out of the truck has always worked.

Once you've paired the iPhone, calling is easy and the system provides good audio and people say my calls are nice and clear. The first time you pair the phone it downloads the address book and so the truck understands commands like "call joe jackson on cell" and then uses my phone book to complete the call. You can also just dial numbers, if you like.

Sync supports texting from the vehicle--including canned responses--and will read your incoming texts to you, but this doesn't seem to work with the iPhone. That's disappointing because I'd love the ability to hear my text messages while I'm driving. The system won't let you send off a text--even a canned response--if the vehicle is moving.

There are three ways to listen to music: using the line-in, Bluetooth, or the USB port. The line-in is just what it sounds like, a traditional eighth-inch audio port into which you can plug anything that generates analog audio. With this option, Sync has no control over the music player at all.

Since iPhone 3.0, the iPhone has supported Bluetooth audio streaming via the A2DP profile. This works great--almost too well. I haven't figured out how to turn off auto-play and so every time I get in my truck, the phone pairs and music starts streaming from my iPhone. Of course, this is hard on the battery, but for short trips, its absolutely the easiest option since it's mostly automatic. Note that the iPhone doesn't (yet?) support the Audio/Videio Remote Control profile, so Sync can't control the playback of the audio other than pause/play. That means you still have to use the iPhone to select playlists, skip songs, etc.

The USB option treats the iPhone like any other dock connector would. Using the USB port and an iPhone cable, you can dock the iPhone, play music, and control it using the command system built into the vehicle--including voice and textual display of the song information. For this to work well, Sync needs to index the songs on the iPhone. This is one place I ran into problems.

Audio integration functions on Ford Sync display

For whatever reason, Sync won't index anything if there are any songs that don't have complete artist, album, and genre information. If you're a Windows user, Ford offers a piece of software call "Sync My iTunes" that will fix your library. I'm not a Windows user. I tried using a Windows VM to do this, but the software kept complaining that iTunes was busy and failed. Ugh. What I ended up doing was just filling them in by hand. This wasn't as big a job as a feared. I used the column headers to sort the songs and find ones with empty fields and just filled them in as best I could. Took less time than trying to get "Sync My iTunes" to work in a VM. Sync could be more forgiving here, it seems and reduce a lot of owner frustration.

After I'd fixed the library, I plugged the iPhone into the truck and it started indexing...and indexing...and indexing. And it kept indexing for about the next 30 minutes. This is not a speedy process--something to do with the squirrels that run the processor I think. It seems that this indexing needs to happen anytime my library has changed (for shorter intervals, thankfully) and so usually happens each time I plug the phone into the USB port. That means it's mostly useful for longer term trips and I stick to Bluetooth audio streaming for most trips.

Sound source bar at the bottom of the iPod screen Selecting the audio source

One other problem you might run into is that at times the iPhone gets confused about where to send the audio stream. If you open the iPod app, you'll notice a "audio source" bar at the bottom and if you touch it, you can choose between the dock and Bluetooth when they're both active. This wasn't apparent to me at first.

The direction service doesn't have anything to do with directions on the iPhone--unfortunately. Still it works fine. When you activate it, you're actually making a phone call that let's you select from places you've entered online at SyncMyRide.com or just say and address. Once you do that, it calculates the directions and downloads something to the Sync system in the vehicle. Then you get turn-by-turn directions from the vehicle itself based on the information that got downloaded. If you need an update, it calls back automatically and updates. Think of it as a modem-based cloud service. All in all, it's usable and works fine. I think I'll stick to my iPhone for directions though since I'm a map guy.

Speaking of SyncMyRide.com, that's where you go to set up the system, look at vehicle status reports and so on. There's a pretty good help section and an "owner-to-owner" forum that I found to be quite helpful.

The first time I logged into SyncMyRide, it told me (based on my VIN, which I registered with) that my Sync system needed to be updated. The process is pretty easy. I downloaded the update file to a thumb drive and the plugged that into the USB port on the truck and selected the update function from the "advanced" menu. The system ran through an update process and told me when it was done. Afterwards you take the thumb drive back to your computer and log into SyncMyRide again to let it know how things went.

I went into this whole experience expecting lots of warts because of incompatibilities between the iPhone and Sync, but I've been plesantly surprised. The system works well and I enjoy having it. What incompatibilities there have been have largely been on Apple's side. For example, I wish Apple would support more of the Bluetooth profiles. Combined with the premium sound of the Sony stereo, I've got a great way to listen to music from my iPhone and make calls from the road.

9:11 AM | Comments () | Recommend This | Print This

December 2, 2009

Build 354: Control Statements in Postludes

Kynetx Logo

This afternoon we releases Build 354 of KNS supporting the addition of a last control statement to postludes. In addition, we also now allow guard conditionals on any statement in a postlude. These are relatively minor additions to KRL in anticipation of some larger features that are coming soon.

The use a last statement in a postlude will halt the execution of the ruleset at that rule if it is executed. So, the following statement would halt execution after the current rule if the rule fired:

fired {
  last
}

This can be useful for rules that initiate action that must be completed before any of the remaining rules in the ruleset are meaningful (like authorization...hint, hint). That saves you from having to guard all of the remaining rules with a premise checking that the action has been completed.

As an example of using a guard condition on a postlude statement, consider this example:

notfired {
  ent:page_count += 2 from 1 if(not_available eq "yes")
}

This would only be applicable if the rule didn't fire and in that case only increment the page_count entity variable if the value of the variable not_available was "yes".

5:22 PM | Comments () | Recommend This | Print This

December 1, 2009

Announcing the Kynetx Developer Exchange

Kynetx Logo

Today we're releasing the Kynetx Developer Exchange. This is a forum, based on the StackExchange service that is the same code that runs StackOverflow, ServerFault, and SuperUser sites. The functionality is excellent and we're hoping that it provides a fruitful place for developers to interact with Kynetx programmers and each other.

Mike Grace, who is now working for Kynetx, has done a good job of seeding it with the questions he's had as he's gotten up to speed. I'll be participating and so will others on the Kynetx team. We hope you'll use it to answer questions about programming in KRL and making the KNS system suit your needs.

To get started, just go to the Dev Exchange homepage and log in using any OpenID, including Google, Yahoo, or AOL. The more you participate, the more cred you'll have.

6:01 PM | Comments () | Recommend This | Print This

CTO Breakfast on Thursday

CTO Breakfast

This coming Thursday is the CTO Breakfast at 8am. This is the event for both November and December. The breakfast will occur in the usual place: Novell Cafeteria, Building G, Provo Campus (map). I have a few books from O'Reilly to give out this time.

You don't need to be a CTO to come, just interested in technology and high-tech products. The discussion will be open and free-form. Future breakfast schedules are shown on Google Calendar or on the CTO Breakfast page.

I hope to see you there.

5:39 PM | Comments () | Recommend This | Print This