TaskTracer: Gathering Attention Data


Today's CS department colloquium featured Thomas Dietterich from Oregon State University. His research area is machine learning. On of the things he talked about is called TaskTracer. The basic idea of Task Tracer is to capturing the coherence of user's desktop activities.

As users use windowed desktops, they engage in seemingly random activities involving applications, documents, and events. But these are not really random--they are all associated with tasks of one sort or another. In the user's mind, these activities are all associated with the task that they're working on:

  • users choose resources associated with tasks
  • users work on resources and then deliver them via print, email, fax, etc.
  • users communicate with other people (or roles) involved in the activity
  • users attend meetings associated with the activity

To be effective, Task Tracer must

  • Capture events
  • Discover tasks
  • Associate events with tasks
  • Give users access to this information

One of the issues is that automatically associating events with tasks if made much more difficult by the fact that people routinely multitask. Task tracer makes the unwarranted, but greatly simplifying assumption that users are working on just one task at a time. TaskTracer uses machine learning technology to watch user actions and determine when the user changes tasks. The goal is to predict the tasks based on email activity ad desktop activity (menus, etc.)

Here are some of the challenges:

  • The set of tasks is constantly changing
  • The distribution of task documents changes within a task over time. For example in the task of "teaching a course," the activities would include creating a syllabus at the first of the semester, making and grading a midterm exam later on, and computing final course grade and archiving course materials at the end.
  • Real-time online learning and prediction is hard
  • Must achieve high accuracy to be acceptable

The email predictor works pretty well based on sender, recipients, and subject. The body of message isn't that important. The predictor uses a hybrid approach. A naive Bayes classifier makes a yes or no determination of whether this message looks like messages seen in the past and if so, a second, Support Vector Machine algorithm classifies the message.

An interesting side note: running this on a faculty member, two post docs and five students showed that the professor and post-docs have 3-4 times as many tasks they were working on at any given time.

The desktop task predictor is based on what they call "window document segments" (WDS). This is a time interval when one window is in focus for one document. So, switching windows or switching documents within a window constitutes a new segment. The goal is to predict a task for each WDS. Classification is done based on the window title, pathname, website name, and URL pathname of a Web page. The technique uses the same hybrid classifier as the email predictor. The bad news is that the accuracy isn't good enough yet.

One of the tools they're developing is a TaskExplorer that looks like the Windows Explorer, but aimed at tasks. Clicking on a task in the left hand menu brings up a list of documents, emails, instant messages, Web pages, and so on associated with that task.

Another tool is the folder predictor. When you do a "Save As" or an "Open" the file dialog shows the top three folders that that document is predicted to be in. There's an optimization algorithm that tries to minimize the number of clicks. We all hate navigating file systems to get to the right place. Results show that folder predictor reduces clicks substantially over the default Windows dialog. As an aside, I've found I can do pretty well in reducing clicks associated with files using DefaultFolder on OS X.

The third tool is a notes file associated with each task. Timestamps are automatically inserted and the notes are automatically saves as you change tasks.

This work seems related to attention.xml and other attention efforts. I asked Dietterich if he were familiar with attention.xml and he wasn't. They're calling it tracing tasks, but in reality--it's attention.