dg.o - NSF's Digital Government Research

The Digital Government Research Center at the University of Southern California maintains an informational Web site for the National Science Foundation Digital Government Research Program. The site includes:

I wish they had an RSS feed (or several), but I can't find one. Sad experience tells me that no matter how much I promise I'll go look at a Web site regularly, I don't without a note in my aggregator about new things.

The center funds dozens of projects. An example is a project by the National Institute of Statistical Science to build a toolkit to safeguard against data swapping in public records. (see the project homepage) The technique swaps key identifying information in public records to protect privacy. There are two primary questions:

  • How much data must you swap to protect someone's identity?
  • How much can you swap before you have made the bulk of the data unreliable or worthless?

Alan Karr, the principal investigator, talks about these tradeoffs:

The trick to effective data-swapping is choosing the cells in a table that can be interchanged without ruining the core utility of the data or in some other way revealing private information, says Alan Karr, who is leading the project for NISS.

"A good choice would be one that creates a high level of protection, but a low level of distortion in the data," Karr says. "You want to avoid things where you create 4-year-olds with 10 children [but] essentially, it's always going to be the case that the more protection you have, the more distortion you have. There's just no way around that.

"But it turns out that in some examples we've looked at, some choices seem to be better than others, and no one before has had a tool to let you see that in a systematic way," he says.

One limitation of the technique is that the protection relies on secrecy of the swapping technique used on a particular set of data.

But how much of published data would be subject to swapping? Seastrom declined to say, since it could give outsiders the tools they need to penetrate the veil of privacy the agency maintains so carefully around its data.

"Census, for example, has maybe half a dozen people who know what the swapping rate is," she says. "We are rather closemouthed about what we do and how much we do, for obvious reasons."

Please leave comments using the Hypothes.is sidebar.

Last modified: Thu Oct 10 12:47:21 2019.