Tidy Extension for Scheme


Last Saturday I needed to clean up some HTML (that I'd read into Scheme as a string) into valid XML for storing in Sleepycat's DbXml database. HTML Tidy is a great way to do this, so I put together a small, single-function extension to the Tidy library for PLT Scheme.

The library's easy to build and use. Here's an example:

(require (lib "tidy.ss" "tidy"))

(define bad_string "<p>Foo!<ul><li>first<li>second")

(display (tidy:string bad_string))

This displays

<p>Foo!</p>
<ul>
<li>first</li>
<li>second</li>
</ul>

I've run several hundred HTML snippets that I've gotten out of RSS feeds through the function over the last week and it's worked great.