Tail Optimized Mutual Recursion in Clojure

Phil Windley // Sat Nov 29 12:08:00 2008 // clojure lisp scheme tail+recursion trampolining

Clojure, a LISP-dialect dynamic language that targets the JVM has been generating some interest in the programming language community lately. By targeting the JVM, Clojure gets speedy performance in a cross platform way.

One of the problems with targeting the JVM with a dynamic language that relies heavily on recursion is that the JVM doesn't support tail recursion (also called tail call optimization). The idea is pretty simple: for some common patterns of recursion the function call can be removed and the recursive function be optimized to a loop. The result is not only programs that run faster, but in some cases where the recursion goes deep programs that run period.

So you might imagine that this is a problem for a language like Clojure. For simple recursion, the recur special form takes care of the problem, but wouldn't work for mutual recursion.

Via Lambda the Ultimate, I learned of Rich Hinkey's implementation of trampolining in Clojure to make tail optimized mutual recursion possible. He gives an example.

"Here's how it works. Normally, if you have mutual recursion (i.e. which can't be replaced with recur), you can blow the stack:"

(declare bar)

(defn foo [n]
  (if (pos? n)
      (bar (dec n))
          :done-foo))

(defn bar [n]
    (if (pos? n)
        (foo (dec n))
\t    :done-bar))

(foo 1000000)
-> java.lang.StackOverflowError

"To convert to a trampoline, simply return closures over your tail calls, rather than direct calls. This is as simple as prepending #"

(declare bar)

(defn foo [n]
  (if (pos? n)
    #(bar (dec n))
    :done-foo))

(defn bar [n]
  (if (pos? n)
    #(foo (dec n))
    :done-bar))

"Then make the top-level call via trampoline:"

(trampoline #(foo 1000000))
-> :done-foo

On LtU, someone asked:

That there is "buzz" around a non-TCO LISP dialect in 2008 is utterly incomprehensible to me. I'm all for a modern/practical LISP, but come on...

Is there something specific about the JVM that makes it impossible, or is it just a naive implementation?

To which James Iry gave a great response:

Native instruction sets often let you do whatever you want with the stack. C doesn't in the ANSI standard, but you can do it with a bit of assembly. .NET IL has an explicit instruction for tail calls. The JVM, on the other hand, is very strict about how you use its stack and has no tail call instruction.

There are full TCO implementations in a couple of Scheme's for the JVM. Kawa uses trampolining and SISC uses a heap based custom stack. Either solution has performance implications as well as implications regarding Java interoperability. Java interop is a major design goal for Clojure.

Hickey's choice to make the programmer indicate when the trade-off is desirable is a pragmatic workaround. There's been some talk in the Scala mailing list about offering trampolining in the standard library - I wouldn't be surprised to see it in 2.8.0.

Well said. In the meantime, if you're familiar with SICP, the you might find Chris Rathman's first chapter of SICP in Clojure a useful way to see how the language differs from Scheme.