Quantifying Recursion on the Java Platform

January 27, 2007

As most programmers know, the solutions to many problems can be expressed very elegantly using recursion. However, it is also true that any recursive algorithm can be converted into an iterative one, and traditional wisdom suggests that iterative algorithms are preferred from the standpoint of performance and scalability.

To quantify the speed of recursion under Java 6 on Windows XP, I wrote recursive and iterative solutions to a simple tree problem: summing the values of a specified node and all its descendants. Although the superior scalability of the iterative solution is a foregone conclusion, I had expected that the speed of the iterative solution would also be faster. However, to my surprise, the speed of recursion solution proved to be significantly faster than the iterative approach.

The source code for five different solutions are listed below. It is important to note that the recursive solutions presented below are not tail-recursive, which will preclude the compiler from optimizing out the recursive method call.

Five Solutions

The first solution is recursive and also uses a Java 5 for-loop to provide code clarity.

Elegant Recursion

  /**
   * Compute the sum of a tree node and its descendants recursively.
   * Use Java 5 syntax for more elegant code but more processing 
   * overhead.
   * 
   * @param root
   * @return
   */
  private static int computeSumRecursive_impl2(TreeNode root) {
    int result = root.getValue();

    for (TreeNode n: root.getChildren())
      result += computeSumRecursive_impl2(n);

    return result;
  }

The second solution does away with the Java 5 syntax sugar in order to achieve significantly better performance.

Fast Recursion

  /**
   * Compute the sum of a tree node and its descendants recursively.
   * Sacrifice code elegance to maximize efficiency.
   * 
   * @param root
   * @return
   */
  private static int computeSumRecursive_impl1(TreeNode root) {
    int result = root.getValue();
    ArrayList children = root.getChildren();

    int size = children.size();
    for (int j=0; j<size; ++j) {
      result += computeSumRecursive_impl1((TreeNode) children.get(j));
    }

    return result;
  }

The third solution is iterative. The elegance of recursion is clear: the iterative solution cannot be expressed as compactly and is somewhat harder to follow.

Elegant Iterative

  /**
   * A stack-based summation routine that focuses on readability.
   * @param root
   * @return
   */
  private static int computeSumStack_impl2(TreeNode root) {
    Stack toExamine = new Stack();
    toExamine.push(root);

    int result=root.getValue();
    while (!toExamine.isEmpty()) 
      for (TreeNode n: toExamine.pop().getChildren()) {
        result += n.getValue();
        toExamine.push(n);
      }
    return result;
  }

The fourth solution is also iterative but eliminates the Java 5 syntax to improve performance.

Fast Iterative

  /**
   * A stack-based summation routine that uses loop optimization 
   * like recursive implementation 1.
   * @param root
   * @return
   */
  private static int computeSumStack_impl1(TreeNode root) {
    Stack toExamine = new Stack();
    toExamine.push(root);

    int result=root.getValue();
    while (!toExamine.isEmpty()) {
      TreeNode node = toExamine.pop();
      ArrayList children = node.getChildren();

      int size = children.size();
      for (int j=0; j<size; ++j) {
        TreeNode child = (TreeNode) children.get(j);
        result += child.getValue();
        toExamine.push(child);
      }
    }
    return result;
  }

The final iterative solution sacrifices maintainability, readability, and scalability in order to achieve maximum performance. Solutions like this should be avoided in real-world situations.

Hardcore Iterative

  /**
   * A stack-based summation routine that sacrifices readability
   * and scalability to achieve maximum performance.
   * @param root
   * @return
   */
  private static int computeSumStack_impl3(TreeNode root) {
    final TreeNode[] nodeStack = new TreeNode[100];
    int start = 0;
    int end = 1;
    nodeStack[start] = root;

    int result=root.getValue();
    while (start != end) {
      ArrayList children = nodeStack[start].getChildren();
      if (++start == 100)
        start = 0;
			
      int size = children.size();
      for (int j=0; j<size; ++j) {
        TreeNode child = (TreeNode) children.get(j);
        result += child.getValue();
        if (--start < 0)
          start = 99;
        nodeStack[start] = child;
      }
    }
    return result;
  }

Performance Results and Analysis

All five solutions were tested on a balanced, 100,000 node tree with a branching factor of five. Each solution was run against the root of this tree 50 times. Timings were collected using nanoTime, and memory usage was captured using the methodology outlined in Java Tip 130: Do you know your data size? by Vladimir Roubtsov. Table 1 below summarizes the results.

Table 1: Results

Solution	Time (ns)	Heap (bytes)
Elegant Recursion	8,422,312	1,108,004
Fast Recursion	3,488,725	0
Elegant Iterative	16,213,494	1,107,893
Fast Iterative	10,181,091	26,707
Hardcore Iterative	4,178,745	26,728

What can we learn from these results? Most surprising is that recursive performance blows iterative performance out of the water. The fast recursive solution is roughly 3 times faster than the fast iterative one. Even the hardcore iterative solution takes roughly 120% the time of the recursive solution. Clearly Sun has aggressively optimized method calling and stack access, and, as a result, iterative solutions, which in theory should run faster than recursive ones, do not, because the recursive solutions rely on heavily optimized JVM code.

On the other hand, it is also clear that it is easy to inadvertently "give away" performance by, for example, improving readability using Java 5 for-loop iteration. Elegant recursion uses over 1MB of heap space and runs almost as slowly as the fast iterative solution. The elegant iterative solution uses 160% of the fast solution's time, and uses roughly 40 times more heap space.

Some readers have questioned "why memory usage of the fast recursive solution is 0?". The reason is because it uses no heap space, which is what is measured in the example above. Rather, it relies exclusively on stack space, which places serious limits on its scalability. Had the tree used in the test not been well-balanced, the recursive solutions would have failed to run.

Clarifications

I have received a great deal of feedback on this article, and have realized that some points need clarification.

My intention in including the "elegant" results was not to show that the Java 5 for-loop syntax is inherently slow. However, many readers came away with that impression from the article. As many of you know, the for-loop syntax in Java 5 creates iterators behind the scenes, but, in the example presented, we can take advantage of the excellent random access capability of the child ArrayList to achieve better performance. The situtation is rather similar to autoboxing: if you, as a developer, do not understand what the compiler is doing for you, you can inadvertantly hurt application performance.
A number of you have pointed to the fact that Stack is backed by Vector, a synchronized collection, as the reason for the slow performance of the iterative implementations. While it is definitely contributory, it does not account for the entire performance gap, since even the "hardcore" iterative version, which does not use any collection classes at all, is slower than the fast recursive version (at least on Windows XP. Apparently the results under Java 5 on Ubuntu Linux were different).

Conclusion

As always, comments are welcome. If you happen to run the program using a different platform configuration, I would appreciate the results.