9

I have a simple prime number calculator in clojure (an inefficient algorithm, but I'm just trying to understand the behavior of recur for now). The code is:

(defn divisible [x,y] (= 0 (mod x y)))

(defn naive-primes [primes candidates] 
  (if (seq candidates)
      (recur  (conj primes (first candidates)) 
              (remove (fn [x] (divisible x (first candidates))) candidates))
      primes)
)

This works as long as I am not trying to find too many numbers. For example

(print (sort (naive-primes [] (range 2 2000))))

works. For anything requiring more recursion, I get an overflow error.

    (print (sort (naive-primes [] (range 2 20000))))

will not work. In general, whether I use recur or call naive-primes again without the attempt at TCO doesn't appear to make any difference. Why am I getting errors for large recursions while using recur?

10
  • Does recur require loop to get tail recursion? I don't see loop in your code. I'd make this an answer, but I'm still learning Clojure. Commented Feb 8, 2012 at 22:42
  • Your code works for me in Clojure 1.2.1 and 1.3. The only error I eventually get is an OutOfMemoryError when finding primes up to 200,000. Commented Feb 8, 2012 at 23:02
  • @octopusgrabbus, no, recur can be used in this fashion (just within a function body) as well. See clojure.org/special_forms#recur. Commented Feb 8, 2012 at 23:03
  • @HansEngel running at the repl in 1.3, I get a stack overflow error when finding primes up to 200,000. Explanation below. Commented Feb 8, 2012 at 23:05
  • 1
    My apologies for not finding this before posting. My question was similar to stackoverflow.com/questions/2946764/… Commented Feb 9, 2012 at 5:20

1 Answer 1

19

recur always uses tail recursion, regardless of whether you are recurring to a loop or a function head. The issue is the calls to remove. remove calls first to get the element from the underlying seq and checks to see if that element is valid. If the underlying seq was created by a call to remove, you get another call to first. If you call remove 20000 times on the same seq, calling first requires calling first 20000 times, and none of the calls can be tail recursive. Hence, the stack overflow error.

Changing (remove ...) to (doall (remove ...)) fixes the problem, since it prevents the infinite stacking of remove calls (each one gets fully applied immediately and returns a concrete seq, not a lazy seq). I think this method only ever keeps one candidates list in memory at one time, though I am not positive about this. If so, it isn't too space inefficient, and a bit of testing shows that it isn't actually much slower.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. It would have taken me forever to figure that out without your help. Even though doall seems a little magical to me, this was extremely helpful!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.