Why is Clojure 10 times slower than Python for the equivalent solution of Euler 50?

Question

I recently started to learn Clojure and decided to practice on Euler problems to get a hang of the available data structures and practice recursion and looping.

I tried various approaches to Problem 50, but no matter what I did, finding the solution for 1000000 never finished. After I looked up what others did, I guessed what I was doing should not take forever either, so I typed in the equivalent algorithm in Python to see if the problem lies in my lack of understanding of some Clojure thing, or Java setting. Python finished in 10 seconds. For primes under 100000, the Python version finished in 0.5 sec, Clojure in 5.

I'm posting the Clojure version which was created specifically to match the Python code. Can you help me understand why there is such a difference in performance? Should I use unchecked-add, type hints, primitives (but where?) or what?

So here's Clojure:

(defn prime? [n] (let [r (int (Math/sqrt n))] (loop [d 2] (cond (= n 1) false (> d r) true (zero? (rem n d)) false :other (recur (inc d)))))) (defn primes [] (filter prime? (iterate inc 2))) (defn cumulative-sum [s] (reduce (fn [v, x] (conj v (+ (last v) x))) [(first s)] (rest s))) (defn longest-seq-under [n] "Longest prime seq with sum under n" (let [ps (vec (take-while #(< % n) (primes))) ; prime numbers up to n prime-set (set ps) ; set for testing of inclusion cs (cumulative-sum ps) cnt (count ps) max-len (count (take-while #(< % n) cs)) ; cannot have longer sequences sub-sum (fn [i j] ; sum of primes between the i-th and j-th (- (cs j) (get cs (dec i) 0))) seq-with-len (fn [m] ; try m length prime sequences and return the first where the sum is prime (loop [i 0] ; try with the lowest sum (if (> i (- cnt m)) ; there are no more elements for and m length sequence nil ; could not find any (let [j (+ i (dec m)) ; fix length s (sub-sum i j)] (if (>= s n) ; overshoot nil (if (prime-set s) ; sum is prime [i (inc j)] ; we just looked for the first (recur (inc i))))))))] ; shift window (loop [m max-len] ; try with the longest sequence (if (not (zero? m)) (let [[i j] (seq-with-len m) ] (if j (subvec ps i j) (recur (dec m)))))))) (assert (= [2 3 5 7 11 13] (longest-seq-under 100))) (let [s1000 (longest-seq-under 1000)] (assert (= 21 (count s1000))) (assert (= 953 (reduce + s1000)))) ; (time (reduce + (longest-seq-under 100000))) ; "Elapsed time: 5707.784369 msecs"

And here's the same in Python:

from math import sqrt from itertools import takewhile def is_prime(n) : for i in xrange(2, int(sqrt(n))+1) : if n % i == 0 : return False return True def next_prime(n): while not is_prime(n) : n += 1 return n def primes() : i = 1 while True : i = next_prime(i+1) yield i def cumulative_sum(s): cs = [] css = 0 for si in s : css += si cs.append( css ) return cs def longest_seq_under(n) : ps = list(takewhile( lambda p : p < n, primes())) pss = set(ps) cs = cumulative_sum(ps) cnt = len(ps) max_len = len(list(takewhile(lambda s : s < n, cs))) def subsum(i, j): return cs[j] - (cs[i-1] if i > 0 else 0) def interval_with_length(m) : for i in xrange(0, cnt-m+1) : j = i + m - 1 sij = subsum(i,j) if sij >= n : return None, None if sij in pss : # prime return i, j+1 return None, None for m in xrange(max_len, 0, -1) : f, t = interval_with_length(m) if t : return ps[f:t] assert longest_seq_under(100) == [2, 3, 5, 7, 11, 13] assert sum(longest_seq_under(1000)) == 953 # import timeit # timeit.Timer("sum(longest_seq_under(100000))", "from __main__ import longest_seq_under").timeit(1) # 0.51235757617223499

Thanks

I figured out what the main culprit was: the way the cumulative sum vector was calculated. I never checked what it did for larger vectors, I assumed that a last of a vector used array access, but (source last) showed that it is recursive. My code never reached the main part with 78000 primes in the vector. — Aa'Koshh
– Aa'Koshh, Commented Nov 9, 2011 at 14:49
The following versions would have worked: (defn cumulative-sum-2 [s] (loop [[x & xs] s ss 0 acc []] (if x (let [ssx (+ ss x)] (recur xs ssx (conj acc ssx))) acc))) or (defn cumulative-sum-3 [s] (reduce (fn [v, x] (conj v (+ (v (dec (count v))) x))) [(first s)] (rest s))) Using one of these the solution is still approximately 3 times slower than the Python equivalent, but that might be mitigated with transients or some techniques I have yet to master. — Aa'Koshh
– Aa'Koshh, Commented Nov 9, 2011 at 14:49

Sam Ritchie · Accepted Answer · 2011-11-09 11:51:11Z

I think the slowdown comes from the number of times you iterate through the sequences in longest-seq-under; each of those iterations takes its toll. Here's a smoking fast version, based on a combination of your code and the answer posted here. Note that primes is lazy, so we can bind it with def vs defn:

(defn prime? [n] (let [r (int (Math/sqrt n))] (loop [d 2] (cond (= n 1) false (> d r) true (zero? (rem n d)) false :else (recur (inc d)))))) (def primes (filter prime? (iterate inc 2))) (defn make-seq-accumulator [[x & xs]] (map first (iterate (fn [[sum [s & more]]] [(+ sum s) more]) [x xs]))) (def prime-sums (conj (make-seq-accumulator primes) 0)) (defn euler-50 [goal] (loop [c 1] (let [bots (reverse (take c prime-sums)) tops (take c (reverse (take-while #(> goal (- % (last bots))) (rest prime-sums))))] (or (some #(when (prime? %) %) (map - tops bots)) (recur (inc c))))))

This finished in about 6 ms on my machine:

user> (time (euler-50 1000000)) "Elapsed time: 6.29 msecs" 997651

Yes, I saw that solution here too, but did not spend enough time with it to fully understand the idea. But since I also found that for others this less clever approach worked too, albeit slower, I really wanted to understand why I just can't do the same in Clojure. Basically all I'm doing is looking up values in a vector and doing two nested for loops to change the indexes.
The reason I didn't define primes as def was that as far as I know Clojure than hangs on to the head of it, so if I consume the list it stays in memory after that. This way I can just discard what I do not need (not that its used here like that).

Aa'Koshh · Accepted Answer · 2011-11-10 11:35:39Z

I will accept my own comment as answer for the question why Python worked and Clojure did not: using last of a vector is a linear operation that prevented the cumulative sum from being computed the way I intended it to be.

Updating the function to use a transient vector like this:

(defn cumulative-sum-2 [s] (loop [[x & xs] s ss 0 acc (transient [])] (if x (let [ssx (+ ss x)] (recur xs ssx (conj! acc ssx))) (persistent! acc))))

results in the Clojure version to run only twice as long as Python, consistently. I kind of hoped Clojure would be faster then Python for the same operations, wonder if I still miss something. I'm using 1.2 by the way.

Thanks

There is also peek which is the same as last for vectors, but much more efficient.

Collectives™ on Stack Overflow

Why is Clojure 10 times slower than Python for the equivalent solution of Euler 50?

2 Answers 2

2 Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Related