I recently started to learn Clojure and decided to practice on Euler problems to get a hang of the available data structures and practice recursion and looping.
I tried various approaches to Problem 50, but no matter what I did, finding the solution for 1000000 never finished. After I looked up what others did, I guessed what I was doing should not take forever either, so I typed in the equivalent algorithm in Python to see if the problem lies in my lack of understanding of some Clojure thing, or Java setting. Python finished in 10 seconds. For primes under 100000, the Python version finished in 0.5 sec, Clojure in 5.
I'm posting the Clojure version which was created specifically to match the Python code. Can you help me understand why there is such a difference in performance? Should I use unchecked-add, type hints, primitives (but where?) or what?
So here's Clojure:
(defn prime? [n] (let [r (int (Math/sqrt n))] (loop [d 2] (cond (= n 1) false (> d r) true (zero? (rem n d)) false :other (recur (inc d)))))) (defn primes [] (filter prime? (iterate inc 2))) (defn cumulative-sum [s] (reduce (fn [v, x] (conj v (+ (last v) x))) [(first s)] (rest s))) (defn longest-seq-under [n] "Longest prime seq with sum under n" (let [ps (vec (take-while #(< % n) (primes))) ; prime numbers up to n prime-set (set ps) ; set for testing of inclusion cs (cumulative-sum ps) cnt (count ps) max-len (count (take-while #(< % n) cs)) ; cannot have longer sequences sub-sum (fn [i j] ; sum of primes between the i-th and j-th (- (cs j) (get cs (dec i) 0))) seq-with-len (fn [m] ; try m length prime sequences and return the first where the sum is prime (loop [i 0] ; try with the lowest sum (if (> i (- cnt m)) ; there are no more elements for and m length sequence nil ; could not find any (let [j (+ i (dec m)) ; fix length s (sub-sum i j)] (if (>= s n) ; overshoot nil (if (prime-set s) ; sum is prime [i (inc j)] ; we just looked for the first (recur (inc i))))))))] ; shift window (loop [m max-len] ; try with the longest sequence (if (not (zero? m)) (let [[i j] (seq-with-len m) ] (if j (subvec ps i j) (recur (dec m)))))))) (assert (= [2 3 5 7 11 13] (longest-seq-under 100))) (let [s1000 (longest-seq-under 1000)] (assert (= 21 (count s1000))) (assert (= 953 (reduce + s1000)))) ; (time (reduce + (longest-seq-under 100000))) ; "Elapsed time: 5707.784369 msecs" And here's the same in Python:
from math import sqrt from itertools import takewhile def is_prime(n) : for i in xrange(2, int(sqrt(n))+1) : if n % i == 0 : return False return True def next_prime(n): while not is_prime(n) : n += 1 return n def primes() : i = 1 while True : i = next_prime(i+1) yield i def cumulative_sum(s): cs = [] css = 0 for si in s : css += si cs.append( css ) return cs def longest_seq_under(n) : ps = list(takewhile( lambda p : p < n, primes())) pss = set(ps) cs = cumulative_sum(ps) cnt = len(ps) max_len = len(list(takewhile(lambda s : s < n, cs))) def subsum(i, j): return cs[j] - (cs[i-1] if i > 0 else 0) def interval_with_length(m) : for i in xrange(0, cnt-m+1) : j = i + m - 1 sij = subsum(i,j) if sij >= n : return None, None if sij in pss : # prime return i, j+1 return None, None for m in xrange(max_len, 0, -1) : f, t = interval_with_length(m) if t : return ps[f:t] assert longest_seq_under(100) == [2, 3, 5, 7, 11, 13] assert sum(longest_seq_under(1000)) == 953 # import timeit # timeit.Timer("sum(longest_seq_under(100000))", "from __main__ import longest_seq_under").timeit(1) # 0.51235757617223499 Thanks
lastof a vector used array access, but(source last)showed that it is recursive. My code never reached the main part with 78000 primes in the vector.(defn cumulative-sum-2 [s] (loop [[x & xs] s ss 0 acc []] (if x (let [ssx (+ ss x)] (recur xs ssx (conj acc ssx))) acc)))or(defn cumulative-sum-3 [s] (reduce (fn [v, x] (conj v (+ (v (dec (count v))) x))) [(first s)] (rest s)))Using one of these the solution is still approximately 3 times slower than the Python equivalent, but that might be mitigated with transients or some techniques I have yet to master.