Haskell performance tuning

Question

I'm quite new to Haskell, and to learn it better I started solving problems here and there and I ended up with this (project Euler 34).

145 is a curious number, as 1! + 4! + 5! = 1 + 24 + 120 = 145.

Find the sum of all numbers which are equal to the sum of the factorial >of their digits.

Note: as 1! = 1 and 2! = 2 are not sums they are not included.

I wrote a C and an Haskell brute force solution.

Could someone explain me the Haskell version is ~15x (~0.450 s vs ~6.5s )slower than the C implementation and how to possibly tune and speedup the Haskell solution?

unsigned int solve(){ unsigned int result = 0; unsigned int i=10; while(i<2540161){ unsigned int sumOfFacts = 0; unsigned int number = i; while (number > 0) { unsigned int d = number % 10; number /= 10; sumOfFacts += factorial(d); } if (sumOfFacts == i) result += i; i++; } return result; }

here the haskell solution

--BRUTE FORCE SOLUTION solve:: Int solve = sum (filter (\x-> sfc x 0 == x) [10..2540160]) --sum factorial of digits sfc :: Int -> Int -> Int sfc 0 acc = acc sfc n acc = sfc n' (acc+fc r) where n' = div n 10 r = mod n 10 --n-(10*n') fc 0 =1 fc 1 =1 fc 2 =2 fc 3 =6 fc 4 =24 fc 5 =120 fc 6 =720 fc 7 =5040 fc 8 =40320 fc 9 =362880

a very similar program compiled with -O2 runs in 0.55s on my computer — Random Dev
– Random Dev, Commented Apr 2, 2015 at 13:27

András Kovács · Accepted Answer · 2015-04-02 17:34:55Z

First, compile with optimizations. With ghc-7.10.1 -O2 -fllvm, the Haskell version runs in 0.54 secs for me. This is already pretty good.

If we want to do even better, we should first replace div with quot and mod with rem. div and mod do some extra work, because they handle the rounding of negative numbers differently. Since we only have positive numbers here, we should switch to the faster functions.

Second, we should replace the pattern matching in fc with an array lookup. GHC uses a branching construct for Int patterns, and uses binary search when the number of cases is large enough. We can do better here with a lookup.

The new code looks like this:

import qualified Data.Vector.Unboxed as V facs :: V.Vector Int facs = V.fromList [1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880] --BRUTE FORCE SOLUTION solve:: Int solve = sum (filter (\x-> sfc x 0 == x) [10..2540160]) --sum factorial of digits sfc :: Int -> Int -> Int sfc 0 acc = acc sfc n acc = sfc n' (acc + V.unsafeIndex facs r) where (n', r) = quotRem n 10 main = print solve

It runs in 0.095 seconds on my computer.

„GHC uses a branching construct for patterns“ – correct, but not for GHC HEAD, where it now uses a jump table: ghc.haskell.org/trac/ghc/ticket/10137
@Stefan I guess I could but it doesn't matter at this level of optimization.
@AndrásKovács great answer! I should have tried with compiler optimizations! Anyway your solution on my laptop (quite old Intel centrino) ghc 7.8.2 runs in 555ms (~100 ms slower than mine version) and using the optimization suggested by@stefan speedups at ~350. I will try (and update the question) on my workstation. Is someone aware of the type of optimizations that come in play using -O2 in this context?

Collectives™ on Stack Overflow

Haskell performance tuning

1 Answer 1

8 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Related