Optimizing Divisor Sieve

Question

I have two sieves that I wrote in python and would like help optimizing them if at all possible. The divisorSieve calculates the divisors of all numbers up to n. Each index of the list contains a list of its divisors. The numDivisorSieve just counts the number of divisors each index has but doesn't store the divisors themselves. These sieves work in a similar way as you would do a Sieve of Eratosthenes to calculate all prime numbers up to n.

Note: divs[i * j].append(i) changed from divs[i * j] += [i] with speed increase thanks to a member over at stackoverflow. I updated the table below with the new times for divisorSieve. It was suggested to use this board instead so I look forward to your input.

def divisorSieve(n): divs = [[1] for x in xrange(0, n + 1)] divs[0] = [0] for i in xrange(2, n + 1): for j in xrange(1, n / i + 1): divs[i * j].append(i) #changed from += [i] with speed increase. return divs def numDivisorSieve(n): divs = [1] * (n + 1) divs[0] = 0 for i in xrange(2, n + 1): for j in xrange(1, n / i + 1): divs[i * j] += 1 return divs #Timer test for function if __name__=='__main__': from timeit import Timer n = ... t1 = Timer(lambda: divisorSieve(n)) print n, t1.timeit(number=1)

Results:

 -----n-----|--time(divSieve)--|--time(numDivSieve)-- 100,000 | 0.333831560615 | 0.187762331281 200,000 | 0.71700566026 | 0.362314797537 300,000 | 1.1643773714 | 0.55124339118 400,000 | 1.63861821235 | 0.748340797412 500,000 | 2.06917832929 | 0.959312993718 600,000 | 2.52753840891 | 1.17777010636 700,000 | 3.01465945139 | 1.38268800149 800,000 | 3.49267338434 | 1.62560614543 900,000 | 3.98145114138 | 1.83002270324 1,000,000 | 4.4809342539 | 2.10247496423 2,000,000 | 9.80035361075 | 4.59150618897 3,000,000 | 15.465184114 | 7.24799900479 4,000,000 | 21.2197508864 | 10.1484527586 5,000,000 | 27.1910144928 | 12.7670585308 6,000,000 | 33.6597508864 | 15.4226118057 7,000,000 | 39.7509513591 | 18.2902677738 8,000,000 | 46.5065447534 | 21.1247001928 9,000,000 | 53.2574136966 | 23.8988925173 10,000,000 | 60.0628718044 | 26.8588813211 11,000,000 | 66.0121182435 | 29.4509693973 12,000,000 | MemoryError | 32.3228102258 20,000,000 | MemoryError | 56.2527237669 30,000,000 | MemoryError | 86.8917332214 40,000,000 | MemoryError | 118.457179822 50,000,000 | MemoryError | 149.526622815 60,000,000 | MemoryError | 181.627320396 70,000,000 | MemoryError | 214.17467749 80,000,000 | MemoryError | 246.23677614 90,000,000 | MemoryError | 279.53308422 100,000,000 | MemoryError | 314.813166014

Results are pretty good and I'm happy I was able to get it this far, but I'm looking to get it even faster. If at all possible, I'd like to get 100,000,000 at a reasonable speed with the divisorSieve. Although this also brings into the issue that anything over 12,000,000+ throws a MemoryError at divs = [[1] for x in xrange(0, n + 1)]) in divisorSieve. numDivisorSieve does allow the full 100,000,000 to run. If you could also help get past the memory error, that would be great.

I've tried replacing numDivisorSieve's divs = [1] * (n + 1) with both divs = array.array('i', [1] * (n + 1)) and divs = numpy.ones((n + 1), dtype='int') but both resulted in a loss of speed (slight difference for array, much larger difference for numpy). I expect that since numDivisorSieve had a loss in efficiency, then so would divisorSieve. Of course there's always the chance I'm using one or both of these incorrectly since I'm not used to either of them.

I would appreciate any help you can give me. I hope I have provided enough details. Thank you.

If storing only prime factors counts as 'optimization', we can do ~3-4 times faster. — avip
– avip, Commented Nov 9, 2012 at 2:31
Thanks for the suggestion about Python 64bit. It's looking like it solves the memory issues. Will update once I've run all the tests — Jeremy K
– Jeremy K, Commented Nov 9, 2012 at 22:29

Adam Wagner · Accepted Answer · 2012-11-09 17:57:52Z

2

You can use xrange's third param to do the stepping for you to shave off a little bit of time (not huge).

Changing:

for j in xrange(1, n / i + 1): divs[i * j].append(i)

To:

for j in xrange(i, n + 1, i): divs[j].append(i)

For n=100000, I go from 0.522774934769 to 0.47496509552. This difference is bigger when made to numDivisorSieve, but as I understand, you're looking for speedups in divisorSieve

answered Nov 9, 2012 at 17:57

Adam Wagner

3431 silver badge10 bronze badges

\$\begingroup\$ This was a great optimization! for n = 10,000,000 divisorSieve's time is down to 9.56704298066 and for n = 100,000,000 numDivisorSieve's time is down to 67.1441108416 which are both great optimizations. \$\endgroup\$

Jeremy K
– Jeremy K

2012-11-09 21:52:07 +00:00
Commented Nov 9, 2012 at 21:52
\$\begingroup\$ Wow... that's better than I expected! Glad I could help. \$\endgroup\$

Adam Wagner
– Adam Wagner

2012-11-09 22:07:14 +00:00
Commented Nov 9, 2012 at 22:07
\$\begingroup\$ Well...apparently the reason it did so well is I had the range messed up. So while it's still an improvement, it's not quite as good as I thought. Doesn't make me appreciate your help any less, just makes me feel a little stupid. Guess that's what I get for not testing the output well enough. Will update the original post when I get a chance to recompute all the results \$\endgroup\$

Jeremy K
– Jeremy K

2012-11-11 04:14:31 +00:00
Commented Nov 11, 2012 at 4:14
\$\begingroup\$ @JeremyK That's fine. At least I know I'm not crazy now. :) \$\endgroup\$

Adam Wagner
– Adam Wagner

2012-11-11 04:28:39 +00:00
Commented Nov 11, 2012 at 4:28
\$\begingroup\$ @JeremyK I was commenting on your OP about the erratic range before reading this. You should update the post - atleast change the code and say that the results are incorrect for now. \$\endgroup\$

Prasanth S
– Prasanth S

2012-12-13 07:03:58 +00:00
Commented Dec 13, 2012 at 7:03

Add a comment |

Prasanth S · Accepted Answer · 2012-12-13 12:01:19Z

The following offers a very very small improvement to divisorSieve and a good improvement to numdivisorSieve. But the factors will not be sorted inside each list. For example the factors list of of 16 will be [4, 2, 8, 1, 16].

def divisorSieve(n): divs = [[] for j in xrange(n + 1)] nsqrt = int(sqrt(n)) for i in xrange(1, nsqrt + 1): divs[i*i].append(i) for j in xrange(i, i*i, i): divs[j].append(j/i) #If j/i is replaced by i, a good improvement is seen. Of course, that would be wrong. divs[j].append(i) for i in xrange(i+1, n+1): for j in xrange(i, n+1, i): divs[j].append(j/i) divs[j].append(i) return divs def numdivisorSieve(n): divs = [1] * (n + 1) divs[0] = 0 nsqrt = int(sqrt(n)) for i in xrange(2, nsqrt + 1): divs[i*i] += 1 for j in xrange(i, i*i, i): divs[j] += 2 for i in xrange(i+1, n+1): for j in xrange(i, n+1, i): divs[j] += 2 return divs

Unfortunately, modifying this definition to create two lists divsmall ([4,2,1]) and divlarge ([8,16]) and in the end doing divsmall[j].reverse(); divsmall[j].extend(divlarge[j]); return divsmall makes it slightly slower than the original.

Also, I think it makes more sense for divs[0] to be [] instead of [0]

avip · Accepted Answer · 2012-11-09 22:07:09Z

0

EDIT: map(lambda s: s.append(i) , [divs[ind] for ind in xrange(i, n + 1, i)]) Seems to be ~~~0.2% faster~~ ~2 times slower than Adam Wagner's (for n=1000000)

The infamous 'test the unit test' problem.

edited Nov 9, 2012 at 22:07

answered Nov 9, 2012 at 19:57

avip

5333 silver badges7 bronze badges

\$\begingroup\$ Maybe I'm doing something incorrectly, but when I put this in there I get TypeError: list indices must be integers, not xrange \$\endgroup\$

Jeremy K
– Jeremy K

2012-11-09 21:52:52 +00:00
Commented Nov 9, 2012 at 21:52
\$\begingroup\$ ~2 times slower Must be due to function call overhead for calling the lambda function. \$\endgroup\$

Prasanth S
– Prasanth S

2012-12-13 07:55:26 +00:00
Commented Dec 13, 2012 at 7:55

Add a comment |

Stack Exchange Network

Optimizing Divisor Sieve

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Optimizing Divisor Sieve

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions