4
\$\begingroup\$

I have two sieves that I wrote in python and would like help optimizing them if at all possible. The divisorSieve calculates the divisors of all numbers up to n. Each index of the list contains a list of its divisors. The numDivisorSieve just counts the number of divisors each index has but doesn't store the divisors themselves. These sieves work in a similar way as you would do a Sieve of Eratosthenes to calculate all prime numbers up to n.

Note: divs[i * j].append(i) changed from divs[i * j] += [i] with speed increase thanks to a member over at stackoverflow. I updated the table below with the new times for divisorSieve. It was suggested to use this board instead so I look forward to your input.

def divisorSieve(n): divs = [[1] for x in xrange(0, n + 1)] divs[0] = [0] for i in xrange(2, n + 1): for j in xrange(1, n / i + 1): divs[i * j].append(i) #changed from += [i] with speed increase. return divs def numDivisorSieve(n): divs = [1] * (n + 1) divs[0] = 0 for i in xrange(2, n + 1): for j in xrange(1, n / i + 1): divs[i * j] += 1 return divs #Timer test for function if __name__=='__main__': from timeit import Timer n = ... t1 = Timer(lambda: divisorSieve(n)) print n, t1.timeit(number=1) 

Results:

 -----n-----|--time(divSieve)--|--time(numDivSieve)-- 100,000 | 0.333831560615 | 0.187762331281 200,000 | 0.71700566026 | 0.362314797537 300,000 | 1.1643773714 | 0.55124339118 400,000 | 1.63861821235 | 0.748340797412 500,000 | 2.06917832929 | 0.959312993718 600,000 | 2.52753840891 | 1.17777010636 700,000 | 3.01465945139 | 1.38268800149 800,000 | 3.49267338434 | 1.62560614543 900,000 | 3.98145114138 | 1.83002270324 1,000,000 | 4.4809342539 | 2.10247496423 2,000,000 | 9.80035361075 | 4.59150618897 3,000,000 | 15.465184114 | 7.24799900479 4,000,000 | 21.2197508864 | 10.1484527586 5,000,000 | 27.1910144928 | 12.7670585308 6,000,000 | 33.6597508864 | 15.4226118057 7,000,000 | 39.7509513591 | 18.2902677738 8,000,000 | 46.5065447534 | 21.1247001928 9,000,000 | 53.2574136966 | 23.8988925173 10,000,000 | 60.0628718044 | 26.8588813211 11,000,000 | 66.0121182435 | 29.4509693973 12,000,000 | MemoryError | 32.3228102258 20,000,000 | MemoryError | 56.2527237669 30,000,000 | MemoryError | 86.8917332214 40,000,000 | MemoryError | 118.457179822 50,000,000 | MemoryError | 149.526622815 60,000,000 | MemoryError | 181.627320396 70,000,000 | MemoryError | 214.17467749 80,000,000 | MemoryError | 246.23677614 90,000,000 | MemoryError | 279.53308422 100,000,000 | MemoryError | 314.813166014 

Results are pretty good and I'm happy I was able to get it this far, but I'm looking to get it even faster. If at all possible, I'd like to get 100,000,000 at a reasonable speed with the divisorSieve. Although this also brings into the issue that anything over 12,000,000+ throws a MemoryError at divs = [[1] for x in xrange(0, n + 1)]) in divisorSieve. numDivisorSieve does allow the full 100,000,000 to run. If you could also help get past the memory error, that would be great.

I've tried replacing numDivisorSieve's divs = [1] * (n + 1) with both divs = array.array('i', [1] * (n + 1)) and divs = numpy.ones((n + 1), dtype='int') but both resulted in a loss of speed (slight difference for array, much larger difference for numpy). I expect that since numDivisorSieve had a loss in efficiency, then so would divisorSieve. Of course there's always the chance I'm using one or both of these incorrectly since I'm not used to either of them.

I would appreciate any help you can give me. I hope I have provided enough details. Thank you.

\$\endgroup\$
4
  • \$\begingroup\$ What are you doing with the result? \$\endgroup\$ Commented Nov 9, 2012 at 2:04
  • \$\begingroup\$ If storing only prime factors counts as 'optimization', we can do ~3-4 times faster. \$\endgroup\$ Commented Nov 9, 2012 at 2:31
  • \$\begingroup\$ Have you tested the application using Python 64bit? \$\endgroup\$ Commented Nov 9, 2012 at 10:48
  • \$\begingroup\$ Thanks for the suggestion about Python 64bit. It's looking like it solves the memory issues. Will update once I've run all the tests \$\endgroup\$ Commented Nov 9, 2012 at 22:29

3 Answers 3

2
\$\begingroup\$

You can use xrange's third param to do the stepping for you to shave off a little bit of time (not huge).

Changing:

for j in xrange(1, n / i + 1): divs[i * j].append(i) 

To:

for j in xrange(i, n + 1, i): divs[j].append(i) 

For n=100000, I go from 0.522774934769 to 0.47496509552. This difference is bigger when made to numDivisorSieve, but as I understand, you're looking for speedups in divisorSieve

\$\endgroup\$
5
  • \$\begingroup\$ This was a great optimization! for n = 10,000,000 divisorSieve's time is down to 9.56704298066 and for n = 100,000,000 numDivisorSieve's time is down to 67.1441108416 which are both great optimizations. \$\endgroup\$ Commented Nov 9, 2012 at 21:52
  • \$\begingroup\$ Wow... that's better than I expected! Glad I could help. \$\endgroup\$ Commented Nov 9, 2012 at 22:07
  • \$\begingroup\$ Well...apparently the reason it did so well is I had the range messed up. So while it's still an improvement, it's not quite as good as I thought. Doesn't make me appreciate your help any less, just makes me feel a little stupid. Guess that's what I get for not testing the output well enough. Will update the original post when I get a chance to recompute all the results \$\endgroup\$ Commented Nov 11, 2012 at 4:14
  • \$\begingroup\$ @JeremyK That's fine. At least I know I'm not crazy now. :) \$\endgroup\$ Commented Nov 11, 2012 at 4:28
  • \$\begingroup\$ @JeremyK I was commenting on your OP about the erratic range before reading this. You should update the post - atleast change the code and say that the results are incorrect for now. \$\endgroup\$ Commented Dec 13, 2012 at 7:03
1
\$\begingroup\$

The following offers a very very small improvement to divisorSieve and a good improvement to numdivisorSieve. But the factors will not be sorted inside each list. For example the factors list of of 16 will be [4, 2, 8, 1, 16].

def divisorSieve(n): divs = [[] for j in xrange(n + 1)] nsqrt = int(sqrt(n)) for i in xrange(1, nsqrt + 1): divs[i*i].append(i) for j in xrange(i, i*i, i): divs[j].append(j/i) #If j/i is replaced by i, a good improvement is seen. Of course, that would be wrong. divs[j].append(i) for i in xrange(i+1, n+1): for j in xrange(i, n+1, i): divs[j].append(j/i) divs[j].append(i) return divs def numdivisorSieve(n): divs = [1] * (n + 1) divs[0] = 0 nsqrt = int(sqrt(n)) for i in xrange(2, nsqrt + 1): divs[i*i] += 1 for j in xrange(i, i*i, i): divs[j] += 2 for i in xrange(i+1, n+1): for j in xrange(i, n+1, i): divs[j] += 2 return divs 

Unfortunately, modifying this definition to create two lists divsmall ([4,2,1]) and divlarge ([8,16]) and in the end doing divsmall[j].reverse(); divsmall[j].extend(divlarge[j]); return divsmall makes it slightly slower than the original.

Also, I think it makes more sense for divs[0] to be [] instead of [0]

\$\endgroup\$
0
\$\begingroup\$

EDIT: map(lambda s: s.append(i) , [divs[ind] for ind in xrange(i, n + 1, i)]) Seems to be ~0.2% faster ~2 times slower than Adam Wagner's (for n=1000000)

The infamous 'test the unit test' problem.

\$\endgroup\$
2
  • \$\begingroup\$ Maybe I'm doing something incorrectly, but when I put this in there I get TypeError: list indices must be integers, not xrange \$\endgroup\$ Commented Nov 9, 2012 at 21:52
  • \$\begingroup\$ ~2 times slower Must be due to function call overhead for calling the lambda function. \$\endgroup\$ Commented Dec 13, 2012 at 7:55

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.