2

I was writing a very simple program to examine if a number could divide another number evenly:

// use the divider squared to reduce iterations for(divider = 2; (divider * divider) <= number; divider++) if(number % divider == 0) print("%d can divided by %d\n", number, divider); 

Now I was curious if the task could be done by finding the square root of number and compare it to divider. However, it seems that sqrt() isn't really able to boost the efficiency. How was sqrt() handled in C and how can I boost the efficiency of sqrt()? Also, is there any other way to approach the answer with even greater efficiency?

Also, the

number % divider == 0 

is used to test if divider could evenly divide number, is there also a more efficient way to do the test besides using %?

11
  • @RSahu I don't believe this is a proper duplicate. Please reopen. Commented Feb 17, 2016 at 23:06
  • @FUZxxl, I was biased by the title. Commented Feb 17, 2016 at 23:07
  • With a little more research, you might rephrase your question as: "What is an efficient algorithm for finding all factors of an integer?" I think you will find that this is a rather deep question that has been tackled many times on this forum... Commented Feb 17, 2016 at 23:08
  • Searching for "why are floating point operations slower than integer operations" in google produces a lot of interesting links. Commented Feb 17, 2016 at 23:09
  • 1
    On the other hand, the remainder takes at least 10x more time than multiplication anyway, so event the ideal case you are not saving a lot of time. Commented Feb 17, 2016 at 23:24

4 Answers 4

3

I'm not going to address what the best algorithm to find all factors of an integer is. Instead I would like to comment on your current method.

There are thee conditional tests cases to consider

  1. (divider * divider) <= number
  2. divider <= number/divider
  3. divider <= sqrt(number)

See Conditional tests in primality by trial division for more detials.

The case to use depends on your goals and hardware.

The advantage of case 1 is that it does not require a division. However, it can overflow when divider*divider is larger than the largest integer. Case two does not have the overflow problem but it requires a division. For case3 the sqrt only needs to be calculated once but it requires that the sqrt function get perfect squares correct.

But there is something else to consider many instruction sets, including the x86 instruction set, return the remainder as well when doing a division. Since you're already doing number % divider this means that you get it for free when doing number / divider.

Therefore, case 1 is only useful on system where the division and remainder are not calculated in one instruction and you're not worried about overflow.

Between case 2 and case3 I think the main issue is again the instruction set. Choose case 2 if the sqrt is too slow compared to case2 or if your sqrt function does not calculate perfect squares correctly. Choose case 3 if the instruction set does not calculate the divisor and remainder in one instruction.

For the x86 instruction set case 1, case 2 and case 3 should give essentially equal performance. So there should be no reason to use case 1 (however see a subtle point below) . The C standard library guarantees that the sqrt of perfect squares are done correctly. So there is no disadvantage to case 3 either.

But there is one subtle point about case 2. I have found that some compilers don't recognize that the division and remainder are calculated together. For example in the following code

for(divider = 2; divider <= number/divider; divider++) if(number % divider == 0) 

GCC generates two division instruction even though only one is necessary. One way to fix this is to keep the division and reminder close like this

divider = 2, q = number/divider, r = number%divider for(; divider <= q; divider++, q = number/divider, r = number%divider) if(r == 0) 

In this case GCC produces only one division instruction and case1, case 2 and case 3 have the same performance. But this code is a bit less readable than

int cut = sqrt(number); for(divider = 2; divider <= cut; divider++) if(number % divider == 0) 

so I think overall case 3 is the best choice at least with the x86 instruction set.

Sign up to request clarification or add additional context in comments.

Comments

3

However, it seems that sqrt() isn't really able to boost the efficiency

That is to be expected, as the saved multiplication per iteration is largely dominated by the much slower division operation inside the loop.

Also, the number % divider = 0 is used to test if divider could evenly divide number, is there also a more efficient way to do the test besides using %?

Not that I know of. Checking whether a % b == 0 is at least as hard as checking a % b = c for some c, because we can use the former to compute the latter (with one extra addition). And at least on Intel architectures, computing the latter is just as computationally expensive as a division, which is amongst the slowest operations in typical, modern processors.

If you want significantly better performance, you need a better factorization algorithm, of which there are plenty. One particular simple one with runtime O(n1/4) is Pollard's ρ algorithm. You can find a straightforward C++ implementation in my algorithms library. Adaption to C is left as an exercise to the reader:

int rho(int n) { // will find a factor < n, but not necessarily prime if (~n & 1) return 2; int c = rand() % n, x = rand() % n, y = x, d = 1; while (d == 1) { x = (1ll*x*x % n + c) % n; y = (1ll*y*y % n + c) % n; y = (1ll*y*y % n + c) % n; d = __gcd(abs(x - y), n); } return d == n ? rho(n) : d; } void factor(int n, map<int, int>& facts) { if (n == 1) return; if (rabin(n)) { // simple randomized prime test (e.g. Miller–Rabin) // we found a prime factor facts[n]++; return; } int f = rho(n); factor(n/f, facts); factor(f, facts); } 

Constructing the factors of n from its prime factors is then an easy task. Just use all possible exponents for the found prime factors and combine them in each possible way.

11 Comments

If the input range is limited and I pre-compute all prime numbers in that range as a table, what would the complicity be? (or is such a table feasible for say int32?)
@user3528438 You can compute the primes in the range 1 to M in time O(M * log log M) using the Sieve of Eratosthenes. There are better algorithms for the job, but this is by far the simplest, and for M in the order of hundreds of millions to even billions it should be the fastest one if implemented efficiently.
@user3528438 And to answer your second question, there are about 200 million primes in that range (pi(n) ~= n / ln n), so the table would occupy somewhat around 800 megabytes. You can probably compute it within a couple of seconds on a single CPU, so yes, that might well be feasible depending on your constraints
Well, in the uint32 range, the largest factor must be below 65536, and below 65536 there are only 6542 prime numbers. So at least in the uint32 range, it would be quite efficient and feasible isn't it?
@user3528438 Oh you mean precomputing the primes up to sqrt(n) and using that table? Yes that's very feasible for your given range. However it's asymptotically "only" a logarithmic factor better than naive trivial division because of how dense primes are within the set of natural numbers. For 64 bits this is no longer feasible, but Pollard-Rho still "kinda" is (depending on your time constraints of course).
|
2

In C, you can take square roots of floating point numbers with the sqrt() family of functions in the header <math.h>.

Taking square roots is usually slower than dividing because the algorithm to take square roots is more complicated than the division algorithm. This is not a property of the C language but of the hardware that executes your program. On modern processors, taking square roots can be just as fast as dividing. This holds, for example, on the Haswell microarchitecture.

However, if the algorithmic improvements are good, the slightly slower speed of a sqrt() call usually doesn't matter.

To only compare up to the square root of number, employ code like this:

#include <math.h> /* ... */ int root = (int)sqrt((double)number); for(divider = 2; divider <= root; divider++) if(number % divider = 0) print("%d can divided by %d\n", number, divider); 

Comments

0

This is just my random thought, so please comment and critisize it if it's wrong.

The idea is to precompute all the prime numbers below a certain range and use it as a table.

Looping though the table, check if the prime number is a factor, if it is, then increament the counter for that prime number, if not then increment the index. Terminate when the index reaches the end or the prime number to check exceeds the input.

At end, the result is a table of all the prime factors of the input, and their counts. Then generating all natual factors should be trival, isn't it?

Worst case, the loop needs to go to the end, then it takes 6542 iterations.

Considering the input is [0, 4294967296] this is similar to O(n^3/8).

Here's MATLAB code that implements this method:

if p is generated by p=primes(65536); this method would work for all inputs between [0, 4294967296] (but not tested).

function [ output_non_zero ] = fact2(input, p) output_table=zeros(size(p)); i=1; while(i<length(p)); if(input<1.5) break; % break condition: input is divided to 1, % all prime factors are found. end if(rem(input,p(i))<1) % if dividable, increament counter and don't increament index % keep checking until not dividable output_table(i)=output_table(i)+1; input = input/p(i); else % not dividable, try next i=i+1; end end % remove all zeros, should be handled more efficiently output_non_zero = [p(output_table~=0);... output_table(output_table~=0)]; if(input > 1.5) % the last and largest prime factor could be larger than 65536 % hence would skip from the table, add it to the end of output % if exists output_non_zero = [output_non_zero,[input;1]]; end end 

test

p=primes(65536); t = floor(rand()*4294967296); b = fact2(t, p); % check if all prime factors adds up and they are all primes assert((prod(b(1,:).^b(2,:))==t)&&all(isprime(b(1,:))), 'test failed'); 

7 Comments

You may not need all the prime numbers. I suggest you check each for being a factor, as you generate them. Also, each prime may divide more than once, for example 45 = 3 * 3 * 5.
@WeatherVane Please see the test code. Running it with fact2(45, p) generates (3,2), (5,1). A difficult case is when the largest prime factor is greater than 65536, which is easy to hadle because there is at most one of them for each input.
Sorry I don't follow MATLAB, but in C it's still valid to generate the primes as you go, rather than all you might need.
@WeatherVane That's true but I don't know how (yet). It'll be useful when the input range is too large for the table to be feasible.
FWIW, the precise asymptotic complexity of this method is O(n^(1/2) / log n) per query (+ the preprocessing step obviously)
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.