0

I want my program to find the largest prime factor of number 600851475143. For example, the prime factors of 13195 are 5, 7, 13 and 29 and 29 is the largest one. While my code does work, the answer takes too long to appear even for much smaller inputs, such as 6kk (takes about 15 seconds. For 12kk, it takes 37 seconds, so the increments are even worse than linear), which is 100k times smaller than the number I should use as the input. Below is my code, any help concerning the increased efficiency of the code would be greatly appreciated.

#include <stdio.h> #include <math.h> int main() { long long int number=600851475143; int largest_prime_factor,i,j,k; for (i=1;i<number/2;i+=2){ k=0; j=3; for (j=3;j<=sqrt(i);j+=2){ if (i%j==0){ k++; break; } } if (k==0){ if (number%i==0) largest_prime_factor=i; } } printf("The largest prime factor of 600851475143 is: %d", largest_prime_factor); return 0; } 
12
  • What have you tried until now, to increase performance? How do you compile your code? Commented Dec 6, 2016 at 12:51
  • 4
    This question is fine and on-topic, but please note that when you have questions about how to improve working code, codereview.stackexchange.com might give you better and more detailed answers. Commented Dec 6, 2016 at 12:54
  • If your code is working and you want to optimise it, this post should be on code review. Lundin said it first. Commented Dec 6, 2016 at 12:54
  • 3
    "warning: iteration 1073741823u invokes undefined behavior [-Waggressive-loop-optimizations] largest_prime_factor and i surely cannot be a plain int, as they have to keep numbers of the same magnitude of number. Commented Dec 6, 2016 at 13:03
  • 1
    @MatteoItalia Do not say "surely" just because you have a 32 bit int. ;-). But of course for portable code it's essential to have a loop variable of the same type as the upper limit. Commented Dec 6, 2016 at 13:04

5 Answers 5

3

You don't need to go through the whole list. Once you find one prime factor, you factorize your number by that, and continue working with what's left.

For instance, take your example: 600,851,475,143. You can quickly find its first prime factor to be 71. If you divide 600,851,475,143 by 71 you get 8,462,696,833. Both these numbers share the same prime factors except 71. So now you can search the largest factor of the original number but with a search space reduced by 2 orders of magnitude.

Also, notice that your code will fail if the number itself is prime. To fix that, initialize your maximum number as

int largest_prime_factor = 1; 

and if it's still 1 in the end, return the number itself. (You could initialize with number, but you'll soon see why I chose 1)

So start by treating 2 as a special case:

 long long remain = number; while (remain % 2 == 0) { remain /= 2; largest_prime_factor = 2; } 

And then do it similarly inside your loop. Since for prime numbers we only need to check up to its square root, we'll limit our loop by two cases, depending on whether we still think the number could be prime.

  1. Still prime: test up to sqrt(number)
  2. No longer prime: test until the maximum factor exceeds what remains.

In the end, your modified code may look like this:

#include <stdio.h> int main() { long long int number=600851475143; long long largest_prime_factor = 1,i,j,k; long long remain = number; while (remain % 2 == 0) { remain /= 2; largest_prime_factor = 2; /* Uncomment to see the factors printf("2 ");*/ } for (i=3; (largest_prime_factor == 1 && i*i <= number) || (largest_prime_factor > 1&& i <= remain); i+=2){ k=0; j=3; for (j=3; j*j<=i;j+=2){ if (i%j==0){ k++; break; } } if (k==0 && remain%i==0) { largest_prime_factor=i; while (remain % i == 0) { /* Uncomment to see the factors printf("%d ", i); */ remain /= i; } } } printf("The largest prime factor of %Ld is: %Ld", number, largest_prime_factor); return 0; } 

Also notice that other variables should also be of type (long long).

The bottleneck will be checking if each number is prime, and the whole process will still be slow if the prime factors themselves are large. But you can get a much faster average case. For your example, this algorithm gets the factors 71, 839, 1471, and 6857 in less than a second.

Sign up to request clarification or add additional context in comments.

Comments

2

In general, there is no efficient way to do this: https://en.wikipedia.org/wiki/Integer_factorization#Difficulty_and_complexity

There are ways to increase your code's speed though. Have a look at some of the algorithms in that article.

3 Comments

@MichaelWalz Not sure. It may be the most valuable answer, actually, because it points out the root cause of the OP's dissatisfaction with performance. Plus it points to hints that are less obvious (I suppose) than "move computations out of the for loop".
@MichaelWalz Keep it for posteriority... discussion can be instructive. Other ppl will have the same idea. Ah, too late ;-).
@MichaelWalz I went back and forth with whether to comment or answer. I decided on answer because there is no real way to get an efficient solution as far as computational complexity is concerned and outright code review is a separate problem for a separate site. Good arguments can be made for commenting too, I agree.
2

You should move sqrt(i) out of the for loop. It is calculated in every loop. Also j*j <= i would be much faster than j <= sqrt(i).

There is an error in your code: if number is a long long int the other variables should be too or the condition i<number/2 is always true!

10 Comments

Modern compilers should be able to optimize that. But indeed, on older compilers you definitely wanted to move calculations out of the loop condition.
Will GCC do that without explicit optimization flags?
Most compilers would optimize it if they could be sure the sqrt() function has no side effects. This may be true for some compilers, but not for all. In my opinion it is generally a good idea to use squares instead of square roots if possible.
Also, number does not appear to be changed in the loop so why recalculate number/2 on every loop? Do number /= 2; before the first for
What troubles me, is that when I declared the variables as long long ints instead of just ints, as you correctly advised me to do, nothing changed. The results remained the same. Doesn't an int reach up to 32k? Using 13456769 as the input gave me 708251 in both cases (13.456.769 is exactly 19 times bigger than 708.251, so the answer should be correct) + when I used long long int to declare the variables, the code became about 2 times slower.
|
1

This code should be pretty fast and only takes a few milliseconds for the number 600851475143:

long long int primes[1000]; int primesSize = 0; long long int primeFactors[100]; int primeFactorsSize = 0; long long int number = 600851475143ll; for (long long int f = 2; f < number / 2; ++f) { // Check if f is a prime number int primesIndex = 0; while (primesIndex < primesSize && (f%primes[primesIndex]) != 0) ++primesIndex; if (primesIndex >= primesSize) { primes[primesSize++] = f; // Check if f is a prime factor of number while ((number % f) == 0) { primeFactors[primeFactorsSize++] = f; number /= f; } } } if (number != 1) primeFactors[primeFactorsSize++] = number; 

Creating a list of prime numbers already found speeds up the check of another possible factor.

If you find a prime factor of your number, you divide your number by the prime factor and continue with the division result. Perhaps this division has to be done multiple times. The final value in number also is the greatest prime factor.

Warning: My code has not been tested at all. I just made sure the result is correct for number = 600851475143ll. Also I am using a C++ compiler so you may have to make some minor changes.

For larger numbers you need to implement dynamic memory allocation at least for the primes array:

#include <stdlib.h> #include <stdio.h> int main() { long long int *primes = NULL; int primesSize = 0; int primesCapacity = 0; long long int *primeFactors = NULL; int primeFactorsSize = 0; int primeFactorsCapacity = 0; long long int number = 600851475143ll; number = 13456769ll; for (long long int f = 2; f < number / 2; ++f) { // Check if f is a prime number int primesIndex = 0; while (primesIndex < primesSize && (f%primes[primesIndex]) != 0) ++primesIndex; if (primesIndex >= primesSize) { if (primesSize == primesCapacity) { primesCapacity += 1000; primes = (long long int*)realloc(primes, primesCapacity * sizeof(long long int)); } primes[primesSize++] = f; // Check if f is a prime factor of number while ((number % f) == 0) { if (primeFactorsSize == primeFactorsCapacity) { primeFactorsCapacity += 1000; primeFactors = (long long int*)realloc(primeFactors, primeFactorsCapacity * sizeof(long long int)); } primeFactors[primeFactorsSize++] = f; number /= f; } } } if (number != 1) { if (primeFactorsSize == primeFactorsCapacity) { primeFactorsCapacity += 1000; primeFactors = (long long int*)realloc(primeFactors, primeFactorsCapacity * sizeof(long long int)); } primeFactors[primeFactorsSize++] = number; } printf("Last prime factor is %lld", primeFactors[primeFactorsSize-1]); return 0; } 

Comments

0

Using j*j<=i instead of j<=sqrt(i), depending on the magnitude of the numbers I used, made the code 5 to 10 times faster. ++k; instead of k++; had a slight impact as well. By about 0.5%. And I have not checked everything yet. Thank you everyone for your valuable inputs! There's lots of things to think and learn about thanks to them!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.