Fast Algorithm To Factorize All Numbers Up To A Given Number
Solution 1:
You can use the first part of your script in order to do that!
Code:
from math import *
import time
MAX = 40000000
t = time.time()
# factors[i] = all the prime factors of i
factors = {}
# Running over all the numbers smaller than sqrt(MAX) since they can be the factors of MAXfor i inrange(2, int(sqrt(MAX) + 1)):
# If this number has already been factored - it is not primeif i notin factors:
# Find all the future numbers that this number will factorfor j inrange(i * 2, MAX, i):
if j notin factors:
factors[j] = [i]
else:
factors[j].append(i)
print(time.time() - t)
for i inrange(3, 15):
if i notin factors:
print(f"{i} is prime")
else:
print(f"{i}: {factors[i]}")
Result:
3: is prime 4: [2] 5: is prime 6: [2, 3] 7: is prime 8: [2] 9: [3] 10: [2, 5] 11: is prime 12: [2, 3] 13: is prime 14: [2, 7]
Explanation:
As mentioned in the comments it is a modification of the Sieve of Eratosthenes algorithm. For each number we find all the numbers it can factorize in the future. If the number does not appear in the result dictionary it is a prime since no number factorize it. We are using dictionary instead of list so the prime numbers will not need to be saved at all - which is a bit more memory friendly but also a bit slower.
Time:
According to a simple check for MAX = 40000000
with time.time()
: 110.14351892471313
seconds.
For MAX = 1000000
: 1.0785243511199951
seconds.
For MAX = 200000000
with time.time()
: Not finished after 1.5 hours... It has reached the 111th item in the main loop out of 6325 items (This is not so bad since the farther the loops go they become shorter).
I do believe however that a well written C code could do it in half an hour (If you are willing to consider it I might write another answer). Some more optimization that can be done is use multithreading and some Primality test like Miller–Rabin. Of course it is worth mentioning that these results are on my laptop and maybe on a PC or a dedicated machine it will run faster or slower.
Solution 2:
Edit:
I actually asked a question in code review about this answer and it has some cool graphs about the runtime!
Edit #2:
Someone answered my question and now the code can run in 2.5 seconds with some modifications.
Since the previous answer was written in Python
it was slow. The following code is doing the exact same but in C++
, it has a thread that is monitoring to which prime it got every 10 seconds.
#include<math.h>#include<unistd.h>#include<list>#include<vector>#include<ctime>#include<thread>#include<iostream>#include<atomic>#ifndef MAX#define MAX 200000000#define TIME 10#endif
std::atomic<bool> exit_thread_flag{false};
voidtimer(int *i_ptr){
for (int i = 1; !exit_thread_flag; i++) {
sleep(TIME);
if (exit_thread_flag) {
break;
}
std::cout << "i = " << *i_ptr << std::endl;
std::cout << "Time elapsed since start: "
<< i * TIME
<< " Seconds" << std::endl;
}
}
intmain(int argc, charconst *argv[]){
int i, upper_bound, j;
std::time_t start_time;
std::thread timer_thread;
std::vector< std::list< int > > factors;
std::cout << "Initiallizating" << std::endl;
start_time = std::time(nullptr);
timer_thread = std::thread(timer, &i);
factors.resize(MAX);
std::cout << "Initiallization took "
<< std::time(nullptr) - start_time
<< " Seconds" << std::endl;
std::cout << "Starting calculation" << std::endl;
start_time = std::time(nullptr);
upper_bound = sqrt(MAX) + 1;
for (i = 2; i < upper_bound; ++i)
{
if (factors[i].empty())
{
for (j = i * 2; j < MAX; j += i)
{
factors[j].push_back(i);
}
}
}
std::cout << "Calculation took "
<< std::time(nullptr) - start_time
<< " Seconds" << std::endl;
// Closing timer thread
exit_thread_flag = true;
std::cout << "Validating results" << std::endl;
for (i = 2; i < 20; ++i)
{
std::cout << i << ": ";
if (factors[i].empty()) {
std::cout << "Is prime";
} else {
for (int v : factors[i]) {
std::cout << v << ", ";
}
}
std::cout << std::endl;
}
timer_thread.join();
return0;
}
It needs to be compiled with the line:
g++ main.cpp -std=c++0x -pthread
If you do not want to turn your entire code to C++ you can use the subprocess library in Python.
Time:
Well I tried my best but it still runs in over an hour... it has reached 6619
which is the 855th prime (Much better!) in 1.386111 hours (4990 seconds). So it is an improvement but there is still some way to go! (It might be faster without another thread)
Post a Comment for "Fast Algorithm To Factorize All Numbers Up To A Given Number"