-2

I am writing a python program to find and return a list of Twin primes between two numbers.

This is my code:

#prime selector
def is_prime(num):
    if num < 2 or num % 2 == 0: 
        return False
    if num == 2: 
        return True
    upr_lmt = int((num ** 0.5) + 1) 
    for number in range(3, upr_lmt, 2): 
        if num % number == 0:
            return False
    return True


def primeLister(end_point, st_point = 2):
    primeList = []
    for i in range(st_point, end_point + 1):
        if is_prime(i):
            primeList.append(i)
    return primeList

It is saved as prime_selector.py in a folder.

This is the testcase that I have written:

# Test case
from prime_selector import primeLister

def log_newline(file_writer):
    file_writer.write("\n\n")

def log_testcase(case_num, case_desc, result, file_writer):
    file_writer.write(f"================= Test Case {case_num + 1} =================\n")
    print((f"================= Test Case {case_num + 1} ================="))
    file_writer.write(case_desc)
    print(case_desc, end = "")
    file_writer.write(result)
    print(result)

num_list = [99991,999999937]

with open("./primeLister_testcase","w") as file_writer:
    for case_num, i in enumerate(num_list):
        case_desc = f"Listing all prime numbers till {i}:\n"
        result = "".join([f" > {prime}\n" for prime in primeLister(i)])

        log_testcase(case_num, case_desc, result, file_writer)

        if case_num + 1 != len(num_list):
            log_newline(file_writer)    

The testcase is also saved in the same folder locally. The program stops abruptly during the execution of primeLister testcase after writing the primes up to 99991 in the file.

I guess that it stops during the process of listing primes between 99991 and 999999937. I've got no idea why. My computer has an Intel i5-12450H processor with 16GB RAM. If that's relevant.

I have already tried implementing an infinite loop, threading and also using exception handling. Nothing has worked so far. What am I doing wrong?

I was expecting the program to keep executing until it found all the primes within the provided range.

15
  • 3
    It's unclear what happens exactly and what is surprising about that. There's likely way more code here than is needed to reproduce the issue. Commented Aug 16, 2024 at 12:28
  • 2
    Please read minimal reproducible example and concentrate on the word Minimal. By stripping out the code not relevant you may even solve it yourself. Commented Aug 16, 2024 at 12:31
  • 2
    Why do you have a while True loop to repeat everything from scratch again? Why creating a thread? How does that help to test your code? Please remove everything that is not contributing to showcase the problem. Only run with one input, and only once, and only testing what fails. Commented Aug 16, 2024 at 12:32
  • Please qualify "stops abruptly". Is there an exception or does it just, literally, stop? Commented Aug 16, 2024 at 14:09
  • 1
    There are still features in this code that seem unrelated to your problem: case 99991 apparently runs fine, so you can leave it out. Only one failing case is needed. The File I/O seems unrelated. Can you confirm whether you get the problem also when you remove all file I/O? Making one long string as output: is that related to your problem? Does the problem also occur when you don't create that string? If not, please remove that. This way continue to reduce the code to only get the minimal code that still produces the issue. Commented Aug 16, 2024 at 17:40

2 Answers 2

2

The problem is that your process is aiming to produce about 50 million primes for the input 999999937, accumulating them in a list, and does this in an non-optimal way, as for each integer 𝑘 between 1 and 999999937 you call is_prime, which executes √𝑘/2 iterations when 𝑘 is prime.

Then, after a long running time, you concatenate those 50 million numbers with their decimal representation into one, long string, which will be about 500 million characters long. That is half a GB if we assume a 1-byte character representation. It may be that your program runs out of memory... if it ever finishes producing the primes in the first place.

Running your code with a small adaptation, turning primeLister into a generator, and then printing the primes one by one (well, I only printed one per 100000 to save time), I could monitor the progress, I got that generating the first million of primes took about 2 minutes, and it continued like this:

number of primes time needed
1,000,000 2 minutes
2,000,000 5 minutes
3,000,000 9 minutes
4,000,000 14 minutes
.... ...

I stopped there, but extrapolating, I estimate that it would take about 22 hours to finish the call to primeLister with input 999999937, and only then the string of 0.5 GB would be created from it. I suppose your system just kills the process either because of memory or time constraints. In my test it didn't get killed for the 15 minutes I waited for it. Then I killed it myself.

But it is clear that this is not practical. You should use a sieve approach.

Sign up to request clarification or add additional context in comments.

3 Comments

How would any of this exhaust their 16GB RAM, though?
@nocomment The one byte per number estimate is very conservative; Python realistically uses many more bytes per object.
Sure, I estimate 40 bytes per int object and 72 bytes per str object (including 8 bytes per object for the reference in the list). But 50 million of that plus the big string is still only ~6.3 GB. And they said it stopped after an hour, so most likely they weren't at the strings stage yet. Maybe they had ~10 million ints and thus were using only ~0.4 GB.
0

If your goal is to generate primes up to a certain number, using Sieve or Eratosthenes would be a much more efficient approach, which might also solve the issue of the program abruptly stopping:

def sieve_of_eratosthenes(end_point):
    sieve = [True] * (end_point+ 1)
    sieve[0] = sieve[1] = False  # 0 and 1 are not primes
    for start in range(2, int(end_point**0.5) + 1):
        if sieve[start]:
            for multiple in range(start*start, end_point+ 1, start):
                sieve[multiple] = False
    return [num for num, is_prime in enumerate(sieve) if is_prime]

2 Comments

So I should use Sieve of Eratosthenes instead of primeLister() and is_prime() in my code? I mean I get that this is easier and also a more efficient way than just going through odd numbers and eliminating them if they are not prime numbers. But I still don't know what the problem with my code is. And chances are that I will run into this error again.
It's hard to diagnose the problem with your code without a stack trace or a minimal reproducible example, so I suggested a possible fix that would greatly speed up your code. The sieve would replace primelister. As for the cause, my best guess is the process being shut down due to using up too much of your machine's memory.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.