Generating All 32-Bit Primes (Part I) (hnlyman.github.io)

by hnlyman 28 comments 85 points
Read article View on HN

28 comments

[−] senfiaj 62d ago
There is also the segmented Sieve of Eratosthenes. It has a simlar performance but uses much less memory: the number of prime numbers from 2 to sqrt(n). For example, for n = 1000000, the RAM has to store only 168 additional numbers.

I use this algorithm here https://surenenfiajyan.github.io/prime-explorer/

[−] dahart 62d ago
The Pseudosquares Sieve will drop the memory requirements much further from sqrt(n) to log^2(n). https://link.springer.com/chapter/10.1007/11792086_15
[−] adgjlsfhk1 62d ago
IMO that algorithm is barely a sieve. It is technically pareto optimal, but it seems like in practice it would just end up worse than either an optimized segmented sieve (it is ~log(n)^2 slower) or an O(1) memory method (probable prime test followed by a deterministic prime test) depending on how large and how wide the numbers you're testing are.
[−] hnlyman 62d ago
Yep, this is the natural way to go, especially considering the possibility of parallel computing and the importance of cache locality, etc.
[−] mark-r 62d ago
You can combine the Sieve and Wheel techniques to reduce the memory requirements dramatically. There's no need to use a bit for numbers that you already know can't be prime. You can find a Python implementation at https://stackoverflow.com/a/62919243/5987
[−] tromp 62d ago
Or a C implementation at https://tromp.github.io/pearls.html#sieve which runs in well under 10s.
[−] hnlyman 62d ago
I'd be interested in seeing an explanation of the code, since it looks pretty incomprehensible to me. Per the arbitrary rules I set for myself, I'm not allowed to precompute/hardcode the wheel (looks like this implementation uses a hardcoded wheel of size 2x3x5=30). I wonder if/by how much the performance would suffer by computing and storing the coprime remainders in memory instead of handing them directly to the compiler.
[−] tromp 62d ago
I wrote this in a semi obfuscated style to make it fit on one screen. It's indeed a hardcoded 2x3x5 wheel; but I suspect computing all those constants would have made the program significantly longer.
[−] forinti 62d ago
If you take all 53 8 bit primes, you can use modular arithmetic with a residue base to work with numbers up to

64266330917908644872330635228106713310880186591609208114244758680898150367880703152525200743234420230

This would require 334 bits.

[−] marxisttemp 62d ago
Why include writing the primes to a file instead of, say, standard output? That increases the optimization space drastically and the IO will eclipse all the careful bitwise math

Does having the primes in a file even allow faster is-prime lookup of a number?

[−] hnlyman 62d ago
No real reason. It's just an arbitrary task I made for myself. I might have to adjust the goal if writing to the file becomes the lion's share of the runtime, but I'll be pretty happy with myself if that's the project's biggest problem.
[−] ojciecczas 62d ago
Do you know the https://en.wikipedia.org/wiki/Sieve_of_Atkin? It's mind-blowing.
[−] dahart 62d ago
This got me through many of the first 100 problems on Project Euler:

    n = 1000000 # must be even
    sieve = [True] * (n/2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i/2]: sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    …
    # x is prime if x%2 and sieve[x/2]
Edit: I guess I irked someone. :/ Yes this is a memory hog, but to me beautiful because it’s so tiny and simple. I never tried very hard, but I wonder if it could be made a real one-liner.
[−] logicallee 62d ago
there are also very fast primality tests that work statistically. It's called Miller-Rabin, I tested in the browser here[1] and it can do them all in about three minutes on my phone.

[1] https://claude.ai/public/artifacts/baa198ed-5a17-4d04-8cef-7...

[−] amelius 62d ago
What are the false positive/negative rates?
[−] logicallee 62d ago
for the way this one was done, this witness set has been proven to produce no false positives or negatives for n < 2³⁷.
[−] _alternator_ 62d ago
Nice. Notably with Miller-Rabin, you can also iterate the test cheaply and get exponentially low false positive/negative rates. I believe that this is how prime factors for RSA keys are usually chosen; choose an error rate below 2^-1000 and sleep extremely soundly knowing that the universe is more likely to evaporate in the next second than that you’ve got a false positive prime.
[−] Cold_Miserable 62d ago
Heh. 1.Create fast modulus quad M for dword D for the first 2000? 200000? (xM)D 2.Eliminate 0b,101b 3.Divide using vrcp14ss/vdivss with correction. Use fast square root too using rsqrt14.
[−] reader9274 62d ago
Very well written
[−] ZyanWu 62d ago

> There is a long way to go from here. Kim Walisch's primesieve can generate all 32-bit primes in 0.061s (though this is without writing them to a file)

Oh, come on, just use a bash indirection and be done with it. It takes 1 minute and you had another result for comparison

[−] shablulman 62d ago
[dead]