A fast prime gap searching suite.
Table of contents generated with markdown-toc
This is a suite of tools (combined_sieve
, gap_stats
, gap_test.py
and gap_test_gpu
) which are useful for searching for prime gaps
centered around m * P#/d
.
The flow is to sieve many intervals with combined_sieve --save -u <filename>
Then calculate statistics about these intervals with gap_stats --save -u <filename>
Then test some portion of these intervals with gap_test.py -u <filename>.txt --prp-top-percent 25
To quickly test a known 30 merit prime gap (Setup must already be complete)
$ make
$ sqlite3 test.db < schema.sql
$ FN=809_1841070_11244000_1000_s15000_l200M.txt
$ ./combined_sieve --save --search-db test.db -u $FN
$ ./gap_stats --save --search-db test.db -u $FN --min-merit 20
$ ./gap_test.py --search-db test.db -u $FN
...
23142 30.1485 11244911 * 809#/1841070 -5788 to +17354 RECORD!
...
see [EXTENDED.md](for labourious notes and details)
See detailed instructions on each combined_sieve
, gap_stats
, gap_test
, and more in
EXTENDED.md tools.
combined_sieve
- Takes a set of parameters (
[m_start, m_start + m_inc), P#, d, sieve-length, max-prime
)- Use
combined_sieve --mstart 1 --minc 100 --max-prime 1 -d 4 -p <P>
for some insight on choose ofd
- Use
- Performs combined sieve algorithm
- Produces
<PARAMS>.txt
, list of all numbers nearm * P#/d
with no small factors.- each row is
m: -count_low, +count_high | -18 -50 -80 ... | +6 +10 +12 ...
- each row is
- Writes initial
range
entry (withtime_sieve
) intorange
table.
- Takes a set of parameters (
gap_stats
- Calculates some statistics over each range in the interval
- Expected gap in both directions
- Probability of being a record gap, of being a missing gap, of being gap >
--min-merit
- Store statistics in
m_stats
table - Compute prob(gap) over all m and save in
range_stats
table
- Calculates some statistics over each range in the interval
gap_test_simple
/gap_test.py
--unknown-filename <PARAM>.txt
- Runs PRP tests on most likely m's from
<PARAM>.txt
- Stores results (as they are calculated in
result
&m_stats
- Allows for saving of progress and graceful restart
gap_test.py
can produce graphs of distribution (via--plots
)
- Runs PRP tests on most likely m's from
Often I run combined_sieve
and gap_stats
for a BUNCH of ranges not I never end up testing.
misc/show_ranges.sh
updates and shows ranges from prime-gap-search.db
.
misc/drop_range.sh
deletes ranges when none of them have ever been tested
misc/record_check.py
searches through prime-gap-search.db
for records (over gaps.db
)
misc/finaliz.py -p <P> -d <D>
dumps m_stats
to <UNKNOWN>.csv
and <UNKNOWN>.stats.txt
files.
It also DELETES DATA from prime-gap-search.db
use it when you are done searching that P#/d
misc/run.sh <UNKNOWN_FILENAME>
runs combined_sieve
, gap_stats
, and prints the gap_test.py
command
misc/status.py
to check if a filename has been processed 100% (and can be deleted)
See EXTENDED.md benchmark and slightly out of date BENCHMARKS.md
Theory and justification for some calculations in present in THEORY.md
#$ sudo apt install libgmp10 libgmp-dev
#$ sudo apt install mercurial build-essential automake autoconf bison make libtool texinfo m4
# Building gmp from source is required until 6.3
$ hg clone https://gmplib.org/repo/gmp/
$ cd gmp
$ ./.bootstrap
$ mkdir build
$ cd build
$ ../configure
$ make
$ make check
$ make install
$ sudo apt install sqlite3
$ sudo apt install libmpfr-dev libmpc-dev libomp-dev libsqlite3-dev libbenchmark-dev
# Make sure this is >= python3.7 and not python2
$ python -m pip install --user gmpy2==2.1.0b5 primegapverify
# For misc/double_check.py
$ sudo apt install gmp-ecm
$ sudo apt install sympy
$ git clone https://github.com/sethtroisi/prime-gap.git
$ cd prime-gap
$ mkdir -p logs/ unknowns/
$ sqlite3 prime-gap-search.db < schema.sql
# There are a handful of constants (MODULE_SEARCH_SECS, PRIME_RANGE_SEC, ...)
# in gap_common.cpp `combined_sieve_method2_time_estimate` that can be set for
# more accurate `combined_sieve` timing.
- prime-gap-list
$ git clone --depth 10 https://github.com/primegap-list-project/prime-gap-list.git $ sqlite3 gaps.db < prime-gap-list/allgaps.sql
gap_test_gpu
is experimental using https://github.com/NVlabs/CGBN- It's much faster for <2,048 bits.
- Clang (
CC=clang++-12
) might speed upcombined_sieve
slightly. - There is some support for using OpenPFGW PrimeWiki SourceForge
- Unpack somewhere and link binary as
pfgw64
- may also need to
chmod a+x pfgw64
- modify
is_prime
in gap_utils.py
- Unpack somewhere and link binary as
- Multiple layers of verification of
combined_sieve
- Can compare
--method1
output with--method2
- Can add use
make clean combined_sieve VALIDATE_FACTORS=1
ormake clean combined_sieve VALIDATE_LARGE=1
misc/double_check.py
double checks usingecm
, to check that small factor aren't found for numbers in unknown-file.txtpython misc/double_check.py --unknown-filename <unknown_file.txt> -c 10
skipped PRP/s
is checked in THEORY.md
- Can compare
- Compression
- method1 doesn't support compression at this time so it also makes verifying
combined_sieve
slightly harder.misc/convert_rle.py
can convert existing files to RLE if you want to double check / shrink old files.
- Run Length Encoding
can be forced with
--rle` saves ~60% of space
--bitcompress
- Saves ~95% of space, but makes code very hard
- Stores 7 bits per byte (to avoid ascii control characters) of is_composite array
- Reduces size of is_composite by removing any number coprime to K# or d
- method1 doesn't support compression at this time so it also makes verifying
This is somewhat automated with ./misc/tests.sh
which should print a green Success!
quick test commands and output
$ PARAMS="-p 907 -d 2190 --mstart 1 --minc 200 --max-prime 100 --sieve-length 11000"
$ make combined_sieve gap_stats gap_test_simple
$ time ./combined_sieve --method1 -qqq --save-unknowns $PARAMS
$ time ./combined_sieve -qqq --save-unknowns $PARAMS
$ md5sum unknowns/907_2190_1_200_s11000_l100M.{txt,m1.txt}
15a5cbff7301262caf047028c05f0525 unknowns/907_2190_1_200_s11000_l100M.txt
15a5cbff7301262caf047028c05f0525 unknowns/907_2190_1_200_s11000_l100M.m1.txt
$ ./gap_stats --unknown-filename 907_2190_1_200_s11000_l100M.txt
# Verify RECORD @ 1,7,13,101,137
# Verify "avg missing prob : 0.0000000"
# Verify "RECORD : top 50% ( 26) sum(prob) = 1.50e-05 (avg: 5.77e-07)"
# Verify "RECORD : top 100% ( 53) sum(prob) = 2.15e-05 (avg: 4.06e-07)"
# Optionally validate combined_sieve results with ecm
$ python misc/double_check.py --seed=123 --unknown-filename unknowns/907_2190_1_200_s11000_l100M.txt -c 5 --B1 1000
1 * 907# / 2190 unknowns 218 + 212 = 430
+1423 had trivial factor of 2 skipping
+6570 should have small factor
ecm: 2099 (1*907#/2190+6570)/2099
+1373 had trivial factor of 2 skipping
-1439 had trivial factor of 2 skipping
+7789 had trivial factor of 2 skipping
-2306 had trivial factor of 3 skipping
+4377 had trivial factor of 2 skipping
-4222 should have small factor
ecm: 3325199 (1*907#/2190+-4222)/3325199
factor 3325199: 1319 2521
...
17 * 907# / 2190 unknowns 216 + 208 = 424
+2722 shouldn't have small factor
ecm: 17*907#/2190+2722
...
4150 trivially composite, 86 unknowns, 179 known composites
ecm found 178 composites: known 171 composite, 7 were unknown
$ ./gap_test_simple --unknown-filename 907_2190_1_200_s11000_l100M.txt -q --min-merit 8
Testing m * 907#/2190, m = 1 + [0, 200)
Min Gap ~= 7734 (for merit > 9.0)
Reading unknowns from '907_2190_1_200_s11000_l100M.txt'
53 tests M_start(1) + mi(0 to 198)
7750 9.0185 1 * 907#/2190 -450 to +7300
m=1 218 <- unknowns -> 212 450 <- gap -> 7300
tests 1 (15.29/sec) 0 seconds elapsed
unknowns 430 (avg: 430.00), 98.05% composite 50.70% <- % -> 49.30%
prp tests 144 (avg: 144.00) (2201.5 tests/sec)
7462 8.6548 17 * 907#/2190 -6982 to +480
6938 8.0460 19 * 907#/2190 -6722 to +216
m=37 214 <- unknowns -> 234 1296 <- gap -> 250
tests 10 (34.35/sec) 0 seconds elapsed
unknowns 4262 (avg: 426.20), 98.06% composite 48.97% <- % -> 51.03%
prp tests 643 (avg: 64.30) (2208.7 tests/sec)
8520 9.8689 53 * 907#/2190 -4084 to +4436
m=113 207 <- unknowns -> 215 40 <- gap -> 972
tests 30 (41.87/sec) 1 seconds elapsed
unknowns 12676 (avg: 422.53), 98.08% composite 49.31% <- % -> 50.69%
prp tests 1550 (avg: 51.67) (2163.0 tests/sec)
8382 9.6979 143 * 907#/2190 -1800 to +6582
7678 8.8820 163 * 907#/2190 -432 to +7246
m=199 216 <- unknowns -> 205 192 <- gap -> 30
tests 53 (43.88/sec) 1 seconds elapsed
unknowns 22377 (avg: 422.21), 98.08% composite 49.97% <- % -> 50.03%
prp tests 2590 (avg: 48.87) (2144.5 tests/sec)
$ python gap_test.py --unknown-filename 907_2190_1_200_s11000_l100M.txt --min-merit 8
...
7750 9.0185 1 * 907#/2190 -7300 to +450
1 218 <- unknowns -> 212 7300 <- gap -> 450
7462 8.6548 17 * 907#/2190 -480 to +6982
6938 8.0460 19 * 907#/2190 -216 to +6722
37 214 <- unknowns -> 234 250 <- gap -> 1296
tests 10 (35.7/sec) 0 seconds elapsed
unknowns 4262 (avg: 426.20), 98.06% composite 48.97% <- % -> 51.03%
prp tests 643 (avg: 64.30) (2294.854 tests/sec)
best merit this interval: 9.018 (at m=1)
8520 9.8689 53 * 907#/2190 -4436 to +4084
113 207 <- unknowns -> 215 972 <- gap -> 40
tests 30 (43.1/sec) 1 seconds elapsed
unknowns 12676 (avg: 422.53), 98.08% composite 49.31% <- % -> 50.69%
prp tests 1550 (avg: 51.67) (2226.529 tests/sec)
best merit this interval: 9.869 (at m=53)
8382 9.6979 143 * 907#/2190 -6582 to +1800
7678 8.8820 163 * 907#/2190 -7246 to +432
199 216 <- unknowns -> 205 30 <- gap -> 192
tests 53 (45/sec) 1 seconds elapsed
unknowns 22377 (avg: 422.21), 98.08% composite 49.97% <- % -> 50.03%
prp tests 2590 (avg: 48.87) (2199.432 tests/sec)
best merit this interval: 9.698 (at m=143)