-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Domain Discovery / Minimal Certificate Transparency Log solution #434
Comments
Just for documentation puposes: Scanning CT logs is a huge step forward. However, note that this way the dashboard will not discover:
|
Thanks @baknu, very true, this mainly benefits the web test, not the mail test. |
An issues called 'Limit max domains via certificate transparency' can be merged into this issue as this is something to keep in mind when working with this. Getting 1000 subdomains for 1000 domains in your list is fun, but not supported and requires several other optimizations. So knowing in advance how many subdomains might be found / are limited. Or allowing users to cherrypick subdomains etc would make this feature more 'workable'. |
Extra note about crt.sh: https://crt.sh/atom is pretty nice since it is XML instead of HTML. |
New notes about sql.sh: they allow direct PostgreSQL read access to their database¹. $ psql -h crt.sh -p 5432 -U guest certwatch The schema can be found here https://github.com/crtsh/certwatch_db/ Some rate limits apply: it's limited to 5 connections per IP and still regularly gives:
Therefor it's probably best to create some daily dump with new seen Precertificates & Leaf certificates (note see some stats about the crt.sh fill ratio of known certificate serials, because of this, both should be parsed). So maybe an idea would be to have a daily job execute the psql command with ¹ it seems to be a hot-standby, because of the errors (see stack overflow):
Why I did not know of this (since this is like forever available, at least more than 5 years) .. maybe I would have discovered it earlier if I would by default port scan hostnames I visit ;) |
Currently the crt.sh is unstable to use (500 errors). Which means we have to push it to background and cannot show the direct impact of adding the CT log subdomains.
Are there solutions to monitor a CT log server and just log the domain names (not all the crypto / etc.)
Cloning a CT log server would be 1+TB, so that's a bit large (reference: https://letsencrypt.org/2019/11/20/how-le-runs-ct-logs.html#database).
Some links for how the CT log API works: https://security.stackexchange.com/a/167373
e.g.
https://oak.ct.letsencrypt.org/2023/ct/v1/get-sth
https://oak.ct.letsencrypt.org/2023/ct/v1/get-entries?start=1000&end=1014
some CT log servers:
E.g. something like:
I queried from 256000000 to 256256000, so 1000 requests and 256000 entries, this resulted in 541100 domains (13.28MiB/3.32MiB), and 402560 unique entries (9.84MiB/2.33MiB). My main issue was CPU in
jq
!'It seems to work' for some sample cases, although this should not be used in production*.
Other tools:
Todo:
Are there already tools to transform CT logs into some parse-able data stream?
Find out max recordsget-entries
supports per CT log (certificate-transparency groups 2020 discussion)Also need to align: https://community.letsencrypt.org/t/enabling-coerced-get-entries/114436
Check retry/fail logic
Find out monitoring certificates to ignore
* Note that this is quite hacky code, since
jq
is not the best tool to do binary (chars != bytes, since jq has unicode support). Theleaf_input
is of theMerkleTreeLeaf
structure. So: byte 0 is version, byte 1 is MerkleLeafType, byte 2..9 is timestamp, byte 10..11 is LogEntryType and should be\x00\x00
for a x509_entry. Bothleaf_input
andextra_data
then have a 3 byte length field, that can be skipped over. Because it aligns on 15 bytes and 3 bytes × 8 bit / 6 bit base64 => 15×8/6=20 base64 chars, 3×8/6=4 base64 chars, we can directly operate on the base64 string to skip these bytes.One can also use
dd bs=4096 skip=15 iflag=skip_bytes status=none
for the X509 entries anddd bs=4096 skip=3 iflag=skip_bytes status=none
for the PreCertificates.For debug:
openssl asn1parse -inform der -i
.See https://datatracker.ietf.org/doc/html/rfc6962#section-3.4
Structure of the Merkle Tree input:
Ideally we would just have a compressed / suffix trie datastructure with the (reversed) Fully qualified domain name (FQDN):
The text was updated successfully, but these errors were encountered: