-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSI file is BGZF compressed but this is not mentioned in the CSV1 spec #765
Comments
While I agree adding this would be beneficial, it's the least problematic bit about the spec! It certainly would be good if the original authors could add more about it. One thing that confused me lots is the "Auxiliary data", which changes format depending on the thing being indexed. (IIRC it's tabix data for VCF and some BAI-related format for BCF). I assume it's meant to be generic, but it also makes it largely unparseable without custom knowledge. Ping @lh3 @pd3: is there any more information on CSI somewhere else? It looks like it arrived with this commit and subsequent commits. This appears to be where the original minimal spec documentation came from too. |
See also #70, a long-standing issue noting this:
|
I used bcftools 1.19 to index a BCF file and tried to parse the CSI index file according to the spec https://github.com/samtools/hts-specs/blob/26347448cadff3cf40982d60fe2a97f20d2543ea/CSIv1.tex#L20C28-L20C33. It was not working as expected. After
hexdump -C
on the csi file, I realized it not a plain binary file as described in CSIv1 spec file.But it seem consist with the spec after decompressing it
bgzip -cd test.bcf.csi | hexdump -C
:Could we add a sentence in the spec to point this out for future readers? Or it is not part of the spec?
The text was updated successfully, but these errors were encountered: