Saving SAM read alignment with match chromosome/position, not the kme… #24
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Earlier version of
guidescan enumerate --format sam ...
assumed that the kmer chromosome and position can be ascertained directly from the incoming kmer file. While this is reasonable, a lot of times this information is not available beforehand, and blank/dummy values for chromosome/position are filled in only to satisfyguidescan enumerate
.In these cases, the output of
guidescan enumerate --format csv ...
is correct since it obtains the match positions directly from the offtargets (including at distance 0), but the output ofguidescan enumerate --format sam ...
is incorrect w.r.t. the reference name and reference position fields in the SAM file (because that information was unavailable to begin with).This PR fixes this issue by getting that information directly from the off-targets at distance 0.
Note that this means that multiple matches at distance 0 will end up producing multiple lines in the SAM file, with identical off-target hex information. This is already happening in the CSV file generation, so this also makes the behavior consistent.