-
Notifications
You must be signed in to change notification settings - Fork 447
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix sam_hdr_dup to cope with long refs.
The h->sdict hash is used to track references that are > 4Gb in size. The dup code didn't copy this. This manifested itself as CRAM SQ headers being truncated (read SAM hdr, dup, write as CRAM hdr). To fix this a function was written that creates or updates the sdict from the hrecs parsed header structs. It's possible this should be called directly from the sam_hdr_create function (part of the SAM format parser) instead of manually keeping track of sdict itself, however doing so would require initialising the new header structs so I haven't done this. This is a general utility, so perhaps should be made a public part of the header API. However IMO the new header API should hide this nuance away and just return the correct data, also ensuring that header updates work correctly and honour the text form. Since c83c9e2 the header API also was using the 32-bit capped target_len in preference to the parsed text from SQ LN fields when they differed. I am assuming this was a decision in what takes priority in BAM where the sequence names and lengths exist in both text and binary form. This commit reverses this and makes the text form always take priority. As this is at least required in some scenarios (long references) it seems easier to simply make it apply in all scenarios.
- Loading branch information
1 parent
7e3234e
commit a914023
Showing
3 changed files
with
65 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters