You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the SAMFileWriterFactory to write a .cram file, when the "create index" default is toggled on, it will create a .bai file for the index rather than .crai. This means that e.g. when running gatk MergeSamFiles --CREATE_INDEX… with a .cram output, you end up with an output.cram.bai file instead of output.cram.crai.
Your environment:
version of htsjdk: 3.0.1
version of java: 17
which OS: MacOS
Steps to reproduce
Run gatk MergeSamFiles as described above.
Expected behaviour
You should get a .crai file.
Actual behaviour
You get a .bai file.
There are a few very old issues surrounding .crai files in the repo. According to this issue it seems like support was added for this but kept off for reasons discussed here. Perhaps it's too much to resurrect the project of getting these indices sorted out, but at the moment is seems GATK just silently puts out .cram.bai files due to this, which can be pretty confusing. I don't know enough about CRAM vs BAM to know how bad it might be to use one index for the other, but at least GATK seems to work just fine doing random access on CRAMs with the .bai file produced as described above. Also not sure if this issue should be pushed up to GATK or kept down here in htsjdk. At the very least it'd be nice if the library could be updated to use the proper file extension for the index.
The text was updated successfully, but these errors were encountered:
@rickymagner It's actually producing a bai index, not a crai. So it would be equally wrong to rename it to crai. It would be great to fix it to make a crai index but I think it's a bit of a project.
Description of the issue:
When using the
SAMFileWriterFactory
to write a .cram
file, when the "create index" default is toggled on, it will create a.bai
file for the index rather than.crai
. This means that e.g. when runninggatk MergeSamFiles --CREATE_INDEX…
with a.cram
output, you end up with anoutput.cram.bai
file instead ofoutput.cram.crai
.Your environment:
Steps to reproduce
Run
gatk MergeSamFiles
as described above.Expected behaviour
You should get a
.crai
file.Actual behaviour
You get a
.bai
file.There are a few very old issues surrounding
.crai
files in the repo. According to this issue it seems like support was added for this but kept off for reasons discussed here. Perhaps it's too much to resurrect the project of getting these indices sorted out, but at the moment is seems GATK just silently puts out.cram.bai
files due to this, which can be pretty confusing. I don't know enough about CRAM vs BAM to know how bad it might be to use one index for the other, but at least GATK seems to work just fine doing random access on CRAMs with the.bai
file produced as described above. Also not sure if this issue should be pushed up to GATK or kept down here in htsjdk. At the very least it'd be nice if the library could be updated to use the proper file extension for the index.The text was updated successfully, but these errors were encountered: