Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BQSR: avoid throwing an error when read group is missing in the recal table, and some refactoring. #9020

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

takutosato
Copy link
Contributor

Addresses #6242.

Current behavior: when all the reads in a read group are filtered in the base recalibration step, the read group is not logged in the recal table. Then ApplyBQSR encounters these reads, can't find the read group in the recal table, and throws an error.

New behavior: if --allow-read-group flag is set to true, then ApplyBQSR outputs the original quantities (after quantizing).

I avoided the alternative approach of collapsing (marginalizing) across the read groups, mostly because it would require a complete overhaul of the code. I also think that using recal data from other read groups might not be a good idea. In any case, using OQ should be good enough; I assume that these "missing" read groups are low enough quality to be filtered out and are likely to be thrown out by downstream tools.

I also refactored the BQSR code, mostly to update the variable and class names to be more accurate and descriptive. For instance:

ReadCovariates.java -> PerReadCovariateMatrix.java
EstimatedQReported -> ReportedQuality

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant