Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report all chromosomes/regions even though they have zero coverage #222

Open
xsoleacha opened this issue Feb 12, 2024 · 2 comments
Open

Comments

@xsoleacha
Copy link

Hi Brent,

I have a suggestion for mosdepth that would make its behavior more consistent regardless the input data. The issue is that when a whole chromosome (for whatever reason) has zero coverage, it is not reported in the summary file. Analogously, in the per-base file, chromosomes with zero coverage are not reported at all.

Since the per-base and the summary files are supposed to report coverage for the whole genome, my suggestion would be to add regions and chromosomes with zero coverage, making the output of mosdepth consistent regardless of the coverage of the input data.

Example of summary file that I get right now (chr8 & 9 are missing):
chr7 159138663 751 0.00 0 2
chr7_region 10152 231 0.02 0 1
chr10 135534747 594 0.00 0 2
chr10_region 3672 224 0.06 0 2

Suggested output:
chr7 159138663 751 0.00 0 2
chr7_region 10152 231 0.02 0 1
chr8 <length_of_chr8> 0 0.00 0 0
chr8_region <length_of_chr8_region> 0 0.00 0 0
chr9 <length_of_chr9> 0 0.00 0 0
chr9_region <length_of_chr9_region> 0 0.00 0 0
chr10 135534747 594 0.00 0 2
chr10_region 3672 224 0.06 0 2

Example of per-base file that I get right now:
chr7 0 27184209 0
chr7 27184209 27184271 1
chr7 27184271 55242388 0
chr7 55242388 55242389 1
chr7 55242389 55242412 2
chr7 55242412 56604537 0
chr7 56604537 56604678 1
chr7 56604678 110828782 0
chr7 110828782 110828949 1
chr7 110828949 140453080 0
chr7 140453080 140453247 1
chr7 140453247 140481371 0
chr7 140481371 140481538 1
chr7 140481538 159138663 0
chr10 0 89717480 0
chr10 89717480 89717616 1
chr10 89717616 89720664 0
chr10 89720664 89720764 1
chr10 89720764 89720773 2
chr10 89720773 89720872 1
chr10 89720872 107340261 0
chr10 107340261 107340313 1
chr10 107340313 118073651 0
chr10 118073651 118073840 1
chr10 118073840 135534747 0

Suggested output:
chr7 0 27184209 0
chr7 27184209 27184271 1
chr7 27184271 55242388 0
chr7 55242388 55242389 1
chr7 55242389 55242412 2
chr7 55242412 56604537 0
chr7 56604537 56604678 1
chr7 56604678 110828782 0
chr7 110828782 110828949 1
chr7 110828949 140453080 0
chr7 140453080 140453247 1
chr7 140453247 140481371 0
chr7 140481371 140481538 1
chr7 140481538 159138663 0
chr8 0 <length_chr8> 0
chr9 0 <length_chr9> 0
chr10 0 89717480 0
chr10 89717480 89717616 1
chr10 89717616 89720664 0
chr10 89720664 89720764 1
chr10 89720764 89720773 2
chr10 89720773 89720872 1
chr10 89720872 107340261 0
chr10 107340261 107340313 1
chr10 107340313 118073651 0
chr10 118073651 118073840 1
chr10 118073840 135534747 0

This modification would make it easier to parse the output of mosdepth when chromosomes are missing.

Thanks as always for all your effort!

XS.

@brentp
Copy link
Owner

brentp commented Feb 14, 2024

Hi Xavier, I agree this would be useful and it is suggested in another open issue. I just haven't had the time to implement. I'll try to get to it soon.

@xsoleacha
Copy link
Author

Dear Brent,

thank you for your quick reply and for your efforts maintaining the tool. It is very useful for us!!

Best regards,

Xavi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants