Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added long read walkers from long-running branch. #8799

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jonn-smith
Copy link
Collaborator

Fixed warnings in long read tools that were pulled in.

@gatk-bot
Copy link

Github actions tests reported job failures from actions build 8883008459
Failures in the following jobs:

Test Type JDK Job ID Logs
integration 17.0.6+10 8883008459.11 logs

Copy link
Collaborator

@jamesemery jamesemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of issues. I mostly didn't touch the actual logic of these tools but rather looked at their documentation and conformity to the GATK style. In general you should update the docs here for each of these tools as there were a bunch of copy-paste errors. Furthermore most of your String outputs can become GATKPath inputs for maximum robustness.

The elephant in the room is that none of this has tests. Thats a broader discussion but I don't want to hold up these experimental tools by asking for lots and lots of test code for it especially when a bunch of them are pretty simple (if not necessarily unambiguously named in all cases).

* </ul>
*
* <h3>Usage Example</h3>
* <h4>Quickly count errors</h4>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong docs. It looks like it picked up elements from multiple tools in here

@Override
public void apply(GATKRead read, ReferenceContext referenceContext, FeatureContext featureContext ) {
byte[] quals = new byte[read.getBases().length];
Arrays.fill(quals, (byte) 40);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the specific base quality not be configurable?

* gatk ShardLongReads \
* -I input.bam \
* -O outputDirectory \
* --num-reads-per-split 10000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include a comment about what the output directory strucutre should look like

@DocumentedFeature
@ExperimentalFeature
@CommandLineProgramProperties(
summary = "Quickly replace read quals with fixed value (@).",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the wrong documentation here

oneLineSummary = "Quickly replace read quals with fixed value (@).",
programGroup = ReadDataManipulationProgramGroup.class
)
public final class ReplaceQuals extends ReadWalker {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe name it something like OverwriteReadBaseQualities to be a little more explicit.

doc="Path to which per-read information should be written")
public String perReadPathName;

@Argument(fullName = "outputIntervalInfo",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required output that is not documented.

@ExperimentalFeature
@CommandLineProgramProperties(
summary = "Count indels",
oneLineSummary = "Count indels",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count indels by interval

}

@Override
public void apply(SimpleInterval interval, ReadsContext readsContext, ReferenceContext referenceContext, FeatureContext featureContext) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reads context has the potential to be absurdly huge and crash if the intervals are too big. This should probably be warned about somewhere.

@CommandLineProgramProperties(
summary = "Count indels",
oneLineSummary = "Count indels",
programGroup = LongReadProgramGroup.class
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the tools in here that probably can belong into one of the other Program groups if you included much more detailed documentation about what it does.

oneLineSummary = "Count indels",
programGroup = LongReadProgramGroup.class
)
public final class CountIndels extends IntervalWalker {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably countIndelsPerInterval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants