add subsample_reads #138

antgonza · 2025-07-10T22:17:19Z

No description provided.

AmandaBirmingham

One request for change to an error message, a couple of questions

AmandaBirmingham · 2025-07-11T18:38:05Z

src/qp_klp/Protocol.py

+                           f'| gzip > {f}')
+                    _, se, rv = system_call(cmd)
+                    if rv != 0 or se:
+                        raise ValueError(f'Error during mv: {cmd}. {se}')


This error message seems misleading since it (still) says "Error during mv"; I think it should be changed to say it is reporting on errors that occur during seqtk.

AmandaBirmingham · 2025-07-11T18:39:38Z

src/qp_klp/Protocol.py

+                for f in files:
+                    dn = dirname(f)
+                    bn = basename(f)
+                    nbn = join(dn, bn.replace('fastq.gz', 'subsampled.gz'))


Is it going to cause any problems later that the subsampled reads file doesn't end with 'fastq.gz'? I'm not familiar with the workflow but I know I've seen regexes around that expect this suffix ...

I wanted to keep in the working directory a backup of the original file, just in case we need to debug it. This basically moves the original problematic file to a new name with subsample,, then in the next command the subsample will create a new (smaller) file with the name of the original. In fact, I'm relying on those regex to ignore the subsampled.gz. I'll add a comment about this.

Ah, that makes sense :) However, I wonder if we could name them something other than "subsampled.gz"? That name makes me think that the file with that name IS the one that was subsampled, rather than being the original one. Could we name it "not_subsampled.gz" or something?

Yes, while writing the comment I realized the same thing so I called it "full".

AmandaBirmingham · 2025-07-11T18:40:18Z

src/qp_klp/Assays.py

            self.convert_raw_to_fastq()
            self.integrate_results()
            self.generate_sequence_counts()
+            self.subsample_reads()


Does this mean that we will now ALWAYS subsample?

Yes, but really we will always check if subsample is needed and only run it when necessary.

Well, we will always subsample every fastq that has more than the max number of reads, right?

add subsample_reads

856e9b4

antgonza changed the title ~~add subsample_reads~~ [WIP] add subsample_reads Jul 10, 2025

antgonza added 7 commits July 11, 2025 06:33

support the 2 different seq counts

ab54d4a

index_col

6c6ebd6

self.assay_type == 'Amplicon'

19e53a3

pack warnings

8be711a

so -> _

cb6848b

self.warnings -> self.assay_warnings

a5b6ae8

"w"

c2ee439

AmandaBirmingham requested changes Jul 11, 2025

View reviewed changes

antgonza added 3 commits July 11, 2025 14:48

addressing @AmandaBirmingham comments

98c294f

self.warnings -> self.assay_warnings

13ffd2b

flake 8

94e885d

antgonza changed the title ~~[WIP] add subsample_reads~~ add subsample_reads Jul 11, 2025

antgonza merged commit 2b7832e into qiita-spots:main Jul 13, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add subsample_reads #138

add subsample_reads #138

Uh oh!

antgonza commented Jul 10, 2025

Uh oh!

AmandaBirmingham left a comment

Uh oh!

AmandaBirmingham Jul 11, 2025

Uh oh!

AmandaBirmingham Jul 11, 2025

Uh oh!

antgonza Jul 11, 2025

Uh oh!

AmandaBirmingham Jul 11, 2025

Uh oh!

antgonza Jul 11, 2025

Uh oh!

AmandaBirmingham Jul 11, 2025

Uh oh!

antgonza Jul 11, 2025

Uh oh!

AmandaBirmingham Jul 11, 2025

Uh oh!

antgonza Jul 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add subsample_reads #138

add subsample_reads #138

Uh oh!

Conversation

antgonza commented Jul 10, 2025

Uh oh!

AmandaBirmingham left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants