A high-performance tool for filtering single-end or paired-end FASTQ reads based on k-mer entropy using the KRIS index.
- Automatically determines k-mer size from read length
- Filters reads below a specified KRIS entropy threshold
- Handles plain and gzipped FASTQ
- Outputs summary statistics and SVG visualizations
- Docker support for reproducibility
pip install .docker build -t kris-read-complexity-filter .kris-read-complexity-filter \
--in1 input_R1.fastq.gz \
--in2 input_R2.fastq.gz \
--out1 output_R1.filtered.fastq \
--out2 output_R2.filtered.fastq \
--threshold 1.8 \
--report-dir reports/If single-end, omit --in2 and --out2.
- Filtered FASTQ files
summary.json: total reads, filtered count, retained reads, filter ratekris_score_histogram.svg: histogram of KRIS scoreskris_score_boxplot.svg: boxplot comparing retained vs total scores
Filter paired-end reads with a minimum KRIS index of 2.0:
kris-read-complexity-filter \
--in1 sample_R1.fastq.gz \
--in2 sample_R2.fastq.gz \
--out1 clean_R1.fastq \
--out2 clean_R2.fastq \
--threshold 2.0