Skip to content

Zymo-Research/krisReadFilter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KRIS Read Complexity Filter

A high-performance tool for filtering single-end or paired-end FASTQ reads based on k-mer entropy using the KRIS index.

Key Features

  • Automatically determines k-mer size from read length
  • Filters reads below a specified KRIS entropy threshold
  • Handles plain and gzipped FASTQ
  • Outputs summary statistics and SVG visualizations
  • Docker support for reproducibility

Installation

Pip (local use)

pip install .

Docker

docker build -t kris-read-complexity-filter .

Usage

Command-line interface

kris-read-complexity-filter \
  --in1 input_R1.fastq.gz \
  --in2 input_R2.fastq.gz \
  --out1 output_R1.filtered.fastq \
  --out2 output_R2.filtered.fastq \
  --threshold 1.8 \
  --report-dir reports/

If single-end, omit --in2 and --out2.

Output

  • Filtered FASTQ files
  • summary.json: total reads, filtered count, retained reads, filter rate
  • kris_score_histogram.svg: histogram of KRIS scores
  • kris_score_boxplot.svg: boxplot comparing retained vs total scores

Example

Filter paired-end reads with a minimum KRIS index of 2.0:

kris-read-complexity-filter \
  --in1 sample_R1.fastq.gz \
  --in2 sample_R2.fastq.gz \
  --out1 clean_R1.fastq \
  --out2 clean_R2.fastq \
  --threshold 2.0

About

Tool for filtering out low-information reads from sequencing data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published