-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hi,
as discussed, here the proposed enhancements:
locityper v1.2.0 @ 2025-10-07 19:33:08
parameterization:
locityper recruit --input HIFI_READS --seqs-all TARGETS --distinct --output OUTPUT --minimizer 21 15 --chunk-size 500 --match-len 10000 --threads 12
...
Collected 311391945 minimizers across 43380 loci and 43380 sequences
...
Cgroup mem limit exceeded ...
# fails with ~350 GB of available memory
The HIFI reads are a single SMRT cell dataset (Revio), other runs finish with the same/similar input, which points at the number of target sequences as being the root cause.
Suggested enhancements:
- no solution, but heads-up for users:
- mention scaling behavior in docs / CLI help; if confirmed that the number of target sequences is the problem, provide a recommendation for maximal number of targets per target file such that users know right away how to split/divide-and-conquer the problem
- desired solution: implement chunking for processing target sequences
Best,
Peter
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels