Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to parallelize tsv2exprofiledb? #686

Open
saro2-a opened this issue Feb 21, 2025 · 0 comments
Open

how to parallelize tsv2exprofiledb? #686

saro2-a opened this issue Feb 21, 2025 · 0 comments

Comments

@saro2-a
Copy link

saro2-a commented Feb 21, 2025

I'm observing that the tsv2exprofiledb (that is triggering mmseqs tsv2db), seems that it is not using all of the possible resources). The CPU utilization is only 1%.

Machine specs:
100GB+ ram
10GB/s read disk
1-2GB/s write disk
96 cores

current CPU utilization: 1%

this is making the process quite slow (many many hours). Is it possible to parallelize that step in any way?

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0   1112   384 ?        Ss   10:04   0:00 /sbin/docker-init -- /app/entrypoint.sh
root          13  0.0  0.0   4496  3080 ?        S    10:04   0:00 /bin/bash /app/entrypoint.sh
root          23  0.0  0.0   4364  2712 ?        S    10:04   0:00 /bin/bash -ex /app/setup_database.sh /workspace/db
root          42  0.0  0.0   2892  1548 ?        S    10:58   0:00 /bin/sh -e /workspace/db/uniref30_2302_db.sh uniref30_2302 uniref30_2302_db
root          71  0.0  0.0   4628  3472 pts/0    Ss   11:16   0:00 /bin/bash
root          94 56.8  0.0 226220 39616 ?        R    11:31   0:26 mmseqs tsv2db uniref30_2302_h.tsv uniref30_2302_db_seq_h --output-dbtype 12 -v 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant