Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMG 5.3 update - databases #37

Open
Tracked by #249
aclum opened this issue Sep 16, 2024 · 12 comments · May be fixed by #44
Open
Tracked by #249

IMG 5.3 update - databases #37

aclum opened this issue Sep 16, 2024 · 12 comments · May be fixed by #44
Assignees

Comments

@aclum
Copy link
Collaborator

aclum commented Sep 16, 2024

Placeholder ticket for updating

  • KEGG db
  • KEGG product lookup tables
  • Pfam database
  • Pfam product lookup tables

Marcel to provide finalized info.

cc @kaijli

@ssarrafan
Copy link

Hmm I'll assign this one to you for now @aclum

@aclum
Copy link
Collaborator Author

aclum commented Oct 21, 2024

Natalia is still reviewing the KEGG results so will bump this a sprint.

@aclum
Copy link
Collaborator Author

aclum commented Nov 20, 2024

Pfam will be version v37.0

@aclum
Copy link
Collaborator Author

aclum commented Dec 2, 2024

we should be getting finalized databases, product lookup files and updated last parameters from Marcel this sprint.

@aclum
Copy link
Collaborator Author

aclum commented Dec 3, 2024

database updates-

  • Pfam version 37.0
    /global/dna/projectdirs/microbial/omics/databases/Pfam/Pfam-A/37.0

  • img-nr
    /global/dna/projectdirs/microbial/omics/databases/IMG-NR/20240916

  • product lookup files
    /global/cfs/cdirs/m3408/refdata/img/Product_Name_Mappings/20250123/pfam.tsv
    /global/cfs/cdirs/m3408/refdata/img/Product_Name_Mappings/20250123/kegg.tsv

software updates-

  • LAST
  1. Update container to use LAST version 1584
  2. Add -m 180 to the lastal argument in the ko_ec task (line 198). This is separate from the -m argument for lastal_img_nr_ko_ec_gene_phylo_hit_selector.py
  3. memory should be 256 gb to get half a permutter node based on Marcel's testing

@aclum
Copy link
Collaborator Author

aclum commented Jan 8, 2025

permission issues copying over md5Hash2Data.tsv and taxonOId2Taxonomy.tsv

@aclum
Copy link
Collaborator Author

aclum commented Jan 9, 2025

Permissions have been resolved, the databases have been copied over. @kaijli this is ready for work.

@aclum aclum assigned kaijli and unassigned aclum Jan 9, 2025
@kaijli
Copy link
Contributor

kaijli commented Jan 9, 2025

attaching to genomad branch because i just need to work out file structures before it's complete / all part of 5.3 update

@aclum
Copy link
Collaborator Author

aclum commented Jan 24, 2025

Marcel tested new product lookup table and there was 1 id missing so Natalia needs to provide an updated table. Actively in progress, moving to the next sprint.

@aclum
Copy link
Collaborator Author

aclum commented Feb 4, 2025

@kaijli final product lookup files are above.

@kaijli
Copy link
Contributor

kaijli commented Feb 11, 2025

Status update: genomad is working, new image with LAST update has been create, running tests on full workflow, some snags with ko_ec task testing on interactive node

@kaijli
Copy link
Contributor

kaijli commented Feb 12, 2025

successful run at jaws id 98507

@kaijli kaijli linked a pull request Feb 12, 2025 that will close this issue
@kaijli kaijli linked a pull request Feb 12, 2025 that will close this issue
@kaijli kaijli moved this from In Progress to In Review in 2025 - Sprint 56 - Feb 10-21,2025 Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

3 participants