Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Creating a community model for 6-7 species using pre-binned MAGs obtained from elsewhere #169

Open
shikidj16 opened this issue Sep 27, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@shikidj16
Copy link

Dear Francisco

I'm working on a project where I wish to use MAGs to create a community GEM (6 species). The MAGs will be obtained from somewhere else, and I want to use that data to create the model using meta-GEM.

  1. Under the FAQs, It says that the pipeline is flexible to accommodate pre-binned MAGs. Could you describe how I do this in more detail? I'm unable to find documentation for this case.

  2. Since it's only 6 species, can I can create the model on my PC ?

Thanks
Charan

@shikidj16 shikidj16 added the question Further information is requested label Sep 27, 2024
@franciscozorrilla
Copy link
Owner

franciscozorrilla commented Sep 28, 2024

Hi Charan,

Since it's only 6 species, can I can create the model on my PC ?

Yes, model reconstruction should only take a few minutes per MAG and does not require nearly as much computational resources as the assembly-based analysis (green block in figure below).

Under the FAQs, It says that the pipeline is flexible to accommodate pre-binned MAGs. Could you describe how I do this in more detail? I'm unable to find documentation for this case.

Please have a look at the workflow diagram:

image

As you can see, you will need to run CarveMe to generate individual GEMs for each MAG, and then use SMETANA to predict nutritional dependencies for your community. Since you already have your MAGs and its only 6 species for one community, setting up and running metaGEM might be overkill for you at the moment. For example, in the metaGEM publication we carry out an analysis of ~137 samples including ~4000 GEMs.

Please have a look at the tutorials section for more details and examples of metabolic model reconstruction and simulation:

metaGEM can be used to explore your own gut microbiome sequencing data from at-home-test-kit services such as unseen bio. The following tutorial showcases the metaGEM workflow on two unseenbio samples.

Tutorial

For an introductory metabolic modeling tutorial, refer to the resources compiled for the EMBOMicroCom: Metabolite and species dynamics in microbial communities workshop in 2022.

Tutorial3

For a more advanced tutorial, check out the resources we put together for the SymbNET: from metagenomics to metabolic interactions course in 2022.

Tutorial2

I also suggest having a look at the corresponding Snakefile rules and individual tool documentation:

metaGEM/workflow/Snakefile

Lines 1275 to 1317 in 8609ad6

rule carveme:
input:
bin = f'{config["path"]["root"]}/{config["folder"]["proteinBins"]}/{{binIDs}}.faa',
media = f'{config["path"]["root"]}/{config["folder"]["scripts"]}/{config["scripts"]["carveme"]}'
output:
f'{config["path"]["root"]}/{config["folder"]["GEMs"]}/{{binIDs}}.xml'
benchmark:
f'{config["path"]["root"]}/{config["folder"]["benchmarks"]}/{{binIDs}}.carveme.benchmark.txt'
message:
"""
Make sure that the input files are ORF annotated and preferably protein fasta.
If given raw fasta files, Carveme will run without errors but each contig will be treated as one gene.
"""
shell:
"""
# Activate metagem environment
set +u;source activate {config[envs][metagem]};set -u;
# Make sure output folder exists
mkdir -p $(dirname {output})
# Make job specific scratch dir
binID=$(echo $(basename {input})|sed 's/.faa//g')
echo -e "\nCreating temporary directory {config[path][scratch]}/{config[folder][GEMs]}/${{binID}} ... "
mkdir -p {config[path][scratch]}/{config[folder][GEMs]}/${{binID}}
# Move into tmp dir
cd {config[path][scratch]}/{config[folder][GEMs]}/${{binID}}
# Copy files
cp {input.bin} {input.media} .
echo "Begin carving GEM ... "
carve -g {config[params][carveMedia]} \
-v \
--mediadb $(basename {input.media}) \
--fbc2 \
-o $(echo $(basename {input.bin}) | sed 's/.faa/.xml/g') $(basename {input.bin})
echo "Done carving GEM. "
[ -f *.xml ] && mv *.xml $(dirname {output})
"""

  • SMETANA

    metaGEM/workflow/Snakefile

    Lines 1422 to 1457 in 8609ad6

    rule smetana:
    input:
    f'{config["path"]["root"]}/{config["folder"]["GEMs"]}/{{IDs}}'
    output:
    f'{config["path"]["root"]}/{config["folder"]["SMETANA"]}/{{IDs}}_detailed.tsv'
    benchmark:
    f'{config["path"]["root"]}/{config["folder"]["benchmarks"]}/{{IDs}}.smetana.benchmark.txt'
    shell:
    """
    # Activate metagem env
    set +u;source activate {config[envs][metagem]};set -u
    # Make sure output folder exists
    mkdir -p $(dirname {output})
    # Make job specific scratch dir
    sampleID=$(echo $(basename {input}))
    echo -e "\nCreating temporary directory {config[path][scratch]}/{config[folder][SMETANA]}/${{sampleID}} ... "
    mkdir -p {config[path][scratch]}/{config[folder][SMETANA]}/${{sampleID}}
    # Move to tmp dir
    cd {config[path][scratch]}/{config[folder][SMETANA]}/${{sampleID}}
    # Copy media db and GEMs
    cp {config[path][root]}/{config[folder][scripts]}/{config[scripts][carveme]} {input}/*.xml .
    # Run SMETANA
    smetana -o $(basename {input}) --flavor fbc2 \
    --mediadb media_db.tsv -m {config[params][smetanaMedia]} \
    --detailed \
    --solver {config[params][smetanaSolver]} -v *.xml
    # Copy results to output folder
    cp *.tsv $(dirname {output})
    """

Best wishes,
Francisco

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants