mgnifams-site

Dev setup

pip install -r requirements.txt
python manage.py migrate
python manage.py collectstatic
python manage.py runserver 8000

Demo deployment (Kubernetes)

There is a basic Kubernetes configuration for deploying this to EBI's Web Production K8s clusters:

A quay.io "pull secret" is required (as a K8s secret YAML), along with a K8s cluster admin configuration.

With those in place:

docker build -f Dockerfile -t quay.io/microbiome-informatics/mgnifams_site:ebi-wp-k8s-hl --load .
docker push quay.io/microbiome-informatics/mgnifams_site:ebi-wp-k8s-hl
kubectl apply -f deployment/ebi-wp-k8s-hl.yaml

bin scripts to produce data

extract_rf.py

Only used to produce earlier version rf files to link HMM to MSA. Now it is incorporated in the family generation pipeline.

Biome distribution

python3 bin/get_biome_distribution.py bin/db_config.ini data/families/updated_refined_families.tsv tmp/ data/biome_sunburst/tmp/

Query the PostgreSQL proteins database for biome data relative to the sequences in each family

Arguments:

config_file: Path to the configuration file for the database secrets

edge_list_file: Path to the edge list file with two columns (family-sequence)

tmp_dir: Path to the tmp directory (intermediate)

result_dir: Path to the result directory (intermediate)

python3 bin/parse_biome_sunburst.py bin/db_config.ini data/biome_sunburst/tmp/ data/biome_sunburst/result/

Query the PostgreSQL proteins database for biome names and parse into the final sunburst format

config_file: Path to the configuration file for the database secrets

counts_dir: Path to the folder with biomes ids and counts per family

out_dir: Path to the results directory (final)

Domain architecture

python3 bin/get_pfams.py bin/db_config.ini data/families/updated_refined_families.tsv data/pfams/tmp

Query the PostgreSQL proteins database for for each family to get the pfam domains for all of its underlying sequences

config_file: Path to the configuration file for the database secrets

edge_list_file: Path to the edge list file with two columns (family-sequence)

result_dir: Path to the result directory (intermediate)

python3 bin/parse_pfams.py data/families/updated_refined_families.tsv data/pfams/tmp/ data/pfams/result/

Parse pfam ids into domain architecture format

edge_list_file: Path to the edge list file with two columns (family-sequence)

read_dir: Path to the folder with pfam ids per family and sequence

out_dir: Path to the results directory (intermediate)

python3 bin/translate_pfams.py bin/db_config.ini data/pfams/result/ data/pfams/translated/

Query a PostgreSQL database and translate pfam ids into clickable names

config_file: Path to the configuration file for the database secrets

read_dir: Path to the folder with the domain architecture json files

out_dir: Path to the translated results directory (final)

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
deployment		deployment
mgnifams_site		mgnifams_site
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mgnifams-site

Dev setup

Demo deployment (Kubernetes)

bin scripts to produce data

Biome distribution

Domain architecture

About

Releases

Packages

Contributors 2

Languages

License

EBI-Metagenomics/mgnifams-site

Folders and files

Latest commit

History

Repository files navigation

mgnifams-site

Dev setup

Demo deployment (Kubernetes)

bin scripts to produce data

Biome distribution

Domain architecture

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages