Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nextflow ProteinFunction: support for SQLite db #1152

Open
wants to merge 6 commits into
base: postreleasefix/115
Choose a base branch
from

Conversation

nakib103
Copy link
Contributor

ENSVAR-6567

Extend ProteinFunction nextflow pipeline to be able to write to a SQLite db. It introduces the following params -

params.offline - if set, no ensembl database connection would be made (and not storing data in MySQL db)
params.sqlite - if set, a SQLite db would be created with the results. (if params.offline is set params.sqlite is automatically set too, otherwise there would be no output)
params.sqlite_dir - directory location where sqlite db should be stored. By default it is the param.outdir. The db name is set to ${params.species}_PolyPhen_SIFT.db
params.sqlite_db - give full path of the SQLite db with name. Alternative way to give db name.

Test:

nextflow run $ENSEMBL_ROOT_DIR/ensembl-variation/nextflow/ProteinFunction \
-profile $profile \
--sift_run_type FULL \
--outdir $PWD/temp \
--species felis_catus \
--offline 1 \
--gtf <gtf file>
--fasta <fasta file>

Test the generated SQLite db ($PWD/temp/felis_catus_PolyPhen_SIFT.db) against what we have in Ensembl database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant