From 5167ea5388a9ab7929cbf622729d0c20052b513b Mon Sep 17 00:00:00 2001 From: Jorge Alvarez Jarreta Date: Tue, 4 Jun 2024 14:20:48 +0100 Subject: [PATCH 1/6] fix minor typos and link in documentation --- docs/BRC4_genome_loader.md | 4 ++-- docs/cicd_gitlab.md | 16 ++++++++-------- docs/trf_split_run.md | 6 +++--- 3 files changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/BRC4_genome_loader.md b/docs/BRC4_genome_loader.md index bd1cef9e8..2c0b51355 100644 --- a/docs/BRC4_genome_loader.md +++ b/docs/BRC4_genome_loader.md @@ -1,9 +1,9 @@ ## BRC4_genome_loader -### Module: [Bio::EnsEMBL::Pipeline::PipeConfig::BRC4_genome_loader_conf](../src/perl/Bio/EnsEMBL/Pipeline/PipeConfig/BRC4_genome_loader_conf.pm) +### Module: [Bio::EnsEMBL::Pipeline::PipeConfig::BRC4_genome_loader_conf](https://github.com/Ensembl/ensembl-genomio/blob/main/src/perl/Bio/EnsEMBL/Pipeline/PipeConfig/BRC4_genome_loader_conf.pm) Creates an [Ensembl core database](http://www.ensembl.org/info/docs/api/core/index.html) from a set of flat files -or adds ad-hoc (ie organellas) sequences to the existing core. +or adds ad-hoc (i.e. organellas) sequences to the existing core. ### Prerequisites A registry file with the locations of the core database server(s) and the production database (or `-production_db $PROD_DB_URL` specified). diff --git a/docs/cicd_gitlab.md b/docs/cicd_gitlab.md index c3a582f98..23564bdb9 100644 --- a/docs/cicd_gitlab.md +++ b/docs/cicd_gitlab.md @@ -1,5 +1,5 @@ ## [Gitlab CI/CD](https://docs.gitlab.com/ee/ci/) pipelines -### Location [gitlab/cicd](../gitlab/cicd) +### Location [cicd/gitlab](https://github.com/Ensembl/ensembl-genomio/blob/main/cicd/gitlab) Some Gitlab based CI/CD helper pipelines @@ -14,17 +14,17 @@ See this [Gitlab documentation](https://docs.gitlab.com/ee/user/project/import/g 2) Get repo 3) Create (if you need) cicd/gitlab folders, add configs. -We use [cicd/gitlab/dot.gitlab-ci.yml](../cicd/gitlab/dot.gitlab-ci.yml) for this repo instead the default one (see below). +We use [cicd/gitlab/dot.gitlab-ci.yml](https://github.com/Ensembl/ensembl-genomio/blob/main/cicd/gitlab/dot.gitlab-ci.yml) for this repo instead the default one (see below). 4) Edit settings for CI/CD [General pipelines] expand -set "CI/CD configuration file" to [cicd/gitlab/dot.gitlab-ci.yml](../cicd/gitlab/dot.gitlab-ci.yml) -click [Save changes] at the end of this small section +Set "CI/CD configuration file" to [cicd/gitlab/dot.gitlab-ci.yml](https://github.com/Ensembl/ensembl-genomio/blob/main/cicd/gitlab/dot.gitlab-ci.yml) +Click [Save changes] at the end of this small section -5) enable runners -General pipelines] expand +5) Enable runners +[General pipelines] expand [Runners] expand -pick "Shared runners" +Pick "Shared runners" 6) Customize email settings Like it's stated on the [official documentation](https://docs.gitlab.com/ee/user/project/integrations/pipeline_status_emails.html): @@ -38,7 +38,7 @@ Like it's stated on the [official documentation](https://docs.gitlab.com/ee/user ## A few notes on style * We suggest separating logic for running various parts into separate pipelines and using different `trigger` jobs to invoke these pipelines. -As for now we have [cicd/gitlab/parts/](../cicd/gitlab/parts/) folder for these needs. +As for now we have [cicd/gitlab/parts/](https://github.com/Ensembl/ensembl-genomio/blob/main/cicd/gitlab/parts/) folder for these needs. * Feel free to use external pipelines (as `trigger` jobs) or other parts from [GitLab templates](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/ci/templates). Though for pipelines, please, add ``` diff --git a/docs/trf_split_run.md b/docs/trf_split_run.md index 48224304f..962b1aefe 100644 --- a/docs/trf_split_run.md +++ b/docs/trf_split_run.md @@ -1,7 +1,7 @@ -# [trf_split_run.bash](scripts/trf_split_run.bash) +# [trf_split_run.bash](https://github.com/Ensembl/ensembl-genomio/blob/main/scripts/trf_split_run.bash) A trf wrapper with chunking support to be used with -[ensembl-production-imported DNAFeatures pipeline](https://github.com/Ensembl/ensembl-production-imported/tree/main/src/perl/Bio/EnsEMBL/EGPipeline/PipeConfig/DNAFeatures_conf.pm) (see [docs](docs/trf_split_run.md)) +[ensembl-production-imported DNAFeatures pipeline](https://github.com/Ensembl/ensembl-production-imported/tree/main/src/perl/Bio/EnsEMBL/EGPipeline/PipeConfig/DNAFeatures_conf.pm) Compatible compatible with input/output format of trf invocation from [Bio::EnsEMBL::Analysis::Runnable::TRF](https://github.com/Ensembl/ensembl-analysis/blob/main/modules/Bio/EnsEMBL/Analysis/Runnable/TRF.pm). And can be used as a hack to allow TRF stage to be accomplished at the cost of splitting long repeat into several adjacent ones (with possible losses). @@ -14,7 +14,7 @@ python -c 'from Bio import SeqIO' || echo "no biopython" >> /dev/stderr ``` ## Options -Use environment variable to control scipt run +Use environment variable to control script run * `DNA_FEATURES_TRF_SPLIT_NO_SPLITTING` -- set to `YES` to skip splitting stage * `DNA_FEATURES_TRF_SPLIT_NO_TRF` -- set to `YES` to skip trf stage * `DNA_FEATURES_TRF_SPLIT_SPLITTER_CHUNK_SIZE` -- chunk size [`1_000_000`] From ae421a0f4778fdd26de991815e32babe0adcc3e2 Mon Sep 17 00:00:00 2001 From: Jorge Alvarez Jarreta Date: Tue, 4 Jun 2024 14:21:28 +0100 Subject: [PATCH 2/6] update (simplify) mkdocs script --- docs/scripts/gen_ref_pages.py | 45 ++++++++++------------------------- 1 file changed, 13 insertions(+), 32 deletions(-) diff --git a/docs/scripts/gen_ref_pages.py b/docs/scripts/gen_ref_pages.py index c5aab2319..ed25450c5 100644 --- a/docs/scripts/gen_ref_pages.py +++ b/docs/scripts/gen_ref_pages.py @@ -22,47 +22,28 @@ nav = mkdocs_gen_files.Nav() -root = Path("src/python/ensembl/brc4") -for py_path in sorted(root.rglob("*.py")): - # Drop "src/python" from the path components - parts = py_path.parts[2:] - - if parts[-1] == "__main__.py": - continue - - doc_path = py_path.relative_to(root).with_suffix(".md") +root = Path("src/python").resolve() +ensembl_path = root / "ensembl" +for py_path in sorted(ensembl_path.rglob("*.py")): + # Get the relative module path and corresponding documentation paths + module_path = py_path.relative_to(root) + doc_path = module_path.with_suffix(".md") full_doc_path = Path("reference", doc_path) - - if parts[-1] == "__init__.py": + # Get all the parts of the module path without the ".py" extension + parts = tuple(module_path.with_suffix("").parts) + # Drop "__init__" file from the path components as well (if present) + if parts[-1] == "__init__": parts = parts[:-1] doc_path = doc_path.with_name("index.md") full_doc_path = full_doc_path.with_name("index.md") - - nav[parts] = doc_path.as_posix() - - with mkdocs_gen_files.open(full_doc_path, "w") as fd: - identifier = ".".join(parts).replace(".py", "") - fd.write(f"::: {identifier}\n") - - mkdocs_gen_files.set_edit_path(full_doc_path, Path("../") / py_path) - -root = Path("src/python/ensembl/io") -num_parents = len(root.parents) - 2 -for init_path in sorted(root.rglob("__init__.py")): - # Get the relative module path - module_path = init_path.relative_to(root).parent - doc_path = module_path.with_suffix(".md") - full_doc_path = Path("reference", doc_path) - # Drop all the parents of the namespace and "__init__.py" file from the path components - parts = init_path.parts[num_parents:-1] # Add markdown file path with its index tree nav[parts] = doc_path.as_posix() - # Populate the markdown file with the doc stub of this Python module - with mkdocs_gen_files.open(full_doc_path, "a") as fd: + # Populate the markdown file with the doc stub of this module + with mkdocs_gen_files.open(full_doc_path, "w") as fd: identifier = ".".join(parts) fd.write(f"::: {identifier}\n") # Correct the path - mkdocs_gen_files.set_edit_path(full_doc_path, Path("../") / init_path) + mkdocs_gen_files.set_edit_path(full_doc_path, module_path) with mkdocs_gen_files.open("reference/summary.md", "w") as nav_file: nav_file.writelines(nav.build_literate_nav()) From 6b9cc7073a4c1f4cd564acceb6996facf5a36551 Mon Sep 17 00:00:00 2001 From: Jorge Alvarez Jarreta Date: Tue, 4 Jun 2024 14:21:47 +0100 Subject: [PATCH 3/6] revert some style changes to previous config --- mkdocs.yml | 3 --- 1 file changed, 3 deletions(-) diff --git a/mkdocs.yml b/mkdocs.yml index a6af7919a..85c75736f 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -33,9 +33,6 @@ theme: code: IBM Plex Mono features: - content.tooltips - - navigation.expand - - navigation.tabs - - navigation.tabs.sticky - navigation.top - search.highlight - search.suggest From 5767f4d66bcceddad6ebb0b045d88c7d8d1d62d9 Mon Sep 17 00:00:00 2001 From: Jorge Alvarez Jarreta Date: Tue, 4 Jun 2024 14:22:16 +0100 Subject: [PATCH 4/6] update minimum version for ensembl-utils to v0.2.0 --- pyproject.toml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pyproject.toml b/pyproject.toml index 0e6a9c953..4023b3d58 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -53,7 +53,7 @@ dependencies = [ "biopython == 1.81", "ensembl-hive @ git+https://github.com/Ensembl/ensembl-hive.git", "ensembl-py @ git+https://github.com/Ensembl/ensembl-py.git", # minimum v2.0.0 - "ensembl-utils >= 0.1.3", + "ensembl-utils >= 0.2.0", "jsonschema >= 4.6.0", "intervaltree >= 3.1.0", "mysql-connector-python >= 8.0.29", From b61e01c7e2ced6a8ed4f13057ee1985a01418f19 Mon Sep 17 00:00:00 2001 From: Jorge Alvarez Jarreta Date: Tue, 4 Jun 2024 14:22:42 +0100 Subject: [PATCH 5/6] add missing __init__.py file for data folder --- src/python/ensembl/io/genomio/data/__init__.py | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 src/python/ensembl/io/genomio/data/__init__.py diff --git a/src/python/ensembl/io/genomio/data/__init__.py b/src/python/ensembl/io/genomio/data/__init__.py new file mode 100644 index 000000000..55b0824c8 --- /dev/null +++ b/src/python/ensembl/io/genomio/data/__init__.py @@ -0,0 +1,15 @@ +# See the NOTICE file distributed with this work for additional information +# regarding copyright ownership. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Data files.""" \ No newline at end of file From a27dbd517a14b46ebb779745001a2e5050dd0028 Mon Sep 17 00:00:00 2001 From: Jorge Alvarez Jarreta Date: Tue, 4 Jun 2024 14:34:35 +0100 Subject: [PATCH 6/6] make pylint happy --- src/python/ensembl/io/genomio/data/__init__.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/python/ensembl/io/genomio/data/__init__.py b/src/python/ensembl/io/genomio/data/__init__.py index 55b0824c8..866de855b 100644 --- a/src/python/ensembl/io/genomio/data/__init__.py +++ b/src/python/ensembl/io/genomio/data/__init__.py @@ -12,4 +12,4 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""Data files.""" \ No newline at end of file +"""Data files."""