Fix bugs in documentation generation workflow #21
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This fixes four issues with the documentation generation:
backfill.py
tries to push toghpages
and notgh-pages
🤦please-see-list-of-contributorshttpsgithubcomuclahs-cdspipeline-call-mtsnvgraphscontributors-at-github
instead ofplease-see-list-of-contributors-at-github
The first problem was a one-character fix.
I walked right into the second problem (#22) by assuming that, because headings get anchor links, they of course must always be unique. Obviously untrue, and GitHub handles that by appending numbers to repeated headings (see this README for an example). The generated docs will now behave the same way, with anchor links
repeat
,repeat-1
,repeat-2
, etc.For the last two problems, I added a
strip_markdown
function that uses some custom rules for a markdown-it renderer to strip away all inline formatting. That function is used for both the anchor links and the page titles - that ensures that the links work as expected, the table of contents looks correct, but the formatting persists on the actual pages:Finally, actually incorrect anchor links are just left broken. That means that instead of being properly re-written to point to the correct page, like http://localhost:8000/latest/pipeline-steps/#3-Call-mtSNV-with-mitoCaller, they are left as bad anchors on the root page, like http://localhost:8000/latest/#2-Align-mt-Reads-with-MToolBox. A warning annotation will show up on the workflow run's summary, but nobody will notice unless they specifically go looking for it.
Closes #19, closes #22
Checklist
This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem1 - Even a small leak can be costly2.
This PR does NOT contain germline genetic data3, RNA-Seq, DNA methylation, microbiome or other molecular data4.
.png
, .jpeg
),.pdf
,.RData
,.xlsx
,.doc
,.ppt
, or other output files.To automatically exclude such files using a .gitignore file, see here for example.
I have read the code review guidelines and the code review best practice on GitHub check-list.
I have set up or verified the
main
branch protection rule following the github standards before opening this pull request.The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have added the major changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.Footnotes
UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records ↩
The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. ↩
Genetic information is considered PHI.
Forensic assays can identify patients with as few as 21 SNPs ↩
RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity. ↩