Fix bugs in documentation generation workflow #21

nwiltsie · 2024-03-07T00:11:43Z

Description

This fixes four issues with the documentation generation:

backfill.py tries to push to ghpages and not gh-pages 🤦
Repeated headings trip an assertion failure
Bad anchor links cause the process to crash
- Sub-issue Handle headings with embedded links #19 is that the parsed anchor for a heading with an embedded link is somethink like please-see-list-of-contributorshttpsgithubcomuclahs-cdspipeline-call-mtsnvgraphscontributors-at-github instead of please-see-list-of-contributors-at-github
- Older versions of call-mtSNV just have plain ol' broken links in them, like https://github.com/uclahs-cds/pipeline-call-mtSNV/tree/2.0.0?tab=readme-ov-file#2-Align-mt-Reads-with-MToolBox
Headings with Markdown formatting (links, emphasis, whatever) have that formatting show up in the table of contents:

The first problem was a one-character fix.

I walked right into the second problem (#22) by assuming that, because headings get anchor links, they of course must always be unique. Obviously untrue, and GitHub handles that by appending numbers to repeated headings (see this README for an example). The generated docs will now behave the same way, with anchor links repeat, repeat-1, repeat-2, etc.

For the last two problems, I added a strip_markdown function that uses some custom rules for a markdown-it renderer to strip away all inline formatting. That function is used for both the anchor links and the page titles - that ensures that the links work as expected, the table of contents looks correct, but the formatting persists on the actual pages:

Finally, actually incorrect anchor links are just left broken. That means that instead of being properly re-written to point to the correct page, like http://localhost:8000/latest/pipeline-steps/#3-Call-mtSNV-with-mitoCaller, they are left as bad anchors on the root page, like http://localhost:8000/latest/#2-Align-mt-Reads-with-MToolBox. A warning annotation will show up on the workflow run's summary, but nobody will notice unless they specifically go looking for it.

Closes #19, closes #22

Checklist

This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem¹ - Even a small leak can be costly².
This PR does NOT contain germline genetic data³, RNA-Seq, DNA methylation, microbiome or other molecular data⁴.

This PR does NOT contain other non-plain text files, such as: compressed files, images (e.g. .png, .jpeg), .pdf, .RData, .xlsx, .doc, .ppt, or other output files.

To automatically exclude such files using a .gitignore file, see here for example.

I have read the code review guidelines and the code review best practice on GitHub check-list.
I have set up or verified the main branch protection rule following the github standards before opening this pull request.
The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have added the major changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records ↩
The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. ↩
Genetic information is considered PHI.
Forensic assays can identify patients with as few as 21 SNPs ↩
RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity. ↩

yashpatel6

Looks good!

Changes made

yashpatel6

Still looks good!

nwiltsie added 3 commits March 6, 2024 15:24

Small bugfixes

961ce49

Fix broken links and markdown headers

f5cccbb

Update CHANGELOG

eea8526

nwiltsie requested a review from a team March 7, 2024 00:11

yashpatel6 previously approved these changes Mar 7, 2024

View reviewed changes

nwiltsie added 2 commits March 7, 2024 08:56

Handle repeated headings

beb9474

Silence pylint warning

26e6825

yashpatel6 approved these changes Mar 7, 2024

View reviewed changes

nwiltsie merged commit 1d81954 into main Mar 7, 2024
1 check passed

nwiltsie deleted the nwiltsie-fix-broken-links branch March 7, 2024 17:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bugs in documentation generation workflow #21

Fix bugs in documentation generation workflow #21

nwiltsie commented Mar 7, 2024 •

edited

Loading

yashpatel6 left a comment

yashpatel6 left a comment

Fix bugs in documentation generation workflow #21

Fix bugs in documentation generation workflow #21

Conversation

nwiltsie commented Mar 7, 2024 • edited Loading

Description

Closes #19, closes #22

Checklist

Footnotes

yashpatel6 left a comment

Choose a reason for hiding this comment

yashpatel6 left a comment

Choose a reason for hiding this comment

nwiltsie commented Mar 7, 2024 •

edited

Loading