Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in documentation generation workflow #21

Merged
merged 5 commits into from
Mar 7, 2024

Conversation

nwiltsie
Copy link
Member

@nwiltsie nwiltsie commented Mar 7, 2024

Description

This fixes four issues with the documentation generation:

  1. backfill.py tries to push to ghpages and not gh-pages 🤦
  2. Repeated headings trip an assertion failure
  3. Bad anchor links cause the process to crash
  4. Headings with Markdown formatting (links, emphasis, whatever) have that formatting show up in the table of contents:
Screenshot 2024-03-06 at 3 43 41 PM

The first problem was a one-character fix.

I walked right into the second problem (#22) by assuming that, because headings get anchor links, they of course must always be unique. Obviously untrue, and GitHub handles that by appending numbers to repeated headings (see this README for an example). The generated docs will now behave the same way, with anchor links repeat, repeat-1, repeat-2, etc.

For the last two problems, I added a strip_markdown function that uses some custom rules for a markdown-it renderer to strip away all inline formatting. That function is used for both the anchor links and the page titles - that ensures that the links work as expected, the table of contents looks correct, but the formatting persists on the actual pages:

Screenshot 2024-03-06 at 3 47 26 PM

Finally, actually incorrect anchor links are just left broken. That means that instead of being properly re-written to point to the correct page, like http://localhost:8000/latest/pipeline-steps/#3-Call-mtSNV-with-mitoCaller, they are left as bad anchors on the root page, like http://localhost:8000/latest/#2-Align-mt-Reads-with-MToolBox. A warning annotation will show up on the workflow run's summary, but nobody will notice unless they specifically go looking for it.

Closes #19, closes #22

Checklist

  • This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
    Disclosing PHI is a major problem1 - Even a small leak can be costly2.

  • This PR does NOT contain germline genetic data3, RNA-Seq, DNA methylation, microbiome or other molecular data4.

  • This PR does NOT contain other non-plain text files, such as: compressed files, images (e.g. .png, .jpeg), .pdf, .RData, .xlsx, .doc, .ppt, or other output files.

  To automatically exclude such files using a .gitignore file, see here for example.

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have set up or verified the main branch protection rule following the github standards before opening this pull request.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have added the major changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

Footnotes

  1. UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records

  2. The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records.

  3. Genetic information is considered PHI.
    Forensic assays can identify patients with as few as 21 SNPs

  4. RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity.

@nwiltsie nwiltsie requested a review from a team March 7, 2024 00:11
yashpatel6
yashpatel6 previously approved these changes Mar 7, 2024
Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@yashpatel6 yashpatel6 dismissed their stale review March 7, 2024 16:58

Changes made

Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good!

@nwiltsie nwiltsie merged commit 1d81954 into main Mar 7, 2024
1 check passed
@nwiltsie nwiltsie deleted the nwiltsie-fix-broken-links branch March 7, 2024 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle docs issue with pipeline-call-sCNA Handle headings with embedded links
2 participants