Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load the glossary from the Octez docs #417

Merged
merged 16 commits into from
Aug 1, 2024
Merged

Conversation

timothymcmackin
Copy link
Collaborator

@timothymcmackin timothymcmackin commented Jun 28, 2024

Grab the glossary from the Octez docs, use the script they provide to extract the content, and overwrite the existing glossary.md file.

I tried editing the output HTML file after the build, but in that case Docusaurus was caching the old version of the file. Based on discussion with some colleagues, it appears to be better to do this manipulation before the build.

@timothymcmackin timothymcmackin self-assigned this Jun 28, 2024
Copy link

vercel bot commented Jun 28, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
docs-staging ✅ Ready (Inspect) Visit Preview Jul 25, 2024 8:40pm

@NicNomadic
Copy link
Collaborator

Looks great! I generated the glossary and built the doc locally, and everything seems fine.
The only little thing I noticed is that external links such as https://tezos.gitlab.io/paris/consensus.html#slashing-paris are not open in a separate tab, nor do they show up with the little weblink icon. But maybe that's not important.

@timothymcmackin
Copy link
Collaborator Author

Looks great! I generated the glossary and built the doc locally, and everything seems fine. The only little thing I noticed is that external links such as https://tezos.gitlab.io/paris/consensus.html#slashing-paris are not open in a separate tab, nor do they show up with the little weblink icon. But maybe that's not important.

Made external links open in a new tab and show the external link icon.

@NicNomadic
Copy link
Collaborator

Made external links open in a new tab and show the external link icon.

Strangely, I see now an empty page with just a link to the Octez glossary, which indeed opens in a new tab. So this could also work, but I think it's not what you're trying to do, right?

@timothymcmackin
Copy link
Collaborator Author

Made external links open in a new tab and show the external link icon.

Strangely, I see now an empty page with just a link to the Octez glossary, which indeed opens in a new tab. So this could also work, but I think it's not what you're trying to do, right?

There's a glitch in docusaurus; when I edit the page after the build, the docusaurus serve command shows the old page -- it caches it somehow. When we deploy to AWS, only the built output gets pushed, not the cache, so I think this will work. I'm investigating the cache issue.

@timothymcmackin timothymcmackin marked this pull request as ready for review July 2, 2024 11:18
@timothymcmackin timothymcmackin requested a review from a team as a code owner July 2, 2024 11:18
beatalipska
beatalipska previously approved these changes Jul 2, 2024
@timothymcmackin
Copy link
Collaborator Author

@NicNomadic

Strangely, I see now an empty page with just a link to the Octez glossary,

I've reworked this PR to edit the glossary file before the build to avoid the docusaurus caching issue. Now you can see the imported glossary in the preview: https://docs-staging-git-include-html-trili-tech.vercel.app/overview/glossary

@NicNomadic
Copy link
Collaborator

Beware, you import the glossary for Alpha, but you should import the active protocol's instead (from https://tezos.gitlab.io/active/glossary.html).

@NicNomadic
Copy link
Collaborator

The terms no longer appear in bold, it was nicer before!

@NicNomadic
Copy link
Collaborator

Clicking on a term goes a bit after its definition (both in Safari and Firefox), is this hard to fix?

@NicNomadic
Copy link
Collaborator

Links to the pages within the Octez, such as "Slashing" or "application time" don't work. Their URL contains "...//..." instead of ".../active/...".

@timothymcmackin
Copy link
Collaborator Author

The terms no longer appear in bold,

Fixed, thx for catching that regression.

import the active protocol's instead

Fixed.

Their URL contains "...//..." instead of ".../active/...".

I see this in the output of the extract_content script -- could be a bug on that side?

Clicking on a term goes a bit after its definition

Investigating...

@NicNomadic
Copy link
Collaborator

Their URL contains "...//..." instead of ".../active/...".
I see this in the output of the extract_content script -- could be a bug on that side?

This issue should be fixed when merging https://gitlab.com/tezos/tezos/-/merge_requests/13987. Or use already the script in that branch: https://gitlab.com/nomadic-labs/tezos/-/raw/nic@global-glossary-doc/docs/scripts/extract_content.

@timothymcmackin
Copy link
Collaborator Author

Clicking on a term goes a bit after its definition

I don't know why this is happening so I added a custom script to add some space after the user clicks an anchor on the page. It won't work if the user comes to the page directly via a URL.

@timothymcmackin
Copy link
Collaborator Author

This issue should be fixed...

Yes, that new script seems to work. It leaves only a few URLs like this, which look odd but work:
https://tezos.gitlab.io/active/../shell/validation.html#shell-header

@NicNomadic
Copy link
Collaborator

Yes, that new script seems to work.

I wanted to validate your last version, but when I run npm run build it stops with an error:

Used Octez docs script to pull the content from its glossary
/Users/nic/github/tezos-developer-docs/src/scripts/process_downloaded_glossary.js:86
  const h1 = trimmed.querySelector('h1');
                     ^
TypeError: Cannot read properties of null (reading 'querySelector')
    at process_glossary (/Users/nic/github/tezos-developer-docs/src/scripts/process_downloaded_glossary.js:86:22)

Any suggestion?

@timothymcmackin
Copy link
Collaborator Author

Any suggestion?

@NicNomadic An element changed in the glossary. Load my updated version and it should work now.

@NicNomadic
Copy link
Collaborator

I checked that most of the issues I raised have been solved, including:

  • The terms no longer appear in bold,
  • import the active protocol's instead [of Alpha]
  • Clicking on a term goes a bit after its definition

However, it remains the following one:

Links to the pages within the Octez, such as "Slashing" or "application time" don't work. Their URL contains "...//..." instead of ".../active/...".

Now links no more contain "...//...", but they don't have the right prefix, which should be https://tezos.gitlab.io/active/ instead of http://localhost:3000/overview/. As a result, clicking on those links results in a 404 error page.

@timothymcmackin
Copy link
Collaborator Author

As a result, clicking on those links results in a 404 error page.

This build process runs the conversion script at https://gitlab.com/tezos/tezos/-/raw/master/docs/scripts/extract_content?ref_type=heads&inline=false, which should fix the links. I'll investigate.

@timothymcmackin
Copy link
Collaborator Author

timothymcmackin commented Jul 25, 2024

@NicNomadic This seems to work: I've taken the two find-replace commands from your extract_content script and handled them myself so I don't have to bother calling awk. Now the external and internal links seem to work. This is starting to seem like a fragile solution, though. I wonder if it would be better if I got the raw RST and ran it through pandoc or something to get markdown.

https://docs-staging-iibre089k-trili-tech.vercel.app/overview/glossary

@NicNomadic
Copy link
Collaborator

@NicNomadic This seems to work [...]. This is starting to seem like a fragile solution, though. I wonder if it would be better if I got the raw RST and ran it through pandoc or something to get markdown.

I agree that this is a low-cost and somewhat fragile solution, but I tried with Pandoc and it didn't handle the advanced RST combinations used in the glossary, so for the least this other solution requires more work to set up. What if we start with the current simple solution as long as it works (modulo simple adaptations), and if this becomes too difficult later on, we can explore further the Pandoc-based track?

Copy link
Collaborator

@NicNomadic NicNomadic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@timothymcmackin timothymcmackin merged commit 2de16fe into staging Aug 1, 2024
4 checks passed
@timothymcmackin timothymcmackin deleted the include-html branch August 1, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants