Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Feedback #1

Closed
chrisjsewell opened this issue Apr 2, 2021 · 33 comments
Closed

Initial Feedback #1

chrisjsewell opened this issue Apr 2, 2021 · 33 comments

Comments

@chrisjsewell
Copy link
Member

chrisjsewell commented Apr 2, 2021

This implements executablebooks/meta#292

Everything should be documented in the README (and what is left TODO)

The _toc.yml differs slightly from the one in https://jupyterbook.org/customize/toc.html, to streamline it a bit (a conversion function can be written for migration):

  • defaults are specified under a separate defaults top-level key, rather under the first document

  • titlesonly is not set to True by default

  • The toc is always heirarchical, i.e.

    - file: intro
    - file: doc1
    - file: doc2

    would now be:

    format: jb-article
    root: intro
    sections:
    - file: doc1
    - file: doc2

    Its a little more verbose, for these very simple tocs, but removes ambiguity, maps more closely to sphinx toctrees and is easier to explain/understand.

  • removed chapters key, as it seems redundant, i.e. only parts (-> toctree directives in a document) and sections (-> lines in a toctree) are necessary to generate any sphinx site map.

  • changed file to doc (changed this back, so as not to deviate unnecessarily)

cc @choldgraf, @AakashGfude, @mmcky, @ericholscher, @pradyunsg, @hukkinj1 for comment

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 3, 2021

@ericholscher you might be able to help me with this: the RTD docs are building (https://readthedocs.org/projects/sphinx-external-toc/builds/) but the site says they do not exist: https://sphinx-external-toc.readthedocs.io 🤔

@chrisjsewell
Copy link
Member Author

also cc @mgeier, since its an alternative to https://nbsphinx.readthedocs.io/en/0.8.2/subdir/toctree.html

@mgeier
Copy link

mgeier commented Apr 3, 2021

Thanks for cc-ing!

I like having the toctree links in the notebooks themselves, because then they can also be clicked in JupyterLab.

But I guess if somebody wants to use it, the external TOC extension would work just as well with Jupyter notebooks and nbsphinx as it would work with "normal" Sphinx source files, right?

@chrisjsewell
Copy link
Member Author

But I guess if somebody wants to use it, the external TOC extension would work just as well with Jupyter notebooks and nbsphinx as it would work with "normal" Sphinx source files, right?

Yes indeed, the core extension is input format agnostic (modifying the parsed AST) 👍

@AakashGfude
Copy link
Member

AakashGfude commented Apr 7, 2021

Thank you @chrisjsewell

root:
  file: intro
  sections:
  - file: doc1
  - file: doc2

Is there a case when main won't be used in a toc? If it is used every time along with the master doc, then might be redundant?

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 7, 2021

yeh you could potentially have:

root: intro
sections:
- file: doc1
- file: doc2

which is also equivalent to:

root: intro
parts:
- sections:
  - file: doc1
  - file: doc2

but would just want to think if this "clutters" the top-level namespace or restricts addition of later keys

@chrisjsewell
Copy link
Member Author

yeh you could potentially have ...

now changed and the initial comment edited

@choldgraf
Copy link
Member

choldgraf commented Apr 10, 2021

A couple quick thoughts:

  • This is awesome, much better implementation than our hacky "create rST and MD from scratch and append to documents" approach. I'm looking forward to removing all of that nasty code!
  • I like the fact that we're forcing hierarchy. I regret the "magic" in our current TOC logic to assume the first file is the root, because it's more brittle. I think this is a better decision.
  • Is there a way that we can specify global defaults for the toc without it being in _toc.yml? E.g., if we were to use this in jupyter book, we'd probably want titlesonly / hidden to be on by default for all tocs, no? In the jupyter book PR we manually added that, but I suspect all JB users will expect this by default, no?
  • I think that in our docs for this, we should start off with simpler examples, since beginning with multi-part TOCs might be a lot to take in for others. Mind if I add a PR w/ something like a beginner tutorial? This will also be a way for me to figure out if there's anything in the spec that feels awkward.
  • On removing the chapter key - I don't have strong opinions about this one. Agreed that it's redundant w/ sections, though I think we originally kept it in order to give authors a mental model of their book's structure (even though chapters is basically just an alias for sections as you point out). That said, I also see a benefit in keeping the syntax people need to remember relatively small. Curious if @mmcky and @jstac have any thoughts on that?
  • (future idea, not blocking on a release) - if people wanted to use tableofcontents directives to put toctrees on multiple parts of a page, I wonder if we could let them do so by providing a unique id: my-id for each toctree in _toc.yml and then referring to it with their tableofcontents directive.
  • I see the word main: used here, but the word root: used in the README, which is up-to-date? (I think that root is more technically correct, while main is more human-readable)

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 11, 2021

Thanks for the comments @choldgraf

Is there a way that we can specify global defaults for the toc without it being in _toc.yml? E.g., if we were to use this in jupyter book, we'd probably want titlesonly / hidden to be on by default

Note hidden is already True by default, then for titlesonly; firstly can you clarify what the rationale is to have this True?
The defaults key is already a global default, so it would be a bit odd to have multiple levels of global defaults. If we do decide for jupyter-book to keep this, then I would likely look to have a "jupyter-book special" code that changed this default to True (e.g. by checking an environmental variable)

On removing the chapter key - I don't have strong opinions about this one. Agreed that it's redundant w/ sections, though I think we originally kept it in order to give authors a mental model of their book's structure (even though chapters is basically just an alias for sections as you point out).

Exactly we should choose one, otherwise its misleading in that it implies a difference between the two, and is more for users to understand/remember.

I see the word main: used here, but the word root: used in the README, which is up-to-date? (I think that root is more technically correct, while main is more human-readable)

No its root, IMO main is too generic whereas root perfectly describes that it is the document at the root of the sitemap

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 11, 2021

the RTD docs are building (readthedocs.org/projects/sphinx-external-toc/builds) but the site says they do not exist: sphinx-external-toc.readthedocs.io 🤔

Ah I figured out this issue; it is because the root/master_doc is set as intro and so there is no index.html that RTD can find.

Should/has this been opened as a bug on sphinx, that if master_doc is not set to index then it should probably handle it?

Anyhow, I guess this relates to my comment here: https://github.com/executablebooks/jupyter-book/pull/1293/files#r610150939, such that this logic should be moved here, and if root is not index then an index.html should be created (given a sphinx html build) if it does not already exist (i.e. due to a single file build) and also a warning should be generated if a non-root document is called index (and also search and genindex from #16).
The index page could either be a redirect to the real root, or simply a copy of the actual root file I guess.

@chrisjsewell
Copy link
Member Author

Should/has this been opened as a bug on sphinx

Found an RTD issue for it: readthedocs/readthedocs.org#1800

@mmcky
Copy link
Member

mmcky commented Apr 12, 2021

thanks @chrisjsewell on this work. This is a great idea. I just had a few comments

removing the chapter key - I don't have strong opinions about this one. Agreed that it's redundant w/ sections, though I think we originally kept it in order to give authors a mental model of their book's structure (even though chapters is basically just an alias for sections as you point out). That said, I also see a benefit in keeping the syntax people need to remember relatively small.

Are chapter and section absolute equivalents here? If they are I see no downside to allowing either key as a lot of authors would prefer to write the toc in terms of chapters. A section is should be a sub component of a chapter.

Alternatively @chrisjsewell would this extension allow:

  1. a general syntax for setting up the toc, and
  2. a book style syntax that uses chapters, parts and sections.

We could then develop a mapping from the general to the book style of toc.

root

Just one comment re root. I completely agree it technically describes the purpose perfectly. However a lot of non-technical users won't know what root mean. If we want to be author friendly perhaps we allow synonyms for this key to include main?

@hukkin
Copy link

hukkin commented Apr 12, 2021

Regarding chapters/sections/main/root, I personally think there are downsides to aliases

  • Requires more documentation, more to internalize
  • Seeing a ToC with mixed chapters and sections makes me think Sphinx thinks they're different. The _toc.yaml is not shown to a reader, it is parsed by Sphinx. The exact term used is not as important as the need to understand what structure I'm creating in Sphinx, IMO.
  • Moving across projects where different terms used can be confusing

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 12, 2021

Yes I'm completely against aliases as well, it just makes things more confusing. When I tried to read the documentation on the toc on jupyter-book, it was honestly super-confusing and not clear at all when to use sections or chapter/chapters.

A section is a sub component of a chapter.

See @mmcky even you don't understand the current structure lol, because this is definitely incorrect/outdated: there is no section or chapter keys; chapter was already deprecated to part: https://github.com/executablebooks/jupyter-book/blob/8be06d8b8920d4967737d7ffec91af691d7100b1/jupyter_book/toc.py#L67-L68, then sections or chapters (which are currently just aliases) are sub-components of part

However a lot of non-technical users won't know what root means.

If I am over-ruled by consensus fair enough, but I think it is very easy/clear to explain that root is the "root of a tree". In fact, I would put something like this in the documentation:

image

which I think would make it much clearer, and actually give users this mental model that you guys keep alluding to but never showing 😜

@chrisjsewell
Copy link
Member Author

In fact, I would put something like this in the documentation

I could even add this toc.yml -> tree generation to the CLI (using https://graphviz.readthedocs.io as an optional dependency) if people think it would be helpful.

@hukkin
Copy link

hukkin commented Apr 12, 2021

Looking at @chrisjsewell excellent visualization, what comes to my mind is that if there's a naming change that would help me get the right mental model, it would be renaming parts as tocs by the way. Isn't that what parts means to Sphinx? A list of one or more table of contents.

EDIT: On second thought, I don't like this idea. Don't listen to me 😄

@chrisjsewell
Copy link
Member Author

On second thought, I don't like this idea

haha yeh I was just typing:

it would be renaming parts as tocs

I would be hesitant, because (a) it takes it further away from the current jupyter-book toc.yml and (b) maybe it fells a bit weird to have a _toc.yml that then has separate toc keys in it

@mmcky
Copy link
Member

mmcky commented Apr 12, 2021

hey @chrisjsewell this _toc.yml uses the chapters key under part and is a good way to record a collection of chapters that form a part. It maps to how you think of a book.

I am all for a general structure for organising the toc, and I think this syntax is great for me to use. But I do think we need to spend a bit of time looking at it from the non technical end user perspective. I suspect your response will be if it is in the documentation then they should just follow up -- that is fine (and in part that is valid) -- but the general syntax takes a cohort of users out of their writing model which makes it harder to write a book.

FWIW - I am also against aliases unless there is a super clear 1:1 mapping and that isn't the case in these toc schema's.

I think the way forward would be to enable a domain based approach. We have the base schema (as you're proposing) and then allow different structures (that are probably more restrictive subsets -- such as for writing books, articles). That way we can take the book schema and ensure support in LaTeX for example. It is harder to support general structures that are more suitable for html then a more restrictive model that is more tightly tied to writing books. For example, on the LaTeX side it is harder to figure out if a nested section belongs to a chapter etc. (whereas it is a page in the html version). A domain based approach would also then enable preface:, chapter:, appendix:. etc. and this would then (under the hood) map to the more general structure using root etc.

@chrisjsewell
Copy link
Member Author

one quick note

uses the chapter key under part

It doesn't use the chapter key it uses the chapters key, this is an important distinction, since there has previously been a chapter key which was different

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 12, 2021

Allow different structures (that are probably more restrictive subsets -- such as for writing books, articles).

Then you/we need to come up with some code/logic to deal with this.

Thinking about it now, as you allude to, the current toc.yml structure has essentially been written with the assumption that it will map to a LaTeX theme that converts:

index.rst

Index Title
===========

.. toctree::
   :caption: Caption

   doc1

+ doc1.rst

Doc 1 Title
===========

.. toctree::

   doc2

+ doc2.rst

Doc 2 Title
===========

to

\part{Caption}
\chapter{Doc 1 Title}
\section{Doc 2 Title}

(see https://github.com/executablebooks/jupyterbook-latex/blob/master/tests/roots/test-partsToc -> https://github.com/executablebooks/jupyterbook-latex/blob/master/tests/test_toc/test_toc.tex)

This is most definitely not always the case:

  1. Different LaTeX themes can map this differently, and in fact the user can change it with: https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-latex_toplevel_sectioning

If I use the default sphinx LaTeX theme, then it creates:

\addto\captionsenglish{\renewcommand{\contentsname}{Caption}}
\chapter{Doc 1 Title}
\section{Doc 2 Title}

or if I set latex_toplevel_sectioning = 'part' you get:

\addto\captionsenglish{\renewcommand{\contentsname}{Caption}}
\part{Doc 1 Title}
\chapter{Doc 2 Title}

(I assume @mmcky/@AakashGfude have added something in jupyterbook-latex to turn the toctree captions into parts?)

  1. It can change if you have multiple header levels e.g. in the index/root file, i.e. if index.rst was
Index Title
===========

Index Sub-Header
----------------

.. toctree::
   :caption: Caption

   doc1

and latex_toplevel_sectioning = 'part' you end up with:

\addto\captionsenglish{\renewcommand{\contentsname}{Caption}}
\part{Index Sub-Header}
\chapter{Doc 1 Title}
\section{Doc 2 Title}

So actually I would be in favour of moving away from using these key names for the general structure, which can be misleading, to something like:

  • parts -> subtrees
  • sections -> items (or nodes, elements, children)

Then OK you can think of having some kind of "theme specific mapping" (maybe specified under a theme/flavor key), e.g. for "jupyterbook"

  • (top level only) subtrees -> parts
  • no nested subtrees allowed?
  • (top level only) items - > chapters
  • (second level only) items - > sections
  • (third level only) items - > subsections
  • (fourth level only) items -> subsubsections
  • no other nesting levels allowed

Ideally here, you should also warn of things that would make this mapping incorrect e.g. if the root file had more than one heading level

A domain based approach would also then enable preface:, appendix:

appendix (and prefix) are already planned (see https://github.com/executablebooks/sphinx-external-toc#development-notes) these would be separate top-level keys in the toc.yml and nothing in the current structure precludes them

@mmcky
Copy link
Member

mmcky commented Apr 13, 2021

I assume @mmcky/@AakashGfude have added something in jupyterbook-latex to turn the toctree captions into parts?

jupyterbook-latex uses transforms to modify the toc to shape it into a suitable format to get an output we would like from the the generic sphinx latex writer. The part option is turned on if part is detected in the toc and is used to generate the latex output. If a part is in the _toc.yml then the text associated with it will be used in the latex to name the part.

Also the part option through the latex config is documented in the jupyterbook docs but it didn't provide a completely correct mapping for the two structures offered by jupyterbook so we needed to alter the sphinx.ast through a transform:

  1. Files
  2. Parts/Chapters
  • So I guess @AakashGfude we would need to modify part recognition to check for a text label definition for translating to a part in LaTeX otherwise ignore and follow your nesting rules for including children sections into a chapter etc.

Effectively any nested documents is a subtree to sphinx right?

So

  1. part -> chapters is a tree relationship (as you have mentioned), but also
  2. file -> sections is also a subtree relationship.

In jupyterbook-latex any children added as section are folded into the parent file for the pdf.


Then OK you can think of having some kind of "theme specific mapping"

@chrisjsewell I like the idea of having a core schema for the toc that directly relates to how sphinx works (in particular fordevelopers). I think that will really help with the cognitive issues when translating between how sphinx works (i.e. tree structures and flexible) and jupyter book table of contents layouts (easy to use).

But I think the user friendly layer that is more restrictive (and sits on top of the core schema) is essential to put some structure around the assumptions that are needed to be made to map to books etc. in the translation process.


When working on LaTeX we have been using a mixture of docutils / sphinx docs and test configurations such as this ebp-test-projectstructure repo to look at various configurations and how they related to the current structures:

  1. Files
  2. Parts/Chapters

Thanks @chrisjsewell -- I see a spec starting to form around this work.

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 13, 2021

thanks @mmcky

Effectively any nested documents is a subtree to sphinx right?

Well no, not if it does not contain a toctree.
@mmcky, @AakashGfude note you should no longer be loading _toc.yml anywhere in your code (or even assuming it exists). You should always be using app.env.external_site_map, which will be an instance of:

class SiteMap(MutableMapping):

This construct is also completely independent of the key names in the actual YAML, e.g.
to check for "parts" (i.e. multiple toctrees in the root document) you could just do something like:

if len(env.external_site_map.root.subtrees) > 1:

and I would emphasize again that your interpretation of a part and a chapter in jupyterbook-latex are predicated on the assumption that there are no sub-headings in the root/master document (otherwise everything gets "pushed down"), so I would certainly check for and warn about additional section nodes in that document.

@mmcky
Copy link
Member

mmcky commented Apr 14, 2021

hey @chrisjsewell I have come to realise this is primarily a sphinx extension so your aim here is to make whatever sphinx can represent as an external toc. I have been commenting primarily from the jupyterbook viewpoint which is a subset of what sphinx toctree objects can represent.

So we will need to either:

  1. Add an ability for domain support here that jupyter book can plug into, OR
  2. Add the ability for jupyterbook to convert to general yaml configurations (that are constructed here) from current _toc.yml files.

Is that right?

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 16, 2021

Is that right?

Kind of, there's really no difference to what jupyter-book already does, it's just the decision about key-names, which is essentially entirely superficial as it does not affect the underlying API/AST

Add the ability for jupyterbook to convert to general yaml configurations (that are constructed here) from current _toc.yml files.

again I would emphasise that jupyter-book, or its components (like jupyterbook-latex) should not even assume the existence of a _toc.yml any more. The parsing is all dealt with here and stored as env.external_site_map: SiteMap

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 18, 2021

Ok this is good to go in my eyes:

See the README (and commit history) for all details.
For the "default" file format: parts -> subtrees and sections -> items, so that these keys do not have misleading connotation for LaTeX builds.

Then for @mmcky's request for "domains" there is the format key: https://github.com/executablebooks/sphinx-external-toc#using-different-key-mappings

You can see the jupyter-book migrations in the test files:
from https://github.com/executablebooks/sphinx-external-toc/tree/main/tests/_jb_migrate_toc_files
to https://github.com/executablebooks/sphinx-external-toc/tree/main/tests/test_tools

and also you can see the diffs in https://github.com/executablebooks/jupyter-book/pull/1293/files

and so basically for jupyter-book _toc.yml, they will look like either:

format: jb-article
root: index
sections:
- file: subfolder/index
  sections:
  - file: subfolder/asubpage1
- file: content1

or

format: jb-book
root: index
chapters:
- file: subfolder/index
  sections:
  - file: subfolder/asubpage1
- file: content1

or

format: jb-book
root: index
parts:
- caption: A part
  chapters:
  - file: subfolder/index
    sections:
    - file: subfolder/asubpage1
- caption: Another part
  chapters:
  - file: content1

@mmcky
Copy link
Member

mmcky commented Apr 19, 2021

This is looking really neat @chrisjsewell - thanks for your work on support for domains.

I just had a few follow up comments.

  1. I have noticed you have used root: index but isn't the index (from sphinx perspective) defined by the _toc.yml for jupyter-book? The root file in the jupyterbook context has the same role as the current first file right? It is essentially preamble / frontmatter (kind off in the PDF context) or the front page in the HTML context.

  2. In the future do you think we could specify other domains such as such as a much simpler article representation that map into the more general syntax you have put together (with stricter assumptions), something like:

format: article
title: <text>
introduction: <file>
section: <file>
section: <file>
section: <file>
conclusion: <file>

and for books

format: book
title: <text>
frontmatter: <file>
toc: true
chapter: <file>
chapter: <file>
chapter: <file>
appendix: <file>

I think these domains will be super helpful in controlling what is (and isn't supported) across the various document output types. (i.e. pdf support is provided for book, article)

@chrisjsewell
Copy link
Member Author

have noticed you have used root: index but isn't the index (from sphinx perspective) defined by the _toc.yml for jupyter-book?

I'm not quite sure I quite understand this sentence lol.
Put simply root == root_doc (master_doc is actually deprecated); in sphinx you cannot build any documentation without one, so it has to be present in some form.
This "entry point document" maps to HTML, where it is generally expected for an index.html to be present, and LaTeX, where you can specify% !TeX root = main.tex.

In the future do you think we could specify other domains such as ...

Well first you would need to define how such formats would be implemented by sphinx?
Presumably you would have to auto-generate the root document in some way

@choldgraf
Copy link
Member

choldgraf commented Apr 19, 2021

Hey all - I gave the docs another read-through, and had a few ideas / comments. Will append to this list as I think of more. In general this is super close though, it looks really nice:

@chrisjsewell
Copy link
Member Author

we should note that master_doc is becoming root_doc in a future Sphinx

yeh it was introduced as an alias in v4: sphinx-doc/sphinx#9116

@choldgraf
Copy link
Member

ok those issues are the main ones that I can think of - I opened them up as separate issues because this mega-thread was becoming difficult to follow. I welcome thoughts and feedback from folks 👍

@mmcky
Copy link
Member

mmcky commented Apr 20, 2021

I'm not quite sure I quite understand this sentence lol.
Put simply root == root_doc (master_doc is actually deprecated); in sphinx you cannot build any documentation without one, so it has to be present in some form.

Right. I just mean in the sphinx context this document typically contains the main toctree but that work is now taken care of by the _toc.yml or yml representation of the toc. So in effect the root_doc is now really the _toc.yml (in a way) as it defines the relationships between files. So the main function of the root_doc (for jupyter-book) is now to host the title and the frontmatter (in the pdf context) and the landing page content (in the html context)?

Well first you would need to define how such formats would be implemented by sphinx?
Presumably you would have to auto-generate the root document in some way

Right -- that's what I am thinking (or replacing the root_doc requirement with the presence of the _toc.yml and generating a synthetic root_doc for sphinx internals)

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Apr 20, 2021

So the main function of the root_doc (for jupyter-book) is now to host the title and the frontmatter (in the pdf context) and the landing page content (in the html context)?
replacing the root_doc requirement with the presence of the _toc.yml and generating a synthetic root_doc for sphinx internals

exactly, so this is obviously functionality that does not currently exist in sphinx or jupyter-book, but could certainly be investigated and likely the code would reside in this package and be accomodated by specifying a new format that may not require the root key.

But yeh definitely a "version 2" consideration 😉

@chrisjsewell
Copy link
Member Author

closing this, since the package is now out of development

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants