Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Titles for footnotes #390

Open
wants to merge 5 commits into
base: lib
Choose a base branch
from
Open

Titles for footnotes #390

wants to merge 5 commits into from

Conversation

edent
Copy link

@edent edent commented Mar 26, 2023

Re #387 - I've created a function which adds truncated title text to footnote links, and I've added some tests to show what it does.

Very happy for any comments or corrections.

Copy link
Owner

@michelf michelf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about the performance of this approach for get_footnote_title. Maybe it's fast and I'm worrying too much, but I've always tried to keep the parser extremely fast so the library can be used without caching in many situations. To me, running the HTML parser to create a DOM, searching the DOM to replace various things, then rewriting that DOM to HTML, stripping the HTML tags, splitting on every space and reassembling those words, and calling mb_strlen in a loop, is not very efficient. So I'm wary and I guess I'll want to see some before/after performance benchmarks, especially with long footnotes, before accepting it to be sure this is not a regression in speed.

I also see tests are failing in CI. Are you sure the output is well-formed XHTML?

@@ -82,20 +100,76 @@ three lines, with some span-level markup like
</li>

<li id="fn:reference" role="doc-endnote">
<p>This footnote has a footnote of its own.<sup id="fnref:nested"><a href="#fn:nested" class="footnote-ref" role="doc-noteref">12</a></sup>&#160;<a href="#fnref:reference" class="footnote-backref" role="doc-backlink">&#8617;&#xFE0E;</a></p>
<p>This footnote has a footnote of its own.<sup id="fnref:nested"><a href="#fn:nested" class="footnote-ref" title="This footnote should appear even though it is referenced from another footnote. But Ffn:reference: should be literal since the footnote with that name has already been used." role="doc-noteref">20</a></sup>&#160;<a href="#fnref:reference" class="footnote-backref" role="doc-backlink">&#8617;&#xFE0E;</a></p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "But Ffn:reference: should" isn't the intended result here.


<p>Some paragraph with a footnote<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">5</a></sup>, and another<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">6</a></sup>.</p>
<p>Some paragraph with a footnote<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" title="Content for fifth footnote." role="doc-noteref">5</a></sup>, and another<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" title="Content for sixth footnote spanning on three lines, with some span-level markup like emphasis, a link." role="doc-noteref">6</a></sup>.</p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this footnote be on three lines?


<p>This paragraph links to a footnote with plenty of
block-level content.<sup id="fnref:block"><a href="#fn:block" class="footnote-ref" role="doc-noteref">8</a></sup></p>
block-level content.<sup id="fnref:block"><a href="#fn:block" class="footnote-ref" title="Paragraph. * List item > Blockquote Code block" role="doc-noteref">8</a></sup></p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really make sense to flatten block-level markup like this? At a minimum it should take three lines (using newline characters). But personally, my inclination would be to truncate after the first paragraph instead of doing a textual approximation of lists, blockquotes or other block-level markup.


<p>This paragraph host the footnote reference within a
footnote test<sup id="fnref:reference"><a href="#fn:reference" class="footnote-ref" role="doc-noteref">9</a></sup>.</p>
footnote test<sup id="fnref:reference"><a href="#fn:reference" class="footnote-ref" title="This footnote has a footnote of its own.Ffn:nested:" role="doc-noteref">9</a></sup>.</p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "Ffn:nested:" is what we want here. (What do we want for nested footnotes is a good question.)


<p>Very long footnotes are truncated at word boundaries<sup id="fnref:long"><a href="#fn:long" class="footnote-ref" title="Sensors indicate no shuttle or other ships in this sector. According to coordinates, we have travelled 7,000 light years and are located near the system J-25. Tractor beam released, sir. Force field…" role="doc-noteref">13</a></sup>.</p>

<p>Although some titles may be split-mid emoji if they use ZWJ<sup id="fnref:zwj"><a href="#fn:zwj" class="footnote-ref" title="👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍🦲👨‍…" role="doc-noteref">14</a></sup>.</p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting characters is unfortunate. And I suppose the same thing could happen to any unicode character composed of multiple code points since the string is sliced on a code point boundary. So this is most likely not only a problem for emoji.


<p>Also Emoji<sup id="fnref:🥰"><a href="#fn:🥰" class="footnote-ref" title="🥰" role="doc-noteref">17</a></sup> compatible.</p>

<p>All the text present in a table should be in the title tooltip<sup id="fnref:table"><a href="#fn:table" class="footnote-ref" title="The table header The table body with two columns Footer A demo table" role="doc-noteref">18</a></sup>.</p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better approximation would be to show the table with | separators between columns and a newline after each row. Personally, my choice would be to not offer a title attribute here, forcing people to click the link to see a proper table.

@@ -1760,8 +1842,10 @@ protected function _appendFootnotes_callback($matches) {
$class = $this->encodeAttribute($class);
$attr .= " class=\"$class\"";
}
if ($this->fn_link_title !== "") {
$title = $this->fn_link_title;
if ($this->fn_link_title == "") {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't make much sense. If a title was specified by fn_link_title it shouldn't be ignored. There should be an else clause.

if ($this->fn_link_title == "") {
// Decode any markdown in the footnote
$title = $this->get_footnote_title( $this->formParagraphs( $this->footnotes[$node_id] ) );
// Format it to be suitable for a title tool-tip
$title = $this->encodeAttribute($title);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure encodeAttribute will handle correctly your generated footnote title here when the no_entities configuration variable is set to true. I haven't checked, but I think it could double encode & and < into &amp;amp; and &lt;lt;.

$title = $this->fn_link_title;
if ($this->fn_link_title == "") {
// Decode any markdown in the footnote
$title = $this->get_footnote_title( $this->formParagraphs( $this->footnotes[$node_id] ) );
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate we're calling `formParagraph twice for every footnote now. Once for the footnote reference, and once for the actual footnote at the end of the document. More than twice if there is more than one references to the same footnote. I think this requires more strategy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants