-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Unicode superscripts for HTML note markers #9437
base: main
Are you sure you want to change the base?
Conversation
Since HTML doesn't have semantic "footnote" elements, Pandoc has historically used the <sup> tag to mark the numeric reference to footnotes. In some fonts, depending on line-spacing, the common default <sup> style of "font-size: smaller; vertical-align: super;" doesn't look very good, spilling beyond the font's cap height and making browsers add extra space at the top of the text line. Many fonts include characters from the Unicode superscripts and subscripts block (https://unicode.org/charts/nameslist/n_2070.html) which are designed to function as footnote markers. Using these characters to render note marks, instead of a <sup> tag, yields better typographical results in these cases without additional CSS. The <sup> tag is purely typographical so losing it from the output doesn't cost anything semantically. This diff adds a --note-style option to pandoc, taking the values "sup-tag" (the default and hitherto only method) and "unicode-superscript" (print marks using superscript chars, no surrounding tag). Due to the nature of Note output in the HTML writer, a Lua filter cannot really customize how footnote marks are printed, justifying a writer option here. An alternative to adding this feature to Pandoc would be for authors to use CSS like 'a.footnote-ref sup { font-size: inherit; vertical-align: inherit; font-feature-settings: "sups"; }' which would work for fonts where the "sups" OpenType feature replaces digits with their superscript forms. That solution only works for fonts encoding that feature though; Times New Roman on my system has the superscript characters but do not support the "sups" OpenType feature. Future work could extend support for this writer option to plain output and possibly other formats where note marks are emitted by Pandoc rather than the renderer of the output document. (The present author has not studied whether there are such writer formats.)
PR is missing changes to |
Interesting idea! The only part I have reservations about is the new option; I really like to avoid adding options if at all possible. Hence I'm wondering how widely supported the superscripted characters are in fonts. E.g. are they found in the standard "web fonts"? If they are very widely supported, perhaps we could get away with just making this the standard behavior? |
My comment in the commit message that "Times New Roman on my system has the superscript characters" was misleading…I didn't check all the numbers. Here's a test page styling these characters with the classic "web-safe fonts", plus I agree that adding a command-line option for this subtle thing that only affects one output format is undesirable. One can imagine a filter-like callback in Lua like One other non-actionable idea after thinking about all this is a future breaking release could axe the dedicated command-line flags for a bunch of Pandoc's less common knobs in favor of something like The least intrusive thing I can do for myself is write something tiny to pipe Pandoc's HTML output into and just replace the Hope this is all food for thought; I completely understand and can't really disagree if you just want to close this as too niche to support with an option and not immediately tractable in any other way. |
Ah, too bad. One option could be a custom writer that just calls the normal writer and then does a pattern substitution on the formatted footnote references. This would avoid the need for piping into an external script, so it might be just slightly nicer. |
Since HTML doesn't have semantic "footnote" elements, Pandoc has
historically used the
<sup>
tag to mark the numeric reference tofootnotes. In some fonts, depending on line-spacing, the common default
<sup>
style of "font-size: smaller; vertical-align: super;" doesn't lookvery good, spilling beyond the font's cap height and making browsers
add extra space at the top of the text line.
Many fonts include characters from the Unicode superscripts and
subscripts block (https://unicode.org/charts/nameslist/n_2070.html)
which are designed to function as footnote markers. Using these
characters to render note marks, instead of a
<sup>
tag, yields bettertypographical results in these cases without additional CSS. The
<sup>
tag is purely typographical so losing it from the output doesn't cost
anything semantically.
This diff adds a --note-style option to pandoc, taking the values
"sup-tag" (the default and hitherto only method) and
"unicode-superscript" (print marks using superscript chars, no
surrounding tag).
Due to the nature of Note output in the HTML writer, a Lua filter cannot
really customize how footnote marks are printed, justifying a writer
option here. An alternative to adding this feature to Pandoc would be
for authors to use CSS like 'a.footnote-ref sup { font-size: inherit;
vertical-align: inherit; font-feature-settings: "sups"; }' which would
work for fonts where the "sups" OpenType feature replaces digits with
their superscript forms. That solution only works for fonts encoding
that feature though; Times New Roman on my system has the superscript
characters but do not support the "sups" OpenType feature.
Future work could extend support for this writer option to plain output
and possibly other formats where note marks are emitted by Pandoc rather
than the renderer of the output document. (The present author has not
studied whether there are such writer formats.)