-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss of style for titles and/or subtitles in .docx output document #10282
Comments
Have you checked the changelog to see what relevant changes were made between 3.1.12 and 3.5? |
Yes, I have read the release notes and found some possible clues, but they don’t help me resolve the issue as a user. I have spent several hours researching and testing, but the documentation did not provide any insights to resolve my issue. My understanding was that Pandoc’s reference file could be used to customize the layout of a Word document as long as only the Pandoc-supported styles were modified or utilized. This approach had always worked fine in the past. I also generated a new reference file using Pandoc 3.5 and ran targeted tests with it. However, when I modify the Title and Subtitle layouts (such as color or font), Word now uses the Standard layout style instead. This should not be happening. I found the following relevant notes in the release history: pandoc 3.2.1“Clean up Abstract Title and Subtitle in default reference docx. Center Subtitle, remove color.”
pandoc 3.2“Use current standard Word theme (#7280). This includes using the sans-serif font Aptos instead of the serif font Cambria, and default colors for headings. Remove duplicate DefaultParagraphFont in styles.xml.” pandoc 3.1.12.2Here’s one relevant note: I suspect that my issue with Title and Subtitle formatting might be related to a change in how style names are detected in Pandoc. In earlier versions, Pandoc always processed the It would be helpful if Pandoc could provide a way to standardize style name recognition independently of localization. This would prevent issues for users working with different language versions of Word. |
What |
lang: de I had also tried it, but there was no improvement. Hm, my problem should actually be easy to recreate, right? |
Can you post files necessary to reproduce the issue? Your reference.docx and a sample markdown input, plus the command line you used? |
Note that in the 3.2 revisions, we added a "Title Char" style (standard for Word). |
There is also "Subtitle Char". These are character styles. |
Sorry for the delay, I've been very busy. Here is an example that reproduces the problem for version 3.5 of Pandoc.
$ pandoc --version
pandoc 3.5
$ pandoc -o reference-3_5.docx --print-default-data-file reference.docx
Title: I have set the font colour to white, the spacing from the top to 80 pt and coloured a background frame in blue. Subtitle: Font colour also white, spacing adjusted to 4 pt from the top and background colour in a lighter blue
---
title: "Resource Template"
subtitle: "Subtitle Here"
date: "\today"
author: "My Name"
toc: true
toc-depth: 2
lang: en
customer_logo: false
client_name: "Client Name"
security_label: "Confidential"
...
# Introduction
Provide a brief introduction here.
# Section 1: Overview
Include a general overview of the resource.
## Subsection 1.1: Details
Details about the resource, including any relevant information, such as objectives, target audience, or specifications.
# Section 2: Implementation
Instructions or steps for implementing the resource.
## Subsection 2.1: Steps
1. Step one details
2. Step two details
3. Step three details
# Section 3: Additional Information
Include any additional relevant information, like references or contact details.
# Appendix
Include any additional resources or appendices here.
The title and subtitle are displayed in the ‘Normal’ style. Likewise ‘author’ and ‘date’. Date is not displayed because \today is a latex option and not for Word. If a date was entered, the date would appear. Format is also wrong here. There is also a page break in the template, which is also missing here. |
Thank you for pointing out the updates in the 3.2 revisions regarding the "Title Char" and "Subtitle Char" styles. However, I’m having trouble with these points because I haven’t been able to find any relevant information in the documentation. I’m not sure how to access or adjust these character styles within the reference doc. Could you provide some guidance on how I can use these features? |
OK, this is very strange. Your reference doc has a Title style, but it doesn't get applied. |
But when I try the same thing you did -- same method of creating a reference.docx -- it works fine. |
OK, the issue is this. Your reference.docx has w:styleId="para6", and this style has When you create your reference.docx, go to the Styles menu, find the already existing Title style, and modify this. I'm not sure what you did differently to create the style you had. |
I will test it tomorrow. I still have a computer with an older version of Pandoc. I haven't had the problem on this system so far. I will check if the versions of Word are the same, which they should be. My system and Word versions are German. As far as I can remember, the style sheets that were generated from Pandoc were always in English, which did not cause any problems. In Word they were also displayed with the names in English. Now it looks as if Pandoc or Word translates the name of the template (in my case into German). The fact that I use macOS may also play a role. There are therefore several factors that can play a role:
and certainly also the user (in this case me). However, I have tried to rule out errors by creating a new test file including a reference file from scratch. The style id can probably change depending on the language, which is probably why the names are used. If these are now translated (by whatever means) or adapted to the system language, this would provide an explanation. |
If I recall, we use styleId and not the display name, because that is the thing that is constant across differently localized versions of Word. |
pandoc 3.1.12.2 Here’s one relevant note: see above. |
OK, I got it reversed then. I knew it was one way or the other! |
Here are my tests and the results: macOS 14.6.1 (Sonoma) Intel Core i7 Microsoft Word for Mac Version 16.89.1 (German) pandoc --version pandoc -o reference-3_1_12.docx --print-default-data-file reference.docx The format template Title is displayed in Word as ‘Title’ and Subtitle as ‘Subtitle’. Open reference-3_1_12.docx with Word pandoc my-markdown.md -o my-word-3_1_12.docx --reference-doc=my-reference-3_1_12.docx Result: Word file is opened. 1st warning: ‘This document contains fields that may refer to other files. Do you want to update the fields in this document?’ (ok) As the fields and the table of contents could not yet be set to a current and valid value by Word, these instructions are understandable. However, you should always call up a Word file first and then save it again before sending it to other people, as they often thank you that the file is damaged! The formatting is correct, everything is as it should be. System on which the problem occurs: macOS 15.0.1 (Sequoia) M3 Pro (ARM) Microsoft Word for Mac Version 16.89.1 (German) Open reference-3_5.docx with Word pandoc my-markdown.md -o my-word-3_5.docx --reference-doc=my-reference-3_5.docx Result Word file is opened with the references as before. However, the formatting is not correct. Title and subtitle are displayed correctly, but the formatting is incorrect and set to default for title, subtitle, and date. The author is correct (and the style sheet has the English name). For ‘Table of Contents’ and the chapter and section headings, the identifiers of the template are displayed in German, but the formatting is correct. I actually suspect the issue is with Pandoc. For once, I would like to exclude Word as the source of the issue. Theoretically, the difference in macOS versions could still be a cause. If this is the case, then there is hardly anything you can do, and you are at the mercy of the folks at Apple. The different architecture of the CPU should not play a role. I have now installed Pandoc 3.1.12 on the system and repeated the tests. There are no problems. The formatting appears to be correct. So it is probably not due to the different macOS versions. One last test: Since I had installed Pandoc 3.5. with brew, I did another installation with the version from the site https://github.com/jgm/pandoc/releases/tag/3.5. I almost expected that this would solve the problem. Unfortunately not. Something has been changed somewhere in Pandoc that is causing the issues. That's the end of my ideas. But I'll save myself the trouble of trying to find out from which version the problem occurs. |
I can't see how the issue would be the OS version. However, I use pandoc 3.5 on ARM macOS (previous version), and I don't have any difficulties customizing the style. There are two factors that may be different in our cases:
One thing that is clearly different is the styleId of the style named Title in your reference docx. This may be relevant, though as you note, we claim to be looking up styles by name. When I have a chance I can look into this further. |
I don't think it's related to the OS version either, I just wanted to mention all the possibilities that came to my mind. The problem will be the name of the style sheets. These are translated by Word into German, for example. If I make a change, the style is saved and the translated name is used. Then pandoc can no longer function properly. This is normally why you use IDs and not names. I don't know if Microsoft sees it differently. Anyway, I will try the following in the next few days: I will create a template with the current version and then use a text editor to search and replace the designations with the English designations. This should be a work around. Since templates are not constantly customized, that should be okay with me. I'll let you know when I've tried it. For a quick test, you can also rename Title in one of your reference files with the German translation In Titel. If the formatting is then lost and the template is set to Standard, then that is the problem. |
Note that in my-reference-docx-3_5.docx, styles.xml has
and the name specified here is "Title", not "Titel". So any localization of that name must be happening somewhere outside the stylesheet. Since according to the commit comment you mentioned above, we are looking up styles by name and not styleId, we should be finding this style. |
PS. I tried manually changing the styleId from |
It looks like the linked commit may have been focused on just the table caption; maybe we didn't make a general change to looking up styles by name intead of styleId. |
It's quite counterintuitive that Word works this way -- it's the name, not the styleId, that stays constant across localized versions -- but such is MS. |
I have narrowed down the problem to a specific Pandoc version. When using Pandoc 3.2, titles and subtitles in the Word file are still displayed correctly according to my customised template. The changes made are properly applied. However, a deviation occurs as of version 3.2.1: titles and subtitles only appear according to Word's default settings, regardless of the Pandoc or custom templates. My adjustments to the style sheet are ignored, but the content remains correct. Titles and subtitles are displayed with the correct content, but without the intended formatting. I tested various Pandoc versions to investigate. The problem first appeared in version 3.2.1, while in version 3.2 the templates worked as expected. The tests were carried out with the binary versions of Pandoc, which I downloaded directly from the GitHub page (https://github.com/jgm/pandoc/releases). I hope this helps to narrow down the error and find a solution. I will stick with the older version for the time being until the problem is fixed. |
IN the 3.2.1 changelog for docx writer we have two items that might be relevant: |
The OpenXML template contains: +$if(title)$
+ <w:p>
+ <w:pPr>
+ <w:pStyle w:val="Title" />
+ </w:pPr>
+ $title$
+ </w:p>
+$endif$
+$if(subtitle)$
+ <w:p>
+ <w:pPr>
+ <w:pStyle w:val="Subtitle" />
+ </w:pPr>
+ $subtitle$
+ </w:p>
+$endif$ |
[EDITED] We produce a docx with
and the way Word deals with this is to look up the style with <w:style w:type="paragraph" w:styleId="Title">
<w:name w:val="Title"/>
<w:qFormat/>
<w:basedOn w:val="para0"/>
<w:next w:val="para1"/>
<w:pPr>
<w:spacing w:before="1600" w:after="80"/>
<w:contextualSpacing/>
<w:jc w:val="center"/>
<w:pBdr>
<w:top w:val="nil" w:sz="0" w:space="3" w:color="000000" tmln="20, 20, 20, 0, 60"/>
etc. So far so good, although it's odd that Then, when you use this reference docx to create a new docx, (my-word_3.5.docx), styles.xml contains: <w:style w:styleId="para4" w:type="paragraph">
<w:name w:val="Title" />
<w:qFormat />
<w:basedOn w:val="para0" />
<w:next w:val="para1" />
<w:pPr>
<w:spacing w:after="80" w:before="1600" />
<w:contextualSpacing />
<w:jc w:val="center" />
<w:pBdr>
<w:top tmln="20, 20, 20, 0, 60" w:color="000000" w:space="3" w:sz="0" w:val="nil" />
etc. I just don't get this. When I use pandoc to do the same thing you described, using your own my-reference_3.5.docx, I don't get this result. And although your Word may be localized, your pandoc is not. So that is not the issue. I ought to be able to use pandoc on the same inputs and get the same result as you. This has nothing to do with Word. So, I'm wondering whether we can repeat the entire process carefully. Take your file linked above, my-reference_3.5.docx, and do this exact command:
And then upload output.docx. |
I am using a customised ‘reference.docx’, which I created with the command ‘pandoc -o custom-reference.docx --print-default-data-file reference.docx’ and added individual styles for titles and subtitles. This file worked perfectly in Pandoc version 3.1.12 and formatted the title and subtitle as desired (e.g. frame with background colour and custom text colour).
However, after upgrading to Pandoc 3.5, the problem arises that titles and subtitles lose their formatting and the default formatting is used instead. The content is output correctly, but without the assigned formatting. This suggests that Pandoc may no longer recognise the custom styles or that the assignment of metadata to styles has been changed.
I am using a German version of Microsoft Word, so the style names in my ‘reference.docx’ file may correspond to the localised German names. Since I updated directly from Pandoc 3.1.12 to 3.5, I can't say exactly from which version the problem occurred. However, downgrading to 3.1.12 fixes the problem, so there seems to be a change in the newer Pandoc versions that affects the style assignment for metadata.
Regards
Stefan
The text was updated successfully, but these errors were encountered: