Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCX: MS Word with uncorrect style converting of font size "12pt" based on BIRT produced DOCX-documents #1905

Closed
speckyspooky opened this issue Sep 12, 2024 · 1 comment · Fixed by #1906
Assignees
Labels
BugFix Change to correct issues
Milestone

Comments

@speckyspooky
Copy link
Contributor

Based on the discussion #1868 I started a research of the font-size styling topic to test different situations
what the font-size result will be given in compare of HTML -> (PDF ->) DOCX.

The styling from HTML to PDF is ok but a special situation with the font-size "12pt" cause a wrong style conversion in DOCX-documents.

The problem is that the HTML-content of the BIRT-elements "text", "dynamic text", "data" will not parsed into full supported DOCX-elements. Instead of a full parser the archive-/intermediate-format MHT(ML) will be used. These files embedded like external documents into the DOCX and will be converted through Word into the DOCX-elements.

example of the internal DOCX-structure with MHT-files created by BIRT
grafik

And there exists a kind of standard error on MS Word side because the font size "12pt" will be converted in every case into "10pt".
This topic is given at the HTML-tags "div" & "span" but also the HTML4.0-element font is not converted correctly for result size "12pt".
(BIRT produce the DOCX-Version: WordProcessingML, Version 2016)

I tried to replace the "div"-tag with the "p"-tag (paragraph). But the converting of Word is not the same for these both elements
so we would get side effects of styling. In special cases the styling of a p-tag will override the style of following elements.

Normally the best way would be to implements a DOCX-parser to avoid MHT-files but for that large efforts are needed to get it.
(Note: DOC-files from BIRT 4.6 was created correctly, for that file format a kind of HTML-parser is implemented but cannot used for DOCX.)

The smallest workaround is to use another font-size instead of "12pt" to get a better result.
So I will provide a PR that the font-size will be changed from "12pt" to "12.5pt".
I know this isn't the same font-size and therefore I call it workaround. But it is better like the currenty situation.

Note: The behavior of #1868 was not reproducable on my side.

Example 01 of wrong converting

grafik

Example 02 of wrong converting

grafik

@speckyspooky speckyspooky added the BugFix Change to correct issues label Sep 12, 2024
@speckyspooky speckyspooky added this to the 4.17 milestone Sep 12, 2024
speckyspooky added a commit to speckyspooky/birt that referenced this issue Sep 12, 2024
@speckyspooky speckyspooky linked a pull request Sep 12, 2024 that will close this issue
@speckyspooky
Copy link
Contributor Author

The test results of the new solution are tested.
The following screens provide the new result for the font-size for DOCX-documents.

Example 01

grafik

Example 02

grafik

The PR with the solution is added with #1906

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugFix Change to correct issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant