Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-16 Byte Order Mark (BOM) 0xFEFF reported as thorn ydieresis (þÿ) #1496

Open
myang-apryse opened this issue Dec 18, 2024 · 1 comment
Open
Assignees
Milestone

Comments

@myang-apryse
Copy link

myang-apryse commented Dec 18, 2024

Hello,

I'm seeing an issue where vera treats an empty string containing only 0xFEFF as non-empty þÿ.

The specific manifestation I have is in PDF/A-1B validation, where the Info/Title entry is the aforementioned string,
and I'm getting the error in the following file:
ANFPSE-2427-PDFA-1b-1.pdf

The value of Title entry from the document Info dictionary and its matching XMP property "dc:title['x-default']" are not equivalent
(Info /Title = þÿ, XMP dc:title['x-default'] = "")

I believe this is an error on vera's part, as the pdf spec 7.9.2.2 Text string type explicitly mentions this case in a note:

NOTE 3 This mechanism precludes beginning a string using PDFDocEncoding with the two characters thorn ydieresis,
which is unlikely to be a meaningful beginning of a word or phrase.

Do you think my interpretation is correct?

Thanks,
Michael

@MaximPlusov MaximPlusov self-assigned this Dec 19, 2024
@MaximPlusov MaximPlusov added this to the 1.28 milestone Dec 19, 2024
@MaximPlusov
Copy link
Contributor

Thanks for reporting this issue. Fixed in the latest dev build 1.27.100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants