Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should BOM be allowed as first character? How to handle it? #8

Open
lueck opened this issue Sep 7, 2022 · 1 comment
Open

Should BOM be allowed as first character? How to handle it? #8

lueck opened this issue Sep 7, 2022 · 1 comment

Comments

@lueck
Copy link
Owner

lueck commented Sep 7, 2022

According to the production rules for the prolog in XML 1.0 spec, no whitespace nodes are allowed in front of the XML declaration. Appendix 7, which is non-normative, seems to allow byte order mark as first character.

If a BOM is present: Is it part of the document and should it be counted as a character when we get the character offsets? Or is it only only a bit of information for the parser?

@lueck lueck added the question label Sep 7, 2022
@lueck lueck added this to the implement XML 1.0 spec milestone Sep 7, 2022
@lueck lueck changed the title Should BOM as first character be allowed? How to handle it? Should BOM be allowed as first character? How to handle it? Sep 7, 2022
@lueck
Copy link
Owner Author

lueck commented Dec 15, 2022

How to handle it? According to RFC 5147, sec. 2.1.2, BOM should not be counted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant