Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement std.parseXmlJsonml #704

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rohitjangid
Copy link
Member

@rohitjangid rohitjangid commented May 31, 2023

Implement std.parseXmlJsonml as standard function

cpp-jsonnet PR: google/jsonnet#1092

@coveralls
Copy link

coveralls commented May 31, 2023

Coverage Status

coverage: 68.722% (+0.2%) from 68.568% when pulling ae9a67d on rohitjangid:feat/jsonml into fed90cd on google:master.

jsonml.go Outdated
b.currDepth++
case xml.CharData:
t := token.(xml.CharData)
s := strings.TrimSpace(string(t))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for trimming it? I think whitespace is actually meaningful in XML because of things like <b>foo</b>bar vs <b>foo</b> bar. If we're only supporting the (admittedly common) case of the XML file being whitespace agnostic (i.e. whoever consumes the XML doesn't care about the whitespace) then we can document it as not supporting that. However I think just not trimming the strings here will fix it, will it not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason behind trimming whitespaces is to simplify jsonml. Currently due to formatted xml, jsonml becomes bit complex to work with due to empty content. We want to ignore empty lines but I agree current implementation would also trim spaces around the content. I could take of it.

I am thinking of allowing users to control this behaviour. We can provide an argument trimWhitespaces = true to the function which would control if user wants to remove formatting whitespaces.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this which looks interesting. http://xml.coverpages.org/rfc-wshp19990416.html I wonder if the current behaviour of the manifester is wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I understand from these link is it is safe to remove boundary whitespaces as a default behaviour. We would provide an argument preserveWhitespaces = false to preserve them if anyone wants to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some diversity in what other parsers are doing so I think anything we do is fine as long as it's documented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@rohitjangid
Copy link
Member Author

Hi @sparkprime
Can we merge this if there are no other concerns?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants