Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-123963: Expose GetCurrentByteCount from expat #123964

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

DelusionalLogic
Copy link

@DelusionalLogic DelusionalLogic commented Sep 11, 2024

When you're trying to preserve the format for the XML, it can be handy to know the length of the input that generated the event. Expat already has the feature (and seems to have had it for a long while), so why not expose it?

I've added a new condition to the test that was already added for the other GetCurrent* methods.


📚 Documentation preview 📚: https://cpython-previews--123964.org.readthedocs.build/

Copy link

cpython-cla-bot bot commented Sep 11, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Sep 11, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@DelusionalLogic DelusionalLogic marked this pull request as ready for review September 11, 2024 16:41
Copy link
Member

@Eclips4 Eclips4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Please, add a few notes about this feature in Doc/whatsnew/3.14.rst.
Additionally, as far as I can see, there should be more changes, take a look at the XML_GetCurrentLineNumber. I think we should something similar for XML_GetCurrentByteCount

Doc/library/pyexpat.rst Outdated Show resolved Hide resolved
Doc/library/pyexpat.rst Outdated Show resolved Hide resolved
Doc/library/pyexpat.rst Outdated Show resolved Hide resolved
@@ -506,6 +506,7 @@ def EndElementHandler(self, name):
def check_pos(self, event):
pos = (event,
self.parser.CurrentByteIndex,
self.parser.CurrentByteCount,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a dedicated test as well for this one? The libexpact docs say:

Returns 0 if the event is inside a reference to an internal entity and for the end-tag event for empty element tags (the later can be used to distinguish empty-element tags from empty elements using separate start and end tags

Copy link
Author

@DelusionalLogic DelusionalLogic Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case already includes an example of an empty-element as defined in the XML spec1. What is missing is an example of an empty element in the long form (with a separate open and close tag), but that seems redundant.

I'll gladly add it if you'd like me to explicitly include it.

@DelusionalLogic
Copy link
Author

DelusionalLogic commented Sep 11, 2024

Please, add a few notes about this feature in Doc/whatsnew/3.14.rst.

Is the consensus that this change it large enough to warrant an item in the what's new? I don't mean to be a busybody, but the verbiage of the contribution guide says:

If a change is particularly interesting for end users (for example, new features, significant improvements, or backwards-incompatible changes)

Does this qualify as that? I'll gladly copy it over, but I want to make sure I'm not creating more work than I'm contributing here ;)

@DelusionalLogic
Copy link
Author

I went ahead and added the changes requested in the review. The test comment is still open, since I'm awaiting some feedback on that.

There also seems to be some problem with the Docs part of the build, something about the reference target of :attr:xmlparser.CurrentByteCount not being found. I don't think I understand enough about the documentation subsystem to really do anything intelligent here :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants