Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-130453: pygettext: Extend support for specifying custom keywords #130463

Merged
merged 10 commits into from
Feb 25, 2025

Conversation

tomasr8
Copy link
Member

@tomasr8 tomasr8 commented Feb 22, 2025

This addresses the first point in #130453

It is now possible to use the full keyword spec syntax (except for t, that will be added later) to specify keywords:

./python Tools/i18n/pygettext.py --keyword=foo:1
./python Tools/i18n/pygettext.py --keyword=foo:1,2
./python Tools/i18n/pygettext.py --keyword=foo:1c,2
./python Tools/i18n/pygettext.py --keyword=foo:1c,2,3

I tried to match the behaviour of xgettext and babel but neither seem to do much validation for the keyword specs.
xgettext, for instance, does not allow foo:1c,2c (context specified twice) nor foo:1,1c (msgid and msgctxt have the same index) but it does (weirdly) allow foo:1,1 (same index for msgid and msgid_plural), whereas it outright crashes with a double free for foo:1,1,2c.

This PR properly validates the keyword specs in order to be consistent and provide helpful error messages to the user.

Feedback welcome!

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that it was necessary to go so far with detecting errors and generating error reports. Garbage in -- garbage out. The parsing code could be 2 or 3 times smaller without this. But if you already implemented this, it is fine.

LGTM.

raise ValueError(f'Invalid keyword spec {spec!r}: '
'msgctxt cannot appear without msgid')

return name, {v: k for k, v in result.items()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to build result in that form from the beginning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that in d861c84, let me know if you like it better like that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just a question. I am fine with both variants.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to let you see the difference :) I don't have a strong preference either, let's stick with the current version, then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I tried implementing some followup work on top of this PR (support for the t specifier, multiple keywords with the same funcname) and it's better to use the original representation because the diff in the followup PRs will be smaller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did the right thing by letting the PR lie down for two days. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good call 🙂 And thanks for your super thorough reviews! It's really appreciated

try:
options.keywords = dict(parse_spec(spec) for spec in options.keywords)
except ValueError as e:
raise SystemExit(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other errors cause print() + sys.exit(1).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, though I believe raise SystemExit(s) is functionally equivalent to print(..., file=sys.stderr) + sys.exit(1) and since it's shorter and the intent is clearer, I thought I'd start using that instead.

Though if you prefer to be consistent, I can change it to print+sys.exit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it just for consistency (also, sys.exit() allows to set different return codes, but this is not used here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 18d29cb to use print+sys.exit

@tomasr8
Copy link
Member Author

tomasr8 commented Feb 23, 2025

I do not think that it was necessary to go so far with detecting errors and generating error reports. Garbage in -- garbage out. The parsing code could be 2 or 3 times smaller without this. But if you already implemented this, it is fine.

Honestly, if you prefer it without the detailed error messages, I am fine with removing them. Just let me know!

My thinking for adding them was that most people using this will not be that familiar with the syntax and for them, it's better to show a descriptive error rather than fail silently, but as I said, if you prefer to have simpler code, that's also ok!

try:
options.keywords = dict(parse_spec(spec) for spec in options.keywords)
except ValueError as e:
raise SystemExit(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it just for consistency (also, sys.exit() allows to set different return codes, but this is not used here).

raise ValueError(f'Invalid keyword spec {spec!r}: '
'msgctxt cannot appear without msgid')

return name, {v: k for k, v in result.items()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just a question. I am fine with both variants.

@serhiy-storchaka serhiy-storchaka merged commit 44213bc into python:main Feb 25, 2025
39 checks passed
@tomasr8 tomasr8 deleted the pygettext-keywordspec branch February 25, 2025 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants