Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support underscores or other separators in numeric literals. #1155

Open
jeschkies opened this issue Apr 22, 2024 · 5 comments
Open

Support underscores or other separators in numeric literals. #1155

jeschkies opened this issue Apr 22, 2024 · 5 comments

Comments

@jeschkies
Copy link

jeschkies commented Apr 22, 2024

It would be nice to have support for underscores in numeric literals to be able to write 500_000_000 instead of 500000000. This is a common feature for digit grouping in languages. See e.g. PEP-0515.

@seizethedave
Copy link

This would be valuable. Tons of big numeric constants embedded in jsonnet out there.

This would be a deviation from json, but ECMAScript has digit separators. Would the mods look favorable upon this addition?

@sparkprime
Copy link
Contributor

SGTM but there's quite a diversity in approaches across languages. Jsonnet typically tries to be compatible with Python or ECMAScript so what is the difference between PEP-0515 and the ECMAScript approach?

@jeschkies
Copy link
Author

what is the difference between PEP-0515 and the ECMAScript approach?

ES6 and Python both allow _ as a separator.

@seizethedave
Copy link

seizethedave commented Jun 18, 2024

ECMA proposal, PEP515.

Differences between the two seem to be:

  • Python plumbs the support into the standard library, allowing you to do things like int('123_456') at runtime. ECMAScript doesn't.

Actually, I'll have to take a closer look later, but among the subset of ECMAScript things that Jsonnet supports (e.g., no octals), that's the only difference I've noticed so far.

The big thing they both have in common is that each underscore can appear between two numerals (0-9), and nothing else can be touching the underscore. Not an e, a ., not another _...

@seizethedave
Copy link

Since that's the only difference I can find, what if we just followed ECMAScript? To put it into words:

  • Jsonnet numeric literals should support digit separators.
  • We'll follow suit with ECMAScript and Python and use the _ character.
  • Separators may be used in whole numbers.
    • Example: 1_000, 123_456_789
  • Separators may be used in floating point numbers, both before or after the decimal point.
    • Example: 1_000.123_456, 3.141_592_653
  • Separators may be used in exponentiated numbers, both before or after the exponent.
    • Example: 1.23e4_567, 1_000e1_000
  • The implementation won't place any restriction on the size, number, or regularity of groups defined by digit separators. (So 1_1, 1_1_1_1_1_1, and 1_11_111_1111 are all valid literals.)
  • A _ may appear between any two numeric digits.
  • A _ may not appear next to any of the other non-digit characters encountered in numeric literals. (Including another _.)
    • Invalid: 1_000_.0, 1._123, 7_e3, 7e_3, 7e-_3, 7e+_3, 1__000
  • A _ may not appear a the beginning or end of a numeric literal.
    • Invalid: 123_, _123. (The latter would be lexed as a variable name.)
  • The addition of digit separators in Jsonnet will be backwards compatible with Jsonnet's existing numeric literals.
  • Digit separators will be implemented in the lexer. The lexer will simply discard separators as they're encountered, such that 1_000 and 1000 will have an identical representation in the parse tree and other Jsonnet internals.
  • Digit separators in Jsonnet will not be implemented in standard library functions such as std.parseInt.

Happy to put this in a doc or something more amenable to commenting.
I'm also keen to submit a PR. I've already drafted one in a fork.

David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants