CLDR-16836 kbd: add EBNF to spec for transforms#4261
CLDR-16836 kbd: add EBNF to spec for transforms#4261srl295 merged 7 commits intounicode-org:mainfrom
Conversation
121aab0 to
afa4af5
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
|
fyi @stasm and @aphillips |
|
a461f03 to
308448c
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
- add keyboard abnf and sample files and automated tests Temporarily skip non-BMP chars, see hildjj/node-abnf#25 which is being fixed.
308448c to
d38f212
Compare
|
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
macchiati
left a comment
There was a problem hiding this comment.
Looks good!
- The EBNF links should point to https://unicode.org/reports/tr35/#ebnf. (Since the LDML EBNF is a superset of the vanilla EBNF that should work.
- The validity / well-formedness constraints (eg no more than 9 capture groups), should use the w3c syntax.
Thanks!
I will add this: The following is the [LDML EBNF](./tr35.md#ebnf) format for the grammar:
The 9 capture groups is 9 inner capture groups. So for example, valid: /(first)(second)(third)(fourth)(fifth)(sixth)(seventh)(eighth)(?:And possibly, (ninth))?/but invalid: /(first)(second)(third)(fourth)(fifth)(sixth)(seventh)(eighth)(?:And possibly, (ninth))?(?:But not, (tenth)!)?/because it's a nested group it may be challenging to express in the grammar, it's more like a resource (slot) limit. |
199f93e to
c0e32b4
Compare
|
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
|
I'd like to get some Java based tooling, but that will be in a separate PR. |
| | DIGIT | ||
| | '_' | ||
| ASCII-CTRLS | ||
| ::= [#x1-#x8#xB-#xC#xE-#x1F] |
There was a problem hiding this comment.
these are not allowed in XML 1.0
|
Yes, my comment about surrogates is redundant because they are already
covered in the BNF. As to errors, what I was trying to express is that an
implementation would reject a keyboard layout with an error… But I can
refer to that as a constraint for consistency
|
macchiati
left a comment
There was a problem hiding this comment.
I requested changes for the constraints.
- remove illegal ctrls - reorder members - cleanup
| ```ebnf | ||
| [ wfc: No more than 9 capture groups may be present. ] | ||
| [ vc: all variables referenced must be defined in the <variables> element ] | ||
| [ vc: The CLDR repository may define additional constraints on the repertoire, such as requiring all characters to be in a published Unicode version and disallowing private-use characters. ] |
There was a problem hiding this comment.
| [ vc: The CLDR repository may define additional constraints on the repertoire, such as requiring all characters to be in a published Unicode version and disallowing private-use characters. ] | |
| [ vc: If a keyboard definitions is submitted to the the CLDR repository, it must satisfy additional constraints on the character repertoire. For more information, see [CLDR keyboard repertoire constraints](#repertoire-constraints). ] |
We have to point to where those constraints are documented. So add a little section header for that and point to it. Also make a change to a similar line below.
I suggest that the contents of that section be:
- No characters can have any of the following General_Category values in the latest version of the Unicode Standard:
- Private Use (Co)
- Surrogate (Cs)
- Unassigned (Cn)
There was a problem hiding this comment.
Not sure we can link from ebnf to spec.
Also the repository requirements are a separate activity - I don't see a reason to specify them here in detail (though I'll take the list back to kbd-wg). Perhaps it's better to just say- see a future version of the spec.
There was a problem hiding this comment.
What about this
[ vc: If a keyboard definitions is submitted to the the CLDR repository, it must satisfy additional constraints on the character repertoire. ]There was a problem hiding this comment.
The problem with that is there is no way for the reader to find out what those requirements are — we can't have undefined vc requirements.
There was a problem hiding this comment.
Then I'd propose to drop it from vc and just note. Or not even note. We don't have the requirements yet.
Yes there's no way to find out the requirements because they are future.
|
@miloush how does it look? |
CLDR-16836
ALLOW_MANY_COMMITS=true