Use character classes as word boundaries #34

bentley · 2015-06-23T04:41:08Z

In current nvi, ‘w’ is not that useful in Japanese text, because it considers a long sentence written without spaces as a single word.

In the old nvi-m17n, ‘w’ moved along katakana/hiragana/kanji boundaries (so, for instance, “本日は晴天なり” would be treated as words “本日” “は” “晴天” “なり”). This is also how Xterm handles word selection; see Xterm’s charclass.c. I’ve been told by a Japanese nvi user that this is behavior he misses from nvi-m17n.

It would be useful to break along character class boundaries like this.

lichray · 2015-06-23T04:54:00Z

Yes. And by relying on the Unicode information, we can also eliminate the platform differences on locale. Currently we have some code to get the Unicode codepoint

nvi2/common/key.c

Line 285 in 1d22313

int uc = -1;

, but only used to display Unicode escape sequences.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use character classes as word boundaries #34

Use character classes as word boundaries #34

bentley commented Jun 23, 2015

lichray commented Jun 23, 2015

Use character classes as word boundaries #34

Use character classes as word boundaries #34

Comments

bentley commented Jun 23, 2015

lichray commented Jun 23, 2015