Skip to content

Conversation

Bormotoon
Copy link

Kumir is a Russian algorithmic language primarily used for teaching programming in schools.

@teverett teverett added kumir new-grammar New grammar issue or pull request labels May 4, 2025
@teverett
Copy link
Member

Well, something is wrong. The check for the ability to merge is not completing.

@teverett
Copy link
Member

teverett commented Jun 3, 2025

@Bormotoon for some reason the check for merge is not ending. Couuld you close and re-open this please?

@teverett teverett closed this Jun 17, 2025
@teverett teverett reopened this Jun 17, 2025
Copy link
Member

@KvanTTT KvanTTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend the lexer's simplifying in the suggested way.

// --- Keywords (Core Language) ---
// Keywords are case-insensitive (both lowercase and uppercase Cyrillic are matched).
MODULE : 'модуль';
ENDMODULE : ('конец' WS 'модуля' | 'конецмодуля' | 'конец_модуля');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend simplifying:

Suggested change
ENDMODULE : ('конец' WS 'модуля' | 'конецмодуля' | 'конец_модуля');
fragment: WS_FRAGMENT: [ \t\r\n]+;
ENDMODULE : 'конец' (WS_FRAGMENT | '_')? 'модуля';

POST_CONDITION : 'надо';
ASSERTION : 'утв';
LOOP : 'нц';
ENDLOOP_COND : ('кц' WS 'при' | 'кц_при');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ENDLOOP_COND : ('кц' WS 'при' | 'кц_при');
ENDLOOP_COND : 'кц' (WS_FRAGMENT | '_')? 'при';

OR : 'или';
OUT_PARAM : 'рез';
IN_PARAM : 'арг';
INOUT_PARAM : ('аргрез' | 'арг' WS 'рез' | 'арг_рез');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
INOUT_PARAM : ('аргрез' | 'арг' WS 'рез' | 'арг_рез');
INOUT_PARAM : 'арг' (WS_FRAGMENT | '_')? 'рез';

Comment on lines +66 to +70
INTEGER_ARRAY_TYPE : ('цел' WS? 'таб' | 'цел_таб');
REAL_ARRAY_TYPE : ('вещ' WS? 'таб' | 'вещ_таб');
CHAR_ARRAY_TYPE : ('сим' WS? 'таб' | 'сим_таб');
STRING_ARRAY_TYPE : ('лит' WS? 'таб' | 'лит_таб');
BOOLEAN_ARRAY_TYPE : ('лог' WS? 'таб' | 'лог_таб');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
INTEGER_ARRAY_TYPE : ('цел' WS? 'таб' | 'цел_таб');
REAL_ARRAY_TYPE : ('вещ' WS? 'таб' | 'вещ_таб');
CHAR_ARRAY_TYPE : ('сим' WS? 'таб' | 'сим_таб');
STRING_ARRAY_TYPE : ('лит' WS? 'таб' | 'лит_таб');
BOOLEAN_ARRAY_TYPE : ('лог' WS? 'таб' | 'лог_таб');
INTEGER_ARRAY_TYPE : 'цел' (WS_FRAGMENT | '_')? 'таб';
REAL_ARRAY_TYPE : 'вещ' (WS_FRAGMENT | '_')? 'таб';
CHAR_ARRAY_TYPE : 'сим' (WS_FRAGMENT | '_')? 'таб';
STRING_ARRAY_TYPE : 'лит' (WS_FRAGMENT | '_')? 'таб';
BOOLEAN_ARRAY_TYPE : 'лог' (WS_FRAGMENT | '_')? 'таб';

// Color constants
PROZRACHNIY : 'прозрачный';
BELIY : 'белый';
CHERNIY : 'чёрный' | 'черный';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CHERNIY : 'чёрный' | 'черный';
fragment E_OR_YO : 'ё' | 'е';
CHERNIY : 'ч' E_OR_YO рный';

Comment on lines +83 to +84
ZELENIY : 'зелёный' | 'зеленый';
ZHELTIY : 'жёлтый' | 'желтый';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ZELENIY : 'зелёный' | 'зеленый';
ZHELTIY : 'жёлтый' | 'желтый';
ZELENIY : 'зел' E_OR_YO 'ный';
ZHELTIY : 'ж' E_OR_YO 'лтый';

DOC_COMMENT : '#' ~[\r\n]* -> channel(HIDDEN);

// --- Whitespace ---
WS : [ \t\r\n]+ -> skip;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the previously introduced WS_FRAGMENT:

Suggested change
WS : [ \t\r\n]+ -> skip;
WS : WS_FRAGMENT+ -> skip;

Comment on lines +135 to +137
fragment DIGIT : [0-9];
fragment HEX_DIGIT : [0-9a-fA-F];
fragment LETTER : [a-zA-Zа-яА-ЯёЁ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UPPER case is not needed since the grammar sets up caseInsensitive = true

Suggested change
fragment DIGIT : [0-9];
fragment HEX_DIGIT : [0-9a-fA-F];
fragment LETTER : [a-zA-Zа-яА-ЯёЁ];
fragment DIGIT : [0-9];
fragment HEX_DIGIT : [0-9a-f];
fragment LETTER : [a-zа-яё];

fragment LETTER : [a-zA-Zа-яА-ЯёЁ];
fragment DecInteger : DIGIT+;
fragment HexInteger : '$' HEX_DIGIT+;
fragment ExpFragment: [eE] [+-]? DIGIT+;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fragment ExpFragment: [eE] [+-]? DIGIT+;
fragment ExpFragment: [e] [+-]? DIGIT+;

@teverett
Copy link
Member

Is this ready to merge?

@KvanTTT
Copy link
Member

KvanTTT commented Jun 26, 2025

There are some minor issues to fix, but generally yes.

@Bormotoon
Copy link
Author

Sorry, not yet. There will be some more changes and optimizations as suggested here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kumir new-grammar New grammar issue or pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants