Skip to content

Conversation

@janlindstrom
Copy link

Problem is that FLUSH TABLES FOR EXPORT is a local operation (i.e it is not replicated by Galera) but it takes MDL-lock. This MDL-lock then can conflict with INSERT from other node causing INSERT to be BF aborted. This depends on timing, if we have enough time to find that INSERT is waiting MDL-lock we do UNLOCK TABLES fast enough and avoid BF abort. If not there will be BF-abort.

Test case is fixed so that no query about number of BF aborts is counted as it is not stable. Furthermore, improved error printing and added warning when query is interrupted and there is error in wsrep layer.

@janlindstrom janlindstrom self-assigned this Dec 10, 2025
Copy link

@hemantdangi-gc hemantdangi-gc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is that FLUSH TABLES FOR EXPORT is a local operation (i.e
it is not replicated by Galera) but it takes MDL-lock. This
MDL-lock then can conflict with INSERT from other node causing
INSERT to be BF aborted. This depends on timing, if we
have enough time to find that INSERT is waiting MDL-lock
we do UNLOCK TABLES fast enough and avoid BF abort. If not
there will be BF-abort.

Test case is fixed so that no query about number of BF aborts
is counted as it is not stable.

The above two are optimization but not related to error. The error is for query 'SET SESSION wsrep_sync_wait = 0' and not for INSERT :
mysqltest: At line 19: query 'SET SESSION wsrep_sync_wait = 0' failed: ER_QUERY_INTERRUPTED (1317): Query execution was interrupted

The additional error priniting and warning will give more information for issue, but don't think thses changes resolves issue.

Problem was in wsrep_handle_mdl_conflict function that was comparing
thd->lex->sql_command variable for granted MDL-lock. In this test
case there is first (1) FLUSH TABLES ... FOR EXPORT that will take MDL-lock.
However, thd->lex->sql_command is not stored to taken tiket. In test
some cases next stament executed is INSERT from other node and
as thd->lex->sql_command in (1) is SQLCOM_FLUSH INSERT is forced to wait.
In some cases next stament executed is SET, in this case thd->lex_sql_command
is changed. Now INSERT is executed and as granted thd sql_command is not
anymore SQLCOM_FLUSH, SET (and its previous command FLUSH) is bf aborted.
This is possible because SET does not cause implicit commit or take
new MDL-locks (note that test had SET autocommit=OFF earlier).

Fix is to store thd->lex->sql_command to MDL-ticket and in MDL conflict
handling use this stored variable when making decision is request or
granted aborted or do we wait granted ticket to be released.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants