-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PS-9238: Make MySQL 5.7 compatible with CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7 replication reliability #5362
base: 8.0
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
sql/sql_insert.cc
Outdated
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags & | ||
HTON_SUPPORTS_ATOMIC_DDL) && !opt_ctas_compatibility_mode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implicit conversion unsigned int
-> bool
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags & | |
HTON_SUPPORTS_ATOMIC_DDL) && !opt_ctas_compatibility_mode) { | |
if (((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags & | |
HTON_SUPPORTS_ATOMIC_DDL) != 0u) && !opt_ctas_compatibility_mode) { |
@@ -4292,6 +4292,13 @@ static Sys_var_int32 Sys_regexp_stack_limit( | |||
GLOBAL_VAR(opt_regexp_stack_limit), CMD_LINE(REQUIRED_ARG), | |||
VALID_RANGE(0, INT32_MAX), DEFAULT(8000000), BLOCK_SIZE(1)); | |||
|
|||
static Sys_var_bool Sys_ctas_compatibility_mode( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initializing non-local variable with non-const expression depending on uninitialized non-local variable opt_ctas_compatibility_mode
"ctas_compatibility_mode", | ||
"Execute and binlog CTAS in pre 8.0.21 way, i.e. with intermediate commit " | ||
"after the table creation.", | ||
READ_ONLY GLOBAL_VAR(opt_ctas_compatibility_mode), CMD_LINE(OPT_ARG), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not use C-style cast to convert between unrelated types
@@ -2363,7 +2363,8 @@ bool store_create_info(THD *thd, Table_ref *table_list, String *packet, | |||
This is done only while binlogging CREATE TABLE AS SELECT. | |||
*/ | |||
if (!thd->lex->query_block->field_list_is_empty() && | |||
(create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL)) { | |||
(create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implicit conversion unsigned int
-> bool
(create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL) && | |
((create_info_arg->db_type->flags & HTON_SUPPORTS_ATOMIC_DDL) != 0u) && |
CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7 replication reliability https://perconadev.atlassian.net/browse/PS-9238 Problem: 5.7 does not understand the new syntax and complains about CREATE TABLE ... START TRANSACTION received in binlog. Cause: Commit 4fe3f2f (8.0.21) introduced changes that allow the execution of CREATE TABLE AS SELECT as a single transaction. Before that change, CREATE TABLE was the 1st transaction, then rows were inserted, which was unsafe. To achieve this we don't commit the transaction after CREATE TABLE. The commit is done after rows are inserted. For async replica to understand this protocol, START TRANSACTION clause was added to CREATE TABLE in binlog. When the replica sees this, it knows not to commit after CREATE TABLE. Solution: Introduced ctas_compatibility_mode global, read-only variable, OFF by default. If switched to ON, it enforces the behavior from before the above-mentioned commit. Implementation details: The fix contains 2 parts: 1. Query_result_create::binlog_show_create_table() - When SE supports atomic DDL, CREATE TABLE query was binlogged with direct=false / is_trans=true flags. This caused the event to be cached in trx_cache instead of stmt_cache. MYSQL_BIN_LOG::commit() logic skips trx_cache commit, if this is not the transaction commit (it does only stmt_cache commit). Then, when rows are inserted, it was detected that there is ongoing transaction (trx_cache not empty), so no new BEGIN statement was generated in binlog. Effectively CREATE TABLE and rows insertion were done in one transaction. The fix is to revert to the default behavior, i.e., Query_result_create::binlog_show_create_table() creates the event in stmt_cache, which is committed after table creation. When rows are being inserted, it is detected that trx_cache is empty, so BEGIN statement is added to binlog. 2. store_create_info()—When executing CTAS, the START TRANSACTION clause was added to the rewritten query to inform the replica about what was happening. The fix is not to add the START TRANSACTION clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags & | ||
HTON_SUPPORTS_ATOMIC_DDL) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implicit conversion unsigned int
-> bool
if ((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags & | |
HTON_SUPPORTS_ATOMIC_DDL) && | |
if (((get_default_handlerton(thd, thd->lex->create_info->db_type)->flags & | |
HTON_SUPPORTS_ATOMIC_DDL) != 0u) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
} | ||
} | ||
result = | ||
write_bin_log(thd, true, thd->query().str, query_length, is_trans); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implicit conversion int
-> bool
write_bin_log(thd, true, thd->query().str, query_length, is_trans); | |
(write_bin_log(thd, true, thd->query().str, query_length, is_trans) != 0); |
CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7 replication reliability https://perconadev.atlassian.net/browse/PS-9238 Part 2 of the fix. In the scenario source -> replica1 -> replica2, if replica1 is configured with ctas_compatibility_mode it should behave accordingly and produce the binglog compatible with version 8.0.20 (intermediate commit after CREATE TABLE and no START TRANSACTION clause) This part is sensitive to create_info->m_transactional_ddl flag which is set during statement parsing. If ctas_compatibility_mode is set, we force this flag not to be set. Additionally, replica side follows the path of regular CREATE TABLE which binlogs the query as-is. if ctas_compatibility_mode is set, CREATE TABLE is removed from the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
HA_CREATE_USED_START_TRANSACTION>(value), | ||
value(value) {} | ||
|
||
bool contextualize(Table_ddl_parse_context *pc) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method contextualize
can be made const
bool contextualize(Table_ddl_parse_context *pc) override { | |
bool contextualize(Table_ddl_parse_context *pc) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
sql/log_event.h
Outdated
@@ -1471,6 +1472,12 @@ class Query_log_event : public virtual binary_log::Query_event, | |||
native_strncasecmp(query, STRING_WITH_LEN("ROLLBACK TO "))) || | |||
!strncmp(query, STRING_WITH_LEN("XA ROLLBACK")); | |||
} | |||
|
|||
virtual bool is_ctas() const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual
is redundant since the function is already declared override
virtual bool is_ctas() const override { | |
bool is_ctas() const override { |
CREATE TABLE AS SELECT [...] START TRANSACTION to improve 8.0 -> 5.7 replication reliability https://perconadev.atlassian.net/browse/PS-9238 Part 2.1 of the fix. When ctas_compatibility_mode=OFF the binlog sequence is: BEGIN CREATE TABLE ... START TRANSACTION ROW EVENT COMMIT MTS sees the above as one group. When the source produces binlog with CREATE TABLE ... START TRANSACTION but the intermediate replica is configured with ctas_compatibility_mode=ON it has to execute and binlog CTAS in legacy mode (pre 8.0.20). It means the intermediate commit will be done after CREATE TABLE. This commit will call pre_commit hook, which will call Slave_worker::commit_positions() where we check the validity of ptr_group->checkpoint_seqno. checkpoint_seqno however is set when group end is detected (COMMIT event), but it is too late. We detect CTAS case and set checkpoint_seqno to be prepared for intermediate commit.
https://perconadev.atlassian.net/browse/PS-9238
Problem:
5.7 does not understand the new syntax and complains about CREATE TABLE ... START TRANSACTION received in binlog.
Cause:
Commit 4fe3f2f (8.0.21) introduced changes that allow the execution of CREATE TABLE AS SELECT as a single transaction. Before that change, CREATE TABLE was the 1st transaction, then rows were inserted, which was unsafe. To achieve this we don't commit the transaction after CREATE TABLE. The commit is done after rows are inserted. For async replica to understand this protocol, START TRANSACTION clause was added to CREATE TABLE in binlog. When the replica sees this, it knows not to commit after CREATE TABLE.
Solution:
Introduced ctas_compatibility_mode global, read-only variable, OFF by default. If switched to ON, it enforces the behavior from before the above-mentioned commit.
Implementation details:
The fix contains 2 parts:
direct=false / is_trans=true flags. This caused the event to be cached in trx_cache instead of stmt_cache.
MYSQL_BIN_LOG::commit() logic skips trx_cache commit, if this is not the transaction commit (it does only stmt_cache commit). Then, when rows are inserted, it was detected that there is ongoing transaction (trx_cache not empty), so no new BEGIN statement was generated in binlog.
Effectively CREATE TABLE and rows insertion were done in one transaction.
The fix is to revert to the default behavior, i.e., Query_result_create::binlog_show_create_table() creates the event in stmt_cache, which is committed after table creation. When rows are being inserted, it is detected that trx_cache is empty, so BEGIN statement is added to binlog.
The fix is not to add the START TRANSACTION clause.