Skip to content

Ps 9395 [8.0]: Merge percona-202401 MyRocks tag#5567

Closed
inikep wants to merge 25 commits intopercona:8.0from
inikep:PS-9395-8.0
Closed

Ps 9395 [8.0]: Merge percona-202401 MyRocks tag#5567
inikep wants to merge 25 commits intopercona:8.0from
inikep:PS-9395-8.0

Conversation

@inikep
Copy link
Copy Markdown
Collaborator

@inikep inikep commented Feb 28, 2025

No description provided.

laurynas-biveinis and others added 25 commits February 28, 2025 12:50
Upstream commit ID: facebook/mysql-5.6@52894d310ac4
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Since 5.7, all the THR_LOCK uses cases for InnoDB are fully covered by MDL locking and thus InnoDB removed THR_LOCK-based locking. The same MDL lock coverage applies to MyRocks as well, thus remove THR_LOCK-based locking there too.
- Remove fields Rdb_table_handler::m_thr_lock and ha_rocksdb::m_db_lock
- Add HA_NO_READ_LOCAL_LOCK to ha_rocksdb::table_flags to indicate that the SE does not have any specific support for LOCK TABLE READ LOCAL
- New overridden method ha_rocksdb::lock_count that always returns zero
- Remove all code in ha_rocksdb::store_lock that created THR_LOCK objects. Leave only the code that decides on the internal RocksDB locking.

References:
- https://dev.mysql.com/worklog/task/?id=6671
- https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-5.html
- 5c6171e

Pull Request resolved: facebook/mysql-5.6#1323

Differential Revision: D46773492
Upstream commit ID: facebook/mysql-5.6@83bb92de5a0a
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
This fixes the errors listed below. For the array argument, fix by converting the arrays to pointers as this is what gets passed around anyway.
```
storage/rocksdb/rdb_datadic.cc:5528:9: error: format specifies type 'unsigned int' but the argument has type 'uint16_t' (aka 'unsigned short') [-Werror,-Wformat]
        index_info->m_index_dict_version, index_info->m_index_type,
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
storage/rocksdb/rdb_datadic.cc:5528:43: error: format specifies type 'unsigned int' but the argument has type 'uchar' (aka 'unsigned char') [-Werror,-Wformat]
        index_info->m_index_dict_version, index_info->m_index_type,
                                          ^~~~~~~~~~~~~~~~~~~~~~~~
storage/rocksdb/rdb_datadic.cc:5529:9: error: format specifies type 'unsigned int' but the argument has type 'uint16_t' (aka 'unsigned short') [-Werror,-Wformat]
        index_info->m_kv_version, index_info->m_ttl_duration);
        ^~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.

storage/rocksdb/ha_rocksdb.cc:14077:34: error: argument 'buf' of type 'uchar[8]' (aka 'unsigned char[8]') with mismatched bound [-Werror,-Warray-parameter]
    const Rdb_key_def &kd, uchar buf[Rdb_key_def::INDEX_NUMBER_SIZE * 2]) {
                                 ^
storage/rocksdb/./ha_rocksdb.h:406:64: note: previously declared as 'uchar[]' (aka 'unsigned char[]') here
  static rocksdb::Range get_range(const Rdb_key_def &kd, uchar buf[]);
                                                               ^
storage/rocksdb/ha_rocksdb.cc:14086:24: error: argument 'buf' of type 'uchar[8]' (aka 'unsigned char[8]') with mismatched bound [-Werror,-Warray-parameter]
    const int i, uchar buf[Rdb_key_def::INDEX_NUMBER_SIZE * 2]) const {
                       ^
storage/rocksdb/./ha_rocksdb.h:391:47: note: previously declared as 'uchar[]' (aka 'unsigned char[]') here
  rocksdb::Range get_range(const int i, uchar buf[]) const;
                                              ^
```

Pull Request resolved: facebook/mysql-5.6#1364

Differential Revision: D49440205
…ate events (percona#1362)

Upstream commit ID: facebook/mysql-5.6@3c6a6e135e43
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Currently, the SQL command type is not checked when making the decision to forbid statement and mixed binlog formats, but some SQL commands may not generate row events anyway, thus it does not make sense to forbid them. Add this condition by calling thd_sqlcom_can_generate_row_events(thd). InnoDB has been using a similar check already.

The difference may not be substantial at the moment, but with MyRocks DDSE many SQL commands start satisfying this condition.

At the same time partially reorganize ha_rocksdb::external_lock code so that things are not checked if they are not needed (i.e. binlog format only needs to be read for F_WRLCK lock type), and do early error exit checks before performing actions such as creating the transaction object.

Pull Request resolved: facebook/mysql-5.6#1362

Differential Revision: D49323956
Upstream commit ID: facebook/mysql-5.6@62f4a46ee492
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
`GetCompactionReasonString()` will be exposed in
RocksDB 8.7, which includes PR facebook/rocksdb#11778 that exposes it. So update its definition here to be based on RocksDB version.

Differential Revision: D49755853
…solation level

Upstream commit ID: facebook/mysql-5.6@5eec7392ab81
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Currently, a concurrency race can take place with REPEATABLE READ isolation level, in the case of a locking read via the secondary key, if after a secondary key is read and before the corresponding row is read via the reconstructed primary key, a concurrent session updates the row.

This bug existed for READ COMMITTED isolation level, and was fixed here: D48740767. For the REPEATABLE READ case, the fix is different. Here, we want to be able to detect the snapshot conflict in the case of a locking read, and return this error back to the user.

Currently, a secondary key read is performed via a get() call and does not usually acquire locks at the DB level, and neither does it trigger any snapshot creation. In this scenario, the snapshot creation gets triggered only during the subsequent get_for_update() call when the row is read via the primary key (reconstructed from the just read secondary key). The objective is to delay snapshot creation till just before the DB->Get() call to prevent unnecessary snapshot conflicts.

Delaying the snapshot creation works for primary key reads, as the snapshot will be used for validation for subsequent operations, but for concurrent updates taking place around SK based locking reads, the snapshot creation takes place too late, and cannot be used for snapshot validation for this present race.

This iteration of the fix involves acquire a snapshot in get_row_by_sk() before the sk_iterator->get() call, which can then be used during the subsequent get_for_update() calls for snapshot validation. This may be a simpler alternative for the time being, pending a more detailed exploration of acquire_snapshot codepaths.

Other alternative fixes considered:
1. acquire snapshot in index_init(), the downside being that we want to delay snapshotting till just before the Get call, to minimize conflicts.
2. acquire snapshot by allowing iter->get() calls to do so: D49793443
3. Making the sk_iterator->get() calls as locking reads based on `m_lock_rows`, which may not make sense for SK get().

Differential Revision: D49927999
Upstream commit ID: facebook/mysql-5.6@313613d8f03a
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: Today, when we issue an alter to convert the unique key to non-unique key, we hit an assert while trying to populate the secondary key due to no indexes needing to be built in reality. In MyRocks, existing data doesn't have any specific information whether the key is unique or not. So other than marking flags which indicate that the data can be duplicate, we can exit without needing any rebuild.

Differential Revision: D49762388
Upstream commit ID: facebook/mysql-5.6@4235414d3ad1
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
If users set high value on rocksdb_max_row_locks session var
and updates large rows, replication threads stopped with
"Failed to acquire lock due to rocksdb_max_row_locks limit" error.
Since rocksdb_max_row_locks is a rocksdb specific session variable,
we can not write to binary log (otherwise different storage engines will
break).
To prevent replication error, this diff makes replication threads
ignore rocksdb_max_row_locks limit.

Differential Revision: D50285899
Upstream commit ID: facebook/mysql-5.6@7d2c7167cca4
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
rocksdb api change
* `Options::compaction_readahead_size` 's default value is changed from 0 to 2MB.

@update-submodule: rocksdb

Differential Revision: D50354704
Upstream commit ID: facebook/mysql-5.6@44b791724098
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Currently table stats/definition stored in myrocks internal DD use normalize table name.

if original table name contains special character, such as "2023-09-01", its normalized table name is "2023@002d09002d01",  myrocks won't find table stats/definition by using original table name.

The change is to mimic what innodb does(https://www.internalfb.com/code/mysql-5.6/[fb-mysql-8.0.28%3A88ca8e7604f66fcf25492c675cd2e669627216f1]/storage/innobase/handler/ha_innodb.cc?lines=17632) --- always convert to normailzed table name first,  then fetch table stats.

Differential Revision: D50110769
Upstream commit ID: facebook/mysql-5.6@c1f79a85e126
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
sometime rocksdb.partial_index_stress MTR fail especially under myrocks DD due
```
MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction (snapshot conflict)')
```

From https://dev.mysql.com/doc/refman/8.0/en/set-transaction.html, SET SESSION TRANSACTION doesn't apply to cur trx.

The fix is to commit Isolation change first to avoid snapshot conflict during first insert

Differential Revision: D50420444
Upstream commit ID: facebook/mysql-5.6@3366bd9d91b2
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
some MTR failed in ubsan due to num of rows/records calculated in records_in_range() is a negative values which is caused by m_actual_disk_size is a negative value

```
storage/rocksdb/ha_rocksdb.cc:: runtime error: -56.8272 is outside the range of representable values of type 'unsigned long long'
    #0  myrocks::ha_rocksdb::records_in_range_internal(unsigned int, key_range*, key_range*, long, long, unsigned long long*, unsigned long long*) /data/sandcastle/boxes/trunk-git-mysql/storage/rocksdb/ha_rocksdb.cc:14855:16
    #1  myrocks::ha_rocksdb::records_in_range(unsigned int, key_range*, key_range*) /data/sandcastle/boxes/trunk-git-mysql/storage/rocksdb/ha_rocksdb.cc:14760:3
    #2  handler::multi_range_read_info_const(unsigned int, RANGE_SEQ_IF*, void*, unsigned int, unsigned int*, unsigned int*, Cost_estimate*) /data/sandcastle/boxes/trunk-git-mysql/sql/handler.cc:6608:26
    #3  myrocks::ha_rocksdb::multi_range_read_info_const(unsigned int, RANGE_SEQ_IF*, void*, unsigned int, unsigned int*, unsigned int*, Cost_estimate*) /data/sandcastle/boxes/trunk-git-mysql/storage/rocksdb/ha_rocksdb.cc:19002:18
    #4  check_quick_select(THD*, RANGE_OPT_P
```

Due to m_actual_disk_size is an estimated value, always reset to 0 if it i becomes negative.

Differential Revision: D50531919
Upstream commit ID: facebook/mysql-5.6@797aaf807014
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
clang 15 comes with a few extra [formatting warnings](https://clang.llvm.org/docs/DiagnosticsReference.html#wformat) that we treat as
errors. This diff fixes them in preparation for the upgrade.

Differential Revision: D51122375
Upstream commit ID: facebook/mysql-5.6@af9623c56ab1
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: For these drop tables testcases, add drop_table_with_delete_range combination inside testcases

Differential Revision: D50632053
Upstream commit ID: facebook/mysql-5.6@67bc25d9949e
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
It was deprecated in 8.0.28.

Pull Request resolved: facebook/mysql-5.6#1397

Differential Revision: D51635623

fbshipit-source-id: 302a39f1eeecee75a838d8e5636358bd87755b94
Upstream commit ID: facebook/mysql-5.6@541869b68f39
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Its value is duplicated with row_info.skip_unique_check, which is already passed to the same methods. At the same time replace MY_ATTRIBUTE((__unused__)) with [[nodiscard]].

Pull Request resolved: facebook/mysql-5.6#1404

Differential Revision: D51872282

fbshipit-source-id: e94795e34535838082df39e385024683e14ec4f6
Upstream commit ID: facebook/mysql-5.6@8f13afd2a198
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: With these changes, we support instant alter for column default values.

Differential Revision: D49562448

fbshipit-source-id: 8cc0d802ac071695ad2046968944d5ea34875c66
Upstream commit ID: facebook/mysql-5.6@739820bcfb5f
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: With these changes, we support altering table comment instantly.

Differential Revision: D50081943

fbshipit-source-id: 1444927cbd42d48839756c7cadd25c5f290191c8
Upstream commit ID: facebook/mysql-5.6@9b8f4e5eaee0
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: The change to support column default changes also covers comment addition/removal but there was no test case to verify it.

Differential Revision: D50157267

fbshipit-source-id: cfad2b1dd4b463169f9b380d595d2b5df962eca2
Upstream commit ID: facebook/mysql-5.6@e5ce5c52f10b
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: Adding support to enable instantddl for drop index/unique index. Keeping changes separate for now. Will club the different instantddls together when there is enough confidence on the rollout.

Differential Revision: D50539296

fbshipit-source-id: 480795d9a46065a0a8f5d05986b971d47f5faa5d
Upstream commit ID: facebook/mysql-5.6@3c92f6107aad
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: Recording some new rocksdb variables added as part of instant DDL support

Differential Revision: D52452369

fbshipit-source-id: 9c145160cc094f0f515dd22626b7351f2e700406
Upstream commit ID: facebook/mysql-5.6@27bc0f8db765
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary: Wrapping RocksDB connection-creating tests in --source include/count_sessions.inc / wait_until_count_sessions.inc for stability.

Differential Revision: D51997562

fbshipit-source-id: 1b927ec1653579f932c96071e9d8821f8df75165
Upstream commit ID: facebook/mysql-5.6@e651eb791be2
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Updating the rocksdb submodule to 8.10.fb

update-submodule: rocksdb

Differential Revision: D52605733

fbshipit-source-id: f06d5d9b9d6903959e18aa16f808dc5ae2154edd
…ercona#1416)

Upstream commit ID: facebook/mysql-5.6@e5b18aae04c8
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
Rdb_tbl_def class has a move constructor-like two-arg constructor that takes the ownership of passed object's m_key_descr_arr field. Signify this moved-from state in the object by resetting m_pk_index-which is a related-field to MAX_INDEXES + 1, and add extra asserts that these and other related fields are not accessed in moved-from objects.

At the same time remove redundant "explicit" specifiers from two-argument constructors and clean them up to use initializer lists consistently.

Pull Request resolved: facebook/mysql-5.6#1416

Differential Revision: D52609969

fbshipit-source-id: 0aed03847f2e0f783e96191d2e966c4068cab4ad
Upstream commit ID: facebook/mysql-5.6@af8699c08190
PS-9395: Merge percona-202401 (https://jira.percona.com/browse/PS-9395)

Summary:
MyRocks has only one physical record storage organization, and does not internally have any notion of row format. But server layer does, and assigns the format setting to each table at CREATE TABLE time, and supports ALTER TABLE to different row formats even when the SE does not.

To match InnoDB better, make the default row format for MyRocks tables DYNAMIC. Continue accepting all the other values or InnoDB compatibility.

This change is required to convert SYSTEM table from INNODB to MyRocks. For example mysql.db table, without this change, INNODB mysql.db table and MyRocks mysql.db table will have different row_type value, thus checksum mismatch:
- INNODB default row_type value is DYNAMIC.
- Without this change, MyRocks will use SQL layer default row_type value which is Fixed. With this change, MyRocks will also use DYNAMIC as default row_type value.

Related: facebook/mysql-5.6#699

Pull Request resolved: facebook/mysql-5.6#1413

Differential Revision: D52259024

fbshipit-source-id: 94c71824656d70d7a02a89dcc0829d7165c38a37
Copy link
Copy Markdown
Collaborator

@percona-ysorokin percona-ysorokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@inikep inikep closed this Mar 25, 2025
@inikep inikep deleted the PS-9395-8.0 branch August 11, 2025 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants