diff --git a/TOC-tidb-cloud.md b/TOC-tidb-cloud.md index 55eab66ef9789..51ebcb89b1532 100644 --- a/TOC-tidb-cloud.md +++ b/TOC-tidb-cloud.md @@ -625,6 +625,7 @@ - [Set Operations](/functions-and-operators/set-operators.md) - [List of Expressions for Pushdown](/functions-and-operators/expressions-pushed-down.md) - [Clustered Indexes](/clustered-indexes.md) + - [Global Indexes](/global-indexes.md) - [Constraints](/constraints.md) - [Generated Columns](/generated-columns.md) - [SQL Mode](/sql-mode.md) diff --git a/TOC.md b/TOC.md index d7b7464f48993..b1f13f4e29305 100644 --- a/TOC.md +++ b/TOC.md @@ -984,6 +984,7 @@ - [List of Expressions for Pushdown](/functions-and-operators/expressions-pushed-down.md) - [Comparisons between Functions and Syntax of Oracle and TiDB](/oracle-functions-to-tidb.md) - [Clustered Indexes](/clustered-indexes.md) + - [Global Indexes](/global-indexes.md) - [Vector Index](/vector-search/vector-search-index.md) - [Constraints](/constraints.md) - [Generated Columns](/generated-columns.md) diff --git a/basic-features.md b/basic-features.md index 9fca82a2cf497..40acbf1a9b18c 100644 --- a/basic-features.md +++ b/basic-features.md @@ -66,7 +66,7 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u | [Multi-valued indexes](/sql-statements/sql-statement-create-index.md#multi-valued-indexes) | Y | Y | Y | Y | N | N | N | | [Foreign key](/foreign-key.md) | Y | E | E | E | N | N | N | | [TiFlash late materialization](/tiflash/tiflash-late-materialization.md) | Y | Y | Y | Y | N | N | N | -| [Global index](/partitioned-table.md#global-indexes) | Y | N | N | N | N | N | N | +| [Global index](/global-indexes.md) | Y | N | N | N | N | N | N | | [Vector index](/vector-search/vector-search-index.md) | E | N | N | N | N | N | N | ## SQL statements @@ -174,7 +174,7 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u | [Range INTERVAL partitioning](/partitioned-table.md#range-interval-partitioning) | Y | Y | Y | Y | E | N | N | | [Convert a partitioned table to a non-partitioned table](/partitioned-table.md#convert-a-partitioned-table-to-a-non-partitioned-table) | Y | Y | Y | N | N | N | N | | [Partition an existing table](/partitioned-table.md#partition-an-existing-table) | Y | Y | Y | N | N | N | N | -| [Global index](/partitioned-table.md#global-indexes) | Y | N | N | N | N | N | N | +| [Global indexes](/global-indexes.md) | Y | N | N | N | N | N | N | ## Statistics diff --git a/best-practices/tidb-best-practices.md b/best-practices/tidb-best-practices.md index c7cca03df2d46..718e23f959167 100644 --- a/best-practices/tidb-best-practices.md +++ b/best-practices/tidb-best-practices.md @@ -81,7 +81,7 @@ Similarly, if all data is read from a focused small range (for example, the cont ### Secondary index -TiDB supports the complete secondary indexes, which are also global indexes. Many queries can be optimized by index. Thus, it is important for applications to make good use of secondary indexes. +TiDB supports the complete secondary indexes, which are also [global indexes](/global-indexes.md). Many queries can be optimized by index. Thus, it is important for applications to make good use of secondary indexes. Lots of MySQL experience is also applicable to TiDB. It is noted that TiDB has its unique features. The following are a few notes when using secondary indexes in TiDB. diff --git a/choose-index.md b/choose-index.md index 5770c1e58544c..753a4e01334ac 100644 --- a/choose-index.md +++ b/choose-index.md @@ -81,7 +81,7 @@ Skyline-pruning is a heuristic filtering rule for indexes, which can reduce the - Select whether the index satisfies a certain order. Because index reading can guarantee the order of certain column sets, indexes that satisfy the query order are superior to indexes that do not satisfy on this dimension. -- Whether the index is a [global index](/partitioned-table.md#global-indexes). In partitioned tables, global indexes can effectively reduce the number of cop tasks for a SQL compared to normal indexes, thus improving overall performance. +- Whether the index is a [global index](/global-indexes.md). In partitioned tables, global indexes can effectively reduce the number of cop tasks for a SQL compared to normal indexes, thus improving overall performance. For these preceding dimensions, if the index `idx_a` performs no worse than the index `idx_b` in all three dimensions and performs better than `idx_b` in one dimension, then `idx_a` is preferred. When executing the `EXPLAIN FORMAT = 'verbose' ...` statement, if skyline-pruning excludes some indexes, TiDB outputs a NOTE-level warning listing the remaining indexes after the skyline-pruning exclusion. diff --git a/global-indexes.md b/global-indexes.md new file mode 100644 index 0000000000000..8eb24d3f08930 --- /dev/null +++ b/global-indexes.md @@ -0,0 +1,320 @@ +--- +title: Global Indexes +summary: Learn the use cases, advantages, usage, working principles, and limitations of TiDB global indexes. +--- + +# Global Indexes + +Before the introduction of global indexes, TiDB created a local index for each partition, leading to [a limitation](/partitioned-table.md#partitioning-keys-primary-keys-and-unique-keys) that primary keys and unique keys had to include the partition key to ensure data uniqueness. Additionally, when querying data across multiple partitions, TiDB needed to scan the data of each partition to return results. + +To address these issues, TiDB introduces the global indexes feature in [v8.3.0](https://docs.pingcap.com/tidb/stable/release-8.3.0). A global index covers the data of the entire table with a single index, allowing primary keys and unique keys to maintain global uniqueness without including all partition keys. Moreover, global indexes can access index data across multiple partitions in a single operation instead of looking up the local index for each partition, significantly improving query performance for non-partitioning keys. + +## Advantages + +Global indexes significantly improve query performance, enhance indexing flexibility, and reduce the cost of data migration and modifying applications. + +### Improved query performance + +Global indexes greatly enhance the efficiency of queries involving non-partitioning columns. When a query involves a non-partitioning column, a global index can quickly locate the relevant data, avoiding full table scans across all partitions. This dramatically reduces the number of Coprocessor (cop) tasks, which is especially beneficial in scenarios with a large number of partitions. + +In benchmark tests using sysbench `select_random_points`, performance improves by up to 53 times when the table contains 100 partitions. + +### Enhanced indexing flexibility + +Global indexes remove the restriction that unique keys in partitioned tables must include all partitioning columns. This provides greater flexibility in index design. You can now create indexes based on actual query patterns and business logic, rather than being constrained by the partitioning scheme. This flexibility not only improves query performance but also supports a wider range of application requirements. + +### Reduced cost for data migration and modifying applications + +Global indexes significantly simplify adjustments for data migration and modifying application. Without global indexes, you might need to modify partitioning schemes or rewrite queries to work around indexing limitations. With global indexes, such changes are unnecessary, reducing both development and maintenance overhead. + +For example, when migrating a table from an Oracle database to TiDB, because Oracle supports global indexes, some tables might contain unique indexes that do not include partitioning columns. Before TiDB introduced global indexes, you had to modify the table schema to comply with TiDB's partitioning rules. Now, TiDB supports global indexes, you can simply define those indexes as global during migration, keeping schema behavior consistent with Oracle and greatly reducing migration costs. + +## Limitations of global indexes + +- If the `GLOBAL` keyword is not explicitly specified in the index definition, TiDB creates a local index by default. +- The `GLOBAL` and `LOCAL` keywords only apply to partitioned tables and do not affect non-partitioned tables. In other words, there is no difference between a global index and a local index in non-partitioned tables. +- DDL operations such as `DROP PARTITION`, `TRUNCATE PARTITION`, and `REORGANIZE PARTITION` also trigger updates to global indexes. These DDL operations need to wait for the global index updates to complete before returning results, which increases the execution time accordingly. This is particularly evident in data archiving scenarios, such as `DROP PARTITION` and `TRUNCATE PARTITION`. Without global indexes, these operations can typically complete immediately. However, with global indexes, the execution time increases as the number of indexes that need to be updated grows. +- Tables that contain global indexes do not support the `EXCHANGE PARTITION` operation. +- By default, the primary key of a partitioned table is a clustered index and must include the partition key. If you require the primary key to exclude the partition key, you can explicitly specify the primary key as a non-clustered global index when creating the table, for example, `PRIMARY KEY(col1, col2) NONCLUSTERED GLOBAL`. +- If a global index is added to an expression column, or a global index is also a prefix index (for example `UNIQUE KEY idx_id_prefix (id(10)) GLOBAL`), you need to collect statistics manually for this global index. + +## Feature evolution + +- **Before v7.6.0**: TiDB only supports local indexes on partitioned tables. This means that unique keys on partitioned tables have to include all columns in the partition expression. Queries that do not use the partition key have to scan all partitions, resulting in degraded query performance. +- **[v7.6.0](https://docs.pingcap.com/tidb/stable/release-7.6.0)**: Introduces the [`tidb_enable_global_index`](/system-variables.md#tidb_enable_global_index-new-in-v760) system variable to enable global indexes. However, at that time the feature is still under development and is not recommended for production use. +- **[v8.3.0](https://docs.pingcap.com/tidb/stable/release-8.3.0)**: Global indexes are released as an experimental feature. You can explicitly create a global index using the `GLOBAL` keyword when defining an index. +- **[v8.4.0](https://docs.pingcap.com/tidb/stable/release-8.4.0)**: The global indexes feature becomes generally available (GA). You can create global indexes directly using the `GLOBAL` keyword without setting the `tidb_enable_global_index` system variable. From this version onward, the system variable is deprecated and fixed to `ON`, meaning global indexes are enabled by default. +- **[v8.5.0](https://docs.pingcap.com/tidb/stable/release-8.5.0)**: Global indexes support including all columns from the partition expression. + +## Global indexes vs. local indexes + +The following diagram shows the differences between global indexes and local indexes. + +Global Index vs. Local Index + +**Scenarios for global indexes**: + +- **Infrequent data archiving**: For example, in the healthcare industry, some business data must be retained for up to 30 years. Data is often partitioned monthly, creating 360 partitions at once, with very few `DROP` or `TRUNCATE` operations. In such scenarios, global indexes are more suitable, providing cross-partition consistency and improved query performance. +- **Queries that require cross-partition data**: When queries need to access data across multiple partitions, global indexes can avoid full scans across all partitions and enhance query efficiency. + +**Scenarios for local indexes**: + +- **Frequent data archiving**: If data archiving operations are frequent and queries are mostly confined to a single partition, local indexes can offer better performance. +- **Partition exchange requirements**: In industries like banking, processed data might first be written to a regular table and, after verification, exchanged into a partitioned table to minimize performance impact. In this case, local indexes are preferred, because enabling global indexes disables the partition exchange functionality on the table. + +## Global indexes vs. clustered indexes + +Due to the underlying principles of clustered indexes and global indexes, a single index cannot serve as both a clustered index and a global index. However, these two types of indexes provide different performance optimizations for different query scenarios. When you need to leverage the benefits of both, you can add the partitioning columns to the clustered index while also creating a global index that does not include the partitioning columns. + +Suppose you have the following table structure: + +```sql +CREATE TABLE `t` ( + `id` int DEFAULT NULL, + `ts` timestamp NULL DEFAULT NULL, + `data` varchar(100) DEFAULT NULL +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin +PARTITION BY RANGE (UNIX_TIMESTAMP(`ts`)) +(PARTITION `p0` VALUES LESS THAN (1735660800) + PARTITION `p1` VALUES LESS THAN (1738339200) + ...) +``` + +In the preceding `t` table, the values in the `id` column are unique. To optimize both point queries and range queries, you can define a clustered index in the table creation statement as `PRIMARY KEY(id, ts)` and a global index without the partitioning column as `UNIQUE KEY id(id)`. This way, point queries based on `id` will use the global index `id` and choose a `PointGet` execution plan, while range queries will use the clustered index. The clustered index requires one less table lookup compared to the global index, improving query efficiency. + +The modified table structure is as follows: + +```sql +CREATE TABLE `t` ( + `id` int NOT NULL, + `ts` timestamp NOT NULL, + `data` varchar(100) DEFAULT NULL, + PRIMARY KEY (`id`, `ts`) /*T![clustered_index] CLUSTERED */, + UNIQUE KEY `id` (`id`) /*T![global_index] GLOBAL */ +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin +PARTITION BY RANGE (UNIX_TIMESTAMP(`ts`)) +(PARTITION `p0` VALUES LESS THAN (1735660800), + PARTITION `p1` VALUES LESS THAN (1738339200) + ...) +``` + +This approach optimizes point queries based on `id` while improving the performance of range queries, and also ensures that the table's partitioning columns are effectively utilized in timestamp-based queries. + +## Usage + +To create a global index, add the `GLOBAL` keyword in the index definition. + +> **Note:** +> +> Global indexes affect partition management. Executing `DROP`, `TRUNCATE`, or `REORGANIZE PARTITION` operations will trigger updates to the table-level global indexes. This means that these DDL operations only return after the corresponding global index updates are completed, which might increase execution time. + +```sql +CREATE TABLE t1 ( + col1 INT NOT NULL, + col2 DATE NOT NULL, + col3 INT NOT NULL, + col4 INT NOT NULL, + UNIQUE KEY uidx12(col1, col2) GLOBAL, + UNIQUE KEY uidx3(col3), + KEY idx1(col1) GLOBAL +) +PARTITION BY HASH(col3) +PARTITIONS 4; +``` + +In the preceding example, the unique index `uidx12` and the non-unique index `idx1` become global indexes, while `uidx3` remains a regular unique index. + +Note that a clustered index cannot be a global index. For example: + +```sql +CREATE TABLE t2 ( + col1 INT NOT NULL, + col2 DATE NOT NULL, + PRIMARY KEY (col2) CLUSTERED GLOBAL +) PARTITION BY HASH(col1) PARTITIONS 5; +``` + +``` +ERROR 1503 (HY000): A CLUSTERED INDEX must include all columns in the table's partitioning function +``` + +A clustered index cannot simultaneously serve as a global index. This is because if a clustered index were global, the table would no longer be partitioned. The keys of a clustered index are at the partition level, while global indexes operate at the table level, creating a conflict. If you need to set the primary key as a global index, you must explicitly define it as a non-clustered index, for example: + +```sql +PRIMARY KEY(col1, col2) NONCLUSTERED GLOBAL +``` + +You can identify global indexes by the `GLOBAL` option in the output of [`SHOW CREATE TABLE`](/sql-statements/sql-statement-show-create-table.md): + +```sql +SHOW CREATE TABLE t1\G +``` + +``` + Table: t1 +Create Table: CREATE TABLE `t1` ( + `col1` int NOT NULL, + `col2` date NOT NULL, + `col3` int NOT NULL, + `col4` int NOT NULL, + UNIQUE KEY `uidx12` (`col1`,`col2`) /*T![global_index] GLOBAL */, + UNIQUE KEY `uidx3` (`col3`), + KEY `idx1` (`col1`) /*T![global_index] GLOBAL */ +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin +PARTITION BY HASH (`col3`) PARTITIONS 4 +1 row in set (0.00 sec) +``` + +Alternatively, you can query the [`INFORMATION_SCHEMA.TIDB_INDEXES`](/information-schema/information-schema-tidb-indexes.md) table and check the `IS_GLOBAL` column in the output to identify global indexes. + +```sql +SELECT * FROM information_schema.tidb_indexes WHERE table_name='t1'; +``` + +``` ++--------------+------------+------------+----------+--------------+-------------+----------+---------------+------------+----------+------------+-----------+-----------+ +| TABLE_SCHEMA | TABLE_NAME | NON_UNIQUE | KEY_NAME | SEQ_IN_INDEX | COLUMN_NAME | SUB_PART | INDEX_COMMENT | Expression | INDEX_ID | IS_VISIBLE | CLUSTERED | IS_GLOBAL | ++--------------+------------+------------+----------+--------------+-------------+----------+---------------+------------+----------+------------+-----------+-----------+ +| test | t1 | 0 | uidx12 | 1 | col1 | NULL | | NULL | 1 | YES | NO | 1 | +| test | t1 | 0 | uidx12 | 2 | col2 | NULL | | NULL | 1 | YES | NO | 1 | +| test | t1 | 0 | uidx3 | 1 | col3 | NULL | | NULL | 2 | YES | NO | 0 | +| test | t1 | 1 | idx1 | 1 | col1 | NULL | | NULL | 3 | YES | NO | 1 | ++--------------+------------+------------+----------+--------------+-------------+----------+---------------+------------+----------+------------+-----------+-----------+ +3 rows in set (0.00 sec) +``` + +When partitioning a regular table or repartitioning a partitioned table, you can update indexes to be either global indexes or local indexes as needed. + +For example, the following SQL statement repartitions table `t1` based on column `col1` and updates the global indexes `uidx12` and `idx1` to local indexes, while updating the local index `uidx3` to a global index. `uidx3` is a unique index on column `col3`. To ensure the uniqueness of `col3` across all partitions, `uidx3` must be a global index. `uidx12` and `idx1` are indexes on column `col1` and can be either global or local indexes. + +```sql +ALTER TABLE t1 PARTITION BY HASH (col1) PARTITIONS 3 UPDATE INDEXES (uidx12 LOCAL, uidx3 GLOBAL, idx1 LOCAL); +``` + +## Working mechanism + +This section explains the working mechanism of global indexes, including their design concept and implementation. + +### Design concept + +In TiDB partitioned tables, the key prefix of a local index is the partition ID, while the key prefix of a global index is the table ID. This design ensures that the data of a global index is stored continuously on TiKV, reducing the number of RPC requests when querying the index. + +```sql +CREATE TABLE `sbtest` ( + `id` int(11) NOT NULL, + `k` int(11) NOT NULL DEFAULT '0', + `c` char(120) NOT NULL DEFAULT '', + KEY idx(k), + KEY global_idx(k) GLOBAL +) partition by hash(id) partitions 5; +``` + +Take the preceding table structure as an example: `idx` is a local index, and `global_idx` is a global index. The data of `idx` is distributed across 5 different ranges, such as `PartitionID1_i_xxx` and `PartitionID2_i_xxx`. Whereas the data of `global_idx` is concentrated in a single range (`TableID_i_xxx`). + +When executing a query related to `k`, such as `SELECT * FROM sbtest WHERE k > 1`, using the local index `idx` results in 5 separate ranges being constructed, while using the global index `global_idx` only constructs a single range. Each range corresponds to one or more RPC requests in TiDB. Therefore, using a global index can reduce the number of RPC requests by several times, improving index query performance. + +The following diagram illustrates the difference in RPC requests and data flow when executing `SELECT * FROM sbtest WHERE k > 1` using `idx` versus `global_idx`: + +![Mechanism of Global Indexes](/media/global-index-mechanism.png) + +### Encoding method + +In TiDB, index entries are encoded as key-value pairs. For partitioned tables, each partition is treated as an independent physical table at the TiKV layer, with its own `partitionID`. Therefore, the encoding of index entries in a partitioned table is as follows: + +``` +Unique key +Key: +- PartitionID_indexID_ColumnValues + +Value: +- IntHandle + - TailLen_IntHandle + +- CommonHandle + - TailLen_IndexVersion_CommonHandle + +Non-unique key +Key: +- PartitionID_indexID_ColumnValues_Handle + +Value: +- IntHandle + - TailLen_Padding + +- CommonHandle + - TailLen_IndexVersion +``` + +For global indexes, the encoding of index entries is different. To ensure compatibility with the current index key encoding, the new index encoding layout is as follows: + +``` +Unique key +Key: +- TableID_indexID_ColumnValues + +Value: +- IntHandle + - TailLen_PartitionID_IntHandle + +- CommonHandle + - TailLen_IndexVersion_CommonHandle_PartitionID + +Non-unique key +Key: +- TableID_indexID_ColumnValues_Handle + +Value: +- IntHandle + - TailLen_PartitionID + +- CommonHandle + - TailLen_IndexVersion_PartitionID +``` + +This encoding scheme places the `TableID` at the beginning of the global index key, while the `PartitionID` is stored in the value. The advantage of this design is that it achieves compatibility with the existing index key encoding. However, it also introduces some challenges. For example, when executing DDL operations such as `DROP PARTITION` or `TRUNCATE PARTITION`, extra handling is required because the index entries are not stored contiguously. + +## Performance test results + +The following tests are based on the `select_random_points` scenario in sysbench, primarily used to compare query performance under different partitioning strategies and indexing methods. + +The table structure used in the tests is as follows: + +```sql +CREATE TABLE `sbtest` ( + `id` int(11) NOT NULL, + `k` int(11) NOT NULL DEFAULT '0', + `c` char(120) NOT NULL DEFAULT '', + `pad` char(60) NOT NULL DEFAULT '', + PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */, + KEY `k_1` (`k`) + /* Key `k_1` (`k`, `c`) GLOBAL */ +) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin +/* Partition by hash(`id`) partitions 100 */ +/* Partition by range(`id`) xxxx */ +``` + +The workload SQL is as follows: + +```sql +SELECT id, k, c, pad +FROM sbtest +WHERE k IN (xx, xx, xx) +``` + +Range Partition (100 partitions): + +| Table type | Concurrency 1 | Concurrency 32 | Concurrency 64 | Average RU | +| --------------------------------------------------------------------- | ------------- | -------------- | -------------- | ---------- | +| Clustered non-partitioned table | 225 | 19,999 | 30,293 | 7.92 | +| Clustered table range partitioned by PK | 68 | 480 | 511 | 114.87 | +| Clustered table range partitioned by PK, with Global Index on `k`, `c` | 207 | 17,798 | 27,707 | 11.73 | + +Hash Partition (100 partitions): + +| Table type | Concurrency 1 | Concurrency 32 | Concurrency 64 | Average RU | +| -------------------------------------------------------------------- | ------------- | -------------- | -------------- | ---------- | +| Clustered non-partitioned table | 166 | 20,361 | 28,922 | 7.86 | +| Clustered table hash partitioned by PK | 60 | 244 | 283 | 119.73 | +| Clustered table hash partitioned by PK, with Global Index on `k`, `c` | 156 | 18,233 | 15,581 | 10.77 | + +From the preceding tests, it is evident that in high-concurrency environments, global indexes can significantly improve the query performance of partitioned tables, with performance gains of up to 50 times. Additionally, global indexes substantially reduce resource (RU) consumption. As the number of partitions increases, the performance benefits of global indexes become even more obvious. diff --git a/media/global-index-mechanism.png b/media/global-index-mechanism.png new file mode 100644 index 0000000000000..3438b236f5893 Binary files /dev/null and b/media/global-index-mechanism.png differ diff --git a/media/global-index-vs-local-index.png b/media/global-index-vs-local-index.png new file mode 100644 index 0000000000000..a84c65a1214dd Binary files /dev/null and b/media/global-index-vs-local-index.png differ diff --git a/partitioned-table.md b/partitioned-table.md index 76c5a91fdf856..05c9ed5fe661b 100644 --- a/partitioned-table.md +++ b/partitioned-table.md @@ -1197,7 +1197,7 @@ ALTER TABLE member_level PARTITION BY RANGE(level) PARTITION pMax VALUES LESS THAN (MAXVALUE)); ``` -When partitioning a non-partitioned table or repartitioning an already partitioned table, you can update the indexes to be global or local as needed: +When partitioning a non-partitioned table or repartitioning an already partitioned table, you can update the indexes to be [global indexes](/global-indexes.md) or local indexes as needed: ```sql CREATE TABLE t1 ( @@ -1491,7 +1491,7 @@ This section discusses the relationship of partitioning keys with primary keys a > **Note:** > -> You can ignore this rule when using [global indexes](#global-indexes). +> You can ignore this rule when using [global indexes](/global-indexes.md). For example, the following table creation statements are invalid: @@ -1702,103 +1702,7 @@ ERROR 8264 (HY000): Global Index is needed for index 'a', since the unique index ### Global indexes -Before the introduction of global indexes, TiDB created a local index for each partition, leading to [a limitation](#partitioning-keys-primary-keys-and-unique-keys) that primary keys and unique keys had to include the partition key to ensure data uniqueness. Additionally, when querying data across multiple partitions, TiDB needed to scan the data of each partition to return results. - -To address these issues, TiDB introduces the global indexes feature in v8.3.0. A global index covers the data of the entire table with a single index, allowing primary keys and unique keys to maintain global uniqueness without including all partition keys. Moreover, global indexes can access index data across multiple partitions in a single operation instead of looking up the local index for each partition, significantly improving query performance for non-partitioned keys. Starting from v9.0.0, non-unique indexes can also be created as global indexes. - -To create a global index, you can add the `GLOBAL` keyword in the index definition. - -> **Note:** -> -> Global indexes affect partition management. `DROP`, `TRUNCATE`, and `REORGANIZE PARTITION` operations also trigger updates to table-level global indexes, meaning that these DDL operations will only return results after the global indexes of the corresponding tables are fully updated. - -```sql -CREATE TABLE t1 ( - col1 INT NOT NULL, - col2 DATE NOT NULL, - col3 INT NOT NULL, - col4 INT NOT NULL, - UNIQUE KEY uidx12(col1, col2) GLOBAL, - UNIQUE KEY uidx3(col3), - KEY idx1(col1) GLOBAL -) -PARTITION BY HASH(col3) -PARTITIONS 4; -``` - -In the preceding example, the unique index `uidx12` and non-unique index `idx1` are global indexes, while `uidx3` is a regular unique index. - -Note that a **clustered index** cannot be a global index, as shown in the following example: - -```sql -CREATE TABLE t2 ( - col1 INT NOT NULL, - col2 DATE NOT NULL, - PRIMARY KEY (col2) CLUSTERED GLOBAL -) PARTITION BY HASH(col1) PARTITIONS 5; -``` - -``` -ERROR 1503 (HY000): A CLUSTERED INDEX must include all columns in the table's partitioning function -``` - -The reason is that if the clustered index is a global index, the table will no longer be partitioned. This is because the key of the clustered index is also the record key at the partition level, but the global index is at the table level, which causes a conflict. If you need to set the primary key as a global index, you must explicitly define it as a non-clustered index, for example, `PRIMARY KEY(col1, col2) NONCLUSTERED GLOBAL`. - -You can identify a global index by the `GLOBAL` index option in the [`SHOW CREATE TABLE`](/sql-statements/sql-statement-show-create-table.md) output. - -```sql -SHOW CREATE TABLE t1\G -``` - -``` - Table: t1 -Create Table: CREATE TABLE `t1` ( - `col1` int NOT NULL, - `col2` date NOT NULL, - `col3` int NOT NULL, - `col4` int NOT NULL, - UNIQUE KEY `uidx12` (`col1`,`col2`) /*T![global_index] GLOBAL */, - UNIQUE KEY `uidx3` (`col3`), - KEY `idx1` (`col1`) /*T![global_index] GLOBAL */ -) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin -PARTITION BY HASH (`col3`) PARTITIONS 4 -1 row in set (0.00 sec) -``` - -Alternatively, you can query the [`INFORMATION_SCHEMA.TIDB_INDEXES`](/information-schema/information-schema-tidb-indexes.md) table and check the `IS_GLOBAL` column in the output. - -```sql -SELECT * FROM INFORMATION_SCHEMA.TIDB_INDEXES WHERE table_name='t1'; -``` - -``` -+--------------+------------+------------+----------+--------------+-------------+----------+---------------+------------+----------+------------+-----------+-----------+ -| TABLE_SCHEMA | TABLE_NAME | NON_UNIQUE | KEY_NAME | SEQ_IN_INDEX | COLUMN_NAME | SUB_PART | INDEX_COMMENT | Expression | INDEX_ID | IS_VISIBLE | CLUSTERED | IS_GLOBAL | -+--------------+------------+------------+----------+--------------+-------------+----------+---------------+------------+----------+------------+-----------+-----------+ -| test | t1 | 0 | uidx12 | 1 | col1 | NULL | | NULL | 1 | YES | NO | 1 | -| test | t1 | 0 | uidx12 | 2 | col2 | NULL | | NULL | 1 | YES | NO | 1 | -| test | t1 | 0 | uidx3 | 1 | col3 | NULL | | NULL | 2 | YES | NO | 0 | -| test | t1 | 1 | idx1 | 1 | col1 | NULL | | NULL | 3 | YES | NO | 1 | -+--------------+------------+------------+----------+--------------+-------------+----------+---------------+------------+----------+------------+-----------+-----------+ -3 rows in set (0.00 sec) -``` - -When partitioning a non-partitioned table or repartitioning an already partitioned table, you can update the indexes to be global indexes or local indexes as needed. - -For example, the following SQL statement repartitions table `t1` based on the `col1` column, updates the global indexes `uidx12` and `idx1` to local indexes, and updates the local index `uidx3` to a global index. Because `uidx3` is a unique index on the `col3` column, it must be a global index to ensure the uniqueness of `col3` across all partitions. `uidx12` and `idx1` are indexes on the `col1` column, which means they can be either global or local indexes. - -```sql -ALTER TABLE t1 PARTITION BY HASH (col1) PARTITIONS 3 UPDATE INDEXES (uidx12 LOCAL, uidx3 GLOBAL, idx1 LOCAL); -``` - -#### Limitations of global indexes - -- If the `GLOBAL` keyword is not explicitly specified in the index definition, TiDB creates a local index by default. -- The `GLOBAL` and `LOCAL` keywords only apply to partitioned tables and do not affect non-partitioned tables. In other words, there is no difference between a global index and a local index in non-partitioned tables. -- DDL operations such as `DROP PARTITION`, `TRUNCATE PARTITION`, and `REORGANIZE PARTITION` also trigger updates to global indexes. These DDL operations need to wait for the global index updates to complete before returning results, which increases the execution time accordingly. This is particularly evident in data archiving scenarios, such as `DROP PARTITION` and `TRUNCATE PARTITION`. Without global indexes, these operations can typically complete immediately. However, with global indexes, the execution time increases as the number of indexes that need to be updated grows. -- Tables with global indexes do not support the `EXCHANGE PARTITION` operation. -- By default, the primary key of a partitioned table is a clustered index and must include the partition key. If you require the primary key to exclude the partition key, you can explicitly specify the primary key as a non-clustered global index when creating the table, for example, `PRIMARY KEY(col1, col2) NONCLUSTERED GLOBAL`. -- If a global index is added to an expression column, or a global index is also a prefix index (for example `UNIQUE KEY idx_id_prefix (id(10)) GLOBAL`), you need to collect statistics manually for this global index. +For more information about global indexes, see [Global Indexes](/global-indexes.md). ### Partitioning limitations relating to functions diff --git a/placement-rules-in-sql.md b/placement-rules-in-sql.md index 8e45aff495fbe..d593821b5154a 100644 --- a/placement-rules-in-sql.md +++ b/placement-rules-in-sql.md @@ -312,7 +312,7 @@ PARTITION BY RANGE( YEAR(purchased) ) ( ); ``` -If no placement policy is specified for a partition in a table, the partition attempts to inherit the policy (if any) from the table. If the table has a [global index](/partitioned-table.md#global-indexes), the index will apply the same placement policy as the table. In the preceding example: +If no placement policy is specified for a partition in a table, the partition attempts to inherit the policy (if any) from the table. If the table has a [global index](/global-indexes.md), the index will apply the same placement policy as the table. In the preceding example: - The `p0` partition will apply the `storageforhistorydata` policy. - The `p4` partition will apply the `storagefornewdata` policy. diff --git a/releases/release-8.3.0.md b/releases/release-8.3.0.md index fe1154db4a85e..981050081fe26 100644 --- a/releases/release-8.3.0.md +++ b/releases/release-8.3.0.md @@ -119,7 +119,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v8.3/quick-start-with- Starting with v8.3.0, the global index feature is released as an experimental feature. You can explicitly create a global index for a partitioned table with the keyword `Global` to remove the restriction that the unique key must use every column in the table's partitioning expression, to meet flexible business needs. Global indexes also enhance the performance of queries that do not include partition keys. - For more information, see [documentation](/partitioned-table.md#global-indexes). + For more information, see [documentation](/global-indexes.md). ### Reliability diff --git a/releases/release-8.4.0.md b/releases/release-8.4.0.md index 6bffd4b467d21..08673baab6045 100644 --- a/releases/release-8.4.0.md +++ b/releases/release-8.4.0.md @@ -134,7 +134,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v8.4/quick-start-with- In v8.4.0, this feature becomes generally available (GA). You can use the keyword `GLOBAL` to create a global index, instead of setting the system variable [`tidb_enable_global_index`](/system-variables.md#tidb_enable_global_index-new-in-v760) to enable the global index feature. Starting from v8.4.0, this system variable is deprecated and is always `ON`. - For more information, see [documentation](/partitioned-table.md#global-indexes). + For more information, see [documentation](/global-indexes.md). * Improve query performance for cached tables in some scenarios [#43249](https://github.com/pingcap/tidb/issues/43249) @[tiancaiamao](https://github.com/tiancaiamao) @@ -282,7 +282,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v8.4/quick-start-with- |--------|------------------------------|------| | `log_bin` | Deleted | In v8.4.0, [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) is removed. This variable indicates whether TiDB Binlog is used, and is deleted starting from v8.4.0. | | `sql_log_bin` | Deleted | In v8.4.0, [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) is removed. This variable indicates whether to write changes to TiDB Binlog or not, and is deleted starting from v8.4.0. | -| [`tidb_enable_global_index`](/system-variables.md#tidb_enable_global_index-new-in-v760) | Deprecated | In v8.4.0, this variable is deprecated. Its value will be fixed to the default value `ON`, that is, [global index](/partitioned-table.md#global-indexes) is enabled by default. You only need to add the keyword `GLOBAL` to the corresponding column when executing `CREATE TABLE` or `ALTER TABLE` to create a global index. | +| [`tidb_enable_global_index`](/system-variables.md#tidb_enable_global_index-new-in-v760) | Deprecated | In v8.4.0, this variable is deprecated. Its value will be fixed to the default value `ON`, that is, [global index](/global-indexes.md) is enabled by default. You only need to add the keyword `GLOBAL` to the corresponding column when executing `CREATE TABLE` or `ALTER TABLE` to create a global index. | | [`tidb_enable_list_partition`](/system-variables.md#tidb_enable_list_partition-new-in-v50) | Deprecated | In v8.4.0, this variable is deprecated. Its value will be fixed to the default value `ON`, that is, [list partitioning](/partitioned-table.md#list-partitioning) is enabled by default. | | [`tidb_enable_table_partition`](/system-variables.md#tidb_enable_table_partition) | Deprecated | In v8.4.0, this variable is deprecated. Its value will be fixed to the default value `ON`, that is, [table partitioning](/partitioned-table.md) is enabled by default. | | [`tidb_analyze_partition_concurrency`](/system-variables.md#tidb_analyze_partition_concurrency) | Modified | Changes the value range from `[1, 18446744073709551615]` to `[1, 128]`. | diff --git a/releases/release-8.5.0.md b/releases/release-8.5.0.md index 64dee84f07025..40ac1608c264b 100644 --- a/releases/release-8.5.0.md +++ b/releases/release-8.5.0.md @@ -51,7 +51,7 @@ Compared with the previous LTS 8.1.0, 8.5.0 includes new features, improvements, Instance-level plan cache allows all sessions within the same TiDB instance to share the plan cache. Compared with session-level plan cache, this feature reduces SQL compilation time by caching more execution plans in memory, decreasing overall SQL execution time. It improves OLTP performance and throughput while providing better control over memory usage and enhancing database stability. - Global indexes for partitioned tables (GA in v8.4.0) + Global indexes for partitioned tables (GA in v8.4.0) Global indexes can effectively improve the efficiency of retrieving non-partitioned columns, and remove the restriction that a unique key must contain the partition key. This feature extends the usage scenarios of TiDB partitioned tables, improves the performance of partitioned tables, and reduces resource consumption in certain query scenarios. diff --git a/sql-statements/sql-statement-add-column.md b/sql-statements/sql-statement-add-column.md index 2c67d1fc2cea0..dc74af490d7f2 100644 --- a/sql-statements/sql-statement-add-column.md +++ b/sql-statements/sql-statement-add-column.md @@ -89,7 +89,7 @@ mysql> SELECT * FROM t1; * Adding a new column and setting it to the `PRIMARY KEY` is not supported. * Adding a new column and setting it to `AUTO_INCREMENT` is not supported. * There are limitations on adding generated columns, refer to: [generated column limitations](/generated-columns.md#limitations). -* Setting a [global index](/partitioned-table.md#global-indexes) by specifying `PRIMARY KEY` or `UNIQUE INDEX` as `GLOBAL` when you add a new column is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. +* Setting a [global index](/global-indexes.md) by specifying `PRIMARY KEY` or `UNIQUE INDEX` as `GLOBAL` when you add a new column is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. ## See also diff --git a/sql-statements/sql-statement-add-index.md b/sql-statements/sql-statement-add-index.md index cec85b275defa..9fdffc984f56d 100644 --- a/sql-statements/sql-statement-add-index.md +++ b/sql-statements/sql-statement-add-index.md @@ -105,7 +105,7 @@ mysql> EXPLAIN SELECT * FROM t1 WHERE c1 = 3; * Descending indexes are not supported (similar to MySQL 5.7). * Adding the primary key of the `CLUSTERED` type to a table is not supported. For more details about the primary key of the `CLUSTERED` type, refer to [clustered index](/clustered-indexes.md). -* Setting a `PRIMARY KEY` or `UNIQUE INDEX` as a [global index](/partitioned-table.md#global-indexes) with the `GLOBAL` index option is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. +* Setting a `PRIMARY KEY` or `UNIQUE INDEX` as a [global index](/global-indexes.md) with the `GLOBAL` index option is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. ## See also diff --git a/sql-statements/sql-statement-create-index.md b/sql-statements/sql-statement-create-index.md index e5a4c832c7937..ae37ca6f15550 100644 --- a/sql-statements/sql-statement-create-index.md +++ b/sql-statements/sql-statement-create-index.md @@ -397,7 +397,7 @@ The system variables associated with the `CREATE INDEX` statement are `tidb_ddl_ * Expression indexes are incompatible with views. When a query is executed using a view, the expression index cannot be used at the same time. * Expression indexes have compatibility issues with bindings. When the expression of an expression index has a constant, the binding created for the corresponding query expands its scope. For example, suppose that the expression in the expression index is `a+1`, and the corresponding query condition is `a+1 > 2`. In this case, the created binding is `a+? > ?`, which means that the query with the condition such as `a+2 > 2` is also forced to use the expression index and results in a poor execution plan. In addition, this also affects the baseline capturing and baseline evolution in SQL Plan Management (SPM). * The data written with multi-valued indexes must exactly match the defined data type. Otherwise, data writes fail. For details, see [create multi-valued indexes](/sql-statements/sql-statement-create-index.md#create-multi-valued-indexes). -* Setting a `UNIQUE KEY` as a [global index](/partitioned-table.md#global-indexes) with the `GLOBAL` index option is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. +* Setting a `UNIQUE KEY` as a [global index](/global-indexes.md) with the `GLOBAL` index option is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. ## See also diff --git a/sql-statements/sql-statement-create-table.md b/sql-statements/sql-statement-create-table.md index 6e8e3d355df09..0c0996be7755d 100644 --- a/sql-statements/sql-statement-create-table.md +++ b/sql-statements/sql-statement-create-table.md @@ -279,7 +279,7 @@ mysql> DESC t1; > > Currently, only {{{ .starter }}} and {{{ .essential }}} clusters in certain AWS regions support [`FULLTEXT` syntax and indexes](https://docs.pingcap.com/tidbcloud/vector-search-full-text-search-sql). -* Setting a `PRIMARY KEY` or `UNIQUE INDEX` as a [global index](/partitioned-table.md#global-indexes) with the `GLOBAL` index option is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. +* Setting a `PRIMARY KEY` or `UNIQUE INDEX` as a [global index](/global-indexes.md) with the `GLOBAL` index option is a TiDB extension for [partitioned tables](/partitioned-table.md) and is not compatible with MySQL. diff --git a/system-variables.md b/system-variables.md index daf752224e3cd..eb7a567ec74ae 100644 --- a/system-variables.md +++ b/system-variables.md @@ -2192,8 +2192,8 @@ Assume that you have a cluster with 4 TiDB nodes and multiple TiKV nodes. In thi - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Boolean - Default value: `ON` -- This variable controls whether to support creating [global indexes](/partitioned-table.md#global-indexes) for partitioned tables. When this variable is enabled, TiDB allows you to create unique indexes that **do not include all the columns used in the partition expressions** by specifying `GLOBAL` in the index definition. -- This variable is deprecated since v8.4.0. Its value will be fixed to the default value `ON`, that is, [global indexes](/partitioned-table.md#global-indexes) is enabled by default. +- This variable controls whether to support creating [global indexes](/global-indexes.md) for partitioned tables. When this variable is enabled, TiDB allows you to create unique indexes that **do not include all the columns used in the partition expressions** by specifying `GLOBAL` in the index definition. +- This variable is deprecated since v8.4.0. Its value will be fixed to the default value `ON`, that is, [global indexes](/global-indexes.md) is enabled by default. ### tidb_enable_lazy_cursor_fetch New in v8.3.0