Skip to content

Commit 15bdb6a

Browse files
perf-tuning: add column-prune.md (pingcap#3246)
* add column pruning * remove category * Apply suggestions from code review Co-authored-by: TomShawn <[email protected]> * update link Co-authored-by: TomShawn <[email protected]>
1 parent becbadd commit 15bdb6a

File tree

2 files changed

+21
-0
lines changed

2 files changed

+21
-0
lines changed

TOC.md

+1
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@
106106
+ [SQL Optimization Process](/sql-optimization-concepts.md)
107107
+ Logic Optimization
108108
+ [Subquery Related Optimizations](/subquery-optimization.md)
109+
+ [Column Pruning](/column-pruning.md)
109110
+ [Decorrelation of Correlated Subquery](/correlated-subquery-optimization.md)
110111
+ [Predicates Push Down](/predicates-push-down.md)
111112
+ [TopN and Limit Push Down](/topn-limit-push-down.md)

column-pruning.md

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
title: Column Pruning
3+
summary: Learn about the usage of column pruning in TiDB.
4+
---
5+
6+
# Column Pruning
7+
8+
The basic idea of column pruning is that for columns not used in the operator, the optimizer does not need to retain them during optimization. Removing these columns reduces the use of I/O resources and facilitates the subsequent optimization. The following is an example of column repetition:
9+
10+
Suppose there are four columns (a, b, c, and d) in table t. You can execute the following statement:
11+
12+
{{< copyable "sql" >}}
13+
14+
```sql
15+
select a from t where b> 5
16+
```
17+
18+
In this query, only column a and column b are used, and column c and column d are redundant. Regarding the query plan of this statement, the `Selection` operator uses column b. Then the `DataSource` operator uses columns a and column b. Columns c and column d can be pruned because the `DataSource` operator does not read them.
19+
20+
Therefore, when TiDB performs a top-down scanning during the logic optimization phase, redundant columns are pruned to reduce waste of resources. This scanning process is called "Column Pruning", corresponding to the `columnPruner` rule. If you want to disable this rule, refer to [The Blocklist of Optimization Rules and Expression Pushdown](/blocklist-control-plan.md).

0 commit comments

Comments
 (0)