feat(optimizer): introduce watermark group #19894

stdrc · 2024-12-23T07:46:12Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Background

Previously we don't record the derivation relation among different watermark columns. For instance, window_start and window_end generated by tumble/hop time window functions are always related to each other and the original event time column, however when we plan for an EOWC query, we lost this information and hence couldn't succeed if more than 1 of these exist (not pruned) at the same time.

Changes

This PR introduces a new way to represent watermark columns in plan nodes -- watermark groups. In this PR we organize watermark columns into several groups, in each of which, all watermark columns are related. With this, when planning EOWC queries, we can handle multiple watermark columns as long as they are in the same one group. For example in the following query, we won't warn our user that there're more than one watermark column any more.

create materialized view mv as
select
  foo, ts, window_start, window_end
from tumble(t, ts, interval '5 mins')
emit on window close;

Among the changes, ~400 LoC are real, others are planner tests. Basically the core changes all happen in StreamXxx::new, with watermark derivation logic unchanged. The main contribution is to determine which columns among all output watermark columns are related for every stream node.

Future work

We still use FixedBitSet to record a flattened watermark_columns field in TableCatalog and TableDesc, and hence in Table protobuf. As a result, When we create MV on MV, all the watermark columns in the base MV will be considered belonging to different watermark groups no matter whether they are actually related. I will change the table catalog field later to fix this.
We still can't do create mv as select window_start, window_end, count(*) as cnt from tumble(t, ts, ...) group by window_start, window_end emit on window close, because I don't want to have too many changes in this single PR. Will modify StreamHashAgg later to support it. (Resolved in feat(eowc): allow multiple watermark columns in eowc hash agg #19998)
Since we don't support duplicated plan node reusing in streaming plan, when we have two side of a join both scanning the same table with join condition left.window = right.window, we cannot know that the two window columns are derived from the same source watermark column. This is because the watermark group ID is allocated and assigned during optimization phase and not persisted.

Checklist

I have written necessary rustdoc comments.
I have added necessary unit tests and integration tests.
I have added test labels as necessary.
I have added fuzzing tests or opened an issue to track them.
My PR contains breaking changes.
My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

My PR needs documentation updates.

Release note

stdrc · 2024-12-23T07:46:30Z

feat(eowc): allow multiple watermark columns in eowc hash agg #19998
feat(optimizer): introduce watermark group #19894 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

chenzl25

Since we don't support shared node in streaming plan (IIRC),

Do you want this self-join plan with StreamShare?

risingwave/src/frontend/planner_test/tests/testdata/output/share.yaml

Lines 19 to 55 in c7a4a39

    
           - id: self_join 
        
             before: 
        
             - create_sources 
        
             sql: | 
        
               select count(*) cnt from auction A join auction B on A.id = B.id where A.initial_bid = 1 and B.initial_bid = 2; 
        
             batch_plan: |- 
        
               BatchSimpleAgg { aggs: [sum0(count)] } 
        
               └─BatchExchange { order: [], dist: Single } 
        
                 └─BatchSimpleAgg { aggs: [count] } 
        
                   └─BatchHashJoin { type: Inner, predicate: id = id, output: [] } 
        
                     ├─BatchExchange { order: [], dist: HashShard(id) } 
        
                     │ └─BatchFilter { predicate: (initial_bid = 1:Int32) } 
        
                     │   └─BatchSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] } 
        
                     └─BatchExchange { order: [], dist: HashShard(id) } 
        
                       └─BatchFilter { predicate: (initial_bid = 2:Int32) } 
        
                         └─BatchSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] } 
        
             stream_plan: |- 
        
               StreamMaterialize { columns: [cnt], stream_key: [], pk_columns: [], pk_conflict: NoCheck } 
        
               └─StreamProject { exprs: [sum0(count)] } 
        
                 └─StreamSimpleAgg [append_only] { aggs: [sum0(count), count] } 
        
                   └─StreamExchange { dist: Single } 
        
                     └─StreamStatelessSimpleAgg { aggs: [count] } 
        
                       └─StreamHashJoin [append_only] { type: Inner, predicate: id = id, output: [_row_id, id, _row_id] } 
        
                         ├─StreamExchange { dist: HashShard(id) } 
        
                         │ └─StreamFilter { predicate: (initial_bid = 1:Int32) } 
        
                         │   └─StreamShare { id: 4 } 
        
                         │     └─StreamProject { exprs: [id, initial_bid, _row_id] } 
        
                         │       └─StreamFilter { predicate: ((initial_bid = 1:Int32) OR (initial_bid = 2:Int32)) } 
        
                         │         └─StreamRowIdGen { row_id_index: 10 } 
        
                         │           └─StreamSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] } 
        
                         └─StreamExchange { dist: HashShard(id) } 
        
                           └─StreamFilter { predicate: (initial_bid = 2:Int32) } 
        
                             └─StreamShare { id: 4 } 
        
                               └─StreamProject { exprs: [id, initial_bid, _row_id] } 
        
                                 └─StreamFilter { predicate: ((initial_bid = 1:Int32) OR (initial_bid = 2:Int32)) } 
        
                                   └─StreamRowIdGen { row_id_index: 10 } 
        
                                     └─StreamSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] }

Signed-off-by: Richard Chien <[email protected]>

This reverts commit b9c286f.

This reverts commit 68ac534.

Signed-off-by: Richard Chien <[email protected]>

github-actions

license-eye has checked 5536 files.

Valid	Invalid	Ignored	Fixed
2332	2	3202	0

Click to see the invalid file list

src/common/src/util/functional.rs
src/frontend/src/optimizer/property/watermark_columns.rs

Use this command to fix any missing license headers

```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

src/common/src/util/functional.rs

src/frontend/src/optimizer/property/watermark_columns.rs

stdrc · 2025-01-02T07:36:30Z

src/frontend/planner_test/tests/testdata/output/watermark_group.yaml

+  eowc_stream_plan: |-
+    StreamMaterialize { columns: [foo, win1, win2, t._row_id(hidden), t._row_id#1(hidden)], stream_key: [t._row_id, t._row_id#1, win1], pk_columns: [t._row_id, t._row_id#1, win1], pk_conflict: NoCheck, watermark_columns: [win1] }
+    └─StreamEowcSort { sort_column: $expr1 }
+      └─StreamProject { exprs: [(t.foo + $expr2) as $expr4, $expr1, $expr3, t._row_id, t._row_id], output_watermarks: [[$expr1], [$expr3]] }
+        └─StreamHashJoin [window, append_only] { type: Inner, predicate: $expr1 = $expr3, output_watermarks: [[$expr1], [$expr3]], output: [t.foo, $expr1, $expr2, $expr3, t._row_id, t._row_id] }
+          ├─StreamExchange { dist: HashShard($expr1) }
+          │ └─StreamProject { exprs: [t.foo, TumbleStart(t.ts, '00:05:00':Interval) as $expr1, t._row_id], output_watermarks: [[$expr1]] }
+          │   └─StreamTableScan { table: t, columns: [t.foo, t.ts, t._row_id], stream_scan_type: ArrangementBackfill, stream_key: [t._row_id], pk: [_row_id], dist: UpstreamHashShard(t._row_id) }
+          └─StreamExchange { dist: HashShard($expr3) }
+            └─StreamProject { exprs: [(t.foo + 1:Int32) as $expr2, TumbleStart(t.ts, '00:10:00':Interval) as $expr3, t._row_id], output_watermarks: [[$expr3]] }
+              └─StreamTableScan { table: t, columns: [t.foo, t.ts, t._row_id], stream_scan_type: ArrangementBackfill, stream_key: [t._row_id], pk: [_row_id], dist: UpstreamHashShard(t._row_id) }


@chenzl25 In this plan, is it possible to use a shared StreamTableScan on the two sides?

You can use a cte to construct a share operator.

create materialized view v as with cte as (select foo, window_start as win from tumble(t, ts, interval '5 mins')) select l.foo + r.foo as foo, l.win as win1, r.win as win2 from cte as l join cte as r on l.win = r.win;

Signed-off-by: Richard Chien <[email protected]>

stdrc force-pushed the rc/watermark-group branch from 7711584 to 68ac534 Compare December 23, 2024 07:48

stdrc changed the title ~~reorder~~ feat(optimizer): introduce watermark group Dec 23, 2024

github-actions bot added the type/feature label Dec 23, 2024

stdrc force-pushed the rc/watermark-group branch from 43dcde2 to 60ee293 Compare December 31, 2024 07:34

stdrc marked this pull request as ready for review December 31, 2024 17:00

stdrc requested review from chenzl25, fuyufjh, BugenZhao, yuhao-su and st1page December 31, 2024 17:01

graphite-app bot requested a review from a team December 31, 2024 17:21

chenzl25 reviewed Jan 2, 2025

View reviewed changes

stdrc added 12 commits January 2, 2025 15:33

add OptimizerContext::next_watermark_group_id

32ab3d8

Signed-off-by: Richard Chien <[email protected]>

add IdAllocator

13ad317

Signed-off-by: Richard Chien <[email protected]>

remove last_watermark_group_id from OptimizerContext

5f0c97f

Signed-off-by: Richard Chien <[email protected]>

Revert "remove last_watermark_group_id from OptimizerContext"

c627a3a

This reverts commit b9c286f.

Revert "add IdAllocator"

927bbcf

This reverts commit 68ac534.

impl WatermarkColumns struct

ef7feb0

Signed-off-by: Richard Chien <[email protected]>

change all easy watermark_columns: FixedBitSet to WatermarkColumns

20f549b

Signed-off-by: Richard Chien <[email protected]>

add doc comment for WatermarkColumns

2fbe201

Signed-off-by: Richard Chien <[email protected]>

flatten the watermark columns info on the plannode-tablecatalog boundary

312d98e

Signed-off-by: Richard Chien <[email protected]>

enforce EOWC according to watermark groups

6c466ac

Signed-off-by: Richard Chien <[email protected]>

fix check

a56cafa

Signed-off-by: Richard Chien <[email protected]>

add/update planner tests

0f81283

Signed-off-by: Richard Chien <[email protected]>

stdrc force-pushed the rc/watermark-group branch from 6a5c106 to 0f81283 Compare January 2, 2025 07:34

github-actions bot reviewed Jan 2, 2025

View reviewed changes

src/common/src/util/functional.rs Outdated Show resolved Hide resolved

src/frontend/src/optimizer/property/watermark_columns.rs Outdated Show resolved Hide resolved

stdrc commented Jan 2, 2025

View reviewed changes

fix license

86a5f9c

Signed-off-by: Richard Chien <[email protected]>

stdrc mentioned this pull request Jan 2, 2025

feat(eowc): allow multiple watermark columns in eowc hash agg #19998

Open

8 tasks

stdrc added 2 commits January 3, 2025 13:51

add shared scan node join

c368612

Signed-off-by: Richard Chien <[email protected]>

fix source e2e

9b11416

Signed-off-by: Richard Chien <[email protected]>

graphite-app bot requested a review from a team January 3, 2025 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(optimizer): introduce watermark group #19894

feat(optimizer): introduce watermark group #19894

stdrc commented Dec 23, 2024 •

edited

Loading

stdrc commented Dec 23, 2024 •

edited

Loading

chenzl25 left a comment •

edited

Loading

github-actions bot left a comment

stdrc Jan 2, 2025

chenzl25 Jan 2, 2025

stdrc Jan 3, 2025

	- id: self_join
	before:
	- create_sources
	sql: \|
	select count(*) cnt from auction A join auction B on A.id = B.id where A.initial_bid = 1 and B.initial_bid = 2;
	batch_plan: \|-
	BatchSimpleAgg { aggs: [sum0(count)] }
	└─BatchExchange { order: [], dist: Single }
	└─BatchSimpleAgg { aggs: [count] }
	└─BatchHashJoin { type: Inner, predicate: id = id, output: [] }
	├─BatchExchange { order: [], dist: HashShard(id) }
	│ └─BatchFilter { predicate: (initial_bid = 1:Int32) }
	│ └─BatchSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] }
	└─BatchExchange { order: [], dist: HashShard(id) }
	└─BatchFilter { predicate: (initial_bid = 2:Int32) }
	└─BatchSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] }
	stream_plan: \|-
	StreamMaterialize { columns: [cnt], stream_key: [], pk_columns: [], pk_conflict: NoCheck }
	└─StreamProject { exprs: [sum0(count)] }
	└─StreamSimpleAgg [append_only] { aggs: [sum0(count), count] }
	└─StreamExchange { dist: Single }
	└─StreamStatelessSimpleAgg { aggs: [count] }
	└─StreamHashJoin [append_only] { type: Inner, predicate: id = id, output: [_row_id, id, _row_id] }
	├─StreamExchange { dist: HashShard(id) }
	│ └─StreamFilter { predicate: (initial_bid = 1:Int32) }
	│ └─StreamShare { id: 4 }
	│ └─StreamProject { exprs: [id, initial_bid, _row_id] }
	│ └─StreamFilter { predicate: ((initial_bid = 1:Int32) OR (initial_bid = 2:Int32)) }
	│ └─StreamRowIdGen { row_id_index: 10 }
	│ └─StreamSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] }
	└─StreamExchange { dist: HashShard(id) }
	└─StreamFilter { predicate: (initial_bid = 2:Int32) }
	└─StreamShare { id: 4 }
	└─StreamProject { exprs: [id, initial_bid, _row_id] }
	└─StreamFilter { predicate: ((initial_bid = 1:Int32) OR (initial_bid = 2:Int32)) }
	└─StreamRowIdGen { row_id_index: 10 }
	└─StreamSource { source: auction, columns: [id, item_name, description, initial_bid, reserve, date_time, expires, seller, category, extra, _row_id] }

feat(optimizer): introduce watermark group #19894

Are you sure you want to change the base?

feat(optimizer): introduce watermark group #19894

Conversation

stdrc commented Dec 23, 2024 • edited Loading

What's changed and what's your intention?

Background

Changes

Future work

Checklist

Documentation

stdrc commented Dec 23, 2024 • edited Loading

chenzl25 left a comment • edited Loading

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

stdrc Jan 2, 2025

Choose a reason for hiding this comment

chenzl25 Jan 2, 2025

Choose a reason for hiding this comment

stdrc Jan 3, 2025

Choose a reason for hiding this comment

stdrc commented Dec 23, 2024 •

edited

Loading

stdrc commented Dec 23, 2024 •

edited

Loading

chenzl25 left a comment •

edited

Loading