Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(frontend): initially introduce table def sql purification #19949

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

BugenZhao
Copy link
Member

@BugenZhao BugenZhao commented Dec 27, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

See #17472 for the background. This PR initially introduces SQL purification for table definition, and adopt it in SHOW CREATE TABLE. Further adoption (like for replacing table plan) will be done in future PRs, as they require additional work that is not directly relevant to the current scope to ensure a complete end-to-end implementation.

It turns out that we only need to restore the ColumnDef (specifically, the DataType, which is already contributed by @StrikeW). This makes the procedure much simpler than initially thought.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

  • My PR needs documentation updates.
Release note

@BugenZhao BugenZhao force-pushed the bz/purify-sql-part-1 branch 2 times, most recently from c2b4de2 to ca9e6ef Compare December 30, 2024 05:37
@BugenZhao BugenZhao marked this pull request as ready for review December 30, 2024 05:47
Base automatically changed from bz/zealous-wombat to main December 30, 2024 06:19
@graphite-app graphite-app bot requested a review from a team December 30, 2024 06:19
@BugenZhao BugenZhao force-pushed the bz/purify-sql-part-1 branch from 45ab48e to 073b15c Compare December 30, 2024 09:05
@BugenZhao BugenZhao requested a review from fuyufjh December 30, 2024 09:35
@BugenZhao BugenZhao force-pushed the bz/purify-sql-part-1 branch from 7922a7b to 7fd1ad9 Compare December 31, 2024 07:40
@graphite-app graphite-app bot requested a review from a team December 31, 2024 08:23
Copy link
Contributor

@StrikeW StrikeW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/frontend/src/catalog/purify.rs Show resolved Hide resolved
Comment on lines +15 to +17
show create table ctas;
----
public.ctas CREATE TABLE ctas (v0 INT, v1 NUMERIC, v2 TIMESTAMP, v3 TIMESTAMP WITH TIME ZONE, v4 CHARACTER VARYING[], v5 STRUCT<i BIGINT, j STRUCT<a BIGINT, b CHARACTER VARYING>>, v6 rw_int256, v7 MAP(CHARACTER VARYING,INT))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the change, how/is it possible (is it meaningful?) to get the "original create sql" for CTAS and schema registry? It seems useful for debugging/development purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be done via inspecting definition field of the metadata (like through the dashboard). However, I suppose it's indeed not that meaningful, as altering a CTAS will cause the purified definition to be persisted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as altering a CTAS will cause the purified definition to be persisted.

Note: This is not the current behavior as purified definition is only used for SHOW but not the base for schema change replanning.


// Schema inferred. Now derive the missing columns and constraints.
// First, remove the wildcard from the definition.
*wildcard_idx = None;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need tests for wildcard, and generated column, include column ,etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it's used solely for observability, so we might introduce comprehensive tests at a later stage.

wildcard, and generated column

Added in 265f6f7.

include column

Will be included in #19965.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 5535 files.

Valid Invalid Ignored Fixed
2332 2 3201 0
Click to see the invalid file list
  • src/frontend/src/catalog/purify.rs
  • src/frontend/src/utils/data_type.rs
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

src/frontend/src/catalog/purify.rs Outdated Show resolved Hide resolved
src/frontend/src/utils/data_type.rs Outdated Show resolved Hide resolved
@BugenZhao BugenZhao force-pushed the bz/purify-sql-part-1 branch from 265f6f7 to efa9a22 Compare January 2, 2025 09:45
@BugenZhao BugenZhao requested a review from xxchan January 3, 2025 02:43
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
@BugenZhao BugenZhao force-pushed the bz/purify-sql-part-1 branch from 2869377 to 5a21eb2 Compare January 3, 2025 03:29
Comment on lines +282 to +286
pub fn create_sql_purified(&self) -> String {
self.create_sql_ast_purified()
.map(|stmt| stmt.to_string())
.unwrap_or_else(|_| self.create_sql())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still return the result here? Not sure if we can guarantee this for all exist tables created previously.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By falling back to self.create_sql() (i.e. self.definition) we ensure that the behavior gets strictly better compared to prior to this PR. 🤔 Not sure what you mean by "guarantee this"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants