Skip to content

Conversation

@zhang2014
Copy link
Member

@zhang2014 zhang2014 commented Dec 28, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

feat(query): add group-by type shrinking rule

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@zhang2014 zhang2014 marked this pull request as draft December 28, 2025 14:05
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Dec 28, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 127 to 131
for rewrite in rewrites.iter() {
new_group_items[rewrite.position].index = rewrite.shrink_index;
new_group_items[rewrite.position].scalar = ScalarExpr::BoundColumnRef(BoundColumnRef {
span: None,
column: rewrite.shrink_binding.clone(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Missing stats for shrink group key causes aggregate panic

When the rule rewrites a group key to use the new shrink_index (group_items[position].index = rewrite.shrink_index), no column statistics are created for that derived column (EvalScalar’s derive_stats returns the child stats unchanged). Aggregate::derive_agg_stats later indexes column_stats[&group.index] and unwraps, so any aggregate that triggers this shrink will panic during planning because the stats map lacks an entry for the new index. Add stats for the shrink column or avoid replacing the group index in stats-based code paths.

Useful? React with 👍 / 👎.

@zhang2014 zhang2014 added the ci-cloud Build docker image for cloud test label Dec 29, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-19177-b688014-1766978994

note: this image tag is only available for internal use.

@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Jan 3, 2026
@zhang2014 zhang2014 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Jan 3, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2026

Docker Image for PR

  • tag: pr-19177-360d488-1767421366

note: this image tag is only available for internal use.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2026

Docker Image for PR

  • tag: pr-19177-a83a2e5-1767421390

note: this image tag is only available for internal use.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2026

🤖 CI Job Analysis

Workflow: 20672887061

📊 Summary

  • Total Jobs: 83
  • Failed Jobs: 29
  • Retryable: 0
  • Code Issues: 29

NO RETRY NEEDED

All failures appear to be code/test issues requiring manual fixes.

🔍 Job Details

  • linux / sqllogic / standalone_minio_with_bendsave: Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, http, parquet): Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, http, native): Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, hybrid, parquet): Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, hybrid, native): Not retryable (Code/Test)
  • linux / sqllogic / standalone (crdb, 2c, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (crdb, 2c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / standalone (base, 2c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / standalone (standalone, 2c, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (tpcds, 4c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / standalone (query, 4c, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (base, 2c, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (tpcds, 4c, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (query, 4c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / standalone (standalone, 2c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / standalone (tpch, 2c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (query, 4c, http): Not retryable (Code/Test)
  • linux / sqllogic / cluster (base, 2c, 2, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (query, 4c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (tpcds, 4c, http): Not retryable (Code/Test)
  • linux / sqllogic / cluster (crdb, 2c, 2, http): Not retryable (Code/Test)
  • linux / sqllogic / cluster (tpch, 2c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (base, 2c, 2, http): Not retryable (Code/Test)
  • linux / sqllogic / cluster (tpch, 2c, http): Not retryable (Code/Test)
  • linux / sqllogic / cluster (cluster, 2c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (tpcds, 4c, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (crdb, 2c, 2, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / cluster (cluster, 2c, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (tpch, 2c, http): Not retryable (Code/Test)

🤖 About

Automated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant