-
Notifications
You must be signed in to change notification settings - Fork 67
all: make coalesce a logical node #587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
cba94b5
to
c8c546f
Compare
Signed-off-by: Michael Hoffmann <[email protected]>
c8c546f
to
7d35322
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is epic!
Just some small questions
case *VectorSelector: | ||
if parent != nil { | ||
// we coalesce matrix selectors in a different branch | ||
if _, ok := (*parent).(*MatrixSelector); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about subqueries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can parent of vectorselector be a subquery? I guess you could have last_over_time(http_request_total[10m:5s])
.
Exprs []Node `json:"-"` | ||
} | ||
|
||
func (c *Coalesce) ReturnType() parser.ValueType { return parser.ValueTypeVector } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should do Exprs[0].ReturnType()?
@@ -20,6 +20,10 @@ type Options struct { | |||
DecodingConcurrency int | |||
} | |||
|
|||
func (o *Options) NumShards() int { | |||
return max(o.DecodingConcurrency, 1) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking this shard size could be different for different operators? I wonder if the default value works for all operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default is numCores / 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I meant if different operators should use the same concurrency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!! Just one nit/question.
"github.com/thanos-io/promql-engine/query" | ||
) | ||
|
||
type CoalesceOptimizer struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the optimizer be called something like ConcurrentDecodeOptimizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is to allow for things like
sum(
coalesce(
sum(shard_a),
sum(shard_b)
)
)
down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should a different optimizer should be responsible for sharding aggregations?
@@ -20,6 +20,10 @@ type Options struct { | |||
DecodingConcurrency int | |||
} | |||
|
|||
func (o *Options) NumShards() int { | |||
return max(o.DecodingConcurrency, 1) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default is numCores / 2?
This change potentially modifies the contract with the scanner. I would like to test it with our downstream project this week before we merge it. |
v := n.VectorSelector | ||
lm = append(v.LabelMatchers, v.Filters...) | ||
default: | ||
logicalplan.TraverseBottomUp(nil, &o.funcExpr.Args[0], func(parent, node *logicalplan.Node) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be in the logical plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sector labels of the vector sector of the absent function? I think we could put it into the functioncall node and populate it in the plan yeah
@@ -219,6 +219,19 @@ func unmarshalNode(data []byte) (Node, error) { | |||
return nil, err | |||
} | |||
return u, nil | |||
case CoalesceNode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this cause remote execution to be planned in the central node?
arg2, | ||
opts, | ||
logicalNode.Range, | ||
vs.Offset, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could simplify this by passing in the entire vs
object.
Making "coalesce" a logical node, allows us to move it around in the execution tree and eventually coalesce later and evaluate more of the query concurrently.
This will allow us to subsequently treat local execution almost like remote execution, so that we can compute aggregations in shards.