Avoid unnecessary queries when querying non-exist TraceID

Currently, VictoriaTraces stores Data-Stream and Index-Stream. When searching by TraceID, VT first try to get startTime of the TraceID from Index-Stream, thus reducing the number of partitions to scan in Data-Stream.

If startTime of the TraceID not found in Index-Stream, VT will scan Data-Stream in 90-minutes decrements starting from now until trace is found or the query time returns to `1970-1-1 8:0:0`.  It will introduce plenty of RPCs to vtstorage incluster mode. See https://github.com/VictoriaMetrics/VictoriaTraces/blob/master/app/vtselect/traces/query/query.go#L432

This situation must happen when querying a non-existent TraceID. And I think this is an issue that needs to be addressed before VT GA.

So in cluster mode, VT should avoid unnecessary RPCs when quering non-exist TraceID. A simple insight is that when the query time range exceeds `retentionPeriod`, subsequent query RPCs can be terminated.

I've thought of two possible solutions:

1. Configure `retentionPeriod` in the vtselect also.  If the scan time range exceeds retentionPeriod and still no TraceID is found, abort the query and return to client.

2. When vtstorage node receives a query request that clearly exceeds `retentionPeriod`, it can fast-return. The vlselect will treat this response as a hint(via an HTTP response status code or sth) and terminate subsequent queries.

Maybe there will be other better solutions. Looking forward to further discussion!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid unnecessary queries when querying non-exist TraceID #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Avoid unnecessary queries when querying non-exist TraceID #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions