Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(eap): Add a GetTraces endpoint #6671

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Conversation

phacops
Copy link
Contributor

@phacops phacops commented Dec 13, 2024

This is a possible implementation of the new FindTraces endpoint.

getsentry/sentry-protos#71

Closes https://github.com/getsentry/eap-planning/issues/122.

@phacops phacops requested review from a team as code owners December 13, 2024 15:51
@phacops phacops changed the title feat(eap): Add a FindTraces endpoint feat(eap): Add a GetTraces endpoint Dec 13, 2024
requirements.txt Outdated Show resolved Hide resolved
Comment on lines 244 to 250
def _validate_order_by(in_msg: GetTracesRequest) -> None:
order_by_cols = set([ob.column.name for ob in in_msg.order_by])
selected_columns = set([c.name for c in in_msg.columns])
if not order_by_cols.issubset(selected_columns):
raise BadSnubaRPCRequestException(
f"Ordered by columns {order_by_cols} not selected: {selected_columns}"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention here to allow ordering by any of the selected columns? The 2 main ones we'd like to have will be by timestamp (newest/oldest trace) and by duration (shortest/longest trace) though this second one is harder to define clearly as the trace duration is as it's becoming closer to a session depending on the platform.

I'm not opposed to just supporting ordering by timestamp first and we can add more later once we more clearly define them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the intention. We can have a default sort to timestamp and you'll see later if you use custom sorts.

snuba/web/rpc/v1/endpoint_get_traces.py Outdated Show resolved Hide resolved
snuba/web/rpc/v1/endpoint_get_traces.py Outdated Show resolved Hide resolved
@phacops phacops requested a review from Zylphrex January 2, 2025 16:28
Copy link
Member

@kylemumma kylemumma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

tests/web/rpc/v1/test_endpoint_get_traces.py Outdated Show resolved Hide resolved
snuba/web/rpc/v1/endpoint_get_traces.py Outdated Show resolved Hide resolved
return GetTracesResponse(meta=response_meta)

# Get metadata for those traces.
traces = self._get_metadata_for_traces(request=in_msg, trace_ids=trace_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you first query for trace_ids, and then do another query for the attributes, rather than doing them together in 1 query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For performance and because we can't compute all attributes in one query. It's much faster to select a list of trace IDS and timestamps matching the conditions and then to compute the appropriate attributes on the list.

Right now, the UI will query all the attributes at once, and I can't compute FILTERED_ITEM_COUNT and TOTAL_ITEM_COUNT in one query since, in order to count the appropriate items matching some conditions, I can't pass them in the WHERE clause, I need to use a countIf without a WHERE.

We could optimize later on by not doing 2 queries depending on the attributes that are requested.

Copy link
Member

@Zylphrex Zylphrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nitpicks but looks good to me overall.

snuba/web/rpc/v1/endpoint_get_traces.py Show resolved Hide resolved
snuba/web/rpc/v1/endpoint_get_traces.py Outdated Show resolved Hide resolved

_ATTRIBUTES: dict[
TraceAttribute.Key.ValueType,
tuple[str, AttributeKey.Type.ValueType],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: a named tuple/data class/typed dict would be a little cleaner here so you don't have to access it by index

snuba/web/rpc/v1/endpoint_get_traces.py Outdated Show resolved Hide resolved
snuba/web/rpc/v1/endpoint_get_traces.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jan 3, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
905 1 904 3
View the top 1 failed tests by shortest run time
tests.datasets.test_errors_replacer.TestReplacer::test_process_offset_twice
Stack Traces | 0.284s run time
Traceback (most recent call last):
  File ".../tests/datasets/test_errors_replacer.py", line 370, in test_process_offset_twice
    assert self.replacer.process_message(message) is None
AssertionError: assert (ReplacementMessageMetadata(partition_index=1, offset=42, consumer_group='consumer_group'), UnmergeGroupsReplacement(state_name=<ReplacerState.ERRORS: 'errors'>, timestamp=datetime.datetime(2025, 1, 3, 0, 18, 32, 461363), hashes=['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'], all_columns=[FlattenedColumn(None, 'project_id', schemas.UInt(64, modifiers=None)), FlattenedColumn(None, 'timestamp', schemas.DateTime(modifiers=None)), FlattenedColumn(None, 'event_id', schemas.UUID(modifiers=None)), FlattenedColumn(None, 'platform', schemas.String(modifiers=None)), FlattenedColumn(None, 'environment', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'release', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'dist', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'ip_address_v4', schemas.IPv4(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'ip_address_v6', schemas.IPv6(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'user', schemas.String(modifiers=None)), FlattenedColumn(None, 'user_id', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'user_name', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'user_email', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'sdk_name', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'sdk_version', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'http_method', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'http_referer', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn('tags', 'key', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('tags', 'value', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('contexts', 'key', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('contexts', 'value', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn(None, 'transaction_name', schemas.String(modifiers=None)), FlattenedColumn(None, 'span_id', schemas.UInt(64, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'trace_id', schemas.UUID(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'partition', schemas.UInt(16, modifiers=None)), FlattenedColumn(None, 'offset', schemas.UInt(64, modifiers=None)), FlattenedColumn(None, 'message_timestamp', schemas.DateTime(modifiers=None)), FlattenedColumn(None, 'retention_days', schemas.UInt(16, modifiers=None)), FlattenedColumn(None, 'deleted', schemas.UInt(8, modifiers=None)), FlattenedColumn(None, 'group_id', schemas.UInt(64, modifiers=None)), FlattenedColumn(None, 'primary_hash', schemas.UUID(modifiers=None)), FlattenedColumn(None, 'received', schemas.DateTime(modifiers=None)), FlattenedColumn(None, 'message', schemas.String(modifiers=None)), FlattenedColumn(None, 'title', schemas.String(modifiers=None)), FlattenedColumn(None, 'culprit', schemas.String(modifiers=None)), FlattenedColumn(None, 'level', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'location', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'version', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'type', schemas.String(modifiers=None)), FlattenedColumn('exception_stacks', 'type', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_stacks', 'value', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_stacks', 'mechanism_type', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_stacks', 'mechanism_handled', schemas.Array(schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'abs_path', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'colno', schemas.Array(schemas.UInt(32, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'filename', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'function', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'lineno', schemas.Array(schemas.UInt(32, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'in_app', schemas.Array(schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'package', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'module', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'stack_level', schemas.Array(schemas.UInt(16, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn(None, 'exception_main_thread', schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'sdk_integrations', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('modules', 'name', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('modules', 'version', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn(None, 'trace_sampled', schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'num_processing_errors', schemas.UInt(64, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'replay_id', schemas.UUID(modifiers=SchemaModifiers(nullable=True, readonly=False)))], project_id=1, previous_group_id=1, new_group_id=2)) is None
 +  where (ReplacementMessageMetadata(partition_index=1, offset=42, consumer_group='consumer_group'), UnmergeGroupsReplacement(state_name=<ReplacerState.ERRORS: 'errors'>, timestamp=datetime.datetime(2025, 1, 3, 0, 18, 32, 461363), hashes=['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'], all_columns=[FlattenedColumn(None, 'project_id', schemas.UInt(64, modifiers=None)), FlattenedColumn(None, 'timestamp', schemas.DateTime(modifiers=None)), FlattenedColumn(None, 'event_id', schemas.UUID(modifiers=None)), FlattenedColumn(None, 'platform', schemas.String(modifiers=None)), FlattenedColumn(None, 'environment', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'release', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'dist', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'ip_address_v4', schemas.IPv4(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'ip_address_v6', schemas.IPv6(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'user', schemas.String(modifiers=None)), FlattenedColumn(None, 'user_id', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'user_name', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'user_email', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'sdk_name', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'sdk_version', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'http_method', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'http_referer', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn('tags', 'key', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('tags', 'value', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('contexts', 'key', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('contexts', 'value', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn(None, 'transaction_name', schemas.String(modifiers=None)), FlattenedColumn(None, 'span_id', schemas.UInt(64, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'trace_id', schemas.UUID(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'partition', schemas.UInt(16, modifiers=None)), FlattenedColumn(None, 'offset', schemas.UInt(64, modifiers=None)), FlattenedColumn(None, 'message_timestamp', schemas.DateTime(modifiers=None)), FlattenedColumn(None, 'retention_days', schemas.UInt(16, modifiers=None)), FlattenedColumn(None, 'deleted', schemas.UInt(8, modifiers=None)), FlattenedColumn(None, 'group_id', schemas.UInt(64, modifiers=None)), FlattenedColumn(None, 'primary_hash', schemas.UUID(modifiers=None)), FlattenedColumn(None, 'received', schemas.DateTime(modifiers=None)), FlattenedColumn(None, 'message', schemas.String(modifiers=None)), FlattenedColumn(None, 'title', schemas.String(modifiers=None)), FlattenedColumn(None, 'culprit', schemas.String(modifiers=None)), FlattenedColumn(None, 'level', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'location', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'version', schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'type', schemas.String(modifiers=None)), FlattenedColumn('exception_stacks', 'type', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_stacks', 'value', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_stacks', 'mechanism_type', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_stacks', 'mechanism_handled', schemas.Array(schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'abs_path', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'colno', schemas.Array(schemas.UInt(32, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'filename', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'function', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'lineno', schemas.Array(schemas.UInt(32, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'in_app', schemas.Array(schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'package', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'module', schemas.Array(schemas.String(modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn('exception_frames', 'stack_level', schemas.Array(schemas.UInt(16, modifiers=SchemaModifiers(nullable=True, readonly=False)), modifiers=None)), FlattenedColumn(None, 'exception_main_thread', schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'sdk_integrations', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('modules', 'name', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn('modules', 'version', schemas.Array(schemas.String(modifiers=None), modifiers=None)), FlattenedColumn(None, 'trace_sampled', schemas.UInt(8, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'num_processing_errors', schemas.UInt(64, modifiers=SchemaModifiers(nullable=True, readonly=False))), FlattenedColumn(None, 'replay_id', schemas.UUID(modifiers=SchemaModifiers(nullable=True, readonly=False)))], project_id=1, previous_group_id=1, new_group_id=2)) = <bound method ReplacerWorker.process_message of <snuba.replacer.ReplacerWorker object at 0x7f00d5918310>>(Message({Partition(topic=Topic(name='replacements'), index=1): 43}))
 +    where <bound method ReplacerWorker.process_message of <snuba.replacer.ReplacerWorker object at 0x7f00d5918310>> = <snuba.replacer.ReplacerWorker object at 0x7f00d5918310>.process_message
 +      where <snuba.replacer.ReplacerWorker object at 0x7f00d5918310> = <tests.datasets.test_errors_replacer.TestReplacer object at 0x7f0102a76310>.replacer

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants