Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Add Span Kind support for ES/OS #6399

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Manik2708
Copy link
Contributor

Which problem is this PR solving?

Fixes: #1923

Description of the changes

  • While querying GetOperations, operations can now be fetched with kind also. When kind kept empty, spans of all kinds are returned

How was this change tested?

  • Unit and E2E tests

Checklist

@Manik2708 Manik2708 requested a review from a team as a code owner December 24, 2024 13:40
@Manik2708 Manik2708 requested a review from jkowall December 24, 2024 13:40
Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 83.78378% with 18 lines in your changes missing coverage. Please review.

Project coverage is 96.20%. Comparing base (a6616fb) to head (5b15a5f).

Files with missing lines Patch % Lines
plugin/storage/es/spanstore/service_operation.go 82.85% 12 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6399      +/-   ##
==========================================
- Coverage   96.23%   96.20%   -0.04%     
==========================================
  Files         368      368              
  Lines       21028    21112      +84     
==========================================
+ Hits        20237    20311      +74     
- Misses        606      612       +6     
- Partials      185      189       +4     
Flag Coverage Δ
badger_v1 10.48% <0.00%> (-0.09%) ⬇️
badger_v2 2.40% <0.00%> (-0.02%) ⬇️
cassandra-4.x-v1-manual 16.32% <0.00%> (-0.13%) ⬇️
cassandra-4.x-v2-auto 2.34% <0.00%> (-0.02%) ⬇️
cassandra-4.x-v2-manual 2.34% <0.00%> (-0.02%) ⬇️
cassandra-5.x-v1-manual 16.32% <0.00%> (-0.13%) ⬇️
cassandra-5.x-v2-auto 2.34% <0.00%> (-0.02%) ⬇️
cassandra-5.x-v2-manual 2.34% <0.00%> (-0.02%) ⬇️
elasticsearch-6.x-v1 20.51% <56.75%> (+0.30%) ⬆️
elasticsearch-7.x-v1 20.59% <56.75%> (+0.31%) ⬆️
elasticsearch-8.x-v1 20.75% <56.75%> (+0.31%) ⬆️
elasticsearch-8.x-v2 2.40% <0.00%> (-0.01%) ⬇️
grpc_v1 12.13% <0.00%> (-0.10%) ⬇️
grpc_v2 8.74% <0.00%> (-0.07%) ⬇️
kafka-3.x-v1 10.32% <0.00%> (-0.09%) ⬇️
kafka-3.x-v2 2.40% <0.00%> (-0.02%) ⬇️
memory_v2 2.40% <0.00%> (-0.01%) ⬇️
opensearch-1.x-v1 20.64% <56.75%> (+0.31%) ⬆️
opensearch-2.x-v1 20.63% <56.75%> (+0.30%) ⬆️
opensearch-2.x-v2 2.40% <0.00%> (-0.01%) ⬇️
tailsampling-processor 0.39% <0.00%> (-0.01%) ⬇️
unittests 95.05% <81.08%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, but I have question on whether the whole thing could be even simpler, by treating Kind as always present (but maybe blank).

// ServiceWithKind is the JSON struct service:kind:operation documents in ElasticSearch
type ServiceWithKind struct {
Service
Kind string `json:"spanKind"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow. Why wouldn't we just add Kind field to the Service struct above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comment. As then in query for fetching spans of all kinds, we have apply another filter with kind as empty.

@@ -124,8 +124,9 @@ func getSpanAndServiceIndexFn(p SpanWriterParams) spanAndServiceIndexFn {
func (s *SpanWriter) WriteSpan(_ context.Context, span *model.Span) error {
spanIndexName, serviceIndexName := s.spanServiceIndex(span.StartTime)
jsonSpan := s.spanConverter.FromDomainEmbedProcess(span)
kind, _ := span.GetSpanKind()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add Kind field to the dbmodel.Span and not pass it around separately?

Copy link
Contributor Author

@Manik2708 Manik2708 Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be present as a tag in json model also! I tried fetching kind from original span but failed because of the bool AllTagsAsFields. As it will already be present in the json model, should we resave it as a seperate kind also?

if !keyInCache(cacheKey, s.serviceCache) {
s.client().Index().Index(indexName).Type(serviceType).Id(cacheKey).BodyJson(service).Add()
writeCache(cacheKey, s.serviceCache)
if kind != model.SpanKindUnspecified {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to bifurcate? If for some reason span kind is unavailable, we can just treat it as blank string and not change the code to deal with it separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please the comment #6399 (comment). If the reason seems valid then I don't think we should introduce empty kinds because that will become messy along with data of Without Kind (old data) and then in reader it might become more complex while fetching.

@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 26, 2024

overall lgtm, but I have question on whether the whole thing could be even simpler, by treating Kind as always present (but maybe blank).

Then we can't get old data! In old data Kind was not present. In the actual issue, when it was asked that What will happen to old data, it was answered that it should be accessible when kind is not present! Therefore I introduced a whole new struct with kind and saved the data without kind also.

@Manik2708 Manik2708 requested a review from yurishkuro December 26, 2024 02:05
}

func bucketOfOperationsToOperationsArray(searchResult *elastic.AggregationBucketFilters) ([]spanstore.Operation, error) {
var result []spanstore.Operation
Copy link
Contributor Author

@Manik2708 Manik2708 Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This is concerning because we are unaware of the total length of result so not sure whether we should initialize this way or not. The best possible solution which could think of is as number of kinds are limited we can iterate through each kind fetch the total doc count of each kind, then finally create the array of total length. Will it be fine?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 28, 2024

@yurishkuro Comitted as per your suggestions. I tried many ways but I couldn't merge those with EmptyKinds and WithoutKinds into a single aggregation. Because both bool queries are contradictory to each other and so it is behaving in a different way. The only possible way is that we have to manage them in different aggregations which will increase complexity in the Operations. So we have to decide between complexity in writing span or fetching operations.

@yurishkuro
Copy link
Member

please update the branch, I am not sure if CI is failing because of that or because of your changes. If it's the latter, how are you testing this change? Did you run e2e tests locally?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Dec 29, 2024

please update the branch, I am not sure if CI is failing because of that or because of your changes. If it's the latter, how are you testing this change? Did you run e2e tests locally?

I tried approaching with empty kind but failed as query in GetOperations is becoming more complex, hence have reverted that commit.

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

@yurishkuro Saving empty strings in ES is not optimal. Please see: elastic/elasticsearch#7515. I have been trying various ways of employing empty or null kinds but not getting results. The probable reason is because query is a bit complex and unconventional. I think it will be better to not to keep empty and null kinds rather handelling them seperately!

@yurishkuro
Copy link
Member

I don't have a strong opinion, but the issue you linked talks about searching for empty strings, which I don't think is the case for our scenario, we just need to write it.

@Manik2708
Copy link
Contributor Author

I don't have a strong opinion, but the issue you linked talks about searching for empty strings, which I don't think is the case for our scenario, we just need to write it.

But while reading we have to search for those operations also which have empty kinds or no kind! I am facing problems when fetching spans of all kinds, the search query behaves weirdly if I introduce filter of "". There are no problems in writing the service but in fetching them

@yurishkuro
Copy link
Member

Searching for "all kinds" to me means not specifying any filter for the "kind" field. How would it even work if you say search for kind="" when the kind is actually not empty?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

Searching for "all kinds" to me means not specifying any filter for the "kind" field. How would it even work if you say search for kind="" when the kind is actually not empty?

I think there is some sort of misunderstanding of my approach. Let me first explain problems:

  1. We want to get two fields (operatioName and kind) from our search query to ES, to achieve this, we have 3 ways:
    a) Use FetchSource(true): This will fetch all the fields which is not optimal neither is avilable in our abstraction.
    b) Use FetchSourceWithContext: Not available in our abstraction.
    c) Composite Filters: Not possible because it will be applied to a field and kind can be absent in old data.
    So I employed my approach, used the query only on service name and applied filters on kind (kinds are limited so it is possible) and used the filter name to get the kind from named buckets. But when this filter is applied to empty kind, it is not giving results. Your thoughts are absolutely correct when you said: When no kind is in query we have to just employ query on service name but we have to fetch two fields which is creating problem in just querying on service!

@yurishkuro
Copy link
Member

"Not available in our abstraction" is an odd argument since we own the abstraction and can change it at will. Re fetchsource, what is the source here - is it the whole span? Or do we write separate entries just for service/operation?

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 2, 2025

"Not available in our abstraction" is an odd argument since we own the abstraction and can change it at will. Re fetchsource, what is the source here - is it the whole span? Or do we write separate entries just for service/operation?

Service, Operation and Kind (if present) but have to investigate the indices whether it is extracting any other information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ES storage plugin: query service to support spanKind when retrieve operations for a given service
2 participants