-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tracing): Port sample_rand
to POTel
#4106
Conversation
❌ 108 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
04dc3b4
to
49a3507
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, @sentrivana!
I think this looks mostly good, although I did notice a couple things which may need to be changed, and I would like some clarification on some of the changes.
@@ -122,6 +150,9 @@ def sampled_result(span_context, attributes, sample_rate): | |||
if sample_rate is not None: | |||
trace_state = trace_state.update(TRACESTATE_SAMPLE_RATE_KEY, str(sample_rate)) | |||
|
|||
if sample_rand is not None: | |||
trace_state = trace_state.update(TRACESTATE_SAMPLE_RAND_KEY, str(sample_rand)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is important that we ensure the stringified sample_rand
has six decimal places here.
Are we already rounding it to six decimal places in the calling code? If yes, please mention in the docstring that the calling code needs to round to six decimal places. If not, then we need to add the rounding logic here, and ideally also add a unit test to verify rounding happens as expected.
In either case, I would use f"{sample_rand:.6f}"
rather than str(sample_rand)
.
trace_state = trace_state.update(TRACESTATE_SAMPLE_RAND_KEY, str(sample_rand)) | |
sample_rand = sample_rand.quantize( | |
Decimal("0.000001"), rounding=ROUND_DOWN, context=Context(prec=6) | |
) | |
trace_state = trace_state.update(TRACESTATE_SAMPLE_RAND_KEY, f"{sample_rand:.6f}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added a test: 34cb097
"incoming_sample_rate": 0.3, | ||
"incoming_sampled": "false", | ||
"sentry_trace_header_parent_sampled": 0, | ||
"use_local_traces_sampler": True, | ||
"local_traces_sampler_result": 0.5, | ||
"local_traces_sampler_result": 0.25, | ||
"local_traces_sample_rate": 0.7, | ||
}, | ||
0.5, # expected_sample_rate | ||
0.25, # expected_sample_rate | ||
"false", # expected_sampled (traces sampler can override parent sampled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the first bullet point – shouldn't we still test the case where sample_rate
and sampled
values are incompatible with each other? Although this is an invalid state for the trace to be in, we still should ensure that the SDK does something reasonable in this case, since the upstream SDK could be broken (or a malicious actor could have tampered with the incoming trace data)
Co-authored-by: Daniel Szoke <[email protected]>
Co-authored-by: Daniel Szoke <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions have been answered off Github and all in all looks good.
The one comment from @szokeasaurusrex about rounding the sample_rand value before storing it trace_state could be addressed, but I guess it should work without it. But maybe better than safe then sorry.
That should hopefully be it for the functional changes -- adding test cases for the scenarios discussed in the reviews now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, identified a few more suggestions but looks pretty good
if parent_sample_rand is None: | ||
return None | ||
|
||
return _round_sample_rand(parent_sample_rand) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should double-check with @cleptric and/or @jan-auer, but in order to maintain future-compatibility, I don't think we should be rounding the parent_sample_rand
.
return _round_sample_rand(parent_sample_rand) | |
return Decimal(parent_sample_rand) |
By rounding it, we potentially modify the DSC (which should be frozen) if the head SDK was rounding differently. This could become a problem if in the future, we were to change the format of the sample_rand
value (e.g. by increasing/decreasing precision). If the head SDK had already been updated use the new format, but the user was still using an older Python SDK version, we'd be changing the value back to the old precision here, which we might not want to be doing.
if sample_rand is not None: | ||
trace_state = trace_state.update( | ||
TRACESTATE_SAMPLE_RAND_KEY, f"{sample_rand:.6f}" # noqa: E231 | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] since it seems like we are setting the sample_rand both here and above in dropped_result
using the same duplicated code, perhaps we can extract this code into a separate function? If we did this, we could also potentially more other stuff we set to that new function (this might be out of scope for this PR though)
If you think this change would be too much hassle, feel free to ignore this suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample_rand = parent_sample_rand | ||
else: | ||
# We are the head SDK and we need to generate a new sample_rand | ||
sample_rand = cast(Decimal, _generate_sample_rand(str(trace_id), (0, 1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cast
is quite confusing.
Could we please try to modify _generate_sample_rand
's signature to return Decimal
rather than Optional[Decimal]
? That would eliminate the need for a cast
sample_rand = cast(Decimal, _generate_sample_rand(str(trace_id), (0, 1))) | |
sample_rand = _generate_sample_rand(str(trace_id), (0, 1)) |
sentry_sdk/tracing_utils.py
Outdated
except Exception as ex: | ||
logger.debug(f"Failed to round sample_rand {sample_rand}: {ex}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exception exactly can the above raise? Doesn't setting the context
avoid the possibility of an InvalidOperation
?
If we can get rid of the try
/except
, that would allow us to make the signature of this function and _generate_sample_rand
a Decimal
.
@szokeasaurusrex Here's a compact view of the changes addressing your last review: https://github.com/getsentry/sentry-python/pull/4106/files/9380196498cd4520b207957e69696410859a1844..185cc15e158ef0ff4c2e072e2038401e799fba37 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -293,14 +283,14 @@ def should_sample( | |||
parent_span_context, | |||
attributes, | |||
sample_rate=sample_rate_to_propagate, | |||
sample_rand=sample_rand, | |||
sample_rand=None if sample_rand == parent_sample_rand else sample_rand, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] What is the purpose of comparing sample_rand
to parent_sample_rand
? We might want to clarify this with a code comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sampled_result
(and dropped_result
as well) will use the sample_rand
provided to overwrite it in the trace state. Since we don't want to overwrite incoming parent_sample_rand
(since we would potentially change the precision), we only pass it to sampled_result
if it's different from parent_sample_rand
-- i.e., we generated it ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ngl I am still a bit confused. Didn't we get rid of the code that would potentially change the precision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We got rid of the rounding code but not the precision code, we still format sample_rand
to 6 decimal places here https://github.com/getsentry/sentry-python/pull/4106/files#diff-59aa7195d955e153b5cdd730f888994996a72eaf5e9ea174335ce961841584a9R169
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, the bigger picture here is just: if we have a halfway valid incoming sample_rand, don't tamper with it, even if it'd mean just adding a few zeroes at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, fair enough
Port
sample_rand
topotel-base
. See spec.There are now two places where a
sample_rand
might be generated:continue_trace
, we'll backfillsample_rand
on the propagation context like on master, either using the incoming one or generating a new one from the incomingsampled
/sample_rate
.sample_rand
in the Sampler.The generated
sample_rand
is then saved on the trace state.This change fixes most of the failures in the Common test suite.
Closes #4027