-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add record_multiplexer
microbenchmarks
#24155
Add record_multiplexer
microbenchmarks
#24155
Conversation
record_multiplexer
microbenchmarks
3712bfa
to
da6f4e6
Compare
da6f4e6
to
2332bcb
Compare
record_multiplexer
microbenchmarksrecord_multiplexer
microbenchmarks
a77d575
to
5f7e229
Compare
co_return std::nullopt; | ||
} | ||
|
||
iobuf encode_protobuf_message_index(const std::vector<int32_t>& message_index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any existing serializer for this message index format anywhere in our code-base? There is a de-serializer; get_proto_offsets
in src/v/datalake/schema_registry.h
. Happy to move this serializer to a more general location if there is any use for it outside of the record generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closest thing is
redpanda/src/v/datalake/tests/record_schema_resolver_test.cc
Lines 63 to 72 in 514f1df
iobuf encode_pb_offsets(const std::vector<int32_t>& offsets) { | |
auto cnt_bytes = vint::to_bytes(offsets.size()); | |
iobuf buf; | |
buf.append(cnt_bytes.data(), cnt_bytes.size()); | |
for (auto o : offsets) { | |
auto bytes = vint::to_bytes(o); | |
buf.append(bytes.data(), bytes.size()); | |
} | |
return buf; | |
} |
I don't have strong feelings about code placement, I think leaving it in the record generate seems reasonable
1a07259
to
d3cc7ab
Compare
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/58542#019352d8-d1a4-477d-92c3-9735aa6242dc |
fa6789a
to
a151b5e
Compare
@andrwng @mmaslankaprv can we get this reviewed please. Would be good to get that in ASAP so that perf can be tracked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good! Nice work
// (12 payload + 3 header) bytes per field | ||
static std::string proto_schema = generate_linear_proto(40); | ||
|
||
static thread_local chunked_vector<model::record_batch> batch_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, what's the significance of this being static thread_local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is none, I had copied this code from an earlier iteration of the microbenchmark I had written that ran on multiple shards and it made more sense there. Will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Still curious, what's the significance of the remaining static
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seastar will re-run the microbenchmark multiple times to try to account for variance in the results. static
just ensures that each run will be with the same dataset. Another, potentially better, option would be to fix the random seed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other option would be to have each run be with a unique dataset. Some of the dataset characteristics like schema and string length would be constant, but the string values would be unique per-run. There's an argument for that being the better way to go. I just start with this as the default as it allows for runs to be quicker since a new dataset doesn't have to be generated each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks for explaining.
Yeah I guess with a single data set I would expect the variance between runs to be pretty low, given there isn't a ton of non-determinism here with IO and such. Curious if you've observed that? If so, yea maybe longer term it'd be interesting to see the effects of different data sets with similarly sized data.
I don't think that needs to block this PR though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and tested things with random datasets on each run and variance was very low(<10ns). So I removed the static specifiers on the batch generation.
static thread_local chunked_vector<model::record_batch> batch_data | ||
= co_await generate_protobuf_batches( | ||
records_per_batch, | ||
batches, | ||
"proto_schema", | ||
proto_schema, | ||
{0}, | ||
gen_config); | ||
|
||
auto reader = model::make_fragmented_memory_record_batch_reader( | ||
share_batches(batch_data)); | ||
|
||
auto consumer = counting_consumer{.mux = create_mux()}; | ||
|
||
perf_tests::start_measuring_time(); | ||
auto res = co_await reader.consume(std::move(consumer), model::no_timeout); | ||
perf_tests::stop_measuring_time(); | ||
|
||
co_return res.total_bytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm wondering if we should wrap this in some measure_proto(std::string schema)
and use it everywhere (same for avro)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be possible, I originally avoided it so that folks could look at a given microbenchmark and see right away what was being measured. However, it is rather repetitive. Happy to factor it out if you think it would be better that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly read microbenchmarks frequently, but it feels like it might make it'd be nice to encapsulate those details (e.g. so it's more obvious to readers that each one is actually measuring in the same way, if that's the intent. Though maybe that's expected out of a microbench? ultimately I'll leave the call up to you)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encapsulation it is then, don't have a strong opinion on this myself.
a151b5e
to
929b102
Compare
CI test resultstest results on build#59830
test results on build#59927
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining feedback is pretty cosmetic, + one question. Otherwise LGTM
929b102
to
048f3e9
Compare
048f3e9
to
a446eb8
Compare
This data writer is for tests. It doesn't write to any files, however, it does go through the process of converting the ostream to parqueat.
a446eb8
to
1c29911
Compare
Retry command for Build#59927please wait until all jobs are finished before running the slash command
|
The CI failure in |
/backport v24.3.x |
Failed to create a backport PR to v24.3.x branch. I tried:
|
This PR adds benchmarks for
record_multiplexer
along with the following that was needed to support that;record_generator
.Backports Required
Release Notes