Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parsed schemas in datalake #24503

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

ballard26
Copy link
Contributor

@ballard26 ballard26 commented Dec 10, 2024

Microbenchmark results without caching;

test iterations median mad min max allocs tasks inst
record_multiplexer_bench_fixture.protobuf_381_byte_message_linear_1_field 45654000 21.613ns 0.000ns 21.613ns 21.613ns 0.694 0.000 291.5
record_multiplexer_bench_fixture.protobuf_381_byte_message_linear_40_fields 3261000 191.138ns 0.000ns 191.138ns 191.138ns 5.302 0.003 2972.1
record_multiplexer_bench_fixture.protobuf_381_byte_message_linear_80_fields 3261000 360.062ns 0.000ns 360.062ns 360.062ns 9.881 0.005 5718.3
record_multiplexer_bench_fixture.protobuf_384_byte_message_nested_24_levels 3291000 280.036ns 0.000ns 280.036ns 280.036ns 10.477 0.005 4684.5
record_multiplexer_bench_fixture.protobuf_386_byte_message_nested_31_levels 3311000 365.044ns 0.000ns 365.044ns 365.044ns 13.431 0.007 6070.1
record_multiplexer_bench_fixture.avro_385_byte_message_linear_1_field 66020000 14.661ns 0.000ns 14.661ns 14.661ns 0.440 0.000 175.0
record_multiplexer_bench_fixture.avro_385_byte_message_linear_31_fields 9903000 104.918ns 0.000ns 104.918ns 104.918ns 2.210 0.001 1369.0
record_multiplexer_bench_fixture.avro_385_byte_message_linear_62_fields 6602000 202.774ns 0.000ns 202.774ns 202.774ns 3.932 0.003 2618.2
record_multiplexer_bench_fixture.avro_385_byte_message_nested_31_levels 3301000 262.044ns 0.000ns 262.044ns 262.044ns 8.367 0.006 3823.9
record_multiplexer_bench_fixture.avro_385_byte_message_nested_62_levels 3301000 532.290ns 0.000ns 532.290ns 532.290ns 16.354 0.015 7760.2

And with caching;

test iterations median mad min max allocs tasks inst
record_multiplexer_bench_fixture.protobuf_381_byte_message_linear_1_field 101091000 9.549ns 0.000ns 9.549ns 9.549ns 0.274 0.000 111.5
record_multiplexer_bench_fixture.protobuf_381_byte_message_linear_40_fields 9783000 81.919ns 0.000ns 81.919ns 81.919ns 0.701 0.001 834.0
record_multiplexer_bench_fixture.protobuf_381_byte_message_linear_80_fields 3261000 176.424ns 0.000ns 176.424ns 176.424ns 1.085 0.002 1567.9
record_multiplexer_bench_fixture.protobuf_384_byte_message_nested_24_levels 6582000 77.550ns 0.000ns 77.550ns 77.550ns 2.035 0.002 1021.1
record_multiplexer_bench_fixture.protobuf_386_byte_message_nested_31_levels 3311000 98.757ns 0.000ns 98.757ns 98.757ns 2.551 0.004 1299.1
record_multiplexer_bench_fixture.avro_385_byte_message_linear_1_field 102331000 9.562ns 0.000ns 9.562ns 9.562ns 0.256 0.000 108.8
record_multiplexer_bench_fixture.avro_385_byte_message_linear_31_fields 13204000 66.485ns 0.000ns 66.485ns 66.485ns 0.841 0.001 729.9
record_multiplexer_bench_fixture.avro_385_byte_message_linear_62_fields 6602000 123.961ns 0.000ns 123.961ns 123.961ns 1.417 0.002 1368.3
record_multiplexer_bench_fixture.avro_385_byte_message_nested_31_levels 3301000 95.581ns 0.000ns 95.581ns 95.581ns 2.410 0.004 1215.1
record_multiplexer_bench_fixture.avro_385_byte_message_nested_62_levels 3301000 187.473ns 0.000ns 187.473ns 187.473ns 4.574 0.013 2433.4

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

  • none

@ballard26
Copy link
Contributor Author

/dt

@vbotbuildovich
Copy link
Collaborator

the below tests from https://buildkite.com/redpanda/redpanda/builds/59517#0193ae52-400d-43e7-aa22-964a6a642f8b have failed and will be retried

gtest_raft_rpunit

@vbotbuildovich
Copy link
Collaborator

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59517#0193ae93-b284-4590-980c-f24d4dacbbed:

"rptest.tests.data_migrations_api_test.DataMigrationsApiTest.test_higher_level_migration_api"

@vbotbuildovich
Copy link
Collaborator

Retry command for Build#59517

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/data_migrations_api_test.py::DataMigrationsApiTest.test_higher_level_migration_api

@vbotbuildovich
Copy link
Collaborator

@ballard26 ballard26 force-pushed the datalake-cached-schemas branch from e803d13 to 6a69bdd Compare December 23, 2024 22:37
@ballard26 ballard26 requested review from ztlpn and andrwng December 23, 2024 22:37
@ballard26 ballard26 changed the title [WIP] Cache parsed schemas in datalake Cache parsed schemas in datalake Dec 23, 2024
@ballard26 ballard26 marked this pull request as ready for review December 23, 2024 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants