-
Notifications
You must be signed in to change notification settings - Fork 601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datalake: Translated offset Range #23704
Conversation
Add an implementation of data_writer_factory for batching_parquet_writer. Use this to test the data path from multiplexer through to writing parquet files.
batching_parquet_writer catches different types of exceptions and transforms them into data_writer_error error codes. This is a good place to integrate some error logging.
The data_writer_factory::create method may need to open files or do other things that may fail. Return a result type so we can correctly indicate failure.
Previously, a failure to create a data writer was handled through a try/ catch. This changes that to a result type, since that's our preferred error handling for the higher-level parts of the code. This requires changing the type for the writer from std::unique_ptr to ss::shared_ptr so it can be returned in a result (previously it was returned by reference).
…urn it This adds a new data structure, translated_offset_range, to store a the range of Kafka offsets translated into Parquet, as well as the locations for the resulting Parquet files. It modifies the record_multiplexer to return this new type.
81bdf10
to
7c93d14
Compare
@rockwotj Only the top two commits to this are relevant for review, the rest are from the base branch. |
Closing this and moving these changes to #23683 |
|
the below tests from https://buildkite.com/redpanda/redpanda/builds/56130#019271e6-f6fa-4280-8dbe-1ea92049f1a5 have failed and will be retried
|
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/56130#0192723b-ac5e-4fce-9582-678b9a1eabdd |
Retry command for Build#56130please wait until all jobs are finished before running the slash command
|
This adds a new structure, translated_offset_range, to store a range of Kafka offsets translated into Parquet, as well as the paths to the resulting Parquet files. It modifies record_multiplexer to use this as the return type when consuming a log.
Backports Required
Release Notes