Skip to content

Conversation

@oleiman
Copy link
Member

@oleiman oleiman commented Jan 7, 2026

Find the multipart boundary string in a response's content-type header. This code is identical between ABS and GCS versions. Includes unit tests.

PR also improves surrounding unit tests slightly w/ use of gmock matchers.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

  • none

@oleiman oleiman self-assigned this Jan 7, 2026
Copilot AI review requested due to automatic review settings January 7, 2026 08:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extracts common multipart boundary parsing logic into a new utility function find_multipart_boundary() to reduce code duplication between Azure Blob Storage (ABS) and Google Cloud Storage (GCS) implementations. The function parses the Content-Type header to extract the boundary string from multipart/mixed responses.

Key Changes

  • Added util::find_multipart_boundary() function to parse multipart boundary strings from HTTP response headers
  • Updated existing unit tests to use gmock matchers (testing::HasSubstr) instead of manual string contains checks
  • Added comprehensive unit tests for the new boundary parsing function covering valid cases and error conditions

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/v/cloud_storage_clients/util.h Declares new find_multipart_boundary() function
src/v/cloud_storage_clients/util.cc Implements boundary parsing with proper error handling for missing headers, incorrect content types, and malformed boundaries
src/v/cloud_storage_clients/tests/util_test.cc Refactors existing tests to use gmock matchers and adds new test cases for boundary parsing

@oleiman oleiman force-pushed the ct/core-15023/find-boundary branch 2 times, most recently from 1c601fa to b84eb38 Compare January 7, 2026 08:49
@oleiman oleiman requested a review from Copilot January 7, 2026 08:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

This is well-specified and identical between ABS and GCS, so factor it out
for easier testing.

Signed-off-by: Oren Leiman <[email protected]>
@oleiman oleiman force-pushed the ct/core-15023/find-boundary branch from 76bb416 to 84e1679 Compare January 7, 2026 09:05
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jan 7, 2026

Retry command for Build#78637

please wait until all jobs are finished before running the slash command

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/compatibility/java_compression_test.py::JavaCompressionTest.test_upgrade_java_compression@{"compression_type":"snappy"}
tests/rptest/tests/write_caching_fi_e2e_test.py::WriteCachingFailureInjectionE2ETest.test_crash_all@{"use_transactions":false}

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jan 7, 2026

CI test results

test results on build#78637
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
JavaCompressionTest test_upgrade_java_compression {"compression_type": "snappy"} integration https://buildkite.com/redpanda/redpanda/builds/78637#019b97c8-c386-43de-b6d9-c31d783c365a FLAKY 9/11 Test FAILS after retries.Significant increase in flaky rate(baseline=0.0000, p0=0.0000, reject_threshold=0.0100) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=JavaCompressionTest&test_method=test_upgrade_java_compression
DatalakeDLQTest test_dlq_table_for_mixed_records {"catalog_type": "rest_jdbc", "cloud_storage_type": 1, "query_engine": "spark"} integration https://buildkite.com/redpanda/redpanda/builds/78637#019b97c7-d8f9-4dc5-bf78-af75ffab0765 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0025, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=DatalakeDLQTest&test_method=test_dlq_table_for_mixed_records
MountUnmountIcebergTest test_simple_remount {"cloud_storage_type": 1} integration https://buildkite.com/redpanda/redpanda/builds/78637#019b97c8-c389-4368-b7b7-e4e6c8f1205b FLAKY 7/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.2093, p0=0.3504, reject_threshold=0.0100. adj_baseline=0.5057, p1=0.1628, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=MountUnmountIcebergTest&test_method=test_simple_remount
WriteCachingFailureInjectionE2ETest test_crash_all {"use_transactions": false} integration https://buildkite.com/redpanda/redpanda/builds/78637#019b97c8-c387-4d9d-8762-14c724db9752 FLAKY 12/21 Test FAILS after retries.Significant increase in flaky rate(baseline=0.1147, p0=0.0011, reject_threshold=0.0100) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all
test results on build#78645
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
WriteCachingFailureInjectionE2ETest test_crash_all {"use_transactions": false} integration https://buildkite.com/redpanda/redpanda/builds/78645#019b992f-a18f-4916-9f82-d96101c208f4 FLAKY 8/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.1176, p0=0.3326, reject_threshold=0.0100. adj_baseline=0.3130, p1=0.3488, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=WriteCachingFailureInjectionE2ETest&test_method=test_crash_all

@oleiman oleiman requested a review from Copilot January 7, 2026 15:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

@oleiman
Copy link
Member Author

oleiman commented Jan 7, 2026

/ci-repeat 1
skip-redpanda-build
skip-units
skip-rebase
tests/rptest/tests/compatibility/java_compression_test.py::JavaCompressionTest.test_upgrade_java_compression@{"compression_type":"snappy"}
tests/rptest/tests/write_caching_fi_e2e_test.py::WriteCachingFailureInjectionE2ETest.test_crash_all@{"use_transactions":false}

@oleiman oleiman requested a review from dotnwat January 7, 2026 17:11
bool is_whitespace = (c == ' ' || c == '\t');
if (!is_eq && !is_whitespace) {
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have "multipart/mixed; boundaryfffff", then boundary will be the substr fffff, and then c will be f and we'll break here, and return fffff as the boundary, but I would have expected that the lack of = would mean that it is malformed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boundary will be the substring fffff... but the lack of = would mean that it is malformed

correct, but a few lines down we blow away the string view if we didn't find exactly one '=' before the first non-whitespace character. there is a unit test for this 🙂

This function could use some documentation. If the diff looks ok to you aside from that, I'll add a nice block comment in the next PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeh makes sense i was thinking those checks were inside the while loop

@oleiman oleiman requested a review from dotnwat January 8, 2026 16:01
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. why are we returning a std::exception_ptr for std::unexpected case? it would seem like either throwing or returning a formatted error string and letting the caller throw would be better.

@oleiman
Copy link
Member Author

oleiman commented Jan 8, 2026

lgtm. why are we returning a std::exception_ptr for std::unexpected case? it would seem like either throwing or returning a formatted error string and letting the caller throw would be better.

yeah fair, that sounds more normal. will change in follow up

@oleiman oleiman merged commit 34b6a34 into redpanda-data:dev Jan 8, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants