Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datalake: relax translation_stm::max_collectible_offset() value (and add compaction_test.py) #24610

Merged

Conversation

WillemKauf
Copy link
Contributor

@WillemKauf WillemKauf commented Dec 18, 2024

Previously, the datalake::translation::translation_stm would return its max collectible as the following:

model::offset translation_stm::max_collectible_offset() {
if (!_raft->log_config().iceberg_enabled()) {
return model::offset::max();
}
// if offset is not initialized, do not attempt translation.
if (_highest_translated_offset == kafka::offset{}) {
return model::offset{};
}
return _raft->log()->to_log_offset(
kafka::offset_cast(_highest_translated_offset));
}

This offset translation leads to an overly restrictive condition for the max collectible offset, due to the fact that it is translation batch unaware.

Here, the utility function highest_log_offset_below_next() is added, which returns the "equivalent" translated log offset for a given kafka offset, taking into account translation batches (which don't need to be translated, and thus shouldn't restrict the max collectible offset).

translation_stm::max_collectible_offset() now uses this function to relax its returned offset.

Additionally, a new test for compaction with an Iceberg enabled topic is added to datalake/compaction_test.py, with some enhancements to the datalake_verifier service to make it compaction aware.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

Improvements

  • Fixes an overly restrictive condition for retention in Iceberg-enabled topics.

@@ -24,22 +24,24 @@

class DatalakeVerifier():
"""
Verifier that does the verification of the data in the redpanda Iceberg table.
The verifier consumes offsets from specified topic and verifies it the data
Verifier that does the verification of the data in the redpanda Iceberg table.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace removal

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 19, 2024

Retry command for Build#59935

please wait until all jobs are finished before running the slash command



/ci-repeat 1
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_ChunkedRead == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[2,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True, SpilloverManifestUploaded == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True, SpilloverManifestUploaded == True)"}}
tests/rptest/tests/write_caching_fi_test.py::WriteCachingFailureInjectionTest.test_crash_all
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[2,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, SpilloverManifestUploaded == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True)"}}
tests/rptest/tests/write_caching_fi_e2e_test.py::WriteCachingFailureInjectionE2ETest.test_crash_all@{"use_transactions":false}
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeE2ETests.test_topic_lifecycle@{"cloud_storage_type":1,"filesystem_catalog_mode":false}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_Timequery == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"path"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True)"}}
tests/rptest/tests/datalake/datalake_e2e_test.py::DatalakeE2ETests.test_topic_lifecycle@{"cloud_storage_type":1,"filesystem_catalog_mode":true}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, TS_Timequery == True, SpilloverManifestUploaded == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"path"],"test_case":{"name":"(TS_Read == True, TS_TxRangeMaterialized == True, SpilloverManifestUploaded == True)"}}
tests/rptest/tests/tiered_storage_model_test.py::TieredStorageTest.test_tiered_storage@{"cloud_storage_type_and_url_style":[1,"virtual_host"],"test_case":{"name":"(TS_Read == True, AdjacentSegmentMergerReupload == True)"}}

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 19, 2024

CI test results

test results on build#59935
test_id test_kind job_url test_status passed
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_topic_lifecycle.cloud_storage_type=CloudStorageType.S3.filesystem_catalog_mode=False ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c804-449e-983b-f6040b48eed2 FAIL 0/6
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_topic_lifecycle.cloud_storage_type=CloudStorageType.S3.filesystem_catalog_mode=False ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f3-40db-9e3c-b72686edcfd1 FAIL 0/6
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_topic_lifecycle.cloud_storage_type=CloudStorageType.S3.filesystem_catalog_mode=True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c806-4851-8e40-029c7bdf36d7 FAIL 0/6
rptest.tests.datalake.datalake_e2e_test.DatalakeE2ETests.test_topic_lifecycle.cloud_storage_type=CloudStorageType.S3.filesystem_catalog_mode=True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f4-48f5-b848-e36477ea95e1 FAIL 0/6
rptest.tests.datalake.partition_movement_test.PartitionMovementTest.test_cross_core_movements.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c803-4c2f-9cd0-86f9a8dd5064 FLAKY 5/6
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.ABS.2.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c807-4275-8982-8723111a2347 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.ABS.2.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f4-48f5-b848-e36477ea95e1 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.ABS.2.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c803-4c2f-9cd0-86f9a8dd5064 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.ABS.2.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f6-49d5-ae5a-d5eb3497ad6e FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.path.test_case=.TS_Read==True.TS_TxRangeMaterialized==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c804-449e-983b-f6040b48eed2 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.path.test_case=.TS_Read==True.TS_TxRangeMaterialized==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f2-467e-a057-0e4a790311ae FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.path.test_case=.TS_Read==True.TS_TxRangeMaterialized==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c806-4851-8e40-029c7bdf36d7 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.path.test_case=.TS_Read==True.TS_TxRangeMaterialized==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f3-40db-9e3c-b72686edcfd1 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f6-49d5-ae5a-d5eb3497ad6e FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c807-4275-8982-8723111a2347 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f4-48f5-b848-e36477ea95e1 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_ChunkedRead==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c803-4c2f-9cd0-86f9a8dd5064 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_Timequery==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c804-449e-983b-f6040b48eed2 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_Timequery==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f2-467e-a057-0e4a790311ae FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_Timequery==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c806-4851-8e40-029c7bdf36d7 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_Timequery==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f3-40db-9e3c-b72686edcfd1 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c807-4275-8982-8723111a2347 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f4-48f5-b848-e36477ea95e1 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c803-4c2f-9cd0-86f9a8dd5064 FAIL 0/1
rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.TS_TxRangeMaterialized==True.SpilloverManifestUploaded==True ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f6-49d5-ae5a-d5eb3497ad6e FAIL 0/1
rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all.use_transactions=False ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c804-449e-983b-f6040b48eed2 FAIL 0/1
rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all.use_transactions=False ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f4-48f5-b848-e36477ea95e1 FAIL 0/1
rptest.tests.write_caching_fi_test.WriteCachingFailureInjectionTest.test_crash_all ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc18-c803-4c2f-9cd0-86f9a8dd5064 FAIL 0/1
rptest.tests.write_caching_fi_test.WriteCachingFailureInjectionTest.test_crash_all ducktape https://buildkite.com/redpanda/redpanda/builds/59935#0193dc1c-24f3-40db-9e3c-b72686edcfd1 FAIL 0/1
test results on build#60023
test_id test_kind job_url test_status passed
rptest.tests.datalake.partition_movement_test.PartitionMovementTest.test_cross_core_movements.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/60023#0193e62a-cfd5-4342-9d5c-bd82d737eba0 FLAKY 2/6
test results on build#60079
test_id test_kind job_url test_status passed
rptest.tests.delete_records_test.DeleteRecordsTest.test_delete_records_concurrent_truncations.cloud_storage_enabled=True.truncate_point=start_offset ducktape https://buildkite.com/redpanda/redpanda/builds/60079#0193f494-6494-4f55-ac0e-69cf1956752e FLAKY 5/6
test results on build#60368
test_id test_kind job_url test_status passed
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/60368#01944294-d5f0-4c98-8132-4655bda7db82 FLAKY 5/6
rptest.tests.partition_reassignments_test.PartitionReassignmentsTest.test_reassignments_kafka_cli ducktape https://buildkite.com/redpanda/redpanda/builds/60368#019442af-789d-4ffa-a231-a3a2c2539ba9 FLAKY 1/6
rptest.transactions.stream_verifier_test.StreamVerifierTest.test_simple_produce_consume_txn_with_add_node ducktape https://buildkite.com/redpanda/redpanda/builds/60368#01944294-d5ed-4718-a54f-7ac9bb15fdcc FLAKY 5/6
storage_e2e_single_thread_rpunit.storage_e2e_single_thread_rpunit unit https://buildkite.com/redpanda/redpanda/builds/60368#01944252-4124-4c76-85c2-f3165a0ba962 FLAKY 1/2
test results on build#60544
test_id test_kind job_url test_status passed
rm_stm_tests_rpunit.rm_stm_tests_rpunit unit https://buildkite.com/redpanda/redpanda/builds/60544#01944d38-a71c-4bb4-86a4-b75b1322f1b0 FLAKY 1/2
test results on build#60563
test_id test_kind job_url test_status passed
rm_stm_tests_rpunit.rm_stm_tests_rpunit unit https://buildkite.com/redpanda/redpanda/builds/60563#01944e3e-7668-4063-8ce8-c9bff6ddcdc7 FLAKY 1/2
rptest.tests.datalake.simple_connect_test.RedpandaConnectIcebergTest.test_translating_avro_serialized_records.cloud_storage_type=CloudStorageType.S3 ducktape https://buildkite.com/redpanda/redpanda/builds/60563#01944e84-4faa-45d6-848a-06d1d9cef224 FLAKY 5/6
rptest.tests.partition_reassignments_test.PartitionReassignmentsTest.test_reassignments_kafka_cli ducktape https://buildkite.com/redpanda/redpanda/builds/60563#01944e84-4faa-45d6-848a-06d1d9cef224 FLAKY 1/6
test results on build#60597
test_id test_kind job_url test_status passed
rm_stm_tests_rpunit.rm_stm_tests_rpunit unit https://buildkite.com/redpanda/redpanda/builds/60597#019450cc-9bd3-4e5f-afb0-e1ec69fbb762 FLAKY 1/2
rm_stm_tests_rpunit.rm_stm_tests_rpunit unit https://buildkite.com/redpanda/redpanda/builds/60597#019450cc-9cc5-4fda-8817-501bafa1c3fe FLAKY 1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade ducktape https://buildkite.com/redpanda/redpanda/builds/60597#01945126-d4c5-4f13-bf17-1e1af69bd8cc FLAKY 5/6

@WillemKauf
Copy link
Contributor Author

WillemKauf commented Dec 19, 2024

Lot of KgoVerifierProducer failures, panic: Out of order offset 0 (vs 0 20000).

Not sure if this is another KgoVerifierProducer issue or if something else has been broken.

The only related change I can see in KgoVerifier was this, in which pw.validOffsets.Insert() is now called under a lock in new function OnAcked (but CI must have ran for this change many times before seeing these failures, so I am uncertain)

EDIT: Probably just because of the oneshot() changes I made. Reverted.

@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 6c113d7 to 0e1a24c Compare December 20, 2024 14:32
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Revert changes to kgo_verifier_service::oneshot().

@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 0e1a24c to 2fe6c55 Compare December 20, 2024 15:05
@vbotbuildovich
Copy link
Collaborator

Retry command for Build#60016

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/datalake/compaction_test.py::CompactionTest.test_compaction@{"cloud_storage_type":1}

@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 2fe6c55 to b01095c Compare December 20, 2024 21:09
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Change compaction wait condition in compaction_test.py. Translation seems to slow the compaction process down quite a bit.

@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from b01095c to 4bd7693 Compare December 23, 2024 16:19
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Add two new tests to translated_log_offset_test.cc
  • Remove early return in get_translated_log_offset() to correct behavior for the edge case of kafka::offset{}
  • Add comment to get_translated_log_offset() declaration about its use.

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the delay here.

src/v/datalake/translation/state_machine.h Outdated Show resolved Hide resolved
tests/rptest/tests/datalake/datalake_verifier.py Outdated Show resolved Hide resolved
@@ -24,22 +24,24 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compaction is expected to block until translation happens. What additional coverage does verification with a compacted log with a fully translated iceberg table provide?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compaction is expected to block until translation happens

What additional coverage does verification with a compacted log with a fully translated iceberg table provide?

That the aforementioned expectation is true (i.e Iceberg table is fully translated, log is fully compacted).

Do you think there is other verification that should be added here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the aforementioned expectation is true (i.e Iceberg table is fully translated, log is fully compacted).

Correct me if I'm wrong but the verifier in the current form can also succeed if the topic got translated from a compacted log (hypothetically if the code violated the max_collectible_offset invariant), no?

Copy link
Contributor Author

@WillemKauf WillemKauf Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suppose that is true.

In its current form, we handle the cases where the iceberg table has as much or more information than the log, as we assume that it was translated before compaction of the log took place/didn't take place, but this doesn't verify that the case you described didn't occur.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, so I was wondering if we should instead test enforcement of max_collectible_offset which is a more critical invariant and leave the verifier as it is today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that perhaps we haven't covered the case where the max_collectible_offset invariant is violated, but I do think that the existing changes to the verifier are helpful to at least ensure the case in which it ISN'T violated is correct

I'll try to think of ways to better cover the critical invariant in a follow up PR, if that works for you.

@@ -118,3 +120,126 @@ def test_translation_no_gaps(self, cloud_storage_type):
include_query_engines=[QueryEngineType.TRINO
]) as dl:
self.do_test_no_gaps(dl)


class CompactionTest(RedpandaTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the motivations for removing offset translation was to use iceberg enabled topics with read replicas/topic recovery, wondering if its worth adding an e2e test for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to see a test added in this PR, or as a follow up PR?

Copy link
Contributor

@bharathv bharathv Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR is also fine. (just want to ensure nothing else is broken, other than offset translation before we declare it as working).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW read replicas or topic recovery don't appear to be addressed by this PR.

Read replica translators won't be able to perform offset translation on anything, and topic recovery will likely require changes to what revisions and topic overrides get passed to the coordinators

I don't believe either are in scope of this work (which is really just to unblock compaction IIUC)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about topic recovery needing more work. I think @~bashtanov is doing some work here in the migrations land.

Read replica translators won't be able to perform offset translation on anything,

whats the use case here with read replicas though? IIRC its about being able to (sql) query an iceberg enabled RRR topic, thats it? RRR topic itself cannot do any translation locally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. But as is, RRR topics won't be able to call the to_log_offset() that's done in the translation path. I think it's reasonable to have the translator skip the offset-translation and pass a nullopt to the STM as the log offset.

src/v/datalake/translation/partition_translator.cc Outdated Show resolved Hide resolved
src/v/datalake/translation/tests/state_machine_test.cc Outdated Show resolved Hide resolved
@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 4bd7693 to d1ef28c Compare January 7, 2025 15:40
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Rebase to upstream/dev and fix merge conflicts
  • Rename new_log_translated_offset -> new_translated_log_offset in state_machine.cc/.h
  • Add comment in datalake_verifier.py around _expected_compacted_keys
  • Move changes for state_machine_test.cc to proper commit

@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from d1ef28c to 7d564a7 Compare January 7, 2025 19:44
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Rebase to upstream/dev and fix merge conflicts

bharathv
bharathv previously approved these changes Jan 7, 2025
@@ -24,22 +24,24 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, so I was wondering if we should instead test enforcement of max_collectible_offset which is a more critical invariant and leave the verifier as it is today.

src/v/datalake/translation/partition_translator.cc Outdated Show resolved Hide resolved
src/v/datalake/translation/state_machine.h Outdated Show resolved Hide resolved
Comment on lines 34 to 36
model::offset
get_translated_log_offset(ss::shared_ptr<storage::log> log, kafka::offset o);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given it may be confusing to refer to "translation" here in multiple contexts (i'm actually not sure if you mean it as offset-translated or datalake-translated), it may be more self-descriptive if this were named highest_log_offset_below(kafka::offset), where the translator would pass in kafka::next_offset(max_translated_offset)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or consider the name highest_log_offset_below_next()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to call this function in translation_stm::max_collectible_offset() like

return get_translated_log_offset(_raft->log(), _highest_translated_offset);

feels better than having to manipulate the passed offset outside the function before calling it each time.

If you feel strongly about this I can change the name- it is unfortunate that "translation" can mean two different things in this context but I hope the code comments are descriptive enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel strongly about the name -- we already have some many named offsets here and there, this one doesn't seem so pivotal that it needs a special name. My vote is for highest_log_offset_below_next()

@@ -118,3 +120,126 @@ def test_translation_no_gaps(self, cloud_storage_type):
include_query_engines=[QueryEngineType.TRINO
]) as dl:
self.do_test_no_gaps(dl)


class CompactionTest(RedpandaTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW read replicas or topic recovery don't appear to be addressed by this PR.

Read replica translators won't be able to perform offset translation on anything, and topic recovery will likely require changes to what revisions and topic overrides get passed to the coordinators

I don't believe either are in scope of this work (which is really just to unblock compaction IIUC)

tests/rptest/tests/datalake/datalake_verifier.py Outdated Show resolved Hide resolved
Comment on lines 31 to 39
bool check_translated_log_offset(
ss::shared_ptr<storage::log> log,
kafka::offset translated_offset,
model::offset expected_offset) {
auto translated_log_offset
= datalake::translation::get_translated_log_offset(
log, translated_offset);
return expected_offset == translated_log_offset;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this feels like quite a lot of testing for something that is ultimately just calling into the offset translator, and so this feels like we're just testing the offset translator. Wondering if we can instead write tests that check the max collectible offset? Plus, if you go the highest_log_offset_below() route, this also all changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ultimately this is more of an offset translation test, but I wanted to have a set of very illustrative unit tests that only tested this mechanism, and not the state machine as a whole.

src/v/datalake/translation/utils.h Outdated Show resolved Hide resolved
@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 7d564a7 to af320da Compare January 9, 2025 21:30
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Remove persistence of _highest_translated_log_offset in datalake/translation structures
  • Directly call into get_translated_log_offset() within translation_stm::max_collectible_offset()
  • Add check for RRR in translation_stm::max_collectible_offset()

@WillemKauf WillemKauf changed the title datalake: remove offset translation from translation_stm (and add compaction_test.py) datalake: relax translation_stm::max_collectible_offset() value (and add compaction_test.py) Jan 9, 2025
@andrwng
Copy link
Contributor

andrwng commented Jan 9, 2025

Add check for RRR in translation_stm::max_collectible_offset()

Mind removing this from this PR and following up with a test in a separate PR? IMO this one here should be focused on unblocking compaction

@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from af320da to 5dc456c Compare January 9, 2025 22:37
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Rename get_translated_log_offset() -> highest_log_offset_below_next()
  • Remove commit Add check for RRR in translation_stm::max_collectible_offset()
  • Refactor code comment text width in datalake_verifier.py

And most importantly, add the function `highest_log_offset_below_next()`.
This function will be used to compute the appropriate highest translated
log offset for a given translated kafka offset while taking into account
translator batches.

This will allow us to be less pessimistic about the `max_collectible_offset`
returned by the `translation_stm` in the future.
@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 5dc456c to 9db2f43 Compare January 10, 2025 13:59
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Rebase to upstream/dev to fix linter CI issues

This will return a less restrictive value for
`translation_stm::max_collectible_offset()`.
By handling gaps in offsets and recording seen keys, we can validate the
correctness of a compacted log that has been translated (fully) into an
iceberg table.
Adds a new `test_compaction` test, which uses the `KgoVerifierSeqConsumer`
to validate a fully compacted log, along with the `datalake_verifier`
service to validate the Iceberg table.

Also moves the contents of `compaction_gaps_test.py` into
`compaction_test.py`.
@WillemKauf WillemKauf force-pushed the datalake_translator_offset_fix branch from 9db2f43 to 3862ad0 Compare January 10, 2025 15:12
@WillemKauf
Copy link
Contributor Author

Force push to:

  • Fix bazel build deps

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm , cover letter needs to be updated to highest_log_offset_below_next ?

@WillemKauf WillemKauf merged commit 205a168 into redpanda-data:dev Jan 13, 2025
19 checks passed
Comment on lines +20 to +36
// Returns the equivalent log offset which can be considered translated by the
// datalake subsystem, while taking into account translator batch types, for a
// given kafka offset.
//
// Note that the provided kafka::offset o MUST be a valid offset, i.e one that
// has been produced to the log. This function will always return a value, and
// its correctness depends on the validity of the input offset.
//
// For example, in the following situation:
// Kaf offsets: [O] . . . [NKO]
// Log offsets: [K] [C] [C] [C/TLO] [NKO]
// where O is the input offset, K is the last kafka record, C is a translator
// (Config) batch, TLO is the translated log offset, and NKO is the next
// expected kafka record. We should expect TLO to be equal to the offset of the
// last configuration batch before the next kafka record.
model::offset highest_log_offset_below_next(
ss::shared_ptr<storage::log> log, kafka::offset o);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants