Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Redpanda Migrator components #3026

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

mihaitodor
Copy link
Collaborator

@mihaitodor mihaitodor commented Nov 21, 2024

I hijacked this PR to address several issues:

Added

  • New redpanda_migrator_offsets input.
  • Fields offset_topic, offset_group, offset_partition, offset_commit_timestamp and offset_metadata added to the redpanda_migrator_offsets output.
  • Fields kafka_key and max_in_flight for the redpanda_migrator_offsets output are now deprecated.
  • Fields batching for the redpanda_migrator output is now deprecated.
  • Field topic_lag_refresh_period added to the redpanda and redpanda_common inputs.
  • Metric redpanda_lag now emitted by the redpanda and redpanda_common inputs.
  • Metadata kafka_lag now emitted by the redpanda and redpanda_common inputs.

Fixed

  • The redpanda_migrator_bundle output now skips schema ID translation when translate_schema_ids: false and schema_registry is configured.
  • The redpanda_migrator output no longer rejects messages if it can't perform schema ID translation.
  • The redpanda_migrator input no longer converts the kafka key to string.

Changed

  • The kafka_key and max_in_flight fields of the redpanda_migrator_offsets output are now deprecated.
  • Fields batch_size and multi_header for the redpanda_migrator input are now deprecated.
    • The redpanda_migrator_bundle input and output now set labels for their subcomponents.
  • The redpanda_migrator input no longer emits tombstone messages.

Redpanda Migrator offset metadata

One quick way to test this is via the following config. Note how I overwrite kafka_offset_metadata to foobar in a mapping processor.

input:
  redpanda_migrator_bundle:
    redpanda_migrator:
      seed_brokers: [ "localhost:9092" ]
      topics:
        - '^[^_]' # Skip internal topics which start with `_`
      regexp_topics: true
      consumer_group: migrator_bundle
      start_from_oldest: true
      replication_factor_override: true
      replication_factor: -1

    schema_registry:
      url: http://localhost:8081
      include_deleted: true
      subject_filter: ""

output:
  processors:
    - switch:
        - check: metadata("input_label") == "redpanda_migrator_offsets_input"
          processors:
            - mapping: |
                meta kafka_offset_metadata = "foobar"
  redpanda_migrator_bundle:
    redpanda_migrator:
      seed_brokers: [ "localhost:9093" ]
      max_in_flight: 1
      replication_factor_override: true
      replication_factor: -1

    schema_registry:
      url: http://localhost:8082

@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch from 34421d0 to 081592f Compare November 21, 2024 02:46
@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch 12 times, most recently from a86bdbd to 72237c4 Compare December 12, 2024 01:18
@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch 5 times, most recently from d37239f to 784ff42 Compare December 16, 2024 11:16
@mihaitodor mihaitodor changed the title Add Redpanda Migrator offset metadata Refactor Redpanda Migrator components Dec 16, 2024
@mihaitodor mihaitodor marked this pull request as ready for review December 16, 2024 11:21
log: res.Logger(),
shutSig: shutdown.NewSignaller(),
clientOpts: optsFn,
topicLagGauge: res.Metrics().NewGauge("redpanda_lag", "topic", "partition"),
Copy link
Collaborator Author

@mihaitodor mihaitodor Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I added the redpanda_migrator input, I had both this gauge and the kafka_lag metadata field. I don't know if we want any of these available by default. Also, should this gauge name be somehow derived from the actual input type (redpanda, redpanda_common, redpanda_migrator, redpanda_migrator_offsets)? It does get the label of the input if set, so maybe that's sufficient.

@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch from 784ff42 to 642fd09 Compare December 16, 2024 11:43
- New `redpanda_migrator_offsets` input
- Fields `offset_topic`, `offset_group`, `offset_partition`, `offset_commit_timestamp` and `offset_metadata` added to the `redpanda_migrator_offsets` output

Signed-off-by: Mihai Todor <[email protected]>
This is required in order to pull in twmb/franz-go#838

This is needed because the `redpanda_migrator` input needs to
create all the matched topics during the first call to
`ReadBatch()`.

Signed-off-by: Mihai Todor <[email protected]>
@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch 4 times, most recently from 5de511f to a7b9103 Compare December 31, 2024 02:33
@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch 2 times, most recently from 34c5d16 to 5749553 Compare December 31, 2024 14:13
@mihaitodor mihaitodor requested review from Jeffail, rockwotj and ooesili and removed request for Jeffail December 31, 2024 14:50
@mihaitodor mihaitodor force-pushed the mihaitodor-add-redpanda-migrator-offset-metadata branch from 5749553 to bcfeae2 Compare December 31, 2024 15:07
@@ -76,18 +105,25 @@ type FranzReaderOrdered struct {

// NewFranzReaderOrderedFromConfig attempts to instantiate a new
// FranzReaderOrdered reader from a parsed config.
func NewFranzReaderOrderedFromConfig(conf *service.ParsedConfig, res *service.Resources, optsFn func() ([]kgo.Opt, error)) (*FranzReaderOrdered, error) {
func NewFranzReaderOrderedFromConfig(conf *service.ParsedConfig, res *service.Resources, clientOptsFn clientOptsFn, recordToMessageFn recordToMessageFn, preflightHookFn preflightHookFn, closeHookFn closeHookFn) (*FranzReaderOrdered, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor got a bit messy to use... It can be hard to tell which of these funcs is set to nil at the call site and one can easily mix them up. I'm thinking to maybe introduce functional options for it or maybe a struct which contain all the parameters. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant