Skip to content

Conversation

Samrat002
Copy link
Owner

@Samrat002 Samrat002 commented Sep 27, 2025

What is the purpose of the change

Native flink s3 Filesystem

Why flink-s3-fs-native ?

  1. Validate that Flink can talk to S3 directly via AWS SDK v2 without Hadoop, This enhancement simplifies the stack and removes dependency on Hadoop & Presto thus depreacating flink-s3-fs-hadoop and flink-s3-fs-presto and eventually remove it.
  2. Reduce classpath conflicts and shading complexity caused by Hadoop + S3A; make S3 support self-contained and easier to operate.
  3. Unify behaviour for cloud-first deployments (Kubernetes, containers, IAM roles, path-style/virtual-host) with straightforward configuration in flink-conf.yaml.
  4. Establish a foundation to implement Flink-specific features (entropy injection, recoverable writer, retries/backoff) natively rather than inheriting Hadoop semantics.

Things that need attention

  1. There is a need for a clear strategy to deprecate and remove flink-s3-fs-hadoop and flink-s3-fs-presto.
  2. Users relying on Hadoop configuration/kerberos auto-wiring must map configs to native options; operational playbooks change.
  3. Coexistence with flink-s3-fs-hadoop and flink-s3-fs-presto can fragment testing and user guidance if not clearly instructed.
  4. S3A-specific behaviours won’t automatically carry over; must be implemented deliberately to avoid regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant