[POC] Native Flink s3 fileSystem #4

Samrat002 · 2025-09-27T07:28:27Z

What is the purpose of the change

Native flink s3 Filesystem

Validate that Flink can talk to S3 directly via AWS SDK v2 without Hadoop, This enhancement simplifies the stack and removes dependency on Hadoop & Presto thus depreacating flink-s3-fs-hadoop and flink-s3-fs-presto and eventually remove it.
Reduce classpath conflicts and shading complexity caused by Hadoop + S3A; make S3 support self-contained and easier to operate.
Unify behaviour for cloud-first deployments (Kubernetes, containers, IAM roles, path-style/virtual-host) with straightforward configuration in flink-conf.yaml.
Establish a foundation to implement Flink-specific features (entropy injection, recoverable writer, retries/backoff) natively rather than inheriting Hadoop semantics.

There is a need for a clear strategy to deprecate and remove flink-s3-fs-hadoop and flink-s3-fs-presto.
Users relying on Hadoop configuration/kerberos auto-wiring must map configs to native options; operational playbooks change.
Coexistence with flink-s3-fs-hadoop and flink-s3-fs-presto can fragment testing and user guidance if not clearly instructed.
S3A-specific behaviours won’t automatically carry over; must be implemented deliberately to avoid regressions.

Native Flink s3 fileSystem

eda858f

Samrat002 force-pushed the flink-s3-fs branch from e95ab23 to eda858f Compare September 27, 2025 07:45