Skip to content

Conversation

@featzhang
Copy link
Member

@featzhang featzhang commented Dec 19, 2025

Background

Apache Flink SQL currently provides URL_DECODE(str) to decode strings in application/x-www-form-urlencoded format (introduced in FLINK-34108). In practice, log data, tracking data, or external system inputs may contain multi-level URL-encoded values, for example strings encoded multiple times by upstream systems or intermediate components such as redirects or proxies.

The existing function performs only a single decoding pass, which requires users to manually apply the function multiple times to fully decode such values.

Problem

Single-pass decoding is insufficient for handling multi-level URL-encoded values.

Repeated manual application of URL_DECODE reduces SQL readability and increases the risk of errors.

Solution

Extend URL decoding support by introducing an additional built-in function that accepts a boolean flag to control recursive decoding.

Signature

URL_DECODE(str)                     -- existing behavior (single-pass decoding)
URL_DECODE_RECURSIVE(str, recursive)

Behavior

  • When recursive is FALSE or NULL, the function performs a single decoding pass.

  • When recursive is TRUE, the function repeatedly applies URL decoding until:

    • the decoded result no longer changes, or
    • a maximum number of iterations is reached.

If the input value is NULL, the function returns NULL.

If a decoding error occurs:

  • in non-recursive mode, the function returns NULL;
  • in recursive mode, the function returns the last successfully decoded result, or NULL if no decoding step succeeds.

Examples

-- Single-pass decoding
SELECT URL_DECODE('%252Fpath%252Fto%252Fresource');
-- Result: '%2Fpath%2Fto%2Fresource'

-- Recursive decoding
SELECT URL_DECODE_RECURSIVE('%252Fpath%252Fto%252Fresource', TRUE);
-- Result: '/path/to/resource'

Compatibility

This change is fully backward-compatible. Existing queries using URL_DECODE(str) are unaffected.

Notes

Recursive decoding terminates when the decoded value stabilizes or when the maximum iteration limit is reached. This functionality is useful for data cleansing, normalization, and processing multi-level URL-encoded inputs.

@flinkbot
Copy link
Collaborator

flinkbot commented Dec 19, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants