Skip to content

[FEA][AUDIT][SPARK-52921][SQL] Specify outputPartitioning for UnionExec for same output partitoning as children operators #14083

@abellina

Description

@abellina

This is a spark 4.1 audit task.

UnionExec supports outputPartitioning, when children have the same output partitioning. I am not 100% sure why we would want this, but filing in case we need it. According to the issue, Union has "unknown" partitioning otherwise. My guess is that if we don't follow suit, spark-rapids could add shuffles when the CPU case wouldn't. That said, I am not 100% sure, and wanted to get some comments from @revans2 and others.

There is a follow up I was looking at for audit (apache/spark@c0acf45023f), so I created the issue for the main spark issue to discuss.

Here's another related follow up: apache/spark@8edc7685b97

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions