[FEA][AUDIT][SPARK-52921][SQL] Specify outputPartitioning for UnionExec for same output partitoning as children operators

This is a spark 4.1 audit task. 

UnionExec supports `outputPartitioning`, when children have the same output partitioning. I am not 100% sure why we would want this, but filing in case we need it. According to the [issue](https://github.com/apache/spark/pull/51623), Union has "unknown" partitioning otherwise. My guess is that if we don't follow suit, spark-rapids could add shuffles when the CPU case wouldn't. That said, I am not 100% sure, and wanted to get some comments from @revans2 and others.

There is a follow up I was looking at for audit (https://github.com/apache/spark/commit/c0acf45023f), so I created the issue for the main spark issue to discuss.

Here's another related follow up: https://github.com/apache/spark/commit/8edc7685b97

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA][AUDIT][SPARK-52921][SQL] Specify outputPartitioning for UnionExec for same output partitoning as children operators #14083

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA][AUDIT][SPARK-52921][SQL] Specify outputPartitioning for UnionExec for same output partitoning as children operators #14083

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions