Skip to content

Adding spark plugin to support spark jobs over EMR on EKS#3

Merged
babourine merged 1 commit intomainfrom
stan_spark_emr_eks
May 9, 2025
Merged

Adding spark plugin to support spark jobs over EMR on EKS#3
babourine merged 1 commit intomainfrom
stan_spark_emr_eks

Conversation

@babourine
Copy link
Contributor

No description provided.

Copilot AI review requested due to automatic review settings May 9, 2025 17:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

A new spark plugin has been added to support the submission and execution of Spark jobs over EMR on EKS. Key changes include:

  • Introducing a spark plugin handler in plugins/spark/spark.go that wraps the internal spark command logic.
  • Implementing the spark job submission flow with support for AWS EMR containers and assume role credentials in internal/pkg/object/command/spark/spark.go.
  • Updating go.mod to include necessary AWS SDK dependencies for EMR containers and S3 manager.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
plugins/spark/spark.go Added a new plugin handler wrapping the spark command implementation.
internal/pkg/object/command/spark/spark.go Implements the spark job execution flow including context unmarshalling, AWS client setup, and job polling.
go.mod Updated module dependencies to include the AWS EMR containers service and S3 manager.
Comments suppressed due to low confidence (1)

plugins/spark/spark.go:1

  • Consider renaming the package from 'main' to 'spark' to maintain consistency with the plugin's directory structure and its intended usage.
package main

return nil, fmt.Errorf("failed to list virtual clusters: %w", err)
}

for _, vc := range outputListClusters.VirtualClusters {
Copy link

Copilot AI May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a nil check for vc.Name before dereferencing it to avoid potential runtime panics in cases where the cluster name may be missing.

Suggested change
for _, vc := range outputListClusters.VirtualClusters {
for _, vc := range outputListClusters.VirtualClusters {
if vc.Name == nil {
continue
}

Copilot uses AI. Check for mistakes.
@babourine babourine merged commit 8c78d90 into main May 9, 2025
5 checks passed
@babourine babourine deleted the stan_spark_emr_eks branch May 9, 2025 18:06
wlggraham pushed a commit that referenced this pull request May 9, 2025
Add PR Template copied from Airflow Repo.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants