Skip to content

Conversation

amrcoder
Copy link
Contributor

@amrcoder amrcoder commented Sep 22, 2025

Summary

Debugging flaky test in spokecluster_aws_joining_test in github workflow

Related issue(s)

Fixes # #1154

Summary by CodeRabbit

  • Tests

    • Integration tests now run with verbose output and a narrowed focus on the registration joining flow; added runtime log messages to improve flaky-test visibility and diagnostics.
  • Chores

    • Unit test job temporarily disabled in pre-merge CI to streamline runs and speed up pipeline feedback.
  • Notes

    • No user-facing behavior or public APIs changed.

Copy link
Contributor

openshift-ci bot commented Sep 22, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: amrcoder
Once this PR has been reviewed and has the lgtm label, please assign qiujian16 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

coderabbitai bot commented Sep 22, 2025

Walkthrough

Commented-out the unit job in .github/workflows/pre.yml and changed the integration job to run ARGS="-test.v" make test-registration-integration. Narrowed integration test selection to the "Joining Process for aws flow" via the Makefile, and added two fmt.Printf("testing flaky test attempt 2\n") log statements in the AWS joining integration test file.

Changes

Cohort / File(s) Summary of Changes
CI workflow updates
.github/workflows/pre.yml
Commented out/disabled the unit job block; updated the integration job command from make test-integration to ARGS="-test.v" make test-registration-integration.
Integration test selection
test/integration-test.mk
Added a Ginkgo focus filter to run only the "Joining Process for aws flow" tests when invoking integration tests (make target for registration integration).
AWS joining integration test edits
test/integration/registration/spokecluster_aws_joining_test.go
Inserted two runtime log statements fmt.Printf("testing flaky test attempt 2\n") (placed after cancel() in two AWS-join tests). No public API or test assertions were changed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title concisely and accurately describes the primary intent—debugging the flaky spokecluster_aws_joining_test in the GitHub workflow—and aligns with the PR objectives and changed files; it is a single clear sentence and includes the repository icon and a "[DO NOT MERGE]" marker indicating work-in-progress.
Description Check ✅ Passed The PR description follows the repository template by including "## Summary" and "## Related issue(s)" and references the related issue, so it meets the template's required structure; however, the summary is minimal and does not enumerate the specific changes (workflow edits, focused integration test invocation, or added debug prints) or explain why the unit job was commented out and what testing was performed.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dfd939e and cab5ddd.

📒 Files selected for processing (1)
  • test/integration/registration/spokecluster_aws_joining_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/integration/registration/spokecluster_aws_joining_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e-hosted
  • GitHub Check: e2e

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/pre.yml (2)

53-74: Don’t comment out jobs; gate them with a condition tied to the debug PR.

Commenting the unit job drops coverage uploads and invites config drift. Keep the job defined and skip it only for this “[DO NOT MERGE]” debug PR via an if: guard.

Apply this diff to re-enable the job while conditionally skipping it on this PR:

-  # unit:
-  #   name: unit
-  #   runs-on: ubuntu-latest
-  #   steps:
-  #     - name: checkout code
-  #       uses: actions/checkout@v5
-  #     - name: install Go
-  #       uses: actions/setup-go@v5
-  #       with:
-  #         go-version: ${{ env.GO_VERSION }}
-  #     - name: unit
-  #       run: make test
-  #     - name: report coverage
-  #       uses: codecov/codecov-action@v5
-  #       with:
-  #         token: ${{ secrets.CODECOV_UPLOAD_TOKEN }}
-  #         files: ./coverage.out
-  #         flags: unit
-  #         name: unit
-  #         verbose: true
-  #         fail_ci_if_error: false
+  unit:
+    # Skip unit job only for PRs explicitly marked DO NOT MERGE (keeps it on for all other PRs and on main)
+    if: ${{ !(github.event_name == 'pull_request' && contains(github.event.pull_request.title, '[DO NOT MERGE]')) }}
+    name: unit
+    runs-on: ubuntu-latest
+    steps:
+      - name: checkout code
+        uses: actions/checkout@v5
+      - name: install Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: ${{ env.GO_VERSION }}
+      - name: unit
+        run: make test
+      - name: report coverage
+        uses: codecov/codecov-action@v5
+        with:
+          token: ${{ secrets.CODECOV_UPLOAD_TOKEN }}
+          files: ./coverage.out
+          flags: unit
+          name: unit
+          verbose: true
+          fail_ci_if_error: false

86-86: Amplify flake reproduction signal and narrow scope.

Run the target test repeatedly with shuffle and upload logs to artifacts.

Apply this diff to exercise the flake harder (adjust the regex to the exact test name you’re chasing):

-        run: ARGS="-test.v" make test-registration-integration
+        run: ARGS='-test.v -test.count=20 -test.shuffle=on -test.run="SpokeCluster.*AWS.*Joining"' make test-registration-integration

Add this step right after to persist logs (so you can diff runs):

      - name: upload integration logs
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: integration-logs
          path: |
            **/*.log
            **/test-results/*.json
            **/test-output/*.txt

Questions to confirm:

  • Does test-registration-integration pass ARGS directly to go test? If not, I’ll adapt to the Makefile.
  • What’s the exact test func regex (e.g., TestSpokeClusterAWSJoining)? I can tune -test.run accordingly.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8cb4f13 and ab99eb6.

📒 Files selected for processing (1)
  • .github/workflows/pre.yml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: e2e-singleton
  • GitHub Check: build
  • GitHub Check: verify
  • GitHub Check: integration
  • GitHub Check: e2e
  • GitHub Check: e2e-hosted

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/integration-test.mk (1)

34-34: Make the focus pattern opt-in to avoid accidental commits.

Gate the focus behind an env var so default runs remain unchanged; set it in the workflow when debugging.

-	./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast -ginkgo.focus="Joining Process for aws flow" ${ARGS}
+	./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast $(if $(GINKGO_FOCUS),-ginkgo.focus="$(GINKGO_FOCUS)",) ${ARGS}

Tip: To reproduce flakes faster without changing Makefile behavior, pass via ARGS in CI:

  • ARGS='-ginkgo.until-it-fails' or ARGS='-ginkgo.flake-attempts=3'
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab99eb6 and ee6cdd7.

📒 Files selected for processing (1)
  • test/integration-test.mk (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: e2e
  • GitHub Check: e2e-hosted
  • GitHub Check: e2e-singleton
  • GitHub Check: verify
  • GitHub Check: integration
  • GitHub Check: build

test-registration-integration: ensure-kubebuilder-tools
go test -c ./test/integration/registration -o ./registration-integration.test
./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast ${ARGS}
./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast --ginkgo.focus "Joining Process for aws flow" ${ARGS}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Use -ginkgo.focus with the compiled test binary (single dash).

--ginkgo.focus is for the ginkgo CLI; the compiled test binary expects -ginkgo.focus. Using the wrong form risks ignoring the filter or flag-parsing errors.

Apply this diff:

-	./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast --ginkgo.focus "Joining Process for aws flow" ${ARGS}
+	./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast -ginkgo.focus="Joining Process for aws flow" ${ARGS}

Run to confirm the focus pattern matches an actual spec title:


🏁 Script executed:

#!/bin/bash
# Verify the focused spec exists (case/spacing sensitive).
rg -nP -C2 'Joining Process.*aws flow' test/integration/registration --type go || {
  echo "No exact match found. Nearby candidates:" >&2
  rg -nP -C2 'Joining Process|aws flow' test/integration/registration --type go
}

Length of output: 565


Use -ginkgo.focus with the compiled test binary (single dash).
Spec found at test/integration/registration/spokecluster_aws_joining_test.go:28 — change the flag so the compiled test binary receives the focus.

-	./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast --ginkgo.focus "Joining Process for aws flow" ${ARGS}
+	./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast -ginkgo.focus="Joining Process for aws flow" ${ARGS}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast --ginkgo.focus "Joining Process for aws flow" ${ARGS}
./registration-integration.test -ginkgo.slow-spec-threshold=15s -ginkgo.v -ginkgo.fail-fast -ginkgo.focus="Joining Process for aws flow" ${ARGS}
🤖 Prompt for AI Agents
In test/integration-test.mk around line 34, the Makefile passes the Ginkgo focus
flag to the compiled test binary as "--ginkgo.focus" but the compiled binary
expects the single-dash form; replace "--ginkgo.focus" (or "-ginkgo.focus") with
"-focus" and keep the existing quoted pattern and other flags intact so the
command becomes the compiled test binary receiving -focus "Joining Process for
aws flow" ${ARGS}.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/integration/registration/spokecluster_aws_joining_test.go (1)

260-260: Prefer GinkgoWriter for CI‑captured debug logs.

fmt.Printf can be dropped by reporters; GinkgoWriter is consistently captured in GitHub Actions output.

-			fmt.Printf("testing flaky test attempt 1\n")
+			ginkgo.GinkgoWriter.Printf("testing flaky test attempt 1\n")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee6cdd7 and dfd939e.

📒 Files selected for processing (1)
  • test/integration/registration/spokecluster_aws_joining_test.go (2 hunks)
🧰 Additional context used
🪛 GitHub Check: integration
test/integration/registration/spokecluster_aws_joining_test.go

[failure] 28-28:
expected declaration, found ginkgo

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: e2e-singleton
  • GitHub Check: e2e
  • GitHub Check: e2e-hosted
  • GitHub Check: build
  • GitHub Check: verify

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant