Skip to content

Conversation

@Abhijeet212004
Copy link

The Reaper was only checking the message field for ImagePullBackOff errors, but Kubernetes actually sets the reason field. This caused pods to not get cleaned up when images failed to pull.

Now checks the reason field first, then falls back to message field for backwards compatibility.

Fixes #2772

Testing done

  • Added a new test case testTerminateAgentOnImagePullBackoffReasonFieldOnly() that reproduces the exact scenario from issue Reaper not terminating pods in ImagePullBackOff state #2772 where only the reason field is set to "ImagePullBackOff" (with null message)
  • Verified the test passes with the fix
  • Ran mvn clean verify -DskipTests to ensure code compiles and passes all quality checks (Spotless, SpotBugs)
  • Manually verified the logic handles both scenarios:
    • New behavior: Detects ImagePullBackOff from reason field (primary Kubernetes indicator)
    • Backward compatibility: Still detects from message field if reason is not set

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Ensure you have provided tests that demonstrate the feature works or the issue is fixed

The Reaper was only checking the message field for ImagePullBackOff errors, but Kubernetes actually sets the reason field. This caused pods to not get cleaned up when images failed to pull.

Now checks the reason field first, then falls back to message field for backwards compatibility.

Fixes jenkinsci#2772
@Abhijeet212004 Abhijeet212004 requested a review from a team as a code owner January 4, 2026 18:27
@Abhijeet212004
Copy link
Author

Hi @Vlatombe @jglick - I'd appreciate if you could review this PR when you have time. This fixes the Reaper not detecting ImagePullBackOff from the reason field (issue #2772).

I've added test coverage and verified the fix works while maintaining backward compatibility. Thanks!

Comment on lines -603 to -605
return waiting != null
&& waiting.getMessage() != null
&& waiting.getMessage().contains("Back-off pulling image");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced originally in #772

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @Vlatombe! Yes, I saw that #772 originally introduced this logic. The issue is that the code only checked the message field, but in some Kubernetes versions, ImagePullBackOff details appear in the reason field first. My fix checks both fields to ensure backward compatibility while catching all ImagePullBackOff scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reaper not terminating pods in ImagePullBackOff state

2 participants