Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the plugin re-triggers the last failed build instead aborted one #443

Open
pznamensky opened this issue Jun 11, 2024 · 1 comment
Open
Labels

Comments

@pznamensky
Copy link

Issue Details

Describe the bug
When spot instances are used, there might be a chance that a server will be terminated during a build.
In this case, the build will be aborted and then rescheduled to another node by default.

Unfortunately, it turns out that the plugin will resubmit the latest failed build instead of aborted one which is crucial in some cases (i.e deployment job can roll out outdated code or rollback migrations in DB).
I do believe that this is a bug and it probably relates to this code:

final WorkflowRun failedBuild = ((WorkflowJob) task).getLastFailedBuild();
actions.addAll(failedBuild.getActions(ParametersAction.class));

On the screenshot below:
#1853 - just some failed build
#1858 - has been aborted by the plugin, as a spot instance has been terminated
#1859 - has been started automatically (and then manually aborted) with parameters from the build #1853 (the failed one) instead of #1858 (aborted one)

image

Logs

2024-06-11 10:46:05.400+0000 [id=132]        INFO        c.a.j.e.EC2FleetOnlineChecker#run: No connection to node 'Infrastructure Fleet i-0c4991da750ce86d8'. Attempting to connect and waiting before retry
2024-06-11 10:46:20.401+0000 [id=132]        INFO        c.a.j.e.EC2FleetOnlineChecker#run: Node 'Infrastructure Fleet i-0c4991da750ce86d8' connected. Resolving planned node
2024-06-11 10:46:49.014+0000 [id=63]        INFO        c.a.j.e.EC2RetentionStrategy#postJobAction: Build PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{label=i-0ca78ec7c56981d6a,context=CpsStepContext[3:node]:Owner[backend-tests-build/add-sidekiq-mailer/1:backend-tests-build/add-sidekiq-mailer #1]} completed successfully on agent i-0ca78ec7c56981d6a. TimeSpentInQueue: 0s, duration: 763s.
2024-06-11 10:47:19.554+0000 [id=1692342]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: DISCONNECTED: Infrastructure Fleet i-0a0aa9dfce2d0f828
2024-06-11 10:47:19.554+0000 [id=1692342]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Start retriggering executors for Infrastructure Fleet i-0a0aa9dfce2d0f828
2024-06-11 10:47:19.554+0000 [id=1692342]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Finished retriggering executors for Infrastructure Fleet i-0a0aa9dfce2d0f828
2024-06-11 10:47:19.767+0000 [id=1689100]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: DISCONNECTED: Default Build Fleet i-062bef57e2061e64c
2024-06-11 10:47:19.767+0000 [id=1689100]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Start retriggering executors for Default Build Fleet i-062bef57e2061e64c
2024-06-11 10:47:19.770+0000 [id=63]        INFO        c.a.j.e.EC2RetentionStrategy#postJobAction: Build PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{label=i-062bef57e2061e64c,context=CpsStepContext[3:node]:Owner[backend-build/1858:backend-build #1858]} completed successfully on agent i-062bef57e2061e64c. TimeSpentInQueue: 0s, duration: 372s.
2024-06-11 10:47:19.773+0000 [id=1689100]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: RETRIGGERING: org.jenkinsci.plugins.workflow.job.WorkflowJob@22e934d3[backend-build] - WITH ACTIONS: [hudson.model.ParametersAction@c571be8]
2024-06-11 10:47:19.774+0000 [id=1689100]        INFO        c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Finished retriggering executors for Default Build Fleet i-062bef57e2061e64c

Environment Details

Plugin Version?
3.2.0

Jenkins Version?
2.452.1

Spot Fleet or ASG?
Spot Fleet

Label based fleet?
Yes

Linux or Windows?
Linux

Anything else unique about your setup?
No

@pznamensky pznamensky added the bug label Jun 11, 2024
@ItielOlenick
Copy link

Hitting this as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants