You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When spot instances are used, there might be a chance that a server will be terminated during a build.
In this case, the build will be aborted and then rescheduled to another node by default.
Unfortunately, it turns out that the plugin will resubmit the latest failed build instead of aborted one which is crucial in some cases (i.e deployment job can roll out outdated code or rollback migrations in DB).
I do believe that this is a bug and it probably relates to this code:
On the screenshot below: #1853 - just some failed build #1858 - has been aborted by the plugin, as a spot instance has been terminated #1859 - has been started automatically (and then manually aborted) with parameters from the build #1853 (the failed one) instead of #1858 (aborted one)
Logs
2024-06-11 10:46:05.400+0000 [id=132] INFO c.a.j.e.EC2FleetOnlineChecker#run: No connection to node 'Infrastructure Fleet i-0c4991da750ce86d8'. Attempting to connect and waiting before retry
2024-06-11 10:46:20.401+0000 [id=132] INFO c.a.j.e.EC2FleetOnlineChecker#run: Node 'Infrastructure Fleet i-0c4991da750ce86d8' connected. Resolving planned node
2024-06-11 10:46:49.014+0000 [id=63] INFO c.a.j.e.EC2RetentionStrategy#postJobAction: Build PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{label=i-0ca78ec7c56981d6a,context=CpsStepContext[3:node]:Owner[backend-tests-build/add-sidekiq-mailer/1:backend-tests-build/add-sidekiq-mailer #1]} completed successfully on agent i-0ca78ec7c56981d6a. TimeSpentInQueue: 0s, duration: 763s.
2024-06-11 10:47:19.554+0000 [id=1692342] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: DISCONNECTED: Infrastructure Fleet i-0a0aa9dfce2d0f828
2024-06-11 10:47:19.554+0000 [id=1692342] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Start retriggering executors for Infrastructure Fleet i-0a0aa9dfce2d0f828
2024-06-11 10:47:19.554+0000 [id=1692342] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Finished retriggering executors for Infrastructure Fleet i-0a0aa9dfce2d0f828
2024-06-11 10:47:19.767+0000 [id=1689100] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: DISCONNECTED: Default Build Fleet i-062bef57e2061e64c
2024-06-11 10:47:19.767+0000 [id=1689100] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Start retriggering executors for Default Build Fleet i-062bef57e2061e64c
2024-06-11 10:47:19.770+0000 [id=63] INFO c.a.j.e.EC2RetentionStrategy#postJobAction: Build PlaceholderExecutable:ExecutorStepExecution.PlaceholderTask{label=i-062bef57e2061e64c,context=CpsStepContext[3:node]:Owner[backend-build/1858:backend-build #1858]} completed successfully on agent i-062bef57e2061e64c. TimeSpentInQueue: 0s, duration: 372s.
2024-06-11 10:47:19.773+0000 [id=1689100] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: RETRIGGERING: org.jenkinsci.plugins.workflow.job.WorkflowJob@22e934d3[backend-build] - WITH ACTIONS: [hudson.model.ParametersAction@c571be8]
2024-06-11 10:47:19.774+0000 [id=1689100] INFO c.a.j.e.EC2FleetAutoResubmitComputerLauncher#afterDisconnect: Finished retriggering executors for Default Build Fleet i-062bef57e2061e64c
Environment Details
Plugin Version?
3.2.0
Jenkins Version?
2.452.1
Spot Fleet or ASG?
Spot Fleet
Label based fleet?
Yes
Linux or Windows?
Linux
Anything else unique about your setup?
No
The text was updated successfully, but these errors were encountered:
Issue Details
Describe the bug
When spot instances are used, there might be a chance that a server will be terminated during a build.
In this case, the build will be aborted and then rescheduled to another node by default.
Unfortunately, it turns out that the plugin will resubmit the latest failed build instead of aborted one which is crucial in some cases (i.e deployment job can roll out outdated code or rollback migrations in DB).
I do believe that this is a bug and it probably relates to this code:
ec2-fleet-plugin/src/main/java/com/amazon/jenkins/ec2fleet/EC2FleetAutoResubmitComputerLauncher.java
Lines 105 to 106 in de2ba96
On the screenshot below:
#1853
- just some failed build#1858
- has been aborted by the plugin, as a spot instance has been terminated#1859
- has been started automatically (and then manually aborted) with parameters from the build#1853
(the failed one) instead of#1858
(aborted one)Logs
Environment Details
Plugin Version?
3.2.0
Jenkins Version?
2.452.1
Spot Fleet or ASG?
Spot Fleet
Label based fleet?
Yes
Linux or Windows?
Linux
Anything else unique about your setup?
No
The text was updated successfully, but these errors were encountered: