Skip to content

Commit 879e592

Browse files
Fail tasks exceeding no-workers-timeout (#8806)
1 parent fea5515 commit 879e592

File tree

5 files changed

+383
-70
lines changed

5 files changed

+383
-70
lines changed

distributed/distributed-schema.yaml

+5-6
Original file line numberDiff line numberDiff line change
@@ -81,16 +81,15 @@ properties:
8181
- string
8282
- "null"
8383
description: |
84-
Shut down the scheduler after this duration if there are pending tasks,
85-
but no workers that can process them. This can either mean that there are
86-
no workers running at all, or that there are idle workers but they've been
87-
excluded through worker or resource restrictions.
84+
Timeout for tasks in an unrunnable state.
85+
86+
If task remains unrunnable for longer than this, it fails. A task is considered unrunnable IFF
87+
it has no pending dependencies, and the task has restrictions that are not satisfied by
88+
any available worker or no workers are running at all.
8889
8990
In adaptive clusters, this timeout must be set to be safely higher than
9091
the time it takes for workers to spin up.
9192
92-
Works in conjunction with idle-timeout.
93-
9493
work-stealing:
9594
type: boolean
9695
description: |

distributed/distributed.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ distributed:
1919
# after they have been removed from the scheduler
2020
events-cleanup-delay: 1h
2121
idle-timeout: null # Shut down after this duration, like "1h" or "30 minutes"
22-
no-workers-timeout: null # Shut down if there are tasks but no workers to process them
22+
no-workers-timeout: null # If a task remains unrunnable for longer than this, it fails.
2323
work-stealing: True # workers should steal tasks from each other
2424
work-stealing-interval: 100ms # Callback time for work stealing
2525
worker-saturation: 1.1 # Send this fraction of nthreads root tasks to workers

0 commit comments

Comments
 (0)