Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update worker autoscaling analyses priority. #1148

Open
sambles opened this issue Nov 27, 2024 · 1 comment
Open

Update worker autoscaling analyses priority. #1148

sambles opened this issue Nov 27, 2024 · 1 comment
Assignees

Comments

@sambles
Copy link
Contributor

sambles commented Nov 27, 2024

Issue Description

In the current auto-scaler logic, ModelStates are aggregations of an OasisModel and its queued/running analyses

class ModelState(TypedDict):
    """ 
    Used in the model states dict to store information about each models current states. For now number of tasks
    and analyses for each model.
    """
    tasks: int 
    analyses: int 
    priority: int     

In the code currently, this grouping will inherit the highest priority from all the of analyses.

if priority > model_state['priority']:
model_state['priority'] = priority

Testing required: This might lead to starvation of other model workers, given that all queues share the same pool of nodes to draw VMs from.

A suggested improvement, is to scale based on the number of "slots" or number of concurrent task threads a worker can process.

e.g. If a task is (priority=10, tasks=15) then 15 slots are assigned that priority, rather than all

@sambles
Copy link
Contributor Author

sambles commented Nov 27, 2024

Sam: Thinking about it again, shouldn't the model_state aggregate priority value drop onces the high pri task has competed?

Amir: Yes it does go down but before the highest priority analysis finishes, all other low priority analyses of the same model will also effectively get highest priority (because the whole model gets highest priority) and those low priority ones will block higher priority analyses from other models. This will last until the highest priority analyses ends then the situation gets rectified. But they could run for many hours and that will block an analysis with higher priority from another model.

which is not desirable. We should have workers lined up according to priority of analyses regardless of model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

1 participant