You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 21, 2024. It is now read-only.
Last night I left BlendNet rendering out an animation and went to bed, and in the morning found all my equipment idle but lots of render jobs shown as PENDING in the job list.
.. it seems that I forgot to attach the charger to one of my laptops, so at some point in the night it turned off. Because my laptops are not cloud providers, the agent did not receive a SIGTERM notice and inform the Manager: it just stopped responding to the Manager's status requests.
It appears that the Manager does not give out any more work packages to agents until the "missing agent" responds. I would expect that the Manager would give work out to the remaining Agents and carry on getting work done.
Environment:
Application version: v0.3.5-4-g6f54c2f
Blender client version: 2.91.0
Blender worker version: 2.91.0
OS: Mac OS 11.1 (Addon & Agent) / Mac OS 10.15.7 (Manager & Agent) / Ubuntu 18.04 (Agent)
Steps to reproduce:
Steps to reproduce the behavior:
Run Blender
Add two or more agents to BlendNet, and ensure they are showing checks ✓
Kill one of the Agents (I pressed ^C twice in the terminal)
Open project, or save default scene to a file
Click "Run image task"
See that it enters RUNNING state, but never completes
In my case I started with four agents online, and took one offline before clicking "Run image task". The task seems to get stuck at 75% (1/4), so it appears the Manager has decided the missing agent should take this task, and until it does all other agents will be idle.
Expected behavior
The task to complete, because other agents are available.
I would expect the Manager to give work to any agent that is online and idle, i.e. if an agent goes offline (gracefully or otherwise) the other agents will do the work, eventually.
(This makes me wonder what happens when some agents are significantly more powerful than others. For example, one of my agents is a server in a rack with 40x CPU cores, while another is an Arm-powered Macbook: does the server race ahead of the others, but sit idle until the others have finished their assigned tasks?)
Screenshots
Additional context
Manager logs are full of:
WARN: Communication issue with request to "https://Roberts-Air.lan:9443/api/v1/status": [Errno 61] Connection refused
Note that if I bring the missing agent back, it will be assigned work and once that is complete the render job as a whole completes successfully.
The text was updated successfully, but these errors were encountered:
Issue description:
Last night I left BlendNet rendering out an animation and went to bed, and in the morning found all my equipment idle but lots of render jobs shown as PENDING in the job list.
.. it seems that I forgot to attach the charger to one of my laptops, so at some point in the night it turned off. Because my laptops are not cloud providers, the agent did not receive a
SIGTERM
notice and inform the Manager: it just stopped responding to the Manager's status requests.It appears that the Manager does not give out any more work packages to agents until the "missing agent" responds. I would expect that the Manager would give work out to the remaining Agents and carry on getting work done.
Environment:
Steps to reproduce:
Steps to reproduce the behavior:
In my case I started with four agents online, and took one offline before clicking "Run image task". The task seems to get stuck at 75% (1/4), so it appears the Manager has decided the missing agent should take this task, and until it does all other agents will be idle.
Expected behavior
The task to complete, because other agents are available.
I would expect the Manager to give work to any agent that is online and idle, i.e. if an agent goes offline (gracefully or otherwise) the other agents will do the work, eventually.
(This makes me wonder what happens when some agents are significantly more powerful than others. For example, one of my agents is a server in a rack with 40x CPU cores, while another is an Arm-powered Macbook: does the server race ahead of the others, but sit idle until the others have finished their assigned tasks?)
Screenshots
Additional context
Manager logs are full of:
Note that if I bring the missing agent back, it will be assigned work and once that is complete the render job as a whole completes successfully.
The text was updated successfully, but these errors were encountered: