Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation of Distributed.remotecall() misleading #81

Open
torrance opened this issue May 6, 2022 · 1 comment · May be fixed by JuliaParallel/DistributedNext.jl#20
Open

Documentation of Distributed.remotecall() misleading #81

torrance opened this issue May 6, 2022 · 1 comment · May be fixed by JuliaParallel/DistributedNext.jl#20

Comments

@torrance
Copy link

torrance commented May 6, 2022

Specifically considering the AbstractWorkerPool implementation of this method, remotecall(f, pool::AbstractWorkerPool, args...; kwargs...), the documentation states:

WorkerPool variant of remotecall(f, pid, ....). Wait for and take a free worker from pool and perform a remotecall on it.

The impression from the documentation is that this function will only submit jobs to idle workers (which in my use case was desirable behaviour).

However, the 'wait for' a free worker doesn't correctly convey what happens. This method will wait for a worker in the worker pool, but this worker may not be 'free' in the sense that it is idle. This occurs since the inner remotecall() returns immediately, and so this function takes and immediately returns workers back to the pool.

The documentation should state that this function:

  1. will assign work to worker pool cyclically,
  2. will return immediately,
  3. omit any mention of 'waiting for free workers'.
@GregPlowman
Copy link
Contributor

Agree, the documentation could and should be improved to avoid misunderstanding.

However, technically it is correct to say "Wait for and take a free worker from pool ...".

If the workers are assigned jobs only from calls to remotecall, which returns immediately, then it will appear that each worker is never busy. However, a worker might be busy from some other call.

Below, the remotecall_wait keeps the first worker busy, so the remotecalls within the for loop are executed on the only free worker.

using Distributed
addprocs(2)
wp = default_worker_pool()

@everywhere function f(i, n)
    println("Starting... $(i)")
    sleep(n)
    println("Done $(i).")
end

@async remotecall_wait(f, wp, 1, 10)    # keeps first worker busy

for i in 2:5
    remotecall(f, wp, i, 5) # runs asynchronously on free worker only, returns immediately
end

@vtjnash vtjnash transferred this issue from JuliaLang/julia Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants