You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm confused on how to properly use dependencies. Let's say I have a workflow with 4 groups of steps (A, B, C, D) and each has multiple subtasks that can happen in parallel (A1, A2, ..., B1, B2, ...). Currently, I'm adding all the A steps using couler.map, then adding all the B steps with couler.map, etc. This correctly parallelizes across A1, A2, ..., but none of the B steps start until all the A steps have completed, despite the fact that I never explicitly set dependencies.
In this case, I want A and B to run in parallel, then C then D. Having this run sequentially as A, B, C, D is technically correct, but not ideally performant. However, given that I'm not setting dependencies, and they're still running sequentially, I feel like using the set_dependencies function wouldn't help. Also, when I tried to use the set_dependencies function, the couler code errored on parsing its own generated yaml due to duplicate anchor definitions. Would definitely like to see a more in-depth example than those currently present in the README which shows how to properly use set_dependencies in combination with functions like map.
Use Cases
Mostly explained above.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered:
Summary
I'm confused on how to properly use dependencies. Let's say I have a workflow with 4 groups of steps (A, B, C, D) and each has multiple subtasks that can happen in parallel (A1, A2, ..., B1, B2, ...). Currently, I'm adding all the A steps using couler.map, then adding all the B steps with couler.map, etc. This correctly parallelizes across A1, A2, ..., but none of the B steps start until all the A steps have completed, despite the fact that I never explicitly set dependencies.
In this case, I want A and B to run in parallel, then C then D. Having this run sequentially as A, B, C, D is technically correct, but not ideally performant. However, given that I'm not setting dependencies, and they're still running sequentially, I feel like using the set_dependencies function wouldn't help. Also, when I tried to use the set_dependencies function, the couler code errored on parsing its own generated yaml due to duplicate anchor definitions. Would definitely like to see a more in-depth example than those currently present in the README which shows how to properly use set_dependencies in combination with functions like map.
Use Cases
Mostly explained above.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: