Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution context and modules activation for execution #338

Open
denisri opened this issue Nov 24, 2023 · 4 comments
Open

Execution context and modules activation for execution #338

denisri opened this issue Nov 24, 2023 · 4 comments

Comments

@denisri
Copy link
Collaborator

denisri commented Nov 24, 2023

Config modules can provide a static method init_execution_context in order to initialize things for execution. This method is called from the constructor of ExecutionContext. The context is created in the function capsul.engine.execution_context(), which, at the moment, does things in two steps:

  • create the context using only the list of config modules, but no config contents.
    The constructor is called at this moment, and calls init_execution_context for modules which do not have config values yet.
  • then, afterwards, fill in the config in the context. But it's too late, modules init_execution_context will not be calles again.

We should modify this, either by calling the constructor using the final complete (filtered) config, or call modules init_execution_context later after the config is setup.

Additionally there is another issue with execution contexts: they are apparently meant for "execution", thus computing server side, initialize config modules in order to run code and software in this config. But we need to instantiate them also on client side, because they are needed for completion: ProcessMetadata needs an execution context in its initialization. So this execution runtime initialization is also performed on client side, using config values which are suited for the server, which 1. is an unnecessary work here, 2. will potentially lead to errors because the config does not match the client machine running the init code there.

@sapetnioc
Copy link
Collaborator

We should reuse the builtin execution context because it is supposed to represent the local machine, therefore the client context. The activation of this context would also be the first step to provide a solution for #335. Now, we have to decide when and how to create and activate this context. I am not in favor to do it on capsul = Capsul() call fro two reasons:

  • I can be slow to activate external software if we do not really need it
  • We would loose the possibility to change config on capsul object before activating the context.

We could add a method such as Capsul.activate_context() that would simply create and return a builtin execution context. We may have to adapt the way we handle the selection of a specific software version when several are present in the config. To date it relies on processes requirements, for activate_context() these requirements would have to be taken from the config.

@denisri
Copy link
Collaborator Author

denisri commented Nov 24, 2023

Is there a builtin execution context ? I can't find it. The builtin wrorker doesn't make mention of execution context, and in run.py the context is asked to the database, which re-creates it from the stored config, but the ExecutionContext object is the same class.
Otherwise I agree it should not be activated in Capsul(), because we don't need it at this time, and the config is likely not complete yet.
Plus, importantly, the context is built in accordance with a process, which provides requirements which allows to select amongst several configs.
While I write that, I notice that, during execution, the context re-created by the database does not depend on the process/job, but only on the execution_id. This means that all jobs in the same workflow should share the same context, so also the same requirements. Maybe something is missing here...?

Anyway I have started to work on this problem (along with #335). I will propose something soon.

denisri added a commit that referenced this issue Nov 24, 2023
and ease use of external software by re-introducing in_context (#338, #335)
@denisri
Copy link
Collaborator Author

denisri commented Nov 27, 2023

OK I understand now for the builtin context, it's the builtin computing resource defined in the configuration object. If we want to use a remote resource, then the selected resource is not "builtin" on client side, we can maybe use this name to detect if we are local or not, but it's not really a good test: we might want to define other resources with different configs, but still local. Moreover once the engine is selected, I'm not sure we have its name somewhere to inform the execution context creation.
Maybe when the context is re-created by the database object is the good time to decide that this context is really for execution, and it should activate modules, contrarily to when we use Capsul.engine().execution_context().
The remaining problem is that the context created in the database doesn't know the process/job and its requirements

denisri added a commit that referenced this issue Nov 27, 2023
when the context is built from the database, really for execution
(#338)
@denisri
Copy link
Collaborator Author

denisri commented Nov 27, 2023

The last commit above implements what has been said just above, except for the adaptation of the context config to the job to run. This is not stored in the database, so the config has to be the same for all jobs of the execution, for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants