Extended JobManifest including execution logic

Current DIRAC Workflow implementation is over complicated, and at the same time it misses a fundamental piece: logic control (what to do if Module X execution produces an error, or if I want to do loops, or if statements).

Start from the current JobManifest implementation as DIRAC CFG we need to allow a complete description of the workload execution Executable and Arguments are not simple Options like now, but rather full sections. Once we agree on such description a tool that would take care of controlling its execution should allow to substitute the current "dirac-jobexec + JobDescription.xml" solution. The new mechanism should handle in the same way all user jobs.

The extended JobManifest will move from:

{
  Executable = /bin/ls
  Arguments = -ltr *.txt
}

To:

{
  Executable
  {
    ...
  }
  Arguments
  {
    ...
  }
}

Let's start with Arguments. In the simpler case it is just an option with an string value (just like now) but also other alternatives must be added: numerical values, coma separated list of strings or numerical values, and finally a CFG section that translate into a kwargs dict. So we should be able to handle:

Arguments = Hello

Arguments = Hello, World

Arguments = 1

Arguments = 0.5, 0.5, 3

Arguments
{
  Arguments = Any of the above
}

Arguments
{
  Site = MySite
  LogLevel = DEBUG
  Iterations = 10
}

Have a look at https://github.com/acasajus/DIRAC/tree/dev-split, in https://github.com/DIRACGrid/DIRAC/pull/1625, that allows to use variables in CFGs and thus in JobManifest:

https://github.com/acasajus/DIRAC/blob/dev-split/Core/Utilities/CFG.py#L1019

@gCFGSynchro
def expand( self ):
  """
  Expand all options into themselves
  a = something-$b
  b = hello

  will end up with
  a = something-hello
  b = hello
  """

The logic below will also be required:

{
  A
  {
    ...
  }
  B = $A
}

producing:

{
  A
  {
    ...
  }
  B
  {
    ...
  }
}

B will be a deep copy of A (as a section), this needs to be implemented.

For the Executable, in the simpler case we have to support the current situation, it is an option with a string value. But in the general case it will be a full CFG section describing the job execution logic:

Executable = /bin/ls

should become:

Executable
{
  ...
}

What needs to be described in that new section is:

A list of named modules that are going to be used. Few standard ones should be available like "Exit", "ShellExec", "GetData", etc
An execution logic
An error handing logic

So in first approach it will look like:

Executable
{
  Modules
  {
    ...
  }
  Execution
  {
    ...
  }
}

Some Options could be defined to manipulate arguments, merging, concatenating, etc. And in a trivial case Modules section can be skipped.

Now to each of these section (this is still quite preliminary ideas, but I think are not too far from a working solution).

Modules: it is a container of sections each one describing an named module. On execution each module will be instantiated, it will have to inherit from a proper "Module" class to be defined and include an execute method that returns S_OK/S_ERROR with some limitations (only string, numerical types and lists, sequences or dicts using them are allowed) and a "getCFG" method that returns the default entry to be added when creating a new job description.

Modules
{
  # This are named description of each module to be used in Execution
  # For simple cases the Module can be named after the python Module
  # and search for in a default location:
  #    [XXX]DIRAC.Job.Modules.[ModuleName]
  #   in this case Path can be skipped
  # the Default Arguments for the Module is the Input CFG defined in Execution section
  # Arguments can be a string, a list of strings or a full section (dictionary)
  # the default Output is the Result dictionary returned by the execute method
  # All Options with updated values after the execution of the module should be made available
  #
  ShellExec
  {
     # This path could be the default one to be searched for in all available extensions
     Path = Job.Modules.ShellExec
     # When called each module will get an Input CFG with named arguments
     # This is the default if nothing else is set
     Arguments = $Input
     # This is the default if nothing else is set
     # Result = CFG version of the result returned after the execution
     # After execution the return dictionary is passed as Result and the JobManifest reevaluated
     Status = $Result/OK
     # The result assumes that apart from OK and Value/Message this module also includes
     # ExitCode, StdOut, StdErr (name of StdOut/StdErr Files)
     # in the returned result dict
     ExitCode = $Result/ExitCode
     StdOut = $Result/StdOut
     StdErr = $Result/StdErr
  }
}

Execution: it is a container for subsection describing the execution logic of the job (remember that CFG keeps track of the order). Something like:

Execution {

# If we are not going to add anything we can directly use Input # For old stile JDLs, they will translate into something like # Arguments # { # Executable = some executable_path # Arguments = string, number, list of strings, list of numbers or CFG (dict) # } # And will make use of a "default" ShellExec module, does not need to be declared # LogicFlow # { # Arguments = $Input # Module = ShellExec, Arguments, Status # } # OnError # { # # Exit is just another available module like ShellExec # Module = Exit, 1 # } # # Otherwise, use something like # For a linear execution with loop and conditional statements LogicFlow {

# This is a sequential list of describing how to execute Modules The number is irrelevant, but # names must start with keyword Module, # plus some options to allow manipulation of Input Arguments # each Module can be a list or a list of lists (a list of Modules to be executed, that can be empty) # The list should include at least one item (the name of the Module in the Modules section) # The second item is passed as argument to the execute method of the Module # they can be a string, a list (args) or a dictionary (kwargs) # The third and any further item are outputs from the execution of the module, they may come from # the Result Dict returned by the actual execution, or be determined in some other way. Module1 = ShellExec, $Input, Status # After execution of Module1, $Module1/Result will always, as well as # be defined and in this case also $Module1/Status, Arguments = $Module1/Arguments/Executable, '-l', $Module1/Result/StdOut Module2 = ShellExec, $Arguments # or even things like this at a later stage (This look very much like # what we need for workflows of Jobs) # Iterations of the same Module 1 definition Module3 = [ $Module1 for i in [1, 2, 3, 4, 5] ] # Conditional execution Module4 = [ $Module1 for i in [1,] if $Module3/Status ]

} OnError {

# If a given module from the LogicFlow is defined here, and Module['Result']['OK'] == False # the defined Error "Module is executed" Module1 = Exit, 1 Module3 = Exit, 3 Module4 = Exit, 0

}

}

Extended JobManifest including execution logic

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!