Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Optional" Artifact should not throw an error when a loader is specified #1253

Open
3 tasks done
sam2x opened this issue Oct 31, 2024 · 0 comments
Open
3 tasks done
Labels
type:bug A general bug

Comments

@sam2x
Copy link

sam2x commented Oct 31, 2024

Pre-bug-report checklist

1. This bug can be reproduced using pure Argo YAML

  • No

2. I have searched for existing issues

  • Yes

3. This bug occurs in Hera when...

  • running on Argo with the Hera runner

Bug report

Hello,

Summary

In the current Runner code, when the parameter "Optional" is set to "True" for an Artifact, it will always fail if an Artifact loader is specified.
Current implemention is as follow:

def get_annotated_artifact_value(artifact_annotation: Artifact) -> Union[Path, Any]:
    """Get the value of the given Artifact annotation.

    If the artifact is an output, return the path it will write to.
    If the artifact is an input, return the loaded value (json, path or string) using its ArtifactLoader.

    As Artifacts are always Annotated in function parameters, we don't need to consider
    the `kwargs` or the function parameter name.
    """
    if artifact_annotation.output:
        if artifact_annotation.path:
            path = Path(artifact_annotation.path)
        else:
            path = _get_outputs_path(artifact_annotation)
        # Automatically create the parent directory (if required)
        path.parent.mkdir(parents=True, exist_ok=True)
        return path

    if not artifact_annotation.path:
        # Path is added to the spec automatically when built. As it isn't present in the annotation itself,
        # we need to add it back in for the runner.
        artifact_annotation.path = artifact_annotation._get_default_inputs_path()

    if artifact_annotation.loader == ArtifactLoader.json.value:
        path = Path(artifact_annotation.path)
        with path.open() as f:
            return json.load(f)

    if artifact_annotation.loader == ArtifactLoader.file.value:
        path = Path(artifact_annotation.path)
        return path.read_text()

    if artifact_annotation.loader is None:
        return artifact_annotation.path

    raise RuntimeError(f"Artifact {artifact_annotation.name} was not given a value")

It will raise "FileNotFoundError: [Errno 2] No such file or directory: '/tmp/hera-inputs/artifacts/data'".

To Reproduce

Any Input inherited class with the following declaration:

    data: Annotated[Any, Artifact(name="data", optional=True, loader=ArtifactLoader.json)] = ""

Expected behavior

We can't catch the exception, which in my opinion doesn't make sense when "Optional" is to True ?
I believe this was not the intended logic when you implemented "Optional". Since the loader is "called" before the function and out of our control, there should be a condition whenever Optional is True to just return 'None' ?

You would maybe ask me "why don't you just remove 'loader'" for this specific Input class ?": Actually I can, but the reason is:

  1. It's really important for me to keep my code standardized/consistent as much as possible and 'avoid' such conditional case. This Input class is currently used over all my tasks as "base" Input class.
  2. I keep an 'interoperable' approach of my tasks (they could be used as entry-point task or 'passing' task, which mean that either direct 'Parameter' value are provided to the task, or Artifact-passing are provided).

Proposed solution

When parameter optional is specified, the get_annotated_artifact_value should return None is the file opening fail (non-existing file, error IO, etc.).
Code:

    if artifact_annotation.loader == ArtifactLoader.json.value:
        path = Path(artifact_annotation.path)
        try:
            with path.open() as f:
                return json.load(f)
        except FileNotFoundError:
            return None
    if artifact_annotation.loader == ArtifactLoader.file.value:
        path = Path(artifact_annotation.path)
        try:
            with path.open() as f:
                return path.read_text()
        except FileNotFoundError:
            return None
    if artifact_annotation.loader is None:
        return artifact_annotation.path

As a temporary solution, i will implement a custom 'loader' that do it, but I believe this temporary solution should be the mainstream approach for Hera.

I reported this as a "bug", but could not find other category suitable.

Let me know your vision on this.
Br,
Sam

@sam2x sam2x changed the title "Optional" in Artifact should not throw an error when a loader is specified "Optional" Artifact should not throw an error when a loader is specified Oct 31, 2024
@sambhav sambhav added the type:bug A general bug label Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants