Skip to content

SNOW-2790496: MLJob : _do_submit_job_v2 function incorrectly constructs container command arguments #188

@XraySierra0211

Description

@XraySierra0211

Hello,
I encountered a trouble which prevented me from running ML Jobs.

Description

When submitting an ML Job, the V2 submission path (_do_submit_job_v2, which calls SYSTEM$EXECUTE_ML_JOB) incorrectly constructs the command-line arguments for the job container. It prepends the job's stage path to what should be absolute paths within the container, causing the job to fail at startup.

Observed Behavior

The arguments passed to the container entrypoint script are malformed. For example, a path that should be an absolute path inside the container, such as: /mnt/job_stage/system/mljob_launcher.py

is incorrectly transformed into a stage path like: @payload_stage/MLJOB_.../\\/mnt/job_stage/system/mljob_launcher.py

This invalid path causes the container to fail to find and execute the launcher script.

Root Cause

The issue is located in the list comprehension that builds the args list within the snowflake.ml.jobs.manager._do_submit_job_v2 function:

# manager.py

def _do_submit_job_v2(...):
    # ...
    args = [
        (payload.stage_path.joinpath(v).as_posix() if isinstance(v, PurePath) else v) for v in payload.entrypoint
    ] + (args or [])
    # ...

The payload.entrypoint list can contain pathlib.PurePath objects that represent absolute paths within the container's filesystem (e.g., PurePath('/mnt/job_stage/system/mljob_launcher.py')).

The current logic incorrectly assumes that any PurePath object is a relative path that needs to be joined with payload.stage_path. This results in the erroneous concatenation of the stage path and the container-local absolute path.

Proposed Solution

The fix is to modify the list comprehension to simply convert PurePath objects to their string representation without prepending the stage path. The path in payload.entrypoint is already the correct path to be used inside the container.

The line should be changed from:

(payload.stage_path.joinpath(v).as_posix() if isinstance(v, PurePath) else v) for v in payload.entrypoint

to:

(v.as_posix() if isinstance(v, PurePath) else v) for v in payload.entrypoint

This ensures that absolute paths within the container are preserved correctly in the final command arguments.

(additional information) My environment

  • Windows 11
  • Python 3.10.19
  • snowflake-ml-python 1.19.0

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions