Skip to content

Commit

Permalink
[SDK] Fix invalid flow path in run YAML (#1656)
Browse files Browse the repository at this point in the history
# Description
This pull request introduces several changes to the `src/promptflow`
directory. The most important changes include enhancing the validation
of the `flow` property in the `RunSchema` class, adding a new test
configuration file for bulk runs, and adding a new test case to validate
the behavior of the code when an invalid flow path is provided.

Main changes:

* <a
href="diffhunk://#diff-db38cc07d25efdbc0622e8e9352e07e34c502d36bdad1954bcd52db156192c9fL63-R80">`src/promptflow/promptflow/_sdk/schemas/_run.py`</a>:
Enhanced the validation of the `flow` property in the `RunSchema` class
by adding a new field called `RemoteFlowStr` and a new validation in the
`_validate` method. <a
href="diffhunk://#diff-db38cc07d25efdbc0622e8e9352e07e34c502d36bdad1954bcd52db156192c9fL63-R80">[1]</a>
<a
href="diffhunk://#diff-db38cc07d25efdbc0622e8e9352e07e34c502d36bdad1954bcd52db156192c9fR51-R67">[2]</a>
* <a
href="diffhunk://#diff-b33d2ea22b9e9679f7a70a7beb5bd27b64c0bdb575e425aeece5322ff550ddbbR1-R11">`src/promptflow/tests/test_configs/runs/bulk_run_invalid_flow.yaml`</a>:
Added a new test configuration file for bulk runs,
`bulk_run_invalid_flow.yaml`.
* <a
href="diffhunk://#diff-c3d1c4e4539af1a59525218043dad93dc866b761a70a16a21783e57a7d0adac5R97-R103">`src/promptflow/tests/sdk_cli_test/unittests/test_run.py`</a>:
Added a new test case to validate the behavior of the code when an
invalid flow path is provided.

Other changes:

* <a
href="diffhunk://#diff-41ec3f7c4b5d4c0e670407d3c00a03a6966d7ebf617b1536473e33a12e2bc765R5-R12">`src/promptflow/CHANGELOG.md`</a>:
Introduced a new feature to the Executor, a `@trace` decorator, which
allows logging traces for functions called by tools. However, it was
later decided to remove this decorator without mentioning the reason in
the diff. <a
href="diffhunk://#diff-41ec3f7c4b5d4c0e670407d3c00a03a6966d7ebf617b1536473e33a12e2bc765R5-R12">[1]</a>
<a
href="diffhunk://#diff-41ec3f7c4b5d4c0e670407d3c00a03a6966d7ebf617b1536473e33a12e2bc765L18-R23">[2]</a>

Please add an informative description that covers that changes made by
the pull request and link all relevant issues.

# All Promptflow Contribution checklist:
- [ ] **The pull request does not introduce [breaking changes].**
- [ ] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [ ] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [ ] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [ ] Title of the pull request is clear and informative.
- [ ] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
  • Loading branch information
D-W- authored Jan 4, 2024
1 parent b6cdb70 commit ba204a2
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 4 deletions.
10 changes: 7 additions & 3 deletions src/promptflow/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,16 @@

## 1.4.0 (Upcoming)

### Features Added

- [Executor] Calculate system_metrics recursively in api_calls.
- [Executor] Add flow root level api_calls, so that user can overview the aggregated metrics of a flow.
- [Executor] Add @trace decorator to make it possible to log traces for functions that are called by tools.

### Bugs Fixed

- Fix unaligned inputs & outputs or pandas exception during get details against run in Azure.
- Fix loose flow path validation for run schema.

## 1.3.0 (2023.12.27)

Expand All @@ -13,9 +20,6 @@
- Add support to configure prompt flow home directory via environment variable `PF_HOME_DIRECTORY`.
- Please set before importing `promptflow`, otherwise it won't take effect.
- [Executor] Handle KeyboardInterrupt in flow test so that the final state is Canceled.
- [Executor] Calculate system_metrics recursively in api_calls.
- [Executor] Add flow root level api_calls, so that user can overview the aggregated metrics of a flow.
- [Executor] Add @trace decorator to make it possible to log traces for functions that are called by tools.

### Bugs Fixed
- [SDK/CLI] Fix single node run doesn't work when consuming sub item of upstream node
Expand Down
19 changes: 18 additions & 1 deletion src/promptflow/promptflow/_sdk/schemas/_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,23 @@ def _validate(self, value):
)


class RemoteFlowStr(fields.Str):
default_error_messages = {
"invalid_path": "Invalid remote flow path. Currently only azureml:<flow-name> is supported",
}

def _validate(self, value):
# inherited validations like required, allow_none, etc.
super(RemoteFlowStr, self)._validate(value)

if value is None:
return
if not isinstance(value, str) or not value.startswith("azureml:"):
raise self.make_error(
"invalid_path",
)


class RunSchema(YamlFileSchema):
"""Base schema for all run schemas."""

Expand All @@ -60,7 +77,7 @@ class RunSchema(YamlFileSchema):
properties = fields.Dict(keys=fields.Str(), values=fields.Str(allow_none=True))
# endregion: common fields

flow = UnionField([LocalPathField(required=True), fields.Str(required=True)])
flow = UnionField([LocalPathField(required=True), RemoteFlowStr(required=True)])
# inputs field
data = UnionField([LocalPathField(), RemotePathStr()])
column_mapping = fields.Dict(keys=fields.Str)
Expand Down
14 changes: 14 additions & 0 deletions src/promptflow/tests/sdk_cli_test/unittests/test_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,20 @@ def test_dot_env_resolve(self):
run = load_run(source=source, params_override=[{"name": run_id}])
assert run.environment_variables == {"FOO": "BAR"}

def test_run_invalid_flow_path(self):
run_id = str(uuid.uuid4())
source = f"{RUNS_DIR}/bulk_run_invalid_flow_path.yaml"
with pytest.raises(ValidationError) as e:
load_run(source=source, params_override=[{"name": run_id}])
assert "Can't find directory or file in resolved absolute path:" in str(e.value)

def test_run_invalid_remote_flow(self):
run_id = str(uuid.uuid4())
source = f"{RUNS_DIR}/bulk_run_invalid_remote_flow_str.yaml"
with pytest.raises(ValidationError) as e:
load_run(source=source, params_override=[{"name": run_id}])
assert "Invalid remote flow path. Currently only azureml:<flow-name> is supported" in str(e.value)

def test_data_not_exist_validation_error(self):
source = f"{RUNS_DIR}/sample_bulk_run.yaml"
with pytest.raises(ValidationError) as e:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
name: flow_run_20230629_101205
description: sample bulk run
# flow relative to current working directory should not be supported.
flow: tests/test_configs/flows/web_classification
data: ../datas/webClassification1.jsonl
column_mapping:
url: "${data.url}"
variant: ${summarize_text_content.variant_0}

# run config: env related
environment_variables: env_file
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
name: flow_run_20230629_101205
description: sample bulk run
# invalid remote flow format should not be supported.
flow: invalid_remote_flow
data: ../datas/webClassification1.jsonl
column_mapping:
url: "${data.url}"
variant: ${summarize_text_content.variant_0}

# run config: env related
environment_variables: env_file

0 comments on commit ba204a2

Please sign in to comment.