Skip to content

Commit

Permalink
[Internal]Add validate functionality in schema_checker (#1681)
Browse files Browse the repository at this point in the history
# Description

Add validate functionality in schema_checker.

Warning is also treated as an error.

![image](https://github.com/microsoft/promptflow/assets/2208599/5b691c8d-ee6d-479a-823e-d2eb0e9facc9)

# All Promptflow Contribution checklist:
- [x] **The pull request does not introduce [breaking changes].**
- [ ] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [ ] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [ ] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [ ] Title of the pull request is clear and informative.
- [ ] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
  • Loading branch information
crazygao authored Jan 8, 2024
1 parent 4021dc0 commit 5a31079
Show file tree
Hide file tree
Showing 4 changed files with 54 additions and 12 deletions.
14 changes: 10 additions & 4 deletions .github/workflows/flowdag_schema_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,31 @@ on:
paths:
- examples/**
- .github/workflows/flowdag_schema_check.yml
- scripts/readme/schema_checker.py
env:
IS_IN_CI_PIPELINE: "true"
jobs:
examples_flowdag_schema_check:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- run: env | sort >> $GITHUB_OUTPUT
- name: Python Setup - ubuntu-latest - Python Version 3.9
uses: "./.github/actions/step_create_python_environment"
with:
pythonVersion: 3.9
- run: pip install -r ${{ github.workspace }}/examples/dev_requirements.txt
- run: |
pip install -r ${{ github.workspace }}/examples/dev_requirements.txt
pip install -r ${{ github.workspace }}/examples/requirements.txt
- name: Summarize check status
id: summarize_check_status
working-directory: ${{ github.workspace }}
shell: pwsh
env:
PYTHONPATH: ${{ github.workspace }}/src/promptflow
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
cd ${{ github.workspace }}/src/promptflow
pip install -e .
cd ${{ github.workspace }}/src
pip install -e promptflow[azure]
pip install -e promptflow-tools
python ${{ github.workspace }}/scripts/readme/schema_checker.py
1 change: 0 additions & 1 deletion examples/flows/evaluation/eval-qna-non-rag/flow.dag.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
name: QnA Evaluation
inputs:
question:
type: string
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
name: QnA RAG Evaluation
inputs:
metrics:
type: string
Expand Down
50 changes: 44 additions & 6 deletions scripts/readme/schema_checker.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,66 @@
from functools import reduce
from promptflow._sdk._load_functions import load_yaml
from promptflow._sdk._pf_client import PFClient
from ghactions_driver.readme_step import ReadmeStepsManage
from pathlib import Path
import os
import subprocess
import sys


def install(filename):
subprocess.check_call([sys.executable, "-m", "pip", "install", "-r", filename])


def main(input_glob_flow_dag):
# check if flow.dag.yaml contains schema field.
def set_add(p, q):
return p | q

error = False
globs = reduce(set_add, [set(Path(ReadmeStepsManage.git_base_dir()).glob(p)) for p in input_glob_flow_dag], set())
globs = set()
pf_client = PFClient()

for p in input_glob_flow_dag:
globs = globs | set(Path(ReadmeStepsManage.git_base_dir()).glob(p))
flow_dag_items = sorted([i for i in globs])

for file in flow_dag_items:
data = load_yaml(file)
if "$schema" not in data.keys():
print(f"{file} does not contain $schema field.")
error = True
if error is False:
new_links = []
if (Path(file).parent / "requirements.txt").exists():
install(Path(file).parent / "requirements.txt")
if "flow-with-symlinks" in str(file):
saved_path = os.getcwd()
os.chdir(str(file.parent))
source_folder = Path("../web-classification")
for file_name in os.listdir(source_folder):
if not Path(file_name).exists():
os.symlink(
source_folder / file_name,
file_name
)
new_links.append(file_name)
validation_result = pf_client.flows.validate(
flow=file,
)
if "flow-with-symlinks" in str(file):
for link in new_links:
os.remove(link)
os.chdir(saved_path)
print(f"VALIDATE {file}: \n" + repr(validation_result))
if not validation_result.passed:
print(f"{file} is not valid.")
error = True
if len(validation_result._warnings) > 0:
print(f"{file} has warnings.")
error = True

if error:
raise Exception("Some flow.dag.yaml doesn't contain $schema field.")
raise Exception("Some flow.dag.yaml validation failed.")
else:
print("All flow.dag.yaml contain $schema field.")
print("All flow.dag.yaml validation completed.")


if __name__ == "__main__":
Expand Down

0 comments on commit 5a31079

Please sign in to comment.