Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Samples]Add column mapping in CLI sample #813

Merged
merged 8 commits into from
Oct 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/how-to-guides/manage-runs.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ column_mapping:
run: <existing-flow-run-name>
```

Reference [here](./run-and-evaluate-a-flow/use-column-mapping.md) for detailed information for column mapping.
Reference [here](https://aka.ms/pf/column-mapping) for detailed information for column mapping.
You can find additional information about flow yaml schema in [Run YAML Schema](../reference/run-yaml-schema-reference.md).

After preparing the yaml file, use the CLI command below to create them:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Create the run with flow and data, can add `--stream` to stream the run.
pf run create --flow standard/web-classification --data standard/web-classification/data.jsonl --column-mapping url='${data.url}' --stream
```

Note `column-mapping` is a mapping from flow input name to specified values, see more details in [Use column mapping](./use-column-mapping.md).
Note `column-mapping` is a mapping from flow input name to specified values, see more details in [Use column mapping](https://aka.ms/pf/column-mapping).

You can also name the run by specifying `--name my_first_run` in above command, otherwise the run name will be generated in a certain pattern which has timestamp inside.

Expand Down Expand Up @@ -118,10 +118,10 @@ In this guide, we use [eval-classification-accuracy](https://github.com/microsof

After the run is finished, you can evaluate the run with below command, compared with the normal run create command, note there are two extra arguments:

- `column-mapping`: A mapping from flow input name to specified data values. Reference [here](./use-column-mapping.md) for detailed information.
- `column-mapping`: A mapping from flow input name to specified data values. Reference [here](https://aka.ms/pf/column-mapping) for detailed information.
- `run`: The run name of the flow run to be evaluated.

More details can be found in [Use column mapping](./use-column-mapping.md).
More details can be found in [Use column mapping](https://aka.ms/pf/column-mapping).

```sh
pf run create --flow evaluation/eval-classification-accuracy --data standard/web-classification/data.jsonl --column-mapping groundtruth='${data.answer}' prediction='${run.outputs.category}' --run my_first_run --stream
Expand Down Expand Up @@ -161,7 +161,7 @@ After the run is finished, you can evaluate the run with below command, compared
- If the data column is from your flow output, then it is specified as `${run.outputs.<output_name>}`.
- `run`: The run name or run instance of the flow run to be evaluated.

More details can be found in [Use column mapping](./use-column-mapping.md).
More details can be found in [Use column mapping](https://aka.ms/pf/column-mapping).

```python
# set eval flow path
Expand Down
3 changes: 2 additions & 1 deletion docs/how-to-guides/tune-prompts-with-variants.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ Assuming you are in working directory `<path-to-the-sample-repo>/examples/flows/
Note we pass `--variant` to specify which variant of the node should be running.

```sh
pf run create --flow web-classification --data web-classification/data.jsonl --variant '${summarize_text_content.variant_1}' --stream --name my_first_variant_run
pf run create --flow web-classification --data web-classification/data.jsonl --variant '${summarize_text_content.variant_1}' --column-mapping url='${data.url}' --stream --name my_first_variant_run
```

:::
Expand All @@ -91,6 +91,7 @@ variant_run = pf.run(
flow=flow,
data=data,
variant="${summarize_text_content.variant_1}", # use variant 1.
column_mapping={"url": "${data.url}"},
)

pf.stream(variant_run)
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/pfazure-command-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Inputs column mapping, use `${data.xx}` to refer to data file columns, use `${ru

`--run`

Referenced flow run name. For example, you can run an evaluation flow against an existing run. For example, "pfazure run create --flow evaluation_flow_dir --run existing_bulk_run".
Referenced flow run name. For example, you can run an evaluation flow against an existing run. For example, "pfazure run create --flow evaluation_flow_dir --run existing_bulk_run --column-mapping url='${data.url}'".

`--variant`

Expand Down
7 changes: 5 additions & 2 deletions examples/flows/evaluation/eval-basic/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,5 +36,8 @@ pf flow test --flow . --node line_process --inputs groundtruth=ABC prediction=AB
There are two ways to evaluate an classification flow.

```bash
pf run create --flow . --data ./data.jsonl --stream
```
pf run create --flow . --data ./data.jsonl --column-mapping groundtruth='${data.groundtruth}' prediction='${data.prediction}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,12 @@ pf flow test --flow . --node grade --inputs groundtruth=groundtruth prediction=p
There are two ways to evaluate an classification flow.

```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping groundtruth='${data.groundtruth}' prediction='${data.prediction}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

### 3. create run against other flow run

Learn more in [web-classification](../../standard/web-classification/README.md)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ pf flow test --flow .
### 2. create flow run with multi line data

```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping ground_truth='${data.ground_truth}' entities='${data.entities}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
4 changes: 3 additions & 1 deletion examples/flows/evaluation/eval-groundedness/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ pf flow test --flow .
### 2. create flow run with multi line data

```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping question='${data.question}' answer='${data.answer}' context='${data.context}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ pf flow test --flow .
### 2. create flow run with multi line data

```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping question='${data.question}' answer='${data.answer}' context='${data.context}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
5 changes: 4 additions & 1 deletion examples/flows/standard/autonomous-agent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,11 @@ pf flow test --flow .

```bash
# create run using command line args
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping name='${data.name}' role='${data.role}' goals='${data.goals}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

## Disclaimer
LLM systems are susceptible to prompt injection, and you can gain a deeper understanding of this issue in the [technical blog](https://developer.nvidia.com/blog/securing-llm-systems-against-prompt-injection/). As an illustration, the PythonREPL function might execute harmful code if provided with a malicious prompt within the provided sample. Furthermore, we cannot guarantee that implementing AST validations solely within the PythonREPL function will reliably elevate the sample's security to an enterprise level. We kindly remind you to refrain from utilizing this in a production environment.
5 changes: 4 additions & 1 deletion examples/flows/standard/basic-with-builtin-llm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,12 @@ pf flow test --flow . --inputs text="Python Hello World!"

- create run
```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping text='${data.text}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

- list and show run meta
```bash
# list created run
Expand Down
9 changes: 6 additions & 3 deletions examples/flows/standard/basic-with-connection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,12 @@ pf flow test --flow . --node llm --inputs prompt="Write a simple Hello World! pr

- create run
```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping text='${data.text}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

- list and show run meta
```bash
# list created run
Expand Down Expand Up @@ -87,7 +90,7 @@ pf connection create --file ../../../connections/azure_openai.yml --set api_key=
Run flow with newly created connection.

```bash
pf run create --flow . --data ./data.jsonl --connections llm.connection=open_ai_connection --stream
pf run create --flow . --data ./data.jsonl --connections llm.connection=open_ai_connection --column-mapping text='${data.text}' --stream
```

### Run in cloud with connection override
Expand All @@ -101,5 +104,5 @@ Run flow with connection `open_ai_connection`.
az account set -s <your_subscription_id>
az configure --defaults group=<your_resource_group_name> workspace=<your_workspace_name>

pfazure run create --flow . --data ./data.jsonl --connections llm.connection=open_ai_connection --stream --runtime example-runtime-ci
pfazure run create --flow . --data ./data.jsonl --connections llm.connection=open_ai_connection --column-mapping text='${data.text}' --stream --runtime example-runtime-ci
```
9 changes: 6 additions & 3 deletions examples/flows/standard/basic/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,12 @@ pf flow test --flow . --node llm --inputs prompt="Write a simple Hello World pro
- Create run with multiple lines data
```bash
# using environment from .env file (loaded in user code: hello.py)
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping text='${data.text}' --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

- List and show run meta
```bash
# list created run
Expand Down Expand Up @@ -87,7 +90,7 @@ pf flow test --flow . --environment-variables AZURE_OPENAI_API_KEY='${open_ai_co
- Create run using connection secret binding specified in environment variables, see [run.yml](run.yml)
```bash
# create run
pf run create --flow . --data ./data.jsonl --stream --environment-variables AZURE_OPENAI_API_KEY='${open_ai_connection.api_key}' AZURE_OPENAI_API_BASE='${open_ai_connection.api_base}'
pf run create --flow . --data ./data.jsonl --stream --environment-variables AZURE_OPENAI_API_KEY='${open_ai_connection.api_key}' AZURE_OPENAI_API_BASE='${open_ai_connection.api_base}' --column-mapping text='${data.text}'
# create run using yaml file
pf run create --file run.yml --stream

Expand All @@ -107,7 +110,7 @@ az configure --defaults group=<your_resource_group_name> workspace=<your_workspa
- Create run
```bash
# run with environment variable reference connection in azureml workspace
pfazure run create --flow . --data ./data.jsonl --environment-variables AZURE_OPENAI_API_KEY='${open_ai_connection.api_key}' AZURE_OPENAI_API_BASE='${open_ai_connection.api_base}' --stream --runtime example-runtime-ci
pfazure run create --flow . --data ./data.jsonl --environment-variables AZURE_OPENAI_API_KEY='${open_ai_connection.api_key}' AZURE_OPENAI_API_BASE='${open_ai_connection.api_base}' --column-mapping text='${data.text}' --stream --runtime example-runtime-ci
# run using yaml file
pfazure run create --file run.yml --stream --runtime example-runtime-ci
```
Expand Down
4 changes: 3 additions & 1 deletion examples/flows/standard/basic/run.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ environment_variables:
# environment variables from connection
AZURE_OPENAI_API_KEY: ${open_ai_connection.api_key}
AZURE_OPENAI_API_BASE: ${open_ai_connection.api_base}
AZURE_OPENAI_API_TYPE: azure
AZURE_OPENAI_API_TYPE: azure
column_mapping:
text: ${data.text}
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,12 @@ pf flow test --flow . --input ./data/denormalized-flat.jsonl

4. run with multiple lines input
```bash
pf run create --flow . --data ./data
pf run create --flow . --data ./data --column-mapping history='${data.history}' customer_info='${data.customer_info}'
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

5. list/show

```bash
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,14 @@ pf flow test --flow . --inputs url='https://www.microsoft.com/en-us/d/xbox-wirel

```bash
# create run using command line args
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream
# create run using yaml file
pf run create --file run.yml --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

#### Submit run to cloud

Assume we already have a connection named `open_ai_connection` in workspace.
Expand All @@ -73,9 +76,9 @@ az configure --defaults group=<your_resource_group_name> workspace=<your_workspa

``` bash
# create run
pfazure run create --flow . --data ./data.jsonl --stream --runtime example-runtime-ci
# pfazure run create --flow . --data ./data.jsonl --stream # automatic runtime
pfazure run create --file run.yml --runtime example-runtime-ci
pfazure run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream --runtime example-runtime-ci
# pfazure run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream # automatic runtime
pfazure run create --file run.yml --runtime example-runtime-ci
# pfazure run create --file run.yml --stream # automatic runtime
```

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Run.schema.json
flow: .
data: data.jsonl
variant: ${summarize_text_content.variant_1}
variant: ${summarize_text_content.variant_1}
column_mapping:
url: ${data.url}
9 changes: 6 additions & 3 deletions examples/flows/standard/flow-with-symlinks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,18 +65,21 @@ pf flow test --flow . --node convert_to_dict --inputs classify_with_llm.output='

```bash
# create run using command line args
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream
# create run using yaml file
pf run create --file run.yml --stream
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.


#### Submit run to cloud

``` bash
# create run
pfazure run create --flow . --data ./data.jsonl --stream --runtime example-runtime-ci --subscription <your_subscription_id> -g <your_resource_group_name> -w <your_workspace_name>
# pfazure run create --flow . --data ./data.jsonl --stream # automatic runtime
pfazure run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream --runtime example-runtime-ci --subscription <your_subscription_id> -g <your_resource_group_name> -w <your_workspace_name>
# pfazure run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream # automatic runtime

# set default workspace
az account set -s <your_subscription_id>
Expand Down
6 changes: 6 additions & 0 deletions examples/flows/standard/flow-with-symlinks/run.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Run.schema.json
flow: .
data: data.jsonl
variant: ${summarize_text_content.variant_1}
column_mapping:
url: ${data.url}
5 changes: 3 additions & 2 deletions examples/flows/standard/gen-docstring/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,9 @@ pf flow test --flow . --inputs source="./azure_open_ai.py"

```bash
# run flow with batch data
pf run create --flow . --data ./data.jsonl --name auto_generate_docstring
pf run create --flow . --data ./data.jsonl --name auto_generate_docstring --column-mapping source='${data.source}'
```
Output the code after add the docstring.


You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
5 changes: 3 additions & 2 deletions examples/flows/standard/named-entity-recognition/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,8 @@ pf flow test --flow . --inputs text='The phone number (321) 654-0987 is no longe

- create run
```bash
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping entity_type='${data.entity_type}' text='${data.text}' --stream
```


You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
9 changes: 6 additions & 3 deletions examples/flows/standard/web-classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,17 @@ pf flow test --flow . --inputs url='https://www.youtube.com/watch?v=kYqRtjDBci8'

```bash
# create run using command line args
pf run create --flow . --data ./data.jsonl --stream
pf run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream

# (Optional) create a random run name
run_name="web_classification_"$(openssl rand -hex 12)
# create run using yaml file, run_name will be used in following contents, --name is optional
pf run create --file run.yml --stream --name $run_name
```

You can also skip providing `column-mapping` if provided data has same column name as the flow.
Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.

```bash
# list run
pf run list
Expand Down Expand Up @@ -96,8 +99,8 @@ az account set -s <your_subscription_id>
az configure --defaults group=<your_resource_group_name> workspace=<your_workspace_name>

# create run
pfazure run create --flow . --data ./data.jsonl --stream --runtime example-runtime-ci
# pfazure run create --flow . --data ./data.jsonl --stream # automatic runtime
pfazure run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream --runtime example-runtime-ci
# pfazure run create --flow . --data ./data.jsonl --column-mapping url='${data.url}' --stream # automatic runtime

# (Optional) create a new random run name for further use
run_name="web_classification_"$(openssl rand -hex 12)
Expand Down
4 changes: 3 additions & 1 deletion examples/flows/standard/web-classification/run.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Run.schema.json
flow: .
data: data.jsonl
variant: ${summarize_text_content.variant_1}
variant: ${summarize_text_content.variant_1}
column_mapping:
url: ${data.url}
1 change: 1 addition & 0 deletions examples/tutorials/e2e-development/chat-with-pdf.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,7 @@ The output will include something like below:
}
```

Reference [here](https://aka.ms/pf/column-mapping) for default behavior when `column-mapping` not provided in CLI.
And we developed two evaluation flows one for "[groundedness](../../flows/evaluation/eval-groundedness/)" and one for "[perceived intelligence](../../flows/evaluation/eval-perceived-intelligence/)". These two flows are using GPT models (ChatGPT or GPT4) to "grade" the answers. Reading the prompts will give you better idea what are these two metrics:
- [groundedness prompt](../../flows/evaluation/eval-groundedness/gpt_groundedness.md)
- [perceived intelligence prompt](../../flows/evaluation/eval-perceived-intelligence/gpt_perceived_intelligence.md)
Expand Down
Loading
Loading