-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Automating Model tracing and uploading #193
Changes from 23 commits
967f48e
35372ca
ef97a5d
d02b7d2
b64b051
65a57e4
81f8652
e8e8efa
296c380
d58d99f
a909e09
3bd44fd
982069c
50c7a33
fb12aea
e976553
ed03ce1
df545b9
f8038cc
11052ce
3d9749b
44a611b
48974f8
6c4e2cc
cb27d4b
63d7bd0
516a37b
844cb71
93d0b3f
0c1bc68
af560d6
bbd0214
d2afc36
4d37394
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
#!/usr/bin/env bash | ||
|
||
# Called by entry point `run-test` use this script to add your repository specific test commands | ||
# Called by entry point `run-test` use this script to add your repository specific task commands | ||
# Once called opensearch is up and running and the following parameters are available to this script | ||
|
||
# OPENSEARCH_VERSION -- version e.g Major.Minor.Patch(-Prelease) | ||
|
@@ -16,7 +16,7 @@ set -e | |
echo -e "\033[34;1mINFO:\033[0m URL ${opensearch_url}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m EXTERNAL OS URL ${external_opensearch_url}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m VERSION ${OPENSEARCH_VERSION}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m IS_DOC: ${IS_DOC}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m TASK_TYPE: ${TASK_TYPE}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m TEST_SUITE ${TEST_SUITE}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m PYTHON_VERSION ${PYTHON_VERSION}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m PYTHON_CONNECTION_CLASS ${PYTHON_CONNECTION_CLASS}\033[0m" | ||
|
@@ -33,7 +33,7 @@ docker build \ | |
echo -e "\033[1m>>>>> Run [opensearch-project/opensearch-py-ml container] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>\033[0m" | ||
|
||
|
||
if [[ "$IS_DOC" == "false" ]]; then | ||
if [[ "$TASK_TYPE" == "test" ]]; then | ||
docker run \ | ||
--network=${network_name} \ | ||
--env "STACK_VERSION=${STACK_VERSION}" \ | ||
|
@@ -45,10 +45,10 @@ if [[ "$IS_DOC" == "false" ]]; then | |
--name opensearch-py-ml-test-runner \ | ||
opensearch-project/opensearch-py-ml \ | ||
nox -s "test-${PYTHON_VERSION}(pandas_version='${PANDAS_VERSION}')" | ||
|
||
docker cp opensearch-py-ml-test-runner:/code/opensearch-py-ml/junit/ ./junit/ | ||
|
||
docker rm opensearch-py-ml-test-runner | ||
else | ||
elif [[ "$TASK_TYPE" == "doc" ]]; then | ||
docker run \ | ||
--network=${network_name} \ | ||
--env "STACK_VERSION=${STACK_VERSION}" \ | ||
|
@@ -60,7 +60,30 @@ else | |
--name opensearch-py-ml-doc-runner \ | ||
opensearch-project/opensearch-py-ml \ | ||
nox -s docs | ||
|
||
docker cp opensearch-py-ml-doc-runner:/code/opensearch-py-ml/docs/build/ ./docs/ | ||
|
||
docker rm opensearch-py-ml-doc-runner | ||
fi | ||
else | ||
echo -e "\033[34;1mINFO:\033[0m MODEL_ID: ${MODEL_ID}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m MODEL_VERSION: ${MODEL_VERSION}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m TRACING_FORMAT: ${TRACING_FORMAT}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m EMBEDDING_DIMENSION: ${EMBEDDING_DIMENSION:-N/A}\033[0m" | ||
echo -e "\033[34;1mINFO:\033[0m POOLING_MODE: ${POOLING_MODE:-N/A}\033[0m" | ||
|
||
docker run \ | ||
--network=${network_name} \ | ||
--env "STACK_VERSION=${STACK_VERSION}" \ | ||
--env "OPENSEARCH_URL=${opensearch_url}" \ | ||
--env "OPENSEARCH_VERSION=${OPENSEARCH_VERSION}" \ | ||
--env "TEST_SUITE=${TEST_SUITE}" \ | ||
--env "PYTHON_CONNECTION_CLASS=${PYTHON_CONNECTION_CLASS}" \ | ||
--env "TEST_TYPE=server" \ | ||
--name opensearch-py-ml-trace-runner \ | ||
opensearch-project/opensearch-py-ml \ | ||
/bin/bash -c "python -m pip install -r requirements-dev.txt --timeout 1500; | ||
python -m pip install pandas~=${PANDAS_VERSION}; | ||
Comment on lines
+86
to
+88
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you think about adding as a nox session? |
||
python utils/model_uploader/model_autotracing.py ${MODEL_ID} ${MODEL_VERSION} ${TRACING_FORMAT} -ed ${EMBEDDING_DIMENSION} -pm ${POOLING_MODE}" | ||
|
||
docker cp opensearch-py-ml-trace-runner:/code/opensearch-py-ml/upload/ ./upload/ | ||
docker rm opensearch-py-ml-trace-runner | ||
fi |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
* @dhrubo-os @greaa-aws @ylwu-amzn @b4sjoo @jngz-es @rbhavna | ||
* @thanawan-atc | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this necessary? I might need to talk with release team to give you write access in the repo. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No. I just made this changes for testing on my own repo. I will revert it back before the PR is merged. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,7 +20,7 @@ jobs: | |
- name: Checkout Repository | ||
uses: actions/checkout@v2 | ||
- name: Integ ${{ matrix.cluster }} secured=${{ matrix.secured }} version=${{matrix.entry.opensearch_version}} | ||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} true" | ||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} doc" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just wanted to make sure you tried this command in your end and it successfully generated the doc files? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it works as expected. |
||
- name: Deploy | ||
uses: peaceiris/actions-gh-pages@v3 | ||
with: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,7 +18,7 @@ jobs: | |
- name: Checkout | ||
uses: actions/checkout@v2 | ||
- name: Integ ${{ matrix.cluster }} secured=${{ matrix.secured }} version=${{matrix.entry.opensearch_version}} | ||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }}" | ||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} test" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as above |
||
- name: Upload coverage to Codecov | ||
uses: codecov/codecov-action@v2 | ||
with: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,240 @@ | ||
name: Model Auto-tracing & Uploading | ||
on: | ||
# Step 1: Initiate the workflow | ||
workflow_dispatch: | ||
inputs: | ||
model_id: | ||
description: "Model ID for auto-tracing and uploading (e.g. sentence-transformers/msmarco-distilbert-base-tas-b)" | ||
required: true | ||
type: string | ||
model_version: | ||
description: "Model version number (e.g. 1.0.1)" | ||
required: true | ||
type: string | ||
tracing_format: | ||
description: "Model format for auto-tracing (torch_script/onnx)" | ||
required: true | ||
type: choice | ||
options: | ||
- "BOTH" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How this will work? Will this create two separate workflow? How will we notify release team? One notification or two? |
||
- "TORCH_SCRIPT" | ||
- "ONNX" | ||
embedding_dimension: | ||
description: "(Optional) Embedding Dimension (Specify here if it does not exist in original config.json file, or you want to overwrite it.)" | ||
required: false | ||
type: int | ||
pooling_mode: | ||
description: "(Optional) Pooling Mode (Specify here if it does not exist in original config.json file or you want to overwrite it.)" | ||
required: false | ||
type: choice | ||
options: | ||
- "" | ||
- "CLS" | ||
- "MEAN" | ||
- "MAX" | ||
- "MEAN_SQRT_LEN" | ||
|
||
jobs: | ||
# Step 2: Check if the model already exists in the model hub | ||
checking-out-model-hub: | ||
runs-on: 'ubuntu-latest' | ||
permissions: | ||
id-token: write | ||
contents: read | ||
steps: | ||
- name: Checkout Repository | ||
uses: actions/checkout@v3 | ||
- name: Set Up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.x' | ||
- name: Configure AWS Credentials | ||
uses: aws-actions/configure-aws-credentials@v2 | ||
with: | ||
aws-region: ${{ secrets.MODEL_UPLOADER_AWS_REGION }} | ||
role-to-assume: ${{ secrets.MODEL_UPLOADER_ROLE }} | ||
role-session-name: checking-out-model-hub | ||
- name: Check if TORCH_SCRIPT Model Exists | ||
if: github.event.inputs.tracing_format == 'TORCH_SCRIPT' || github.event.inputs.tracing_format == 'BOTH' | ||
run: | | ||
TORCH_FILE_PATH=$(python utils/model_uploader/save_model_file_path_to_env.py \ | ||
${{ github.event.inputs.model_id }} ${{ github.event.inputs.model_version }} TORCH_SCRIPT) | ||
aws s3api head-object --bucket opensearch-exp --key $TORCH_FILE_PATH > /dev/null 2>&1 || TORCH_MODEL_NOT_EXIST=true | ||
if [[ -z $TORCH_MODEL_NOT_EXIST ]]; | ||
then | ||
echo "TORCH_SCRIPT Model already exists on model hub." | ||
exit 1 | ||
fi | ||
- name: Check if ONNX Model Exists | ||
if: github.event.inputs.tracing_format == 'ONNX' || github.event.inputs.tracing_format == 'BOTH' | ||
run: | | ||
ONNX_FILE_PATH=$(python utils/model_uploader/save_model_file_path_to_env.py \ | ||
${{ github.event.inputs.model_id }} ${{ github.event.inputs.model_version }} ONNX) | ||
aws s3api head-object --bucket opensearch-exp --key $ONNX_FILE_PATH > /dev/null 2>&1 || ONNX_MODEL_NOT_EXIST=true | ||
if [[ -z $ONNX_MODEL_NOT_EXIST ]]; | ||
then | ||
echo "TORCH_SCRIPT Model already exists on model hub." | ||
exit 1; | ||
fi | ||
|
||
# Step 3: Trace the model, Verify the embeddings & Upload the model files as artifacts | ||
model-auto-tracing: | ||
needs: checking-out-model-hub | ||
name: model-auto-tracing | ||
runs-on: ubuntu-latest | ||
permissions: | ||
id-token: write | ||
contents: read | ||
strategy: | ||
fail-fast: false | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is this for? May be add a comment why do we need this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, after thinking about it, we can remove this. I initially followed the workflow for integration test. |
||
matrix: | ||
cluster: ["opensearch"] | ||
secured: ["true"] | ||
entry: | ||
- { opensearch_version: 2.7.0 } | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v2 | ||
- name: Export Arguments | ||
run: | | ||
echo "MODEL_ID=${{ github.event.inputs.model_id }}" >> $GITHUB_ENV | ||
echo "MODEL_VERSION=${{ github.event.inputs.model_version }}" >> $GITHUB_ENV | ||
echo "TRACING_FORMAT=${{ github.event.inputs.tracing_format }}" >> $GITHUB_ENV | ||
echo "EMBEDDING_DIMENSION=${{ github.event.inputs.embedding_dimension }}" >> $GITHUB_ENV | ||
echo "POOLING_MODE=${{ github.event.inputs.pooling_mode }}" >> $GITHUB_ENV | ||
- name: Autotracing ${{ matrix.cluster }} secured=${{ matrix.secured }} version=${{matrix.entry.opensearch_version}} | ||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} trace" | ||
- name: Upload Artifact | ||
uses: actions/upload-artifact@v3 | ||
with: | ||
name: upload | ||
path: ./upload/ | ||
retention-days: 5 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is this for? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is to upload the files so that it can be used in the next job (i.e., model uploading). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The retention day is how long we want to keep the artifacts. |
||
if-no-files-found: error | ||
- name: Configure AWS Credentials | ||
uses: aws-actions/configure-aws-credentials@v2 | ||
with: | ||
aws-region: ${{ secrets.MODEL_UPLOADER_AWS_REGION }} | ||
role-to-assume: ${{ secrets.MODEL_UPLOADER_ROLE }} | ||
role-session-name: model-auto-tracing | ||
- name: Dryrun model uploading | ||
id: dryrun_model_uploading | ||
run: | | ||
aws s3 sync ./upload/ s3://opensearch-exp/ml-models/huggingface/sentence-transformers/ --dryrun | ||
dryrun_output=$(aws s3 sync ./upload/ s3://opensearch-exp/ml-models/huggingface/sentence-transformers/ --dryrun) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to talk about the file path in our meeting. |
||
echo "dryrun_output<<EOF" >> $GITHUB_OUTPUT | ||
echo "${dryrun_output@E}" >> $GITHUB_OUTPUT | ||
echo "EOF" >> $GITHUB_OUTPUT | ||
echo "${dryrun_output@E}" | ||
outputs: | ||
dryrun_output: ${{ steps.dryrun_model_uploading.outputs.dryrun_output }} | ||
|
||
# Step 4: Ask for manual approval from the CODEOWNERS | ||
manual-approval: | ||
needs: model-auto-tracing | ||
runs-on: 'ubuntu-latest' | ||
permissions: | ||
issues: write | ||
steps: | ||
- name: Checkout Repository | ||
uses: actions/checkout@v3 | ||
- name: Get Approvers | ||
id: get_approvers | ||
run: | | ||
echo "approvers=$(cat .github/CODEOWNERS | grep @ | tr -d '* ' | sed 's/@/,/g' | sed 's/,//1')" >> $GITHUB_OUTPUT | ||
- name: Create Issue Body | ||
id: create_issue_body | ||
run: | | ||
embedding_dimension=${{ github.event.inputs.embedding_dimension }} | ||
pooling_mode=${{ github.event.inputs.pooling_mode }} | ||
issue_body="Please approve or deny opensearch-py-ml model uploading: | ||
|
||
========= Workflow Details ========== | ||
- Workflow Name: ${{ github.workflow }} | ||
- Workflow Initiator: @${{ github.actor }} | ||
|
||
========= Model Information ========= | ||
- Model ID: ${{ github.event.inputs.model_id }} | ||
- Model Version: ${{ github.event.inputs.model_version }} | ||
- Tracing Format: ${{ github.event.inputs.tracing_format }} | ||
- Embedding Dimension: ${embedding_dimension:-Default} | ||
- Pooling Mode: ${pooling_mode:-Default} | ||
|
||
===== Dry Run of Model Uploading ===== | ||
${{ needs.model-auto-tracing.outputs.dryrun_output }}" | ||
|
||
echo "issue_body<<EOF" >> $GITHUB_OUTPUT | ||
echo "${issue_body@E}" >> $GITHUB_OUTPUT | ||
echo "EOF" >> $GITHUB_OUTPUT | ||
echo "${issue_body@E}" | ||
- uses: trstringer/manual-approval@v1 | ||
with: | ||
secret: ${{ github.TOKEN }} | ||
approvers: ${{ steps.get_approvers.outputs.approvers }} | ||
minimum-approvals: 1 | ||
issue-title: "Upload Model to OpenSearch Model Hub (${{ github.event.inputs.model_id }})" | ||
issue-body: ${{ steps.create_issue_body.outputs.issue_body }} | ||
exclude-workflow-initiator-as-approver: false | ||
|
||
# Step 5: Download the artifacts & Upload it to the S3 bucket | ||
model-uploading: | ||
needs: manual-approval | ||
runs-on: 'ubuntu-latest' | ||
permissions: | ||
id-token: write | ||
contents: read | ||
steps: | ||
- name: Download Artifact | ||
uses: actions/download-artifact@v2 | ||
with: | ||
name: upload | ||
path: ./upload/ | ||
- name: Configure AWS Credentials | ||
uses: aws-actions/configure-aws-credentials@v2 | ||
with: | ||
aws-region: ${{ secrets.MODEL_UPLOADER_AWS_REGION }} | ||
role-to-assume: ${{ secrets.MODEL_UPLOADER_ROLE }} | ||
role-session-name: model-uploading | ||
- name: Copy Files to the Bucket | ||
id: copying_to_bucket | ||
run: | | ||
aws s3 sync ./upload/ s3://opensearch-exp/ml-models/huggingface/sentence-transformers/ | ||
echo "upload_time=$(TZ='America/Los_Angeles' date "+%Y-%m-%d %T")" >> $GITHUB_OUTPUT | ||
outputs: | ||
upload_time: ${{ steps.copying_to_bucket.outputs.upload_time }} | ||
|
||
# Step 6: Update MODEL_UPLOAD_HISTORY.md & supported_models.json | ||
history-update: | ||
needs: model-uploading | ||
runs-on: 'ubuntu-latest' | ||
permissions: | ||
id-token: write | ||
contents: write | ||
concurrency: ${{ github.workflow }}-concurrency | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v2 | ||
- name: Set Up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.x' | ||
- name: Install Packages | ||
run: | ||
python -m pip install mdutils | ||
- name: Update MODEL_UPLOAD_HISTORY.md | ||
run: | | ||
python utils/model_uploader/update_models_upload_history_md.py \ | ||
${{ github.event.inputs.model_id }} \ | ||
${{ github.event.inputs.model_version }} \ | ||
${{ github.event.inputs.tracing_format }} \ | ||
-ed ${{ github.event.inputs.embedding_dimension }} \ | ||
-pm ${{ github.event.inputs.pooling_mode }} \ | ||
-u ${{ github.actor }} -t "${{ needs.model-uploading.outputs.upload_time }}" | ||
- name: Commit Updates | ||
uses: stefanzweifel/git-auto-commit-action@v4 | ||
id: commit | ||
with: | ||
commit_message: 'GitHub Actions Workflow - Update MODEL_UPLOAD_HISTORY.md (${{ github.event.inputs.model_id }})' | ||
commit_options: '--signoff' | ||
repository: ./utils/model_uploader/upload_history | ||
file_pattern: MODEL_UPLOAD_HISTORY.md supported_models.json | ||
Comment on lines
+232
to
+239
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It will raise a PR for us to approve? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that it won't. It should be able to push directly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What will happen if somebody also modifies this file? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because they don't have write access, they couldn't push the changes directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment.