-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Automating Model tracing and uploading #193
Conversation
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #193 +/- ##
=======================================
Coverage 91.06% 91.06%
=======================================
Files 37 37
Lines 4052 4052
=======================================
Hits 3690 3690
Misses 362 362 |
…ert-base-tas-b) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
* @thanawan-atc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? I might need to talk with release team to give you write access in the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I just made this changes for testing on my own repo. I will revert it back before the PR is merged.
.ci/run-repository.sh
Outdated
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some comments for all these if else conditions
@@ -33,7 +33,7 @@ docker build \ | |||
echo -e "\033[1m>>>>> Run [opensearch-project/opensearch-py-ml container] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>\033[0m" | |||
|
|||
|
|||
if [[ "$IS_DOC" == "false" ]]; then | |||
if [[ "$TASK_TYPE" == "test" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment.
required: true | ||
type: choice | ||
options: | ||
- "BOTH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How this will work? Will this create two separate workflow? How will we notify release team? One notification or two?
.github/workflows/model_uploader.yml
Outdated
id-token: write | ||
contents: read | ||
strategy: | ||
fail-fast: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this for? May be add a comment why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, after thinking about it, we can remove this. I initially followed the workflow for integration test.
python -m pip install pandas~=${PANDAS_VERSION}; | ||
python utils/model_uploader/model_autotracing.py ${MODEL_ID} ${MODEL_VERSION} ${TRACING_FORMAT} -ed ${EMBEDDING_DIMENSION} -pm ${POOLING_MODE}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding as a nox session?
@@ -20,7 +20,7 @@ jobs: | |||
- name: Checkout Repository | |||
uses: actions/checkout@v2 | |||
- name: Integ ${{ matrix.cluster }} secured=${{ matrix.secured }} version=${{matrix.entry.opensearch_version}} | |||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} true" | |||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} doc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to make sure you tried this command in your end and it successfully generated the doc files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it works as expected.
@@ -18,7 +18,7 @@ jobs: | |||
- name: Checkout | |||
uses: actions/checkout@v2 | |||
- name: Integ ${{ matrix.cluster }} secured=${{ matrix.secured }} version=${{matrix.entry.opensearch_version}} | |||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }}" | |||
run: "./.ci/run-tests ${{ matrix.cluster }} ${{ matrix.secured }} ${{ matrix.entry.opensearch_version }} test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
id: dryrun_model_uploading | ||
run: | | ||
aws s3 sync ./upload/ s3://opensearch-exp/ml-models/huggingface/sentence-transformers/ --dryrun | ||
dryrun_output=$(aws s3 sync ./upload/ s3://opensearch-exp/ml-models/huggingface/sentence-transformers/ --dryrun) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to talk about the file path in our meeting.
assert False, f"Raised Exception in {model_format} model deployment: {e}" | ||
|
||
# 3.) Check model status | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to check model task status also. IF model is deployed then we can start generating embedding.
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
Signed-off-by: Thanawan Atchariyachanvanit <[email protected]>
…ansformers/msmarco-distilbert-base-tas-b) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Description
I implement a GitHub Action workflow that will automate model tracing and uploading process, with an initial focus on pre-trained sentence transformer models.
Issues Resolved
OpenSearch Team receives frequent requests from customers to incorporate additional models into our model hub. Moreover, the current manual process of tracing and uploading these models is laborious and time-consuming. As a result, there is a compelling need to automate this process to streamline model availability effectively.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.