Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from CTuning #1111

Merged
merged 49 commits into from
Feb 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
32db940
Added README files for stable-diffusion and llama2-70b
arjunsuresh Feb 15, 2024
3bb39d4
Merge branch 'mlcommons:master' into master
arjunsuresh Feb 15, 2024
a020b43
productivity tools
gfursin Feb 15, 2024
a59a986
Update README_nvidia.md
arjunsuresh Feb 15, 2024
a82d4fc
Update README_nvidia.md
arjunsuresh Feb 15, 2024
2ae017f
Added bert mlperf inference readme for qaic
arjunsuresh Feb 15, 2024
f19c0ec
fixed MLPerf accuracy loading on Windows
gfursin Feb 15, 2024
b01d773
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Feb 15, 2024
d4f30d2
Also save raw pip_freeze for mlperf inference results
arjunsuresh Feb 15, 2024
af65bfb
Use CM cache for mlperf inference submission generation
arjunsuresh Feb 15, 2024
4100724
Update README_aws_dl2q.24xlarge.md
arjunsuresh Feb 15, 2024
4170a92
Update README_aws_dl2q.24xlarge.md
arjunsuresh Feb 15, 2024
042d9f8
Update Submission.md
arjunsuresh Feb 15, 2024
aeaa4f1
Fixes to submission generation
arjunsuresh Feb 15, 2024
a65fd4e
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
arjunsuresh Feb 15, 2024
f3b8ba1
Fixes for submission generation
arjunsuresh Feb 15, 2024
a020572
Remove --results_dir from run-template
arjunsuresh Feb 15, 2024
7c786f0
Added new SUTs
arjunsuresh Feb 15, 2024
36134fe
Update the power server IP for the suts
arjunsuresh Feb 15, 2024
14430a6
* support loadgen C++ building on Windows
gfursin Feb 15, 2024
27e991b
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Feb 15, 2024
5ee841f
fixing benchmark program on Windows
gfursin Feb 15, 2024
3f43fad
clean up
gfursin Feb 15, 2024
08d4a59
Fix wrong state export in nvidia implementation
arjunsuresh Feb 15, 2024
2539269
Mark invalid results in results table
arjunsuresh Feb 15, 2024
d0953cb
Fix mlperf log path
arjunsuresh Feb 15, 2024
cb685bc
Fix SPR nvidia configs
arjunsuresh Feb 15, 2024
ebf3223
Fix mlperf log path
arjunsuresh Feb 15, 2024
1f7bfbe
Fix for invalid results in results table
arjunsuresh Feb 15, 2024
36f4eb2
fix typo
arjunsuresh Feb 15, 2024
4f6bfb4
Fix results_dir for submission generation
arjunsuresh Feb 15, 2024
a660db2
Fix results table for short run
arjunsuresh Feb 15, 2024
2d07731
Fix results table for short run
arjunsuresh Feb 15, 2024
58251a6
demos
gfursin Feb 15, 2024
c86e87e
Update README_aws_dl2q.24xlarge.md
arjunsuresh Feb 15, 2024
a3f87c2
Fixes for dl2q qaic run
arjunsuresh Feb 15, 2024
1256385
Update README_aws_dl2q.24xlarge.md
arjunsuresh Feb 15, 2024
ff31843
docker clean up
gfursin Feb 15, 2024
0a5a9cf
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Feb 15, 2024
8095a62
minor clean up
gfursin Feb 15, 2024
38482e9
clean up
gfursin Feb 15, 2024
d7ad203
Fix model starting weights
arjunsuresh Feb 15, 2024
a101ec5
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
arjunsuresh Feb 15, 2024
5a77557
fix retinanet model soft link for nvidia-harness
arjunsuresh Feb 15, 2024
7421d85
Fix SS command for nvidia-harness
arjunsuresh Feb 15, 2024
c1ec39a
improving docker automation
gfursin Feb 15, 2024
62fe792
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Feb 15, 2024
966df62
fixing hugging face download, adding latest cmake prebuilt
gfursin Feb 16, 2024
09c941e
typo
gfursin Feb 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions cm-mlops/automation/script/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -4133,12 +4133,14 @@ def prepare_and_run_script_with_postprocessing(i, postprocess="postprocess"):
print (r['string'])
print ("")

note = '''^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note = '''
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note that it may be a portability issue of a third-party tool or a native script
wrapped and unified by this portable CM script. In such case, please report this issue
with a full log at "https://github.com/mlcommons/ck". The CM concept is to collaboratively
fix such issues inside portable CM scripts to make existing tools and native scripts
more portable, interoperable and deterministic. Thank you'''
wrapped and unified by this automation recipe (CM script). In such case,
please report this issue with a full log at "https://github.com/mlcommons/ck".
The CM concept is to collaboratively fix such issues inside portable CM scripts
to make existing tools and native scripts more portable, interoperable
and deterministic. Thank you!'''

return {'return':2, 'error':'Portable CM script failed (name = {}, return code = {})\n\n{}'.format(meta['alias'], rc, note)}

Expand Down
17 changes: 12 additions & 5 deletions cm-mlops/automation/script/module_misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1187,9 +1187,10 @@ def regenerate_script_cmd(i):


skip_input_for_fake_run = docker_settings.get('skip_input_for_fake_run', [])
add_quotes_to_keys = docker_settings.get('add_quotes_to_keys', [])


def rebuild_flags(i_run_cmd, fake_run, skip_input_for_fake_run, key_prefix):
def rebuild_flags(i_run_cmd, fake_run, skip_input_for_fake_run, add_quotes_to_keys, key_prefix):

run_cmd = ''

Expand All @@ -1212,16 +1213,22 @@ def rebuild_flags(i_run_cmd, fake_run, skip_input_for_fake_run, key_prefix):

v = i_run_cmd[k]

q = '\\"' if long_key in add_quotes_to_keys else ''

if type(v)==dict:
run_cmd += rebuild_flags(v, fake_run, skip_input_for_fake_run, long_key)
run_cmd += rebuild_flags(v, fake_run, skip_input_for_fake_run, add_quotes_to_keys, long_key)
elif type(v)==list:
run_cmd+=' --'+long_key+',='+','.join(v)
x = ''
for vv in v:
if x != '': x+=','
x+=q+str(vv)+q
run_cmd+=' --'+long_key+',=' + x
else:
run_cmd+=' --'+long_key+'='+str(v)
run_cmd+=' --'+long_key+'='+q+str(v)+q

return run_cmd

run_cmd += rebuild_flags(i_run_cmd, fake_run, skip_input_for_fake_run, '')
run_cmd += rebuild_flags(i_run_cmd, fake_run, skip_input_for_fake_run, add_quotes_to_keys, '')

run_cmd = docker_run_cmd_prefix + ' && ' + run_cmd if docker_run_cmd_prefix!='' else run_cmd

Expand Down
4 changes: 4 additions & 0 deletions cm-mlops/script/app-image-corner-detection/_cm.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
"automation_alias": "script",
"automation_uid": "5b4e0237da074764",
"category": "Modular application pipeline",
"deps": [
{"tags":"detect,os"},
{"tags":"detect,cpu"}
],
"posthook_deps": [
{
"skip_if_env": {
Expand Down
9 changes: 3 additions & 6 deletions cm-mlops/script/app-loadgen-generic-python/README-extra.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,12 +221,9 @@ cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-s
cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.Intel/gpt-j-6B-int8-static" --adr.hf-downloader.model_filename=model.onnx --adr.hf-downloader.full_subfolder=. --samples=2
```


cmr "python app loadgen-generic _onnxruntime _custom _huggingface _model-stub.runwayml/stable-diffusion-v1-5" --adr.hf-downloader.model_filename=onnx/unet/model.onnx,onnx/unet/weights.pb --samples=2


TBD: some cases that are not yet fully supported (data types, etc):
TBD: some cases that are not yet fully supported (data types, input mismatch, etc):
```bash
cmr "python app loadgen-generic _onnxruntime _custom _huggingface _model-stub.runwayml/stable-diffusion-v1-5" --adr.hf-downloader.revision=onnx --adr.hf-downloader.model_filename=unet/model.onnx,unet/weights.pb --samples=2
cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.microsoft/Mistral-7B-v0.1-onnx" --adr.hf-downloader.model_filename=Mistral-7B-v0.1.onnx,Mistral-7B-v0.1.onnx.data --samples=2
cmr "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.alpindale/Llama-2-7b-ONNX" --adr.hf-downloader.model_filename=FP16/LlamaV2_7B_float16.onnx --adr.hf-downloader.full_subfolder=FP16 --samples=2
```
Expand Down Expand Up @@ -269,7 +266,7 @@ Available flags mapped to environment variables:
## Running this app via Docker

```bash
cm docker script "python app loadgen-generic _onnxruntime _cuda _custom _huggingface _model-stub.steerapi/Llama-2-7b-chat-hf-onnx-awq-w8" --adr.hf-downloader.model_filename=onnx/decoder_model_merged_quantized.onnx,onnx/decoder_model_merged_quantized.onnx_data --samples=2 --output_dir=. --docker_cm_repo=ctuning@mlcommons-ck
cm docker script "python app loadgen-generic _onnxruntime _custom _huggingface _model-stub.ctuning/mlperf-inference-bert-onnx-fp32-squad-v1.1" --adr.hf-downloader.model_filename=model.onnx --samples=2 --output_dir=new_results --docker_cm_repo=ctuning@mlcommons-ck
```

## Tuning CPU performance via CM experiment
Expand Down
1 change: 1 addition & 0 deletions cm-mlops/script/app-loadgen-generic-python/run.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
#!/bin/bash

${CM_PYTHON_BIN_WITH_PATH} ${CM_TMP_CURRENT_SCRIPT_PATH}/src/main.py ${CM_RUN_OPTS} ${CM_ML_MODEL_FILE_WITH_PATH}
test $? -eq 0 || exit 1
11 changes: 9 additions & 2 deletions cm-mlops/script/app-mlperf-inference-cpp/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,16 @@ def preprocess(i):

os_info = i['os_info']

# if os_info['platform'] == 'windows':
# return {'return':1, 'error': 'Windows is not supported in this script yet'}
automation = i['automation']

meta = i['meta']

# if os_info['platform'] == 'windows':
# # Currently support only LLVM on Windows
# print ('# Forcing LLVM on Windows')
# r = automation.update_deps({'deps':meta['post_deps'], 'update_deps':{'compile-program': {'adr':{'compiler':{'tags':'llvm'}}}}})
# if r['return']>0: return r

env = i['env']

if env.get('CM_MLPERF_SKIP_RUN', '') == "yes":
Expand Down
8 changes: 8 additions & 0 deletions cm-mlops/script/app-mlperf-inference-cpp/tests/win.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
rem TBD: current not compiling - need to check ...

cmr "install llvm prebuilt" --version=16.0.4
cmr "install llvm prebuilt" --version=17.0.6

cmr "get lib onnxruntime lang-cpp _cpu" --version=1.11.1
cmr "get lib onnxruntime lang-cpp _cpu" --version=1.13.1
cmr "get lib onnxruntime lang-cpp _cpu" --version=1.15.1
7 changes: 5 additions & 2 deletions cm-mlops/script/app-mlperf-inference/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ def postprocess(i):
if os.path.exists(env['CM_MLPERF_USER_CONF']):
shutil.copy(env['CM_MLPERF_USER_CONF'], 'user.conf')

result = mlperf_utils.get_result_from_log(env['CM_MLPERF_LAST_RELEASE'], model, scenario, output_dir, mode)
result, valid = mlperf_utils.get_result_from_log(env['CM_MLPERF_LAST_RELEASE'], model, scenario, output_dir, mode)
power = None
power_efficiency = None
if mode == "performance":
Expand All @@ -221,8 +221,11 @@ def postprocess(i):
if not state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model].get(scenario):
state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model][scenario] = {}
state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model][scenario][mode] = result
state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model][scenario][mode+'_valid'] = valid[mode]

if power:
state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model][scenario]['power'] = power
state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model][scenario]['power_valid'] = valid['power']
if power_efficiency:
state['cm-mlperf-inference-results'][state['CM_SUT_CONFIG_NAME']][model][scenario]['power_efficiency'] = power_efficiency

Expand Down Expand Up @@ -281,7 +284,6 @@ def postprocess(i):

for xd in xdirs:
xpath = os.path.join(cache.path, xd)
print (xpath)
if os.path.isdir(xpath):
r = cm.access({'action':'system', 'automation':'utils', 'path':xpath, 'cmd':'git rev-parse HEAD'})
if r['return'] == 0 and r['ret'] == 0:
Expand Down Expand Up @@ -459,6 +461,7 @@ def postprocess(i):
if env.get('CM_DUMP_SYSTEM_INFO', True):
dump_script_output("detect,os", env, state, 'new_env', os.path.join(output_dir, "os_info.json"))
dump_script_output("detect,cpu", env, state, 'new_env', os.path.join(output_dir, "cpu_info.json"))
env['CM_DUMP_RAW_PIP_FREEZE_FILE_PATH'] = os.path.join(env['CM_MLPERF_OUTPUT_DIR'], "pip_freeze.raw")
dump_script_output("dump,pip,freeze", env, state, 'new_state', os.path.join(output_dir, "pip_freeze.json"))

return {'return':0}
Expand Down
30 changes: 30 additions & 0 deletions cm-mlops/script/app-stable-diffusion-onnx-py/README-extra.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Examples

CM interface for https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx

```bash
cm run script "install python-venv" --name=sd-test
cm run script "get generic-python-lib _package.optimum[onnxruntime]" --adr.python.name=sd-test
cm run script "activate python-venv" --name=sd-test

cm run script "python app stable-diffusion onnx" --adr.python.name=sd-test --text="crazy programmer"

cm rm cache -f
cm run script "python app stable-diffusion onnx _cuda" --adr.python.name=sd-test --text="crazy programmer"

cm docker script "python app stable-diffusion onnx" --text="crazy programmer" --output=. --docker_cm_repo=ctuning@mlcommons-ck --env.CM_DOCKER_ADD_FLAG_TO_CM_MLOPS_REPO=xyz4

```



# Resources

* https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
* https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main
* https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/main
* https://huggingface.co/runwayml/stable-diffusion-v1-5
* https://huggingface.co/bes-dev/stable-diffusion-v1-4-onnx
* https://onnxruntime.ai/docs/tutorials/csharp/stable-diffusion-csharp.html
* https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main
* https://huggingface.co/docs/optimum/onnxruntime/usage_guides/models
106 changes: 106 additions & 0 deletions cm-mlops/script/app-stable-diffusion-onnx-py/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
alias: app-stable-diffusion-onnx-py
uid: 4d33981ac3534b3b

automation_alias: script
automation_uid: 5b4e0237da074764

category: "Modular AI/ML application pipeline"

tags:
- app
- stable
- diffusion
- stable-diffusion
- onnx
- python


deps:
- tags: detect,os
- tags: get,sys-utils-cm
- names:
- python
- python3
tags: get,python3

- tags: get,cuda
names:
- cuda
enable_if_env:
USE_CUDA:
- yes
- tags: get,cudnn
names:
- cudnn
enable_if_env:
USE_CUDA:
- yes







- tags: get,generic-python-lib,_package.optimum[onnxruntime]
names:
- optimum
skip_if_env:
USE_CUDA:
- yes

- tags: get,generic-python-lib,_package.optimum[onnxruntime-gpu]
names:
- optimum
enable_if_env:
USE_CUDA:
- yes

- tags: get,generic-python-lib,_package.diffusers
names:
- diffusers


- tags: get,ml-model,huggingface,zoo,_model-stub.runwayml/stable-diffusion-v1-5
revision: onnx
model_filename: model_index.json
full_subfolder: .


variations:
cuda:
group: target
env:
USE_CUDA: yes
CM_DEVICE: cuda:0

cpu:
group: target
default: yes
env:
USE_CPU: yes
CM_DEVICE: cpu

input_mapping:
text: CM_APP_STABLE_DIFFUSION_ONNX_PY_TEXT
output: CM_APP_STABLE_DIFFUSION_ONNX_PY_OUTPUT


input_description:
text:
desc: "Text to generate image"
output:
desc: "Output directory"


docker:
skip_run_cmd: 'no'
all_gpus: 'yes'
input_paths:
- output
add_quotes_to_keys:
- text
skip_input_for_fake_run:
- text
- output
- env.CM_DOCKER_ADD_FLAG_TO_CM_MLOPS_REPO
34 changes: 34 additions & 0 deletions cm-mlops/script/app-stable-diffusion-onnx-py/process.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/onnx

import os

from optimum.onnxruntime import ORTStableDiffusionPipeline

output = os.environ.get('CM_APP_STABLE_DIFFUSION_ONNX_PY_OUTPUT','')

f = os.path.join(output, 'output.png')

if os.path.isfile(f):
os.remove(f)

cm_model_path = os.environ.get('CM_ML_MODEL_PATH','')
if cm_model_path == '':
print ('Error: CM_ML_MODEL_PATH env is not defined')
exit(1)

device = os.environ.get('CM_DEVICE','')

pipeline = ORTStableDiffusionPipeline.from_pretrained(cm_model_path, local_files_only=True).to(device)

text = os.environ.get('CM_APP_STABLE_DIFFUSION_ONNX_PY_TEXT','')
if text == '': text = "a photo of an astronaut riding a horse on mars"


print ('')
print ('Generating imaged based on "{}"'.format(text))

image = pipeline(text).images[0]

image.save(f)

print ('Image recorded to "{}"'.format(f))
2 changes: 2 additions & 0 deletions cm-mlops/script/app-stable-diffusion-onnx-py/run.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
%CM_PYTHON_BIN_WITH_PATH% %CM_TMP_CURRENT_SCRIPT_PATH%\process.py
IF %ERRORLEVEL% NEQ 0 EXIT %ERRORLEVEL%
4 changes: 4 additions & 0 deletions cm-mlops/script/app-stable-diffusion-onnx-py/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash

${CM_PYTHON_BIN} ${CM_TMP_CURRENT_SCRIPT_PATH}/process.py
test $? -eq 0 || exit 1
Loading
Loading