Skip to content

Commit a3dc371

Browse files
committed
Auto-merge updates from auto-update branch
2 parents 5a97b2f + efbc0c7 commit a3dc371

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+1134
-1134
lines changed

open/MLCommons/measurements/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/stable-diffusion-xl/offline/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ pip install -U cmind
1919

2020
cm rm cache -f
2121

22-
cm pull repo mlcommons@mlperf-automations --checkout=467517e4a572872046058e394a0d83512cfff38b
22+
cm pull repo mlcommons@mlperf-automations --checkout=ca9263aff2a56ee495a03382fb678506581d9da9
2323

2424
cm run script \
2525
--tags=app,mlperf,inference,generic,_nvidia,_sdxl,_tensorrt,_cuda,_valid,_r4.1-dev_default,_offline \
@@ -71,7 +71,7 @@ cm run script \
7171
--env.CM_DOCKER_REUSE_EXISTING_CONTAINER=yes \
7272
--env.CM_DOCKER_DETACHED_MODE=yes \
7373
--env.CM_MLPERF_INFERENCE_RESULTS_DIR_=/home/arjun/gh_action_results/valid_results \
74-
--env.CM_DOCKER_CONTAINER_ID=82cba5956497 \
74+
--env.CM_DOCKER_CONTAINER_ID=c30d1a720abb \
7575
--env.CM_MLPERF_LOADGEN_COMPLIANCE_TEST=TEST04 \
7676
--add_deps_recursive.compiler.tags=gcc \
7777
--add_deps_recursive.coco2014-original.tags=_full \
@@ -129,4 +129,4 @@ Model Precision: int8
129129
### Accuracy Results
130130

131131
### Performance Results
132-
`Samples per second`: `0.697572`
132+
`Samples per second`: `0.698`
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,30 @@
1-
[2024-12-28 07:19:36,133 main.py:229 INFO] Detected system ID: KnownSystem.RTX4090x1
1+
[2024-12-29 07:34:13,589 main.py:229 INFO] Detected system ID: KnownSystem.RTX4090x1
22
/home/cmuser/.local/lib/python3.8/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
33
warnings.warn(_BETA_TRANSFORMS_WARNING)
44
/home/cmuser/.local/lib/python3.8/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
55
warnings.warn(_BETA_TRANSFORMS_WARNING)
6-
[2024-12-28 07:19:37,149 generate_conf_files.py:107 INFO] Generated measurements/ entries for RTX4090x1_TRT/stable-diffusion-xl/Offline
7-
[2024-12-28 07:19:37,149 __init__.py:46 INFO] Running command: python3 -m code.stable-diffusion-xl.tensorrt.harness --logfile_outdir="/cm-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/stable-diffusion-xl/offline/accuracy" --logfile_prefix="mlperf_log_" --performance_sample_count=5000 --test_mode="AccuracyOnly" --gpu_batch_size=2 --mlperf_conf_path="/home/cmuser/CM/repos/local/cache/c1d8c371d52d46a3/inference/mlperf.conf" --tensor_path="build/preprocessed_data/coco2014-tokenized-sdxl/5k_dataset_final/" --use_graphs=true --user_conf_path="/home/cmuser/CM/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/9589e8492fe242ea972de9be508f4e7e.conf" --gpu_inference_streams=1 --gpu_copy_streams=1 --gpu_engines="./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan,./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan" --scenario Offline --model stable-diffusion-xl
8-
[2024-12-28 07:19:37,149 __init__.py:53 INFO] Overriding Environment
6+
[2024-12-29 07:34:14,715 generate_conf_files.py:107 INFO] Generated measurements/ entries for RTX4090x1_TRT/stable-diffusion-xl/Offline
7+
[2024-12-29 07:34:14,715 __init__.py:46 INFO] Running command: python3 -m code.stable-diffusion-xl.tensorrt.harness --logfile_outdir="/cm-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/stable-diffusion-xl/offline/accuracy" --logfile_prefix="mlperf_log_" --performance_sample_count=5000 --test_mode="AccuracyOnly" --gpu_batch_size=2 --mlperf_conf_path="/home/cmuser/CM/repos/local/cache/c1d8c371d52d46a3/inference/mlperf.conf" --tensor_path="build/preprocessed_data/coco2014-tokenized-sdxl/5k_dataset_final/" --use_graphs=true --user_conf_path="/home/cmuser/CM/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/0f2b4a4ab1aa48d092f808fe52515e2a.conf" --gpu_inference_streams=1 --gpu_copy_streams=1 --gpu_engines="./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan,./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan" --scenario Offline --model stable-diffusion-xl
8+
[2024-12-29 07:34:14,715 __init__.py:53 INFO] Overriding Environment
99
/home/cmuser/.local/lib/python3.8/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
1010
warnings.warn(_BETA_TRANSFORMS_WARNING)
1111
/home/cmuser/.local/lib/python3.8/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
1212
warnings.warn(_BETA_TRANSFORMS_WARNING)
13-
[2024-12-28 07:19:38,510 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
14-
[2024-12-28 07:19:38,608 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
15-
[2024-12-28 07:19:39,117 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan.
16-
[2024-12-28 07:19:40,158 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan.
17-
[2024-12-28 07:19:41,123 backend.py:96 INFO] Enabling cuda graphs for unet
18-
[2024-12-28 07:19:41,300 backend.py:154 INFO] captured graph for BS=1
19-
[2024-12-28 07:19:41,553 backend.py:154 INFO] captured graph for BS=2
20-
[2024-12-28 07:19:41,554 harness.py:207 INFO] Start Warm Up!
21-
[2024-12-28 07:19:47,429 harness.py:209 INFO] Warm Up Done!
22-
[2024-12-28 07:19:47,429 harness.py:211 INFO] Start Test!
23-
[2024-12-28 09:19:15,294 backend.py:801 INFO] [Server] Received 5000 total samples
24-
[2024-12-28 09:19:15,295 backend.py:809 INFO] [Device 0] Reported 5000 samples
25-
[2024-12-28 09:19:15,295 harness.py:214 INFO] Test Done!
26-
[2024-12-28 09:19:15,295 harness.py:216 INFO] Destroying SUT...
27-
[2024-12-28 09:19:15,295 harness.py:219 INFO] Destroying QSL...
13+
[2024-12-29 07:34:16,327 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
14+
[2024-12-29 07:34:16,428 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
15+
[2024-12-29 07:34:16,936 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan.
16+
[2024-12-29 07:34:17,974 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan.
17+
[2024-12-29 07:34:18,939 backend.py:96 INFO] Enabling cuda graphs for unet
18+
[2024-12-29 07:34:19,149 backend.py:154 INFO] captured graph for BS=1
19+
[2024-12-29 07:34:19,402 backend.py:154 INFO] captured graph for BS=2
20+
[2024-12-29 07:34:19,402 harness.py:207 INFO] Start Warm Up!
21+
[2024-12-29 07:34:25,225 harness.py:209 INFO] Warm Up Done!
22+
[2024-12-29 07:34:25,225 harness.py:211 INFO] Start Test!
23+
[2024-12-29 09:33:49,131 backend.py:801 INFO] [Server] Received 5000 total samples
24+
[2024-12-29 09:33:49,132 backend.py:809 INFO] [Device 0] Reported 5000 samples
25+
[2024-12-29 09:33:49,132 harness.py:214 INFO] Test Done!
26+
[2024-12-29 09:33:49,132 harness.py:216 INFO] Destroying SUT...
27+
[2024-12-29 09:33:49,132 harness.py:219 INFO] Destroying QSL...
2828
benchmark : Benchmark.SDXL
2929
buffer_manager_thread_count : 0
3030
data_dir : /home/cmuser/CM/repos/local/cache/5b2b0cc913a4453a/data
@@ -33,7 +33,7 @@ gpu_copy_streams : 1
3333
gpu_inference_streams : 1
3434
input_dtype : int32
3535
input_format : linear
36-
log_dir : /home/cmuser/CM/repos/local/cache/dfbf240f980947f5/repo/closed/NVIDIA/build/logs/2024.12.28-07.19.35
36+
log_dir : /home/cmuser/CM/repos/local/cache/dfbf240f980947f5/repo/closed/NVIDIA/build/logs/2024.12.29-07.34.12
3737
mlperf_conf_path : /home/cmuser/CM/repos/local/cache/c1d8c371d52d46a3/inference/mlperf.conf
3838
model_path : /home/cmuser/CM/repos/local/cache/5b2b0cc913a4453a/models/SDXL/
3939
offline_expected_qps : 0.0
@@ -44,7 +44,7 @@ system : SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='13
4444
tensor_path : build/preprocessed_data/coco2014-tokenized-sdxl/5k_dataset_final/
4545
test_mode : AccuracyOnly
4646
use_graphs : True
47-
user_conf_path : /home/cmuser/CM/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/9589e8492fe242ea972de9be508f4e7e.conf
47+
user_conf_path : /home/cmuser/CM/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/0f2b4a4ab1aa48d092f808fe52515e2a.conf
4848
system_id : RTX4090x1
4949
config_name : RTX4090x1_stable-diffusion-xl_Offline
5050
workload_setting : WorkloadSetting(HarnessType.Custom, AccuracyTarget.k_99, PowerSetting.MaxP)
@@ -60,7 +60,7 @@ cpu_freq : None
6060
[I] Loading bytes from ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan
6161
[I] Loading bytes from ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan
6262
[I] Loading bytes from ./build/engines/RTX4090x1/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan
63-
[2024-12-28 09:19:15,588 run_harness.py:166 INFO] Result: Accuracy run detected.
63+
[2024-12-29 09:33:49,425 run_harness.py:166 INFO] Result: Accuracy run detected.
6464

6565
======================== Result summaries: ========================
6666

0 commit comments

Comments
 (0)