Results from self hosted Github actions - NVIDIARTX4090

mlcommons · Feb 7, 2025 · d6439d0 · d6439d0
1 parent a76edae
commit d6439d0
Show file tree

Hide file tree

Showing 24 changed files with 6,522 additions and 67 deletions.
diff --git a/open/MLCommons/code/gptj-99/README.md b/open/MLCommons/code/gptj-99/README.md
@@ -0,0 +1 @@
+TBD
diff --git a/open/MLCommons/measurements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/README.md b/open/MLCommons/measurements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/README.md
@@ -1,3 +1,3 @@
-| Model               | Scenario   | Accuracy             |   Throughput | Latency (in ms)   |
-|---------------------|------------|----------------------|--------------|-------------------|
-| stable-diffusion-xl | offline    | (16.3689, 237.82579) |        0.353 | -                 |
+| Model   | Scenario   | Accuracy                          |   Throughput | Latency (in ms)   |
+|---------|------------|-----------------------------------|--------------|-------------------|
+| gptj-99 | offline    | (32.2581, 6.6667, 22.5806, 264.0) |        49.01 | -                 |
diff --git a/...urements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/README.md b/...urements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/README.md
@@ -0,0 +1,46 @@
+*Check [CM MLPerf docs](https://docs.mlcommons.org/inference) for more details.*
+
+## Host platform
+
+* OS version: Linux-6.8.0-52-generic-x86_64-with-glibc2.35
+* CPU version: x86_64
+* Python version: 3.10.12 (main, Jan 17 2025, 14:35:34) [GCC 11.4.0]
+* MLC version: unknown
+
+## CM Run Command
+
+See [CM installation guide](https://docs.mlcommons.org/inference/install/).
+
+```bash
+pip install -U mlcflow
+
+mlc rm cache -f
+
+mlc pull repo gateoverflow@mlperf-automations --checkout=0f5a86f9514172a0976090cdc91963b2b7eb4282
+
+
+```
+*Note that if you want to use the [latest automation recipes](https://docs.mlcommons.org/inference) for MLPerf,
+ you should simply reload gateoverflow@mlperf-automations without checkout and clean MLC cache as follows:*
+
+```bash
+mlc rm repo gateoverflow@mlperf-automations
+mlc pull repo gateoverflow@mlperf-automations
+mlc rm cache -f
+
+```
+
+## Results
+
+Platform: gh_action-reference-gpu-pytorch_v2.6.0-cu124
+
+Model Precision: fp32
+
+### Accuracy Results 
+`ROUGE1`: `32.2581`, Required accuracy for closed division `>= 42.55663`
+`ROUGE2`: `6.6667`, Required accuracy for closed division `>= 19.92226`
+`ROUGEL`: `22.5806`, Required accuracy for closed division `>= 29.68822`
+`GEN_LEN`: `264.0`, Required accuracy for closed division `>= 3615190.2`
+
+### Performance Results 
+`Samples per second`: `49.0101`
diff --git a/...rements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/accuracy_console.out b/...rements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/accuracy_console.out
@@ -0,0 +1,24 @@
+Constructing QSL
+Encoding Samples
+Finished constructing QSL.
+Loading PyTorch model...
+Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]Loading checkpoint shards:  33%|███▎      | 1/3 [00:00<00:00,  2.58it/s]Loading checkpoint shards:  67%|██████▋   | 2/3 [00:00<00:00,  2.00it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00,  2.19it/s]Loading checkpoint shards: 100%|██████████| 3/3 [00:01<00:00,  2.19it/s]
+Some weights of the model checkpoint at /home/mlcuser/MLC/repos/local/cache/download-file_9907d37a/checkpoint/checkpoint-final were not used when initializing GPTJForCausalLM: ['transformer.h.0.attn.bias', 'transformer.h.0.attn.masked_bias', 'transformer.h.1.attn.bias', 'transformer.h.1.attn.masked_bias', 'transformer.h.10.attn.bias', 'transformer.h.10.attn.masked_bias', 'transformer.h.11.attn.bias', 'transformer.h.11.attn.masked_bias', 'transformer.h.12.attn.bias', 'transformer.h.12.attn.masked_bias', 'transformer.h.13.attn.bias', 'transformer.h.13.attn.masked_bias', 'transformer.h.14.attn.bias', 'transformer.h.14.attn.masked_bias', 'transformer.h.15.attn.bias', 'transformer.h.15.attn.masked_bias', 'transformer.h.16.attn.bias', 'transformer.h.16.attn.masked_bias', 'transformer.h.17.attn.bias', 'transformer.h.17.attn.masked_bias', 'transformer.h.18.attn.bias', 'transformer.h.18.attn.masked_bias', 'transformer.h.19.attn.bias', 'transformer.h.19.attn.masked_bias', 'transformer.h.2.attn.bias', 'transformer.h.2.attn.masked_bias', 'transformer.h.20.attn.bias', 'transformer.h.20.attn.masked_bias', 'transformer.h.21.attn.bias', 'transformer.h.21.attn.masked_bias', 'transformer.h.22.attn.bias', 'transformer.h.22.attn.masked_bias', 'transformer.h.23.attn.bias', 'transformer.h.23.attn.masked_bias', 'transformer.h.24.attn.bias', 'transformer.h.24.attn.masked_bias', 'transformer.h.25.attn.bias', 'transformer.h.25.attn.masked_bias', 'transformer.h.26.attn.bias', 'transformer.h.26.attn.masked_bias', 'transformer.h.27.attn.bias', 'transformer.h.27.attn.masked_bias', 'transformer.h.3.attn.bias', 'transformer.h.3.attn.masked_bias', 'transformer.h.4.attn.bias', 'transformer.h.4.attn.masked_bias', 'transformer.h.5.attn.bias', 'transformer.h.5.attn.masked_bias', 'transformer.h.6.attn.bias', 'transformer.h.6.attn.masked_bias', 'transformer.h.7.attn.bias', 'transformer.h.7.attn.masked_bias', 'transformer.h.8.attn.bias', 'transformer.h.8.attn.masked_bias', 'transformer.h.9.attn.bias', 'transformer.h.9.attn.masked_bias']
+- This IS expected if you are initializing GPTJForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
+- This IS NOT expected if you are initializing GPTJForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
+Casting models to GPU...
+  0%|          | 0/285 [00:00<?, ?it/s]100%|██████████| 285/285 [00:00<00:00, 2029501.94it/s]
+Running LoadGen test...
+Number of Samples in query_samples :  1
+  0%|          | 0/1 [00:00<?, ?it/s]/home/mlcuser/venv/mlc/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:676: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
+  warnings.warn(
+100%|██████████| 1/1 [00:01<00:00,  1.29s/it]100%|██████████| 1/1 [00:01<00:00,  1.29s/it]
+
+No warnings encountered during test.
+
+No errors encountered during test.
+Test Done!
+Destroying SUT...
+Destroying QSL...
+Finished destroying SUT.
+Finished destroying QSL.

diff --git a/...s/measurements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/cpu_info.json b/...s/measurements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/cpu_info.json
@@ -0,0 +1,27 @@
+{
+  "MLC_HOST_CPU_WRITE_PROTECT_SUPPORT": "yes",
+  "MLC_HOST_CPU_MICROCODE": "0x2b000603",
+  "MLC_HOST_CPU_FPU_SUPPORT": "yes",
+  "MLC_HOST_CPU_FPU_EXCEPTION_SUPPORT": "yes",
+  "MLC_HOST_CPU_BUGS": "spectre_v1 spectre_v2 spec_store_bypass swapgs eibrs_pbrsb bhi",
+  "MLC_HOST_CPU_TLB_SIZE": "Not Found",
+  "MLC_HOST_CPU_CFLUSH_SIZE": "64",
+  "MLC_HOST_CPU_ARCHITECTURE": "x86_64",
+  "MLC_HOST_CPU_TOTAL_CORES": "48",
+  "MLC_HOST_CPU_ON_LINE_CPUS_LIST": "0-47",
+  "MLC_HOST_CPU_VENDOR_ID": "GenuineIntel",
+  "MLC_HOST_CPU_MODEL_NAME": "Intel(R) Xeon(R) w7-2495X",
+  "MLC_HOST_CPU_FAMILY": "6",
+  "MLC_HOST_CPU_THREADS_PER_CORE": "2",
+  "MLC_HOST_CPU_PHYSICAL_CORES_PER_SOCKET": "24",
+  "MLC_HOST_CPU_SOCKETS": "1",
+  "MLC_HOST_CPU_MAX_MHZ": "4800.0000",
+  "MLC_HOST_CPU_L1D_CACHE_SIZE": "1.1 MiB (24 instances)",
+  "MLC_HOST_CPU_L1I_CACHE_SIZE": "768 KiB (24 instances)",
+  "MLC_HOST_CPU_L2_CACHE_SIZE": "48 MiB (24 instances)",
+  "MLC_HOST_CPU_L3_CACHE_SIZE": "45 MiB (1 instance)",
+  "MLC_HOST_CPU_NUMA_NODES": "1",
+  "MLC_HOST_CPU_TOTAL_LOGICAL_CORES": "48",
+  "MLC_HOST_MEMORY_CAPACITY": "192G",
+  "MLC_HOST_DISK_CAPACITY": "6.9T"
+}
diff --git a/...pu-pytorch_v2.6.0-cu124/gptj-99/offline/gh_action-reference-gpu-pytorch_v2.6.0-cu124.json b/...pu-pytorch_v2.6.0-cu124/gptj-99/offline/gh_action-reference-gpu-pytorch_v2.6.0-cu124.json
@@ -0,0 +1,7 @@
+{
+  "starting_weights_filename": "checkpoint-final",
+  "retraining": "no",
+  "input_data_types": "fp32",
+  "weight_data_types": "fp32",
+  "weight_transformations": "no"
+}
diff --git a/...ns/measurements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/mlc-deps.mmd b/...ns/measurements/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/mlc-deps.mmd
@@ -0,0 +1,64 @@
+graph TD
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> detect,os
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> get,sys-utils-cm
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> get,python
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> get,mlcommons,inference,src
+    pull-git-repo,c23132ed65c4421d --> detect,os
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> pull,git,repo
+    get-mlperf-inference-utils,e341e5f86d8342e5 --> get,mlperf,inference,src
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> get,mlperf,inference,utils
+    get-cuda,46d133d9ef92422d_(_toolkit_) --> detect,os
+    get-cuda-devices,7a3ede4d3558427a_(_with-pycuda_) --> get,cuda,_toolkit
+    get-cuda-devices,7a3ede4d3558427a_(_with-pycuda_) --> get,python3
+    get-generic-python-lib,94b62a682bc44791_(_package.pycuda_) --> detect,os
+    detect-cpu,586c8a43320142f7 --> detect,os
+    get-generic-python-lib,94b62a682bc44791_(_package.pycuda_) --> detect,cpu
+    get-generic-python-lib,94b62a682bc44791_(_package.pycuda_) --> get,python3
+    get-generic-python-lib,94b62a682bc44791_(_pip_) --> get,python3
+    get-generic-python-lib,94b62a682bc44791_(_package.pycuda_) --> get,generic-python-lib,_pip
+    get-cuda-devices,7a3ede4d3558427a_(_with-pycuda_) --> get,generic-python-lib,_package.pycuda
+    get-generic-python-lib,94b62a682bc44791_(_package.numpy_) --> detect,os
+    detect-cpu,586c8a43320142f7 --> detect,os
+    get-generic-python-lib,94b62a682bc44791_(_package.numpy_) --> detect,cpu
+    get-generic-python-lib,94b62a682bc44791_(_package.numpy_) --> get,python3
+    get-generic-python-lib,94b62a682bc44791_(_pip_) --> get,python3
+    get-generic-python-lib,94b62a682bc44791_(_package.numpy_) --> get,generic-python-lib,_pip
+    get-cuda-devices,7a3ede4d3558427a_(_with-pycuda_) --> get,generic-python-lib,_package.numpy
+    app-mlperf-inference,d775cac873ee4231_(_reference,_gptj-99,_pytorch,_cuda,_test,_r5.0-dev_default,_float16,_offline_) --> get,cuda-devices,_with-pycuda
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> detect,os
+    detect-cpu,586c8a43320142f7 --> detect,os
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> detect,cpu
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,sys-utils-cm
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,python
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,cuda,_cudnn
+    get-generic-python-lib,94b62a682bc44791_(_torch_cuda_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_torch_cuda
+    get-generic-python-lib,94b62a682bc44791_(_torchvision_cuda_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_torchvision_cuda
+    get-generic-python-lib,94b62a682bc44791_(_transformers_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_transformers
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,ml-model,large-language-model,gptj,raw,_pytorch
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,dataset,cnndm,_validation
+    generate-mlperf-inference-user-conf,3af4475745964b93 --> detect,os
+    detect-cpu,586c8a43320142f7 --> detect,os
+    generate-mlperf-inference-user-conf,3af4475745964b93 --> detect,cpu
+    generate-mlperf-inference-user-conf,3af4475745964b93 --> get,python
+    generate-mlperf-inference-user-conf,3af4475745964b93 --> get,mlcommons,inference,src
+    get-mlperf-inference-sut-configs,c2fbf72009e2445b --> get,cache,dir,_name.mlperf-inference-sut-configs
+    generate-mlperf-inference-user-conf,3af4475745964b93 --> get,sut,configs
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> generate,user-conf,mlperf,inference
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,loadgen
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,mlcommons,inference,src
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,mlcommons,inference,src
+    get-generic-python-lib,94b62a682bc44791_(_package.psutil_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_package.psutil
+    get-generic-python-lib,94b62a682bc44791_(_package.datasets_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_package.datasets
+    get-generic-python-lib,94b62a682bc44791_(_package.attrs_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_package.attrs
+    get-generic-python-lib,94b62a682bc44791_(_package.accelerate_) --> get,python3
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> get,generic-python-lib,_package.accelerate
+    detect-cpu,586c8a43320142f7 --> detect,os
+    benchmark-program,19f369ef47084895 --> detect,cpu
+    benchmark-program-mlperf,cfff0132a8aa4018 --> benchmark-program,program
+    app-mlperf-inference-mlcommons-python,ff149e9781fc4b65_(_pytorch,_gptj-99,_cuda,_offline,_float16_) --> benchmark-mlperf
diff --git a/...ments/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/mlc-deps.png b/...ments/gh_action-reference-gpu-pytorch_v2.6.0-cu124/gptj-99/offline/mlc-deps.png