Run reference code for `mixtral-8x7b` #48

maria-18-git · 2024-07-03T13:23:29Z

According to README.md run reference code for mixtral-8x7b.

The text was updated successfully, but these errors were encountered:

maria-18-git · 2024-07-03T13:33:48Z

1. Download a repository with reference code

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ git clone --recurse-submodules https://github.com/mlcommons/inference --depth 1
...
Receiving objects: 100% (27459/27459), 12.81 MiB | 20.96 MiB/s, done.
Resolving deltas: 100% (20403/20403), done.
Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest': checked out '9077ec7efe5b652468ab051e93c67589d5cb8f85'
Submodule path 'vision/medical_imaging/3d-unet-brats19/nnUnet': checked out 'b38c69b345b2f60cd0d053039669e8f988b0c0af'

Directory with the reference code:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ ls -la
total 100
drwxr-xr-x 2 mmirkina users  4096 Jun 19 06:56 .
drwxr-xr-x 6 mmirkina users  4096 Jun 19 06:56 ..
-rw-r--r-- 1 mmirkina users   342 Jun 19 06:56 build.sh
-rw-r--r-- 1 mmirkina users  3818 Jun 19 06:56 dataset.py
-rw-r--r-- 1 mmirkina users  1907 Jun 19 06:56 Dockerfile
-rw-r--r-- 1 mmirkina users  1811 Jun 19 06:56 Dockerfile.eval
-rw-r--r-- 1 mmirkina users  6565 Jun 19 06:56 evaluate-accuracy.py
-rw-r--r-- 1 mmirkina users  4445 Jun 19 06:56 evaluate_mbxp.py
-rw-r--r-- 1 mmirkina users  1085 Jun 19 06:56 launch.sh
-rw-r--r-- 1 mmirkina users  4490 Jun 19 06:56 main.py
-rw-r--r-- 1 mmirkina users  9328 Jun 19 06:56 README.md
-rw-r--r-- 1 mmirkina users   874 Jun 19 06:56 run_accuracy.sh
-rw-r--r-- 1 mmirkina users   382 Jun 19 06:56 run_offline.sh
-rw-r--r-- 1 mmirkina users   383 Jun 19 06:56 run_server.sh
-rw-r--r-- 1 mmirkina users 16660 Jun 19 06:56 SUT.py

2. Copy mperf.conf to mixtral-8x7b

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ cp ../../mlperf.conf .


mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ ls -la
total 104
drwxr-xr-x 2 mmirkina users  4096 Jun 19 07:03 .
drwxr-xr-x 6 mmirkina users  4096 Jun 19 06:56 ..
-rw-r--r-- 1 mmirkina users   342 Jun 19 06:56 build.sh
-rw-r--r-- 1 mmirkina users  3818 Jun 19 06:56 dataset.py
-rw-r--r-- 1 mmirkina users  1907 Jun 19 06:56 Dockerfile
-rw-r--r-- 1 mmirkina users  1811 Jun 19 06:56 Dockerfile.eval
-rw-r--r-- 1 mmirkina users  6565 Jun 19 06:56 evaluate-accuracy.py
-rw-r--r-- 1 mmirkina users  4445 Jun 19 06:56 evaluate_mbxp.py
-rw-r--r-- 1 mmirkina users  1085 Jun 19 06:56 launch.sh
-rw-r--r-- 1 mmirkina users  4490 Jun 19 06:56 main.py
-rw-r--r-- 1 mmirkina users  3996 Jun 19 07:03 mlperf.conf
-rw-r--r-- 1 mmirkina users  9328 Jun 19 06:56 README.md
-rw-r--r-- 1 mmirkina users   874 Jun 19 06:56 run_accuracy.sh
-rw-r--r-- 1 mmirkina users   382 Jun 19 06:56 run_offline.sh
-rw-r--r-- 1 mmirkina users   383 Jun 19 06:56 run_server.sh
-rw-r--r-- 1 mmirkina users 16660 Jun 19 06:56 SUT.py
-rw-r--r-- 1 mmirkina users   234 Jun 19 06:56 user.conf

3. Set python3 version to 3.9

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 --version
Python 3.8.19
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3.9 --version
Python 3.9.19

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo update-alternatives --set python3 /usr/bin/python3.9
update-alternatives: using /usr/bin/python3.9 to provide /usr/bin/python3 (python3) in manual mode

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 --version
Python 3.9.19

maria-18-git · 2024-07-04T10:49:55Z

4. Install python packages

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install pybind11==2.10.4
Defaulting to user installation because normal site-packages is not writeable
Collecting pybind11==2.10.4
  Downloading pybind11-2.10.4-py3-none-any.whl (222 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 222.3/222.3 KB 1.3 MB/s eta 0:00:00
Installing collected packages: pybind11
  WARNING: The script pybind11-config is installed in '/local/mnt/workspace/mmirkina/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pybind11-2.10.4

If you want to use CPU:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch==2.2.0.dev20231006+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
ERROR: Could not find a version that satisfies the requirement torch==2.2.0.dev20231006+cpu (from versions: 2.2.0.dev20231010+cpu, 2.4.0.dev20240421+cpu, 2.4.0.dev20240422+cpu, 2.4.0.dev20240423+cpu, 2.4.0.dev20240424+cpu, 2.4.0.dev20240425+cpu, 2.4.0.dev20240426+cpu, 2.4.0.dev20240427+cpu, 2.4.0.dev20240428+cpu, 2.4.0.dev20240429+cpu, 2.4.0.dev20240430+cpu, 2.4.0.dev20240501+cpu, 2.4.0.dev20240502+cpu, 2.4.0.dev20240503+cpu, 2.4.0.dev20240504+cpu, 2.4.0.dev20240505+cpu, 2.4.0.dev20240506+cpu, 2.4.0.dev20240507+cpu, 2.4.0.dev20240508+cpu, 2.4.0.dev20240509+cpu, 2.4.0.dev20240510+cpu, 2.4.0.dev20240511+cpu, 2.4.0.dev20240512+cpu, 2.4.0.dev20240513+cpu, 2.4.0.dev20240514+cpu, 2.4.0.dev20240515+cpu, 2.4.0.dev20240516+cpu, 2.4.0.dev20240517+cpu, 2.4.0.dev20240518+cpu, 2.4.0.dev20240519+cpu, 2.4.0.dev20240520+cpu, 2.4.0.dev20240521+cpu, 2.4.0.dev20240522+cpu, 2.4.0.dev20240523+cpu, 2.4.0.dev20240524+cpu, 2.4.0.dev20240525+cpu, 2.4.0.dev20240526+cpu, 2.4.0.dev20240527+cpu, 2.4.0.dev20240528+cpu, 2.4.0.dev20240529+cpu, 2.4.0.dev20240530+cpu, 2.4.0.dev20240531+cpu, 2.4.0.dev20240601+cpu, 2.4.0.dev20240602+cpu, 2.4.0.dev20240603+cpu, 2.4.0.dev20240604+cpu, 2.4.0.dev20240605+cpu, 2.4.0.dev20240606+cpu, 2.4.0.dev20240607+cpu, 2.4.0.dev20240608+cpu, 2.4.0.dev20240609+cpu, 2.4.0.dev20240610+cpu, 2.4.0.dev20240611+cpu, 2.4.0.dev20240612+cpu, 2.5.0.dev20240613+cpu, 2.5.0.dev20240614+cpu, 2.5.0.dev20240615+cpu, 2.5.0.dev20240616+cpu, 2.5.0.dev20240617+cpu, 2.5.0.dev20240618+cpu, 2.5.0.dev20240619+cpu)
ERROR: No matching distribution found for torch==2.2.0.dev20231006+cpu

We don't have this version

torch==2.2.0.dev20231006+cpu

Now we have only these

torch-2.2.0.dev20231010+cpu.cxx11.abi-cp310-cp310-linux_x86_64.whl 
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp311-cp311-linux_x86_64.whl 
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp38-cp38-linux_x86_64.whl 
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp39-cp39-linux_x86_64.whl

in https://download.pytorch.org/whl/nightly/torch/
We use
torch-2.2.0.dev20231010+cpu

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch==2.2.0.dev20231010+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch==2.2.0.dev20231010+cpu
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.2.0.dev20231010%2Bcpu-cp39-cp39-linux_x86_64.whl (185.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 185.1/185.1 MB 10.5 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (4.4.0)
Requirement already satisfied: networkx in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (3.0)
Requirement already satisfied: filelock in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (3.9.0)
Requirement already satisfied: sympy in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (1.12)
Requirement already satisfied: fsspec in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (2023.12.2)
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch==2.2.0.dev20231010+cpu) (3.0.3)
Requirement already satisfied: mpmath>=0.19 in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from sympy->torch==2.2.0.dev20231010+cpu) (1.3.0)
Installing collected packages: torch
  Attempting uninstall: torch
    Found existing installation: torch 2.1.2+cpu
    Uninstalling torch-2.1.2+cpu:
      Successfully uninstalled torch-2.1.2+cpu
  WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/local/mnt/workspace/mmirkina/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.16.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
torchaudio 2.1.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
Successfully installed torch-2.2.0.dev20231010+cpu

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install transformers==4.31.0 nltk==3.8.1 evaluate==0.4.0 absl-py==1.4.0 rouge-score==0.1.2 sentencepiece==0.1.99 accelerate==0.21.0
...
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.16.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
Successfully installed absl-py-1.4.0 accelerate-0.21.0 aiohttp-3.9.5 aiosignal-1.3.1 async-timeout-4.0.3 charset-normalizer-3.3.2 datasets-2.20.0 dill-0.3.8 evaluate-0.4.0 frozenlist-1.4.1 huggingface-hub-0.23.4 joblib-1.4.2 multidict-6.0.5 multiprocess-0.70.16 nltk-3.8.1 pyarrow-16.1.0 pyarrow-hotfix-0.6 requests-2.32.3 responses-0.18.0 rouge-score-0.1.2 sentencepiece-0.1.99 tokenizers-0.13.3 tqdm-4.66.4 transformers-4.31.0 xxhash-3.4.1 yarl-1.9.4

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install git+https://github.com/amazon-science/mxeval.git@e09974f990eeaf0c0e8f2b5eaff4be66effb2c86
...
Successfully built mxeval fire
Installing collected packages: termcolor, fire, mxeval
ERROR: For req: mxeval==1.0. Invalid script entry point: <ExportEntry evaluate_functional_correctness = mxeval.evaluate_functional_correctness:None []> - A callable suffix is required. Cf https://packaging.python.org/specifications/entry-points/#use-for-scripts for more information.

maria-18-git · 2024-07-04T10:52:12Z

If run on CPU:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show torch
Name: torch
Version: 2.2.0.dev20231010+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: accelerate, torchaudio, torchvision

maria-18-git · 2024-07-04T10:52:48Z

If run on GPU:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch
...
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 torch-2.3.1 triton-2.3.1 typing-extensions-4.12.2

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show torch
Name: torch
Version: 2.3.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: accelerate, torchaudio, torchvision

For running experiments we also need pandas:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show pandas                                                                          
WARNING: Package(s) not found: pandas
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install pandas
Defaulting to user installation because normal site-packages is not writeable
Collecting pandas
  Downloading pandas-2.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 21.2 MB/s eta 0:00:00
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas) (2022.1)
Collecting tzdata>=2022.7
  Downloading tzdata-2024.1-py2.py3-none-any.whl (345 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 345.4/345.4 KB 36.8 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.22.4 in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from pandas) (1.24.1)
Collecting python-dateutil>=2.8.2
  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 KB 19.8 MB/s eta 0:00:00
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Installing collected packages: tzdata, python-dateutil, pandas
Successfully installed pandas-2.2.2 python-dateutil-2.9.0.post0 tzdata-2024.1

transformers:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install transformers
...
Successfully installed tokenizers-0.19.1 transformers-4.41.2

maria-18-git · 2024-07-04T10:54:28Z

5. Install loadgen

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ export CUR_DIR=${PWD}
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ cd ../../loadgen/

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/loadgen$ python3 -m pip install .
Defaulting to user installation because normal site-packages is not writeable
Processing /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/loadgen
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: mlperf_loadgen
  Building wheel for mlperf_loadgen (pyproject.toml) ... done
  Created wheel for mlperf_loadgen: filename=mlperf_loadgen-4.0-cp39-cp39-linux_x86_64.whl size=418285 sha256=714f5348ab9db3d520b72bf3c333a038787394cd31586e4b77a77f3a065f9e16
  Stored in directory: /tmp/pip-ephem-wheel-cache-uo3qp3wa/wheels/35/c2/51/339102eab2197cf953ad0a1e30c6fca1f22390f8702f2e0b21
Successfully built mlperf_loadgen
Installing collected packages: mlperf_loadgen
Successfully installed mlperf_loadgen-4.0

maria-18-git · 2024-07-04T11:47:37Z

6. Get model (checkpoint)

run Rclone

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo -v ; curl https://rclone.org/install.sh | sudo bash
Enter password for mmirkina (QUALPASS):
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4734  100  4734    0     0   5488      0 --:--:-- --:--:-- --:--:--  5485

The latest version of rclone rclone v1.67.0 is already installed.

run the following command to authenticate with the bucket

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ rclone config create mlc-inference s3 provider=Cloudflare access_key_id=f65ba5eef400db161ea49967de89f47b secret_access_key=fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b endpoint=https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com
[mlc-inference]
type = s3
access_key_id = f65ba5eef400db161ea49967de89f47b
secret_access_key = fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b
endpoint = https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com
provider = Cloudflare

download the model checkpoint

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624$ time rclone copy mlc-inference:mlcommons-inference-wg-public/mixtral_8x7b/mixtral-8x7b-instruct-v0.1 ./mixtral-8x7b-instruct-v0.1 -P
Transferred:      173.982 GiB / 173.982 GiB, 100%, 18.288 MiB/s, ETA 0s
Transferred:           42 / 42, 100%
Elapsed time:      36m6.7s

real    36m6.834s
user    11m57.130s
sys     8m50.341s

Results:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624$ ls -la mixtral-8x7b-instruct-v0.1/
total 182433204
drwxr-xr-x 2 mmirkina docker       4096 Jun 27 09:38 .
drwxr-xr-x 3 mmirkina docker       4096 Jun 27 09:02 ..
-rw-r--r-- 1 mmirkina docker        803 Jun 24 17:04 config.json
-rw-r--r-- 1 mmirkina docker        111 Jun 24 17:04 generation_config.json
-rw-r--r-- 1 mmirkina docker 4920052720 Jun 24 17:04 model-00001-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00002-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00003-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00004-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00005-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504264 Jun 24 17:05 model-00006-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559912 Jun 24 17:05 model-00007-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00008-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00009-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00010-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00011-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646240 Jun 24 17:06 model-00012-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:06 model-00013-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00014-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00015-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00016-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00017-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00018-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:08 model-00019-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00020-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00021-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00022-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00023-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00024-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:09 model-00025-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00026-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00027-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00028-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00029-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00030-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:11 model-00031-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00032-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00033-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00034-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00035-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00036-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646264 Jun 24 17:12 model-00037-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:13 model-00038-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 1463862216 Jun 24 17:13 model-00039-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker      92659 Jun 24 17:13 model.safetensors.index.json

maria-18-git · 2024-07-04T11:53:07Z

7. Download dataset

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ mkdir dataset
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ chmod 775 dataset
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ cd dataset/
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset$ sudo -v ; curl https://rclone.org/install.sh | sudo bash
Enter password for mmirkina (QUALPASS):
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4734  100  4734    0     0   6199      0 --:--:-- --:--:-- --:--:--  6196

The latest version of rclone rclone v1.67.0 is already installed.

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset$ rclone copyurl https://inference.mlcommons-storage.org/mixtral_8x7b%2F2024.06.06_mixtral_15k_v4.pkl ./ -a -P
Transferred:       68.439 MiB / 68.439 MiB, 100%, 46.237 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:         1.8s

We don't need calibration dataset for accuracy/performance running.

maria-18-git · 2024-07-04T11:57:02Z

8. Run performance

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline   --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log
Traceback (most recent call last):
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/main.py", line 168, in <module>
    main()
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/main.py", line 135, in main
    sut = sut_cls(
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/SUT.py", line 152, in __init__
    self.data_object = Dataset(self.model_path,
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/dataset.py", line 31, in __init__
    self.load_tokenizer()
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/dataset.py", line 39, in load_tokenizer
    self.tokenizer = AutoTokenizer.from_pretrained(
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 902, in from_pretrained
    return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2094, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for '/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/' is the correct path to a directory containing all relevant files for a LlamaTokenizer tokenizer.

real    0m6.223s
user    0m3.351s
sys     0m9.660s

maria-18-git · 2024-07-04T14:13:07Z

The reason of this issue is missing of tokenizer files in downloaded model checkpoint:

tokenizer.json
tokenizer.model
tokenizer_config.json

These files are located in https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main

maria-18-git · 2024-07-04T14:36:08Z

So login to huggingface (https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main) and download these files to Windows.
Then open cmd and copy to apollo using scp`:

C:\Users\mmirkina\Downloads>scp tokenizer* mmirkina@aus655-apollo-0:
...
tokenizer.json                                                                        100% 1753KB   1.6MB/s   00:01
tokenizer.model                                                                       100%  482KB   3.4MB/s   00:00
tokenizer_config.json                                                                 100% 1466    12.4KB/s   00:00

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1$ cp /usr2/mmirkina/token* ./
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1$ ls -la
total 182435448
drwxr-xr-x 2 mmirkina docker       4096 Jun 27 17:41 .
drwxr-xr-x 4 mmirkina docker       4096 Jun 27 17:20 ..
-rw-r--r-- 1 mmirkina docker        803 Jun 24 17:04 config.json
-rw-r--r-- 1 mmirkina docker        111 Jun 24 17:04 generation_config.json
-rw-r--r-- 1 mmirkina docker 4920052720 Jun 24 17:04 model-00001-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00002-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00003-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00004-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00005-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504264 Jun 24 17:05 model-00006-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559912 Jun 24 17:05 model-00007-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00008-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00009-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00010-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00011-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646240 Jun 24 17:06 model-00012-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:06 model-00013-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00014-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00015-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00016-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00017-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00018-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:08 model-00019-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00020-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00021-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00022-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00023-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00024-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:09 model-00025-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00026-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00027-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00028-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00029-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00030-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:11 model-00031-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00032-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00033-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00034-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00035-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00036-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646264 Jun 24 17:12 model-00037-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:13 model-00038-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 1463862216 Jun 24 17:13 model-00039-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker      92659 Jun 24 17:13 model.safetensors.index.json
-rw-r--r-- 1 mmirkina docker       1466 Jun 27 17:41 tokenizer_config.json
-rw-r--r-- 1 mmirkina docker    1795303 Jun 27 17:41 tokenizer.json
-rw-r--r-- 1 mmirkina docker     493443 Jun 27 17:41 tokenizer.model

maria-18-git · 2024-07-04T14:40:50Z

Run Performance:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline   --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log
...
Loading dataset...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
Finished loading dataset.
Loading checkpoint shards: 100%|██████████| 39/39 [00:57<00:00,  1.47s/it]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 15000 samples
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
IssueQuery done

Saving outputs to run_outputs/q13.pkl
Samples run: 1
        BatchMaker time: 0.031223773956298828
        Inference time: 139.80554151535034
        Postprocess time: 0.0006673336029052734
        ==== Total time: 139.83743262290955



Saving outputs to run_outputs/q11.pkl
Samples run: 2
        BatchMaker time: 0.00020241737365722656
        Inference time: 338.96012592315674
        Postprocess time: 0.000946044921875
        ==== Total time: 338.96127438545227
Saving outputs to run_outputs/q10.pkl
Samples run: 3
        BatchMaker time: 0.00020623207092285156
        Inference time: 378.08090806007385
        Postprocess time: 0.0004837512969970703
        ==== Total time: 378.0815980434418
Saving outputs to run_outputs/q7.pkl
Samples run: 4
        BatchMaker time: 0.0001933574676513672
        Inference time: 142.67389917373657
        Postprocess time: 0.0005340576171875
        ==== Total time: 142.6746265888214
Saving outputs to run_outputs/q5.pkl
...
Samples run: 116
        BatchMaker time: 0.0001952648162841797
        Inference time: 138.5476894378662
        Postprocess time: 0.0007069110870361328
        ==== Total time: 138.54859161376953
Saving outputs to run_outputs/q3.pkl
Samples run: 117
        BatchMaker time: 0.00021195411682128906
        Inference time: 255.1372139453888
        Postprocess time: 0.0006382465362548828
        ==== Total time: 255.13806414604187
^C
^C
^C
^C

real    587m51.182s
user    585m56.346s

it was stopped because full experiment took a lot of time.
Setting of --total-sample-count 15 as input parameter didn't influence for number of samples in Performance mode.
Comment about it:
https://github.com/mlcommons/inference/blob/master/language/mixtral-8x7b/main.py#L67

maria-18-git · 2024-07-04T17:10:52Z

Accuracy

Setting of --total-sample-count for accuracy experiments works correctly.

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ OUTPUT_LOG_DIR=offline-accuracy-logs
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ mkdir -p "run_outputs"


mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 10 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
...
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_r
eference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 10 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7
b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
WARNING:Mixtral-8x7B-Instruct-v0.1-MAIN:Accuracy run will generate the accuracy logs, but the evaluation of the log is not completed yet
Loading dataset...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
Finished loading dataset.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:19<00:00,  2.00it/s]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 10 samples
IssueQuery done
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
.Saving outputs to run_outputs/q8.pkl
Samples run: 1
        BatchMaker time: 0.0005521774291992188
        Inference time: 760.4167733192444
        Postprocess time: 0.0010259151458740234
        ==== Total time: 760.4183514118195
Saving outputs to run_outputs/q9.pkl                                                                                                                                                                               Samples run: 2                                                                                                                                                                                                             BatchMaker time: 0.0002186298370361328
        Inference time: 255.86129760742188
        Postprocess time: 0.0005745887756347656
        ==== Total time: 255.86209082603455
Saving outputs to run_outputs/q7.pkl
Samples run: 3
        BatchMaker time: 0.0002162456512451172
        Inference time: 143.11561179161072
        Postprocess time: 0.0005106925964355469
        ==== Total time: 143.1163387298584
Saving outputs to run_outputs/q6.pkl
Samples run: 4
        BatchMaker time: 0.00020933151245117188
        Inference time: 179.33479189872742
        Postprocess time: 0.0005283355712890625
        ==== Total time: 179.33552956581116
Saving outputs to run_outputs/q0.pkl
Samples run: 5
        BatchMaker time: 0.00020623207092285156
        Inference time: 302.74939727783203
        Postprocess time: 0.0005166530609130859
        ==== Total time: 302.75012016296387
Saving outputs to run_outputs/q1.pkl
Samples run: 6
        BatchMaker time: 0.00020122528076171875
        Inference time: 176.75450086593628
        Postprocess time: 0.0005507469177246094
        ==== Total time: 176.75525283813477
Saving outputs to run_outputs/q4.pkl
Samples run: 7
        BatchMaker time: 0.0002200603485107422
        Inference time: 138.0684790611267
        Postprocess time: 0.0005571842193603516
        ==== Total time: 138.06925630569458
Saving outputs to run_outputs/q5.pkl
Samples run: 8
        BatchMaker time: 0.00020170211791992188
        Inference time: 154.93840098381042
        Postprocess time: 0.0005500316619873047
        ==== Total time: 154.93915271759033
Saving outputs to run_outputs/q3.pkl
Samples run: 9
        BatchMaker time: 0.00019693374633789062
        Inference time: 254.17931580543518
        Postprocess time: 0.0005474090576171875
        ==== Total time: 254.18006014823914
Saving outputs to run_outputs/q2.pkl
Samples run: 10
        BatchMaker time: 0.0002086162567138672
        Inference time: 340.03823614120483
        Postprocess time: 0.0005693435668945312
        ==== Total time: 340.03901410102844

No warnings encountered during test.

No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...

real    45m41.087s
user    45m36.972s
sys     0m37.635s

maria-18-git · 2024-07-04T17:21:34Z

But we have an issue when we run evaluate-accuracy.py for accuracy getting(some debug printing added).

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mi
xtral-8x7b-instruct-v0.1/  --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --dtype int32
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
[nltk_data] Downloading package punkt to
[nltk_data]     /local/mnt/workspace/mmirkina/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
DEBUG: data =        dataset                            id                                           question                                              input  ... stop_sequence       tok_stop_sequence tok_input_len tok_ref_output_len
0       GSM8K                     train.548  Gary manages two Amazon distribution centers. ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657                174
1       GSM8K                    train.6592  The square footage of the two bedrooms in the ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657                118
2       GSM8K                    train.6644  Thomas, Toby, and Rebecca worked a total of 15...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           662                224
3       GSM8K                    train.3596  Two-thirds of the class have brown eyes. Half ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           648                 96
4       GSM8K                    train.5034  Jackie spends 8 hours working, 3 hours of exer...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           634                 75
...       ...                           ...                                                ...                                                ...  ...           ...                     ...           ...                ...
14995    MBXP  javascript_sumDigitsTwoparts  /**\n * * Write a function to divide a number ...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           137                284
14996    MBXP   javascript_palindromeLambda  /**\n * * Write a function to find palindromes...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           192                 38
14997    MBXP       javascript_removeTuples  /**\n * * Write a function to remove all the t...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           282                 35
14998    MBXP             javascript_posNos  /**\n * * Write a JavaScript function to print...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           142                 31
14999    MBXP        javascript_tupleToDict  /**\n * * Write a function to convert the give...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           208                 75

[15000 rows x 12 columns]
DEBUG: results(mlperf_accuracy_file) =  [{'seq_id': 0, 'qsl_idx': 8, 'data': '610C00000000000046700000000000002970000000000000CF38000000000000100100000000000030100000000000002E010000000000000801000000000000D82C00000000000086010000000000003E0100000000000
03001000000000000100100000000000030100000000000002E0100000000000008010000000000004C170000000000002E01000000000000514100000000000086010000000000006F01000000000000337000000000000021700000000000000D000000000000000D00000000000000480D000000000000100100000000
00008C0A0000000000003570000000000000DE010000000000006903000000000000710100000000000021700000000000004E700000000000003F7000000000000088020000000000002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E7000000
00000003E70000000000000337000000000000021700000000000000D000000000000000D00000000000000351D00000000000070010000000000009D1000000000000020020000000000002170000000000000627000000000000092310000000000002E0100000000000051410000000000003570000000000000700100
00000000008E020000000000008A4E0000000000001E0100000000000021700000000000004E700000000000006E7000000000000097700000000000002E01000000000000FF020000000000007001000000000000362A00000000000013170000000000006201000000000000C2020000000000003370000000000000530
3000000000000090B000000000000710100000000000021700000000000003E7000000000000033700000000000004E700000000000006E700000000000009E010000000000004070000000000000217000000000000062700000000000005170000000000000470100000000000021700000000000003E70000000000000
337000000000000073700000000000006E70000000000000517000000000000093010000000000008A4E0000000000001E01000000000000337000000000000021700000000000000D000000000000000D000000000000001B5E00000000000010010000000000001E0C000000000000E60D0000000000007001000000000
00013170000000000009301000000000000217000000000000044700000000000004E700000000000003E7000000000000035700000000000001001000000000000E60D000000000000700100000000000013170000000000006201000000000000100100000000000051410000000000002C06000000000000FA01000000
000000EE02000000000000217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F70000000000000337000000000000021700000000000000D000000000000000D00000000000000161400000000000035700000000000002170000
0000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F7000000000000033700000000000002170
0000000000000D000000000000000D000000000000001F21000000000000DE01000000000000FA01000000000000DD03000000000000D5300000000000008B01000000000000DD03000000000000DD220000000000004B700000000000000D000000000000000D000000000000004E700000000000003F700000000000008
8020000000000002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000003E70000000000000337000000000000073700000000000006E70000000000000517000000000000047010000000000
00217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F700000000000000D000000000000000D0000000000000014090000000000001D02000000000000112F000000000000C80100000000000033060000000000002E010000000
00000D5300000000000002A0100000000000014050000000000001001000000000000FD0B0000000000002E010000000000003E0100000000000030010000000000006F01000000000000337000000000000021700000000000000D000000000000000D00000000000000411D000000000000357000000000000042050000
000000004670000000000000297000000000000031260000000000007C010000000000003E01000000000000290100000000000010010000000000008C0600000000000049220000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000450100000000000044700
000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000003B70000000000000DC0200000000000021700000000000004E700000000000000D000000000000000D000000000000001F
210000000000003570000000000000435D000000000000C801000000000000961600000000000062010000000000003E010000000000000A0300000000000010010000000000008B0300000000000049220000000000004B700000000000000D000000000000000D000000000000004E70000000000000580700000000000
044700000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000003B70000000000000DC0200000000000021700000000000004E700000000000003B70000000000000880200000000
00002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B7000000
00000000D000000000000000D0000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E700000000000005170000000000000880200000000000021700000000000006270000000000000517000
00000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000000D000000000000005159000000000000D901000000000000E1020000000000008F0D0000000000004B700000000000000D000000000000000D000000000000004E70000000000000337
00000000000004E700000000000006E700000000000005170000000000000470100000000000021700000000000003E700000000000000D000000000000000D00000000000000C331000000000000230200000000000018060000000000002021000000000000E60100000000000021700000000000004E70000000000000
33700000000000004E700000000000006E700000000000004B700000000000000D000000000000000D000000000000005170000000000000470100000000000021700000000000003E700000000000000D000000000000000D000000000000001F210000000000003570000000000000435D0000000000006F01000000000
000470100000000000021700000000000003E700000000000000A030000000000001001000000000000492200000000000062010000000000003E010000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000450100000000000044700000000000004E70000000
0000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E700000000000009E01000000000000407000000000000021700000000000003E700000000000003B70000000000000DC0200000000000021700000000000004E700000000000000D00000
0000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000217000000000000044700000000000004E700000000000003E70000000000000DC02
00000000000021700000000000004E700000000000000D000000000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B700000000000000D000000000000000D000000000000003F70000000000000470100000000000021700000000000007
0700000000000003E700000000000000D000000000000000D0000000000000016140000000000003570000000000000100100000000000030100000000000002E010000000000000801000000000000D82C0000000000005D010000000000003E010000000000004701000000000000217000000000000070700000000000
003E7000000000000033700000000000009F0100000000000014110000000000005D01000000000000217000000000000070700000000000003E7000000000000033700000000000000200000000000000', 'token_count': 448}, {'seq_id': 1, 'qsl_idx': 9, 'data': '
...
[3:31](https://dividiti.slack.com/archives/C0258V8LLRJ/p1719844312114499)
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: preds_token_OpenOrca =  []
DEBUG: preds_decoded_text =  []
DEBUG: target_required_OpenOrca =  []
DEBUG: preds, targets =  [] []
DEBUG: model:  dict_items([('predictions', []), ('references', [])])
Traceback (most recent call last):
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 221, in <module>
    main()
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 182, in main
    result = metric.compute(
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 432, in compute
    self.add_batch(**inputs)
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 480, in add_batch
    self.selected_feature_format = self._infer_feature_from_batch(batch)
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 552, in _infer_feature_from_batch
    example = dict([(k, v[0]) for k, v in batch.items()])
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 552, in <listcomp>
    example = dict([(k, v[0]) for k, v in batch.items()])
IndexError: list index out of range

real    0m8.781s
user    0m5.032s
sys     0m9.812s

The reason of this issue is full dataset using but results for short run(--total-sample-count 10). So in evaluate-accuracy.py we should use all 3 type of dataset (GSM8K, Open Orca and MBXP ). But we have results only for 10 samples of GSM8K.
Full dataset contains
5000 GSM8K samples,
5000 Open Orca samples ,
5000 MBXP samples.

maria-18-git · 2024-07-04T17:31:08Z

If I commented code for Open Orca and MBXP samples we can have:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/  --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --dtype int32
...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML                                                                                      warnings.warn("Can't initialize NVML")                                                                                                                                                                           [nltk_data] Downloading package punkt to
[nltk_data]     /local/mnt/workspace/mmirkina/nltk_data...                                                                                                                                                         [nltk_data]   Package punkt is already up-to-date!                                                                                                                                                                 DEBUG: data =        dataset                            id                                           question                                              input  ... stop_sequence       tok_stop_sequence tok_inp
ut_len tok_ref_output_len
0       GSM8K                     train.548  Gary manages two Amazon distribution centers. ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657
       174
1       GSM8K                    train.6592  The square footage of the two bedrooms in the ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657
       118
2       GSM8K                    train.6644  Thomas, Toby, and Rebecca worked a total of 15...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           662
       224
3       GSM8K                    train.3596  Two-thirds of the class have brown eyes. Half ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           648
        96
4       GSM8K                    train.5034  Jackie spends 8 hours working, 3 hours of exer...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           634
        75
...       ...                           ...                                                ...                                                ...  ...           ...                     ...           ...
       ...
14995    MBXP  javascript_sumDigitsTwoparts  /**\n * * Write a function to divide a number ...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           137
       284
14996    MBXP   javascript_palindromeLambda  /**\n * * Write a function to find palindromes...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           192
        38
14997    MBXP       javascript_removeTuples  /**\n * * Write a function to remove all the t...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           282
        35
14998    MBXP             javascript_posNos  /**\n * * Write a JavaScript function to print...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           142
        31
14999    MBXP        javascript_tupleToDict  /**\n * * Write a function to convert the give...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           208
        75

[15000 rows x 12 columns]
...
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG:  gsm8k_total =  10
DEBUG: tgt =  60.0
DEBUG: tgt =  36.0
DEBUG: tgt =  58.0
DEBUG: tgt =  4.0
DEBUG: tgt =  14000.0
DEBUG: tgt =  120.0
DEBUG: tgt =  5.0
DEBUG: tgt =  46.0
DEBUG: tgt =  18.0
DEBUG: tgt =  66.0
DEBUG:  correct =  7

Results

{'gsm8k': 70.0, 'gen_len': 0.0, 'gen_num': 10, 'gen_tok_len': 3174, 'tokens_per_sample': 317.4}

maria-18-git · 2024-07-05T09:57:12Z

Solution:

Create dataset with 15 samples:
5 GSM8K samples,
5 Open Orca samples ,
5 MBXP samples.

mixtral_15.pkl - name of this dataset file.

Accuracy

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/download_dataset_15_samples/mixtral_15.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
...
Samples run: 10
        BatchMaker time: 0.0002014636993408203
        Inference time: 218.228013753891
        Postprocess time: 0.0005128383636474609
        ==== Total time: 218.22872805595398
Saving outputs to run_outputs/q4.pkl
Samples run: 11
        BatchMaker time: 0.00020122528076171875
        Inference time: 229.7228820323944
        Postprocess time: 0.0005888938903808594
        ==== Total time: 229.72367215156555
Saving outputs to run_outputs/q3.pkl
Samples run: 12
        BatchMaker time: 0.00020003318786621094
        Inference time: 423.8593044281006
        Postprocess time: 0.0005216598510742188
        ==== Total time: 423.8600261211395
Saving outputs to run_outputs/q9.pkl
Samples run: 13
        BatchMaker time: 0.00019216537475585938
        Inference time: 330.89380168914795
        Postprocess time: 0.0005433559417724609
        ==== Total time: 330.8945372104645
Saving outputs to run_outputs/q12.pkl
Samples run: 14
        BatchMaker time: 0.00019311904907226562
        Inference time: 266.23928022384644
        Postprocess time: 0.0004799365997314453
        ==== Total time: 266.23995327949524
Saving outputs to run_outputs/q0.pkl
Samples run: 15
        BatchMaker time: 0.00022220611572265625
        Inference time: 491.44899702072144
        Postprocess time: 0.000576019287109375
        ==== Total time: 491.44979524612427

No warnings encountered during test.

No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...

real    103m14.302s
user    102m36.134s
sys     1m15.891s

6.93 mins per 1 sample.
It runs on CPU despite of setting --device cuda:0.
For running on GPU we need to set --dtype float16 and add --device cuda to main.py(https://github.com/mlcommons/inference/blob/master/language/mixtral-8x7b/main.py#L49)

maria-18-git · 2024-07-05T10:03:22Z

Accuracy on GPU( all GPUs on `apollo` - 2):

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_r
eference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7
b_reference/download_dataset_15_samples/mixtral_15.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda --dtype float16
WARNING:Mixtral-8x7B-Instruct-v0.1-MAIN:Accuracy run will generate the accuracy logs, but the evaluation of the log is not completed yet
Loading dataset...
Finished loading dataset.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:34<00:00,  1.12it/s]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 15 samples
IssueQuery done
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
Saving outputs to run_outputs/q13.pkl
Samples run: 1
        BatchMaker time: 0.0005486011505126953
        Inference time: 11.902294874191284
        Postprocess time: 0.0007691383361816406
        ==== Total time: 11.903612613677979
...
Samples run: 14
        BatchMaker time: 0.00011944770812988281
        Inference time: 9.253939628601074
        Postprocess time: 0.00037169456481933594
        ==== Total time: 9.254430770874023
Saving outputs to run_outputs/q0.pkl
Samples run: 15
        BatchMaker time: 0.0001556873321533203
        Inference time: 16.992162942886353
        Postprocess time: 0.0005273818969726562
        ==== Total time: 16.99284601211548

No warnings encountered during test.

No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...

real    4m24.852s
user    22m44.229s
sys     7m23.684s

15 samples - 4 min 24 sec.
evaluate accuracy:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/
downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/  --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/download_dataset_15_samples/mixtral
_15.pkl --dtype int32
...
Results

{'rouge1': 51.8093, 'rouge2': 23.1958, 'rougeL': 31.7219, 'rougeLsum': 48.2656, 'gsm8k': 80.0, 'mbxp': 20.0, 'gen_len': 4271, 'gen_num': 15, 'gen_tok_len': 4560, 'tokens_per_sample': 304.0}

maria-18-git · 2024-07-05T10:06:07Z

For checking how it runs on GPU we should use nvtop.

sudo apt install nvtop
nvtop

psyhtest closed this as completed Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run reference code for `mixtral-8x7b` #48

Run reference code for `mixtral-8x7b` #48

maria-18-git commented Jul 3, 2024

maria-18-git commented Jul 3, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024 •

edited

Loading

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024 •

edited

Loading

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024 •

edited

Loading

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 5, 2024 •

edited

Loading

maria-18-git commented Jul 5, 2024 •

edited

Loading

maria-18-git commented Jul 5, 2024

Run reference code for mixtral-8x7b #48

Run reference code for mixtral-8x7b #48

Comments

maria-18-git commented Jul 3, 2024

maria-18-git commented Jul 3, 2024

1. Download a repository with reference code

2. Copy mperf.conf to mixtral-8x7b

3. Set python3 version to 3.9

maria-18-git commented Jul 4, 2024

4. Install python packages

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024 • edited Loading

maria-18-git commented Jul 4, 2024

5. Install loadgen

maria-18-git commented Jul 4, 2024 • edited Loading

6. Get model (checkpoint)

maria-18-git commented Jul 4, 2024

7. Download dataset

maria-18-git commented Jul 4, 2024

8. Run performance

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024 • edited Loading

Run Performance:

maria-18-git commented Jul 4, 2024

Accuracy

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 4, 2024

maria-18-git commented Jul 5, 2024 • edited Loading

Solution:

Accuracy

maria-18-git commented Jul 5, 2024 • edited Loading

Accuracy on GPU( all GPUs on apollo - 2):

maria-18-git commented Jul 5, 2024

Run reference code for `mixtral-8x7b` #48

Run reference code for `mixtral-8x7b` #48

maria-18-git commented Jul 4, 2024 •

edited

Loading

maria-18-git commented Jul 4, 2024 •

edited

Loading

maria-18-git commented Jul 4, 2024 •

edited

Loading

maria-18-git commented Jul 5, 2024 •

edited

Loading

maria-18-git commented Jul 5, 2024 •

edited

Loading

Accuracy on GPU( all GPUs on `apollo` - 2):