Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenShift AI Caikit+TGIS MLPerf Inference Implementation for Llama2-70b #1

Open
wants to merge 75 commits into
base: master
Choose a base branch
from

Commits on Jan 23, 2024

  1. Configuration menu
    Copy the full SHA
    4d0e246 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. Fixes for report generation and submission checker for models without…

    … compliance tests (mlcommons#1576)
    
    * Fix offline_min_samples in submission checker and mlcommons#1569
    
    * Removed mlperf.conf from llama2 directory to avoid confusion
    
    * Update submission_checker.py
    
    * Fixes for 4.0
    
    * Cleanup compliance dir check for models without compliance tests
    arjunsuresh authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    901ce67 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    190413d View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. 🔄 synced local 'tools/submission/power/sources_checksums.json' with r…

    …emote 'compliance/sources_checksums.json' (mlcommons#1582)
    
    Co-authored-by: mlcommons-bot <null>
    mlcommons-bot authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    27ef43a View commit details
    Browse the repository at this point in the history
  2. Fix image list mismatch (mlcommons#1579)

    Co-authored-by: Miro <[email protected]>
    pgmpablo157321 and mrmhodak authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    9b8006f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    180014a View commit details
    Browse the repository at this point in the history
  4. 🔄 synced local 'tools/submission/power/power_checker.py' with remote …

    …'compliance/check.py' (mlcommons#1587)
    
    Co-authored-by: mlcommons-bot <null>
    mlcommons-bot authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    523316e View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3ad8534 View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2024

  1. Ignore trailing whitespace lines in spl.txt files (mlcommons#1584)

    * Ignore trailing whitespace lines in spl.txt files.
    
    * Remove fix from sync'ed power_checker.py.
    
    * Reformat according to black.
    psyhtest authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    a04b1f5 View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Configuration menu
    Copy the full SHA
    4bdf56f View commit details
    Browse the repository at this point in the history
  2. Add support to dump 10 compliance images during accuracy run for SDXL (

    …mlcommons#1591)
    
    * Add support to dump 10 compliance images during accuracy run for SDXL
    
    * Fix typo
    
    * Dump caption.txt in the same path
    nvyihengz authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    3a902e5 View commit details
    Browse the repository at this point in the history
  3. mlcommons#1598: fix token and sample logging for Llama2 when accuracy…

    …_log_sampling_target is enabled (mlcommons#1599)
    nvzhihanj authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    cc3daae View commit details
    Browse the repository at this point in the history

Commits on Feb 2, 2024

  1. Fix loadgen token metrics latency constrains (mlcommons#1596)

    * Fix loadgen token metrics latency constrains
    
    * Update perf constraints check for token metrics
    
    * Add equal issue mode for LLMs models
    pgmpablo157321 authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    473053f View commit details
    Browse the repository at this point in the history

Commits on Feb 6, 2024

  1. Add sample length check to test06 (mlcommons#1603)

    * Add sample length check to test06
    
    * Remove spaces in token metrics recomendation
    
    * Add important item to Llama readme
    
    * Fix Bug: number of tokens logged before computing them
    
    * Fix typo: lenght -> length
    pgmpablo157321 authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    104d855 View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2024

  1. Enable equal issue mode for LLM benchmarks (mlcommons#1610)

    * Enable equal issue mode for LLM benchmarks
    
    * Reduce min_query_count to 1 for server/MS/SS
    
    * Remove scenario
    
    * Remove min_query_count so default is used; revoke padding change for equal issue offline
    
    * Pad min_queries, not samples_per_query for non-offline
    
    * Add documentation to the sample equal issue
    nvzhihanj authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    357ccef View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    44285d9 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d45a66c View commit details
    Browse the repository at this point in the history
  4. Remove loadgen warnings (mlcommons#1608)

    Co-authored-by: Miro <[email protected]>
    pgmpablo157321 and mrmhodak authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    d7dba08 View commit details
    Browse the repository at this point in the history
  5. Update README.md - remove unwanted lines in CM commands (mlcommons#1601)

    * Update README.md
    
    No longer need custom fork as the relevant changes are in the inference repository
    
    * Update dataset.py
    
    ---------
    
    Co-authored-by: Miro <[email protected]>
    arjunsuresh and mrmhodak authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    b0777f0 View commit details
    Browse the repository at this point in the history
  6. Typo fix in README.md (mlcommons#1588)

    Co-authored-by: Miro <[email protected]>
    arjunsuresh and mrmhodak authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    3190d09 View commit details
    Browse the repository at this point in the history
  7. Update README.md with CM commands to download stable-diffusion, gptj …

    …and dlrmv2 models (mlcommons#1604)
    
    * Update README.md
    
    Add CM commands to download Stable diffusion models
    
    * Update README.md
    
    * Update README.md
    arjunsuresh authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    840435a View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2024

  1. Turn equal issue mode off for TEST06 (mlcommons#1615)

    * Turn equal issue mode off for Llama2 TEST06
    
    * Add TEST06 to the output dir
    nvzhihanj authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    817dd96 View commit details
    Browse the repository at this point in the history
  2. Fix submission checker and TEST06 for Llama2 (mlcommons#1616)

    * Fix submission checker and TEST06 for Llama2
    
    * Remove redundant line
    
    * Move test_dir check
    nvzhihanj authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    0ed5190 View commit details
    Browse the repository at this point in the history

Commits on Feb 12, 2024

  1. Bugfix: equal-issue mode on offline causing accuracy run to fail (3D-…

    …UNet) (mlcommons#1624)
    
    Currently 3D-UNet is the only workload using equal-issue mode on Offline scenario. 
    Recent code change on LLM equal-issue mode caused 3D-UNet accuracy run to run more than 1 queries, causing the accuracy log to bloat and fail the accuracy checking script.
    This change fixes the problem described above.
    nv-jinhosuh authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    f06b920 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f9a643c View commit details
    Browse the repository at this point in the history

Commits on Feb 15, 2024

  1. Hotfix: DLRMv2 Audit Test01 fallback failure (mlcommons#1626)

    * Hotfix: DLRMv2 Audit Test01 fallback failure 
    
    DLRMv2 Audit TEST01 may go to fallback route and the accuracy check script (accuracy-dlrm.py) didn't expect this to happen. It always expects entire sample set to be in the accuracy log while Audit TEST01 would generate subset only.
    
    This fixes the Audit TEST01 failure described above.
    
    * typo fix
    nv-jinhosuh authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    486a629 View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2024

  1. Configuration menu
    Copy the full SHA
    de31ee2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    268bc9d View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2024

  1. Configuration menu
    Copy the full SHA
    5d0c221 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    dc94ae3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ab747c4 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d037f22 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    147a91a View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    15d14c9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    46a35c2 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    c0bd844 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    8219069 View commit details
    Browse the repository at this point in the history

Commits on Feb 23, 2024

  1. Fix typo in README.md

    nathanw-mlc authored Feb 23, 2024
    Configuration menu
    Copy the full SHA
    e39003a View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2024

  1. TGI support first pass

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    396d3f8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    84c9673 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dbab9f0 View commit details
    Browse the repository at this point in the history
  4. Added v1 offline artifacts

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    ffcbc0e View commit details
    Browse the repository at this point in the history
  5. server scenario first pass

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    86c594e View commit details
    Browse the repository at this point in the history
  6. Funcional server scenario

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    8751a35 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    6ff4090 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    2225a45 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    22eb574 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    7bb4c1b View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    e96c8a6 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    84475d7 View commit details
    Browse the repository at this point in the history
  13. Streaming first pass

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    58e16ea View commit details
    Browse the repository at this point in the history
  14. Updated server impl

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    f35c17e View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    132f725 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    9f31f19 View commit details
    Browse the repository at this point in the history
  17. v8 Update

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    654dda5 View commit details
    Browse the repository at this point in the history
  18. GPT-J first pass

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    c7f699d View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    081024f View commit details
    Browse the repository at this point in the history
  20. v1 full implementation

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    43afdea View commit details
    Browse the repository at this point in the history
  21. Update README for gpt-j

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    f8cc5ba View commit details
    Browse the repository at this point in the history
  22. First pass multi-endpoint

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    8ae2bf4 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    0ad354e View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    6b117e0 View commit details
    Browse the repository at this point in the history
  25. Change file names

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    ebf0710 View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    a357cf4 View commit details
    Browse the repository at this point in the history
  27. Minor adjustments

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    a505e83 View commit details
    Browse the repository at this point in the history
  28. Updated for exact values

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    ef6b3db View commit details
    Browse the repository at this point in the history
  29. Update llama-2 with vllm

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    230d495 View commit details
    Browse the repository at this point in the history
  30. Fixed output cap bug

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    48a4396 View commit details
    Browse the repository at this point in the history
  31. Fix llama server bug

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    57e241d View commit details
    Browse the repository at this point in the history
  32. Added v10 image for llama

    Maxusmusti committed Feb 27, 2024
    Configuration menu
    Copy the full SHA
    6ef5023 View commit details
    Browse the repository at this point in the history

Commits on Feb 28, 2024

  1. Configuration menu
    Copy the full SHA
    3bc09fa View commit details
    Browse the repository at this point in the history
  2. Updated READMEs

    Maxusmusti committed Feb 28, 2024
    Configuration menu
    Copy the full SHA
    38e3aea View commit details
    Browse the repository at this point in the history

Commits on Feb 29, 2024

  1. Configuration menu
    Copy the full SHA
    84f1aac View commit details
    Browse the repository at this point in the history
  2. Updated image in yaml

    Maxusmusti committed Feb 29, 2024
    Configuration menu
    Copy the full SHA
    c7eeef9 View commit details
    Browse the repository at this point in the history
  3. Fix first token dtype

    Maxusmusti committed Feb 29, 2024
    Configuration menu
    Copy the full SHA
    8ab5998 View commit details
    Browse the repository at this point in the history