mixtral: accuracy check output contains `np.float64(...)` which doesnt suit metric regex #1763

viraatc · 2024-07-02T23:50:52Z

The output of mixtral-8x7b/evaluate-accuracy.py:

xx.xx% pass@1
{'typescript': x, 'ruby': x, 'python': x, 'javascript': x, 'php': x, 'cpp': x}
{'typescript': x, 'ruby': x, 'python': x, 'javascript': x, 'php': x, 'cpp': x}
Results
{
    'rouge1': np.float64(x),
    'rouge2': np.float64(x),
    'rougeL': np.float64(x),
    'rougeLsum': np.float64(x),
    'gsm8k': x,
    'mbxp': x,
    'gen_len': np.int64(x),
    'gen_num': x,
    'gen_tok_len': x,
    'tokens_per_sample': x
}

conatins np.float64(...) text wrapped around fp value for given fields:

rogue1, rogue2, rogueL, rogueLsum

contains np.int64 text around long value for given fields:

gen_len

this fails the regexes we have defined:

inference/tools/submission/submission_checker.py

Lines 533 to 537 in 9e2c9f6

    
           "ROUGE1": r".*'rouge1':\s([\d.]+).*", 
        
           "ROUGE2": r".*'rouge2':\s([\d.]+).*", 
        
           "ROUGEL": r".*'rougeL':\s([\d.]+).*", 
        
           "ROUGELSUM": r".*'rougeLsum':\s([\d.]+).*", 
        
           "GEN_LEN": r".*'gen_len':\s([\d.]+).*",

im not entirely sure why this behaves differently from our llama script:

inference/language/llama2-70b/evaluate-accuracy.py

Line 95 in 29edbb0

result = {k: round(np.mean(v) * 100, 4) for k, v in result.items()}

Verified Fix:

wrap here with built-in float(round(...))

inference/language/mixtral-8x7b/evaluate-accuracy.py

Line 180 in 29edbb0

result = {k: round(np.mean(v) * 100, 4) for k, v in result.items()}
wrap here with built-in int int(np.sum(...))

inference/language/mixtral-8x7b/evaluate-accuracy.py

Line 215 in 29edbb0

'gen_len': np.sum(prediction_lens),

The text was updated successfully, but these errors were encountered:

fixes mlcommons#1763

fixes #1763

viraatc added a commit to viraatc/inference that referenced this issue Jul 2, 2024

fix: remove np type around output values

d1bfc0b

fixes mlcommons#1763

viraatc mentioned this issue Jul 2, 2024

fix: mixtral: remove np type around output values in evaluate-accuracy script #1764

Merged

pgmpablo157321 pushed a commit that referenced this issue Jul 3, 2024

fix: remove np type around output values (#1764)

4a77003

fixes #1763

pgmpablo157321 closed this as completed in #1764 Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mixtral: accuracy check output contains `np.float64(...)` which doesnt suit metric regex #1763

mixtral: accuracy check output contains `np.float64(...)` which doesnt suit metric regex #1763

viraatc commented Jul 2, 2024 •

edited

Loading

mixtral: accuracy check output contains np.float64(...) which doesnt suit metric regex #1763

mixtral: accuracy check output contains np.float64(...) which doesnt suit metric regex #1763

Comments

viraatc commented Jul 2, 2024 • edited Loading

mixtral: accuracy check output contains `np.float64(...)` which doesnt suit metric regex #1763

mixtral: accuracy check output contains `np.float64(...)` which doesnt suit metric regex #1763

viraatc commented Jul 2, 2024 •

edited

Loading