Skip to content

Commit f97e06f

Browse files
committed
update README
1 parent 4258021 commit f97e06f

File tree

1 file changed

+6
-10
lines changed

1 file changed

+6
-10
lines changed

README.md

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -148,16 +148,12 @@ python show_result.py \
148148

149149
## Pairwise win-rate compared with GPT-3.5-davinci-003
150150
| Model | Win | Loss | Tie | Win Rate | Loss Rate | Win Rate Adjusted |
151-
|:---------------------------------------------------------|----:|-----:|----:|---------:|----------:|------------------:|
152-
| llm-jp--llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0 | 65 | 146 | 26 | 0.274262 | 0.616034 | 0.329114 |
153-
| rinna--japanese-gpt-neox-3.6b-instruction-ppo | 8 | 62 | 10 | 0.100000 | 0.775000 | 0.162500 |
154-
| rinna--japanese-gpt-neox-3.6b-instruction-sft-v2 | 7 | 65 | 8 | 0.087500 | 0.812500 | 0.137500 |
155-
| cyberagent--calm2-7b-chat | 6 | 68 | 7 | 0.074074 | 0.839506 | 0.117284 |
156-
| llm-jp--llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0 | 5 | 66 | 8 | 0.063291 | 0.835443 | 0.113924 |
157-
158-
159-
160-
151+
|----------------------------------------------------------|-----|------|-----|----------|-----------|-------------------|
152+
| llm-jp--llm-jp-13b-instruct-lora-jaster-dolly-oasst-v1.0 | 22 | 48 | 10 | 0.2750 | 0.6000 | 0.33750 |
153+
| rinna--japanese-gpt-neox-3.6b-instruction-ppo | 10 | 61 | 9 | 0.1250 | 0.7625 | 0.18125 |
154+
| llm-jp--llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0 | 7 | 65 | 8 | 0.0875 | 0.8125 | 0.13750 |
155+
| rinna--japanese-gpt-neox-3.6b-instruction-sft-v2 | 8 | 69 | 3 | 0.1000 | 0.8625 | 0.11875 |
156+
| cyberagent--calm2-7b-chat | 5 | 67 | 8 | 0.0625 | 0.8375 | 0.11250 |
161157

162158
The GPT4 judgments is placed in `data/jp_bench/model_judgment/gpt-4_pair.jsonl`.
163159

0 commit comments

Comments
 (0)