You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I personally find that this setting will make all correct choices C.
Similarly, if the seed is set to 40, all correct choices will become D.
This seems to be bad since we don't want to have a dataset with the same ground truth labels.
The text was updated successfully, but these errors were encountered:
XuGW-Kevin
changed the title
Reomve seed in GPQA benchmark
Remove seed in GPQA benchmark
Sep 29, 2024
In gpqa.py, line 42, we have:
I personally find that this setting will make all correct choices
C
.Similarly, if the seed is set to 40, all correct choices will become
D
.This seems to be bad since we don't want to have a dataset with the same ground truth labels.
The text was updated successfully, but these errors were encountered: