You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, @lyzhongcrd ,
Yeah. However, we recommend you use the same LLM as the judger for all LMMs to make it comparable.
For MCQ or Y/N benchmarks, when LLMs are only used as choice extractor for more accurate evaluation, using different LLMs will not lead to significantly different results.
Is it possible to use locally deployed LLM like LLaVa-Critic as judge LLM instead of calling GPT4 API?
The text was updated successfully, but these errors were encountered: