Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Azure OpenAI to Run WebArena Evaluation #6372

Open
wxl-lxw opened this issue Jan 21, 2025 · 2 comments
Open

Use Azure OpenAI to Run WebArena Evaluation #6372

wxl-lxw opened this issue Jan 21, 2025 · 2 comments
Labels
documentation Related to documentation evaluation Related to running evaluations with OpenHands

Comments

@wxl-lxw
Copy link

wxl-lxw commented Jan 21, 2025

I'm trying to run WebArena evaluation following this guide. However, it only shows how to run it using OpenAI API. Now I am trying to evaluate WebArena using Azure OpenAI API. Are there any instructions for me to follow?

Thanks.

@enyst
Copy link
Collaborator

enyst commented Jan 21, 2025

I think the problem there is that WebArena/browsergym requires an OpenAI key for some internal functions.

In my understanding, other than that you can set up an Azure LLM for the agent like in the linked guide here

Cc: @adityasoni9998 I believe you hit the same issue?

@mamoodi mamoodi added documentation Related to documentation evaluation Related to running evaluations with OpenHands labels Jan 21, 2025
@adityasoni9998
Copy link
Contributor

I think the problem there is that WebArena/browsergym requires an OpenAI key for some internal functions.

In my understanding, other than that you can set up an Azure LLM for the agent like in the linked guide here

Cc: @adityasoni9998 I believe you hit the same issue?

Yes, this is a bit annoying and I encountered a similar issue when trying to use LiteLLM proxy for VisualWebArena evaluation. There is 2-step dependency here - OpenHands relies on BrowserGym for evaluation on WebArena benchmark and BrowserGym internally relies on WebArena functions to compute resolve rates. In case the model name mentioned in the code linked above matches your Azure OpenAI model, you can try setting OPENAI_BASE_URL environment variable to Azure API base URL in your sandbox here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Related to documentation evaluation Related to running evaluations with OpenHands
Projects
None yet
Development

No branches or pull requests

4 participants