Skip to content

Commit 3f24ebf

Browse files
chughtapanTapan Chugh
andauthored
Improvements for Appworld MCP Integration (#23)
Co-authored-by: Tapan Chugh <[email protected]>
1 parent 8d5ff70 commit 3f24ebf

File tree

3 files changed

+20
-7
lines changed

3 files changed

+20
-7
lines changed

tests/benchmarks/appworld/mcp_server.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -251,6 +251,11 @@ async def call_tool(name: str, arguments: dict[str, Any]) -> Any:
251251
),
252252
)
253253

254+
# Clean up database connections before exiting
255+
# This ensures SQLite connections are closed and the process can exit cleanly
256+
collections.model_collection.close()
257+
collections.apis.close()
258+
254259

255260
async def main() -> None:
256261
parser = argparse.ArgumentParser(description="AppWorld MCP Server with task-specific state")

tests/benchmarks/appworld/prompts.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
from appworld.common.io import dump_yaml, read_file, read_json
99
from appworld.common.text import render_template
1010
from appworld.task import Task
11+
from fast_agent.mcp.common import create_namespaced_name
1112

1213
# Path to installed appworld_experiments package
1314
EXPERIMENTS_PATH = Path(appworld_experiments.__file__).parent
@@ -54,8 +55,14 @@ def load_system_instruction(task: Task) -> str:
5455
return base_instruction + demo_text
5556

5657

57-
def _format_demo_messages(demo_messages: list[dict[str, Any]]) -> str:
58-
"""Format demo messages as readable conversation."""
58+
def _format_demo_messages(demo_messages: list[dict[str, Any]], server_name: str = "appworld") -> str:
59+
"""
60+
Format demo messages as readable conversation.
61+
62+
Args:
63+
demo_messages: List of demo message dictionaries
64+
server_name: MCP server name (default: "appworld")
65+
"""
5966
demo_text_parts = ["\n"]
6067

6168
for msg in demo_messages:
@@ -72,13 +79,14 @@ def _format_demo_messages(demo_messages: list[dict[str, Any]]) -> str:
7279
calls = []
7380
for tc in tool_calls:
7481
func_name = tc["function"]["name"]
82+
prefixed_name = create_namespaced_name(server_name, func_name)
7583
func_args = tc["function"]["arguments"]
7684
args_dict = json.loads(func_args) if isinstance(func_args, str) else func_args
7785
if args_dict:
7886
args_str = ", ".join(f"{k}={repr(v)}" for k, v in args_dict.items())
79-
calls.append(f"{func_name}({args_str})")
87+
calls.append(f"{prefixed_name}({args_str})")
8088
else:
81-
calls.append(f"{func_name}()")
89+
calls.append(f"{prefixed_name}()")
8290
demo_text_parts.append("\n" + "\n".join(calls))
8391
elif content:
8492
demo_text_parts.append(f"\n{content}")

tests/benchmarks/appworld/system_instruction.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ I am your supervisor, and you are an AI Assistant whose job is to complete my da
33

44
My name is: {{ main_user.first_name }} {{ main_user.last_name }}. My personal email is {{ main_user.email }} and phone number is {{ main_user.phone_number }}.
55

6-
You will be given a task instruction and a list of functions in the standard format. The functions correspond to APIs from various apps you have access to. The function name has two parts, the app name and API name separated by "__", e.g., spotify__login is the login API for the Spotify app.
6+
You will be given a task instruction and a list of functions in the standard format. The functions correspond to APIs from various apps you have access to. The function name has three parts: the server name "appworld", the app name, and the API name, all separated by "__" (double underscore). For example, appworld__spotify__login is the login API for the Spotify app.
77

8-
You will complete the task completely autonomously through multi-turn interaction with the execution environment. In each turn, you will make one or more function calls, and the environment will return its outputs. This will continue until you call `complete_task` API from the Supervisor app.
8+
You will complete the task completely autonomously through multi-turn interaction with the execution environment. In each turn, you will make one or more function calls, and the environment will return its outputs. This will continue until you call the appworld__supervisor__complete_task API.
99

1010
Here are brief app-wise descriptions.
1111

@@ -35,7 +35,7 @@ B. App-specific instructions:
3535

3636
C. Task-completion instructions:
3737

38-
You must call the `supervisor__complete_task` API after completing the task.
38+
You must call the `appworld__supervisor__complete_task` API after completing the task.
3939
- If an answer is needed, e.g., for "How many songs are in the Spotify queue?", call it with the appropriate answer argument value.
4040
- If no answer is required, e.g., for "Start my Spotify music player.", omit the answer argument (or set it to None/null).
4141
- The task is doable, but if you cannot find a way, you can call it with status="fail" to exit with failure.

0 commit comments

Comments
 (0)