Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .env.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# API Keys for ToolUniverse
# Copy this file to .env and fill in your actual API keys

At least one of: OPENAI_API_KEY, AZURE_OPENAI_API_KEY=your_api_key_here

BOLTZ_MCP_SERVER_HOST=your_api_key_here

EXPERT_FEEDBACK_MCP_SERVER_URL=your_api_key_here

HF_TOKEN=your_api_key_here

TXAGENT_MCP_SERVER_HOST=your_api_key_here

USPTO_API_KEY=your_api_key_here

USPTO_MCP_SERVER_HOST=your_api_key_here

38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,44 @@ Our comprehensive documentation covers everything from quick start to advanced w
- **[Adding Tools Tutorial](https://zitniklab.hms.harvard.edu/ToolUniverse/tutorials/addtools/Adding_Tools_Tutorial.html)**: Step-by-step tool addition guide
- **[MCP Tool Registration](https://zitniklab.hms.harvard.edu/ToolUniverse/tutorials/addtools/mcp_tool_registration_en.html)**: Register tools via MCP

### Verified Source Discovery (VSD) + Harvest Workflow

ToolUniverse ships a “harvest → verify → register” pipeline that turns external REST endpoints into first-class Dynamic REST tools.

1. **Harvest candidates** – `GenericHarvestTool` searches the static Harvest catalog or wraps ad‑hoc URLs you supply with `{"urls": [...]}`.
2. **Probe readiness** – `HarvestCandidateTesterTool` (optional) performs a dry run against the candidate, suggesting default query params or headers.
3. **Register verified tools** – `VerifiedSourceRegisterTool` stamps metadata, persists the tool in `~/.tooluniverse/vsd/generated_tools.json` (override with `TOOLUNIVERSE_VSD_DIR`), and hot-loads it through the Dynamic REST runner.
4. **Inspect / prune** – `VerifiedSourceDiscoveryTool` lists everything in the verified catalog, while `VerifiedSourceRemoveTool` deletes entries and unregisters their dynamic bindings.

```python
from tooluniverse.vsd_tool import (
GenericHarvestTool,
HarvestCandidateTesterTool,
VerifiedSourceRegisterTool,
VerifiedSourceDiscoveryTool,
VerifiedSourceRemoveTool,
)

harvest = GenericHarvestTool({})
candidate = harvest.run({"query": "clinical"})["candidates"][0]

tester = HarvestCandidateTesterTool({})
probe = tester.run({"candidate": candidate})

register = VerifiedSourceRegisterTool({})
register.run(
"ClinicalTrialsREST",
candidate,
default_params={"search": "cancer"},
force=True, # bypass strict validation once endpoint is known-good
)

print(VerifiedSourceDiscoveryTool({}).run({})["tools"])
VerifiedSourceRemoveTool({}).run({"tool_name": "ClinicalTrialsREST"})
```

Registered tools are immediately available to agents via normal loading (e.g., `ToolUniverse().load_tools(tool_type=["dynamic_rest"])`). This workflow keeps internal sources (Harvest/VSD) separate from public REST integrations so they can ship on their own release cadence.

### 📚 API Reference
- **[API Directory](https://zitniklab.hms.harvard.edu/ToolUniverse/api/modules.html)**: Complete module listing
- **[Core Modules](https://zitniklab.hms.harvard.edu/ToolUniverse/api/tooluniverse.html)**: Main ToolUniverse class and utilities
Expand Down
32 changes: 32 additions & 0 deletions docs/expand_tooluniverse/architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ Repository Structure Tree
│ ├── tool_finder_keyword.py # Keyword-based tool search
│ ├── tool_finder_embedding.py # Embedding-based tool search
│ ├── tool_finder_llm.py # LLM-powered tool discovery
│ ├── remote/docker_llm/ # Docker-based LLM provisioning helpers
│ ├── DockerLLMProvisioner.py # Compose tool for Docker LLM MCP auto-registration
│ ├── embedding_database.py # Tool embedding database
│ └── embedding_sync.py # Embedding synchronization
│ │
Expand Down Expand Up @@ -314,6 +316,36 @@ Extension Points
- Use `compose_tool.py` or add scripts in `compose_scripts/` for complex call chains
- Leverage `tool_finder_*` for retrieval and routing assistance

Tool Loading Cheat Sheet
------------------------

- Package data is loaded from the JSON files mapped in :mod:`default_config.py` plus everything under ``src/tooluniverse/data/``.
- Remote/MCP entries are merged from both the packaged ``data/remote_tools`` directory **and** the user override folder ``~/.tooluniverse/remote_tools``. Dropping a JSON config there makes the tool visible without code changes.
- The runtime builds three main registries:

1. ``tool_files`` → category JSON manifests (local tools)
2. ``data/remote_tools`` → bundled remote definitions
3. ``~/.tooluniverse/remote_tools`` → user/automation supplied remote definitions

- Use ``ToolUniverse.load_tools()`` to refresh the registry after adding new files without restarting the host process.

Remote MCP Provisioning
-----------------------

- ``DockerLLMProvisioner`` (compose tool) and ``scripts/provision_docker_llm.py`` automate standing up an MCP-enabled LLM in Docker, poll its ``/health`` endpoint, and emit the JSON configs under ``~/.tooluniverse/remote_tools`` so the new tool registers instantly.
- Remote stubs created from bundled configs (e.g., expert feedback, DepMap) are read-only until you connect ToolUniverse to the actual MCP server. You can:

1. Call ``ToolUniverse.load_mcp_tools(["http://server:port/mcp"])`` to ingest tools live, or
2. Provision a local container via ``DockerLLMProvisioner`` or the CLI helper to host the endpoints yourself.
- The `RemoteTool` error message now includes these activation instructions when an agent accidentally calls an offline remote tool.

Catalog Navigation Tips
-----------------------

- ``ToolNavigatorTool`` combines the full catalog (including remote/VSD entries) with lightweight scoring—use it to shortlist relevant tools before running long compositions.
- ``ToolFinderKeyword`` / ``ToolFinderEmbedding`` provide complementary search modalities; both now benefit from the expanded metadata listed in ``~/.tooluniverse/remote_tools``.
- For big collections consider building category-specific shortlists in ``toolsets/`` and surfacing them via ``ToolNavigatorTool`` filters or custom compose tools.

Directory Quick Reference
--------------------------

Expand Down
125 changes: 125 additions & 0 deletions scripts/provision_docker_llm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
#!/usr/bin/env python3
"""
Provision a Docker-hosted LLM and register it with ToolUniverse.

This script wraps the helper in tooluniverse.remote.docker_llm.provision so that
non-technical users can start the container and create the necessary MCP client
configurations with a single command.
"""

from __future__ import annotations

import argparse
import sys

from tooluniverse.remote.docker_llm.provision import (
DEFAULT_IMAGE,
ProvisionError,
provision_docker_llm,
)


def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Start a Docker-hosted LLM and register it with ToolUniverse."
)
parser.add_argument(
"--image",
default=DEFAULT_IMAGE,
help=f"Docker image to run (default: {DEFAULT_IMAGE})",
)
parser.add_argument(
"--container-name",
help="Name for the Docker container. Generated automatically if omitted.",
)
parser.add_argument(
"--host",
default="127.0.0.1",
help="Host interface to bind (default: 127.0.0.1).",
)
parser.add_argument(
"--host-port",
type=int,
default=9000,
help="Host port to expose the MCP endpoint on (default: 9000).",
)
parser.add_argument(
"--container-port",
type=int,
default=8000,
help="Internal container port (default: 8000).",
)
parser.add_argument(
"--tool-name",
default="DockerLLMChat",
help="Tool name to register inside ToolUniverse.",
)
parser.add_argument(
"--tool-prefix",
help="Prefix used when auto-registering tools from the MCP server.",
)
parser.add_argument(
"--mcp-tool-name",
default="docker_llm_chat",
help="Underlying MCP tool name exposed by the container.",
)
parser.add_argument(
"--health-path",
default="/health",
help="HTTP path used for readiness checks (default: /health).",
)
parser.add_argument(
"--timeout",
type=int,
default=120,
help="Seconds to wait for container health (default: 120).",
)
parser.add_argument(
"--no-reuse",
action="store_true",
help="Always recreate the container instead of reusing an existing one.",
)
parser.add_argument(
"--docker-cli",
default="docker",
help="Docker CLI executable to invoke (default: docker).",
)
return parser


def main(argv: list[str] | None = None) -> int:
parser = build_parser()
args = parser.parse_args(argv)

try:
result = provision_docker_llm(
image=args.image,
container_name=args.container_name,
docker_cli=args.docker_cli,
host=args.host,
host_port=args.host_port,
container_port=args.container_port,
tool_name=args.tool_name,
tool_prefix=args.tool_prefix,
mcp_tool_name=args.mcp_tool_name,
health_path=args.health_path,
timeout_seconds=args.timeout,
reuse_container=not args.no_reuse,
)
except ProvisionError as exc:
print(f"Provisioning failed: {exc}", file=sys.stderr)
return 1

print("Docker LLM provisioning complete.")
print(f" Container name : {result.container_name}")
print(f" MCP server URL : {result.server_url}")
print(f" Tool config : {result.config_path}")
print(
"Add the tool by reloading ToolUniverse or invoking "
"'DockerLLMProvisioner' from within the agent."
)
return 0


if __name__ == "__main__":
raise SystemExit(main())
Loading
Loading