feat: Complete web_search Tavily API integration by nydamon · Pull Request #53 · nydamon/automaton-research

nydamon · 2026-03-10T21:46:40Z

Summary

Completes the web_search tool implementation by wiring it to the Tavily Search API, replacing the previous stub that returned empty results.

Key Changes:

Implement Tavily REST API integration with ResilientHttpClient for automatic retries and circuit breaker protection
Map search_type parameter (all/news/research/code) to Tavily topic and search_depth
Use dependency injection via ToolContext.config for test isolation
Add comprehensive HTTP mocking with vi.mock() for offline deterministic tests
Implement defensive URL parsing and response validation
Update all unit and integration tests to validate real search results

Architecture

Config pattern: Use ctx.config.discovery.tavilyApiKey (injected at runtime) instead of loadConfig() for testability
Graceful degradation: Returns empty results with warning log when API key is unconfigured
Response mapping: Maps Tavily results (title, url, content, score, published_date) to WebSearchResult schema
Error handling: Captures API errors with status codes and error text for debugging
Caching: Reuses existing 24-hour in-memory cache infrastructure

Testing

✅ 4 unit tests passing (web-search.test.ts)
✅ 8 integration tests passing (discovery-tools.test.ts)
✅ All tests run offline with mocked HTTP (no network dependency)
✅ Full response structure validation with real results assertion

Deployment

Code reviewed and approved for production
Defensive improvements applied (URL parsing error handling, response validation)
Ready for deployment to VPS after merge

Fixes #52 (API discovery tool) - web_search now functional

🤖 Generated with Claude Code

Update cost estimation baselines to clarify: - All costs are in USD cents (not abstract credits) - Minimum goal budget: $50-100 USD per goal - Per-task budgets: $25-50 minimum with inference overhead - Explicit examples of realistic plan costs This fixes the issue where planner was returning $2.00 budgets when SOUL.md and GOVERNANCE.md specify $50-100 per goal. Model will now allocate costs aligned with documented budget policy. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

PROBLEM: Agent stuck in indecision loop during orchestrator planning phase. When orchestrator entered "classifying/planning" phase, agent received vague instruction to "do solo work" but had no clear guidance. This caused 87+ repetitive thought chunks looping on: "Should I wait or do solo work?" ROOT CAUSE: System prompt had ambiguous guidance. Agent's genesis prompt included the active goal, so agent kept thinking about a goal with $0.00 budget while orchestrator planned it. No state change occurred, causing infinite loop of identical reasoning before model gave up with malformed tool calls. SOLUTION: Added getPhaseSpecificGuidance() function that: 1. Detects when orchestrator is in classifying/planning/plan_review phases 2. Injects explicit "IMMEDIATE SOLO WORK DIRECTIVE" into system prompt 3. Provides 5 concrete, different actions (WORKLOG update, research, etc.) 4. Explicitly forbids agent from thinking about the active goal VERIFICATION: After fix deployment, agent reduced from 80+ repetitive thought chunks to 11 analytical chunks. Agent moved through executing phase without getting stuck, delegating work to children instead of looping. - Added getPhaseSpecificGuidance(db) function with detailed comments - Integrated into buildSystemPrompt() flow after orchestrator status injection - Guidance only activates during busy orchestrator phases (no impact on other phases) - No tool changes, no behavior changes to agent decision logic, purely guidance injection Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

PROBLEM: Worker getting 400 "invalid function arguments json string" error from MiniMax when sending reconstructed messages with tool calls. ROOT CAUSE: Double-encoding bug in buildContextMessages() on line 183. When tool call arguments come from MiniMax inference response, they are already a JSON string. Calling JSON.stringify() on a string value results in double-encoded JSON: Before: {"key":"value"} After: "{\"key\":\"value\"}" When we send this back to MiniMax as a tool call argument, it rejects it as malformed JSON because the value is a string literal, not an object. SOLUTION: Type-check tc.arguments before stringifying. If already a string, pass through as-is. Only stringify if it's an object. This complements the thought-loop fix by ensuring that even if the model gets confused and generates edge-case scenarios, tool arguments are properly formatted when reconstructing the message history. - Added conditional: `typeof tc.arguments === "string" ? tc.arguments : JSON.stringify(...)` - Prevents double-encoding while maintaining backward compatibility with object arguments - Minimal change, zero side effects Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

…nance policy Root cause: Hardcoded fallback values in orchestrator.ts used estimatedCostCents: 200 (cents, = $2.00) when the planner failed or returned no tasks. This fallback was not updated when cost estimation policy changed to require $50-100 minimum budgets. Solution: Update both fallback cases (planner error + empty tasks) to use estimatedCostCents: 5000 ($50 minimum), aligning with the $50-100 USD budget policy specified in SOUL.md and GOVERNANCE.md. This ensures that even when the planner fails, goals created fall back to realistic budgets rather than the old $2 placeholder. Fixes agent seeing 'goal created with $2 budget' messages despite cost estimation policy fix in ff7da4e. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Documents the minimum budget requirement for all goals and specifies how budgets are determined in both planner success and fallback paths. Includes reference to commit 2d28a3b which fixed fallback values. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

…ning - Implement Tavily REST API integration for web_search tool - Add HTTP mocking with vi.mock() for offline deterministic tests - Implement search_type parameter mapping (all/news/research/code) - Add defensive URL parsing with try-catch error handling - Add response validation for Tavily API structure - Use dependency injection (ctx.config) for test isolation - Update all test cases to validate real search results - Add integration test coverage for tool registration This completes the web_search tool from stub (empty results) to production-ready Tavily-powered search with comprehensive error handling and test coverage. Code review approved. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

chatgpt-codex-connector · 2026-03-10T21:46:45Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

nydamon and others added 6 commits March 10, 2026 10:46

nydamon merged commit 4faa25e into main Mar 10, 2026
4 of 6 checks passed

nydamon deleted the feat/web-search-tavily-integration branch March 10, 2026 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Complete web_search Tavily API integration#53

feat: Complete web_search Tavily API integration#53
nydamon merged 6 commits intomainfrom
feat/web-search-tavily-integration

nydamon commented Mar 10, 2026

Uh oh!

chatgpt-codex-connector bot commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nydamon commented Mar 10, 2026

Summary

Architecture

Testing

Deployment

Uh oh!

chatgpt-codex-connector bot commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant