Skip to content

Conversation

@chenmancm169
Copy link

No description provided.

Copilot AI review requested due to automatic review settings December 22, 2025 12:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two new agent implementations for code generation: CodexAgent and ClaudeCodeAgent. Both agents extend the BaseAgent class to provide AI-powered code generation capabilities through external CLI tools.

Key Changes:

  • Added CodexAgent that integrates with a Codex CLI tool for code generation using GPT models
  • Added ClaudeCodeAgent that integrates with Claude Agent SDK for code generation using Claude models
  • Both agents support configurable permissions, working directories, tool restrictions, and prompt customization

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 23 comments.

File Description
autoenv/codex_agent.py Implements CodexAgent with Codex CLI integration, including command building, async execution, JSONL parsing, and process management
autoenv/claude_code_agent.py Implements ClaudeCodeAgent using Claude Agent SDK with graceful fallbacks, async streaming, and error handling for various SDK exceptions
Comments suppressed due to low confidence (7)

autoenv/codex_agent.py:332

  • Overly complex 'del' method.
    def __del__(self):

autoenv/claude_code_agent.py:4

  • Import of 'os' is not used.
import os

autoenv/claude_code_agent.py:11

  • Import of 'AsyncLLM' is not used.
from base.engine.async_llm import AsyncLLM

autoenv/claude_code_agent.py:15

  • Import of 'ClaudeSDKClient' is not used.
    from claude_agent_sdk import query, ClaudeAgentOptions, ClaudeSDKClient

autoenv/claude_code_agent.py:16

  • Import of 'AssistantMessage' is not used.
    Import of 'UserMessage' is not used.
    Import of 'ResultMessage' is not used.
    from claude_agent_sdk import AssistantMessage, UserMessage, ResultMessage

autoenv/claude_code_agent.py:17

  • Import of 'TextBlock' is not used.
    Import of 'ToolUseBlock' is not used.
    Import of 'ToolResultBlock' is not used.
    from claude_agent_sdk import TextBlock, ToolUseBlock, ToolResultBlock

autoenv/codex_agent.py:13

  • Import of 'AsyncLLM' is not used.
from base.engine.async_llm import AsyncLLM

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chenmancm169 chenmancm169 changed the title Feat/add codex claude sdk feat: add codex claude sdk Dec 23, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 15 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 36 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 72 to 79
system_prompt_override: Optional[str] = Field(
default=None,
description="Override system prompt (only for non-interactive mode)"
)
append_system_prompt: Optional[str] = Field(
default=None,
description="Append to system prompt (only for non-interactive mode)"
)
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The system_prompt_override and append_system_prompt fields are defined but never used in CodexAgent's implementation. These fields are listed in modifiable_attrs and mentioned in comments but there's no code that actually passes them to the Codex CLI. Either implement this functionality or remove these unused fields.

Suggested change
system_prompt_override: Optional[str] = Field(
default=None,
description="Override system prompt (only for non-interactive mode)"
)
append_system_prompt: Optional[str] = Field(
default=None,
description="Append to system prompt (only for non-interactive mode)"
)

Copilot uses AI. Check for mistakes.
Comment on lines 195 to 196
if not output:
return {"result": "", "session_id": None}
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for stdout being empty (line 195-196) returns a dictionary with session_id=None, but in other cases session_id is extracted from the JSONL output. This inconsistency could cause issues if calling code expects session_id to always be a string or None. Consider using a consistent approach for handling missing values.

Copilot uses AI. Check for mistakes.
Comment on lines +284 to +289
except (TimeoutError, RuntimeError, ValueError, FileNotFoundError, NotADirectoryError) as e:
# Known CLI execution errors - return as error strings
return f"Error during execution: {str(e)}"
except Exception:
# Re-raise unexpected errors (programming errors) for proper debugging
raise
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling pattern catches specific exceptions (TimeoutError, RuntimeError, ValueError, FileNotFoundError, NotADirectoryError) and returns them as error strings, but then re-raises unexpected exceptions. However, FileNotFoundError and NotADirectoryError from _validate_cwd are expected validation errors that should be caught during initialization, not during execution. Consider whether these should be caught here or if they represent programming errors that should propagate.

Copilot uses AI. Check for mistakes.
Comment on lines 232 to 275
# Safely modify attributes with validation
original_values = {}
for key, value in kwargs.items():
if key not in modifiable_attrs:
return f"Error: Attribute '{key}' cannot be modified via kwargs. Allowed: {sorted(modifiable_attrs)}"

if not hasattr(self, key):
return f"Error: Unknown attribute '{key}'"

try:
original_values[key] = getattr(self, key)
setattr(self, key, value)

# Validate critical attributes after modification
if key == 'cwd':
if isinstance(self.cwd, str):
self.cwd = Path(self.cwd)
self._validate_cwd()
elif key == 'permission_mode':
valid_modes = ["default", "acceptEdits", "bypassPermissions", "plan"]
if value not in valid_modes:
raise ValueError(f"Invalid permission_mode: {value}. Must be one of: {valid_modes}")

except Exception as e:
# If validation fails, restore any attributes set so far
for restore_key, restore_value in original_values.items():
try:
setattr(self, restore_key, restore_value)
except Exception:
pass # Best effort restoration
return f"Error: Failed to set attribute '{key}': {str(e)}"

try:
options = self._create_options()
return await self._process_query_stream(request, options)
except Exception as e:
return self._handle_sdk_error(e)
finally:
# Restore original values (best effort)
for key, value in original_values.items():
try:
setattr(self, key, value)
except Exception:
pass # Ignore errors during restoration
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to CodexAgent, the run method modifies instance attributes without synchronization. If multiple coroutines call run() concurrently on the same ClaudeCodeAgent instance, they could interfere with each other's attribute modifications and restoration. Document that the agent is not safe for concurrent use or implement proper locking.

Copilot uses AI. Check for mistakes.
return f"Error: Attribute '{key}' cannot be modified via kwargs. Allowed: {sorted(modifiable_attrs)}"

if not hasattr(self, key):
return f"Error: Unknown attribute '{key}'"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message for unknown attributes (line 327) is less helpful than it could be. It says "Unknown attribute" but the attribute exists on the class - it's just not in the modifiable_attrs whitelist. This will confuse users. The message should clarify that the attribute exists but cannot be modified via kwargs, similar to the message on line 324.

Suggested change
return f"Error: Unknown attribute '{key}'"
return (
f"Error: Attribute '{key}' does not exist on CodexAgent and cannot be "
f"modified via kwargs. Allowed: {sorted(modifiable_attrs)}"
)

Copilot uses AI. Check for mistakes.
Comment on lines 320 to 370
# Safely modify attributes with validation
original_values = {}
for key, value in kwargs.items():
if key not in modifiable_attrs:
return f"Error: Attribute '{key}' cannot be modified via kwargs. Allowed: {sorted(modifiable_attrs)}"

if not hasattr(self, key):
return f"Error: Unknown attribute '{key}'"

try:
original_values[key] = getattr(self, key)
setattr(self, key, value)

# Validate critical attributes after modification
if key == 'cwd':
# Convert to Path if string provided
if isinstance(self.cwd, str):
self.cwd = Path(self.cwd)
self._validate_cwd()
elif key == 'permission_mode':
valid_modes = ["default", "acceptEdits", "bypassPermissions", "plan"]
if value not in valid_modes:
raise ValueError(f"Invalid permission_mode: {value}. Must be one of: {valid_modes}")

except Exception as e:
# If validation fails, restore any attributes set so far
for restore_key, restore_value in original_values.items():
try:
setattr(self, restore_key, restore_value)
except Exception:
pass # Best effort restoration
return f"Error: Failed to set attribute '{key}': {str(e)}"

try:
result = await self._run_cli_command(request)
return self._process_result(result)

except (TimeoutError, RuntimeError, ValueError, FileNotFoundError, NotADirectoryError) as e:
# Known CLI execution errors - return as error strings
return f"Error during execution: {str(e)}"
except Exception:
# Re-raise unexpected errors (programming errors) for proper debugging
raise

finally:
# Restore original values (best effort)
for key, value in original_values.items():
try:
setattr(self, key, value)
except Exception:
pass # Ignore errors during restoration
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run method modifies instance attributes (cwd, permission_mode, etc.) via setattr without any locking mechanism. If multiple coroutines call run() concurrently on the same agent instance, they could interfere with each other's attribute modifications and restoration, leading to unpredictable behavior. Either document that the agent is not thread-safe and should not be used concurrently, or implement proper synchronization.

Copilot uses AI. Check for mistakes.
)
permission_mode: str = Field(
default="acceptEdits",
description="Permission mode: default|acceptEdits|bypassPermissions|plan"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The permission_mode field describes four valid modes (default|acceptEdits|bypassPermissions|plan) but doesn't explain what each mode does. Users need to understand the security and functional implications of each mode, especially the dangerous "bypassPermissions" mode. Add detailed documentation for each mode.

Suggested change
description="Permission mode: default|acceptEdits|bypassPermissions|plan"
description=(
"Controls how the agent requests and applies potentially dangerous actions or edits. "
"Valid values:\n"
"- 'default': Conservative behavior. The agent requests confirmation before executing "
" actions that may modify files, run commands, or change external state.\n"
"- 'acceptEdits': The agent may automatically apply code or file edits it suggests, "
" but still requires confirmation for high‑risk operations such as running arbitrary "
" shell commands or using powerful tools.\n"
"- 'bypassPermissions': Disables interactive permission checks and safety gating. The "
" agent is allowed to execute actions and apply edits directly, including running "
" commands and using tools without asking for approval. This mode is dangerous and "
" should only be used in tightly controlled, non‑production environments where you "
" fully trust the agent and the code it operates on.\n"
"- 'plan': The agent only analyzes and generates a plan of actions/edits without "
" actually executing commands or applying changes. Use this mode to review what the "
" agent would do before enabling a more permissive mode."
)

Copilot uses AI. Check for mistakes.
)
permission_mode: str = Field(
default="acceptEdits",
description="Permission mode: default|acceptEdits|bypassPermissions|plan"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as CodexAgent - the permission_mode field lists four modes but doesn't explain what each mode does. This is especially important given the security implications of "bypassPermissions". Add detailed documentation for each permission mode.

Suggested change
description="Permission mode: default|acceptEdits|bypassPermissions|plan"
description=(
"Controls how the agent handles permission prompts for edits, tool use, and commands. "
"Supported values:\n"
"- 'default': Use the Claude Code CLI / SDK's default interactive behavior for requesting\n"
" permission before making changes or running tools.\n"
"- 'acceptEdits': Automatically accept and apply code edits suggested by the agent while\n"
" still requiring confirmation for other sensitive actions (e.g., shell commands).\n"
"- 'bypassPermissions': Run tools, apply edits, and execute commands without asking for\n"
" interactive confirmation. This effectively disables permission prompts and should only\n"
" be used in fully trusted, sandboxed environments due to its security implications.\n"
"- 'plan': Focus on generating and updating a plan of actions without automatically\n"
" executing potentially destructive steps; execution typically requires separate approval."
)

Copilot uses AI. Check for mistakes.
Comment on lines +278 to +279
"""Execute the agent with given parameters."""
request = kwargs.pop('request', None) or kwargs.pop('task', None) or kwargs.pop('prompt', None)
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as CodexAgent - the call method accepts three different parameter names (request, task, prompt) which can lead to ambiguity. If multiple are provided, the precedence is undocumented. Consider standardizing on a single parameter name or clearly documenting the precedence order.

Suggested change
"""Execute the agent with given parameters."""
request = kwargs.pop('request', None) or kwargs.pop('task', None) or kwargs.pop('prompt', None)
"""Execute the agent with given parameters.
Accepts a single text argument under one of the aliases: 'request', 'task', or 'prompt'.
If more than one of these is provided, a ValueError is raised to avoid ambiguity.
"""
request = kwargs.pop("request", None)
task = kwargs.pop("task", None)
prompt = kwargs.pop("prompt", None)
# Ensure we don't silently ignore conflicting inputs
provided_count = sum(v is not None for v in (request, task, prompt))
if provided_count > 1:
raise ValueError(
"ClaudeCodeAgent.__call__ accepts only one of 'request', 'task', or 'prompt'. "
f"Received: request={request!r}, task={task!r}, prompt={prompt!r}"
)
# Preserve existing precedence: request > task > prompt
if request is None:
request = task if task is not None else prompt

Copilot uses AI. Check for mistakes.
Comment on lines +344 to +351
except Exception as e:
# If validation fails, restore any attributes set so far
for restore_key, restore_value in original_values.items():
try:
setattr(self, restore_key, restore_value)
except Exception:
pass # Best effort restoration
return f"Error: Failed to set attribute '{key}': {str(e)}"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling in the run method's kwargs modification loop (lines 344-351) restores attributes in a try-except that silently swallows exceptions with pass. If restoration fails for some attributes, the agent could be left in an inconsistent state. The comment says "Best effort restoration" but it would be better to log these failures or at least track which restorations failed.

Copilot uses AI. Check for mistakes.
@didiforgithub
Copy link
Collaborator

  1. Add use case and config (how to add codex config, and how to generate an environment via codex instead of mini-swe)
  2. Delete Claude Code if you don't verify it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants