@@ -351,18 +351,31 @@ def execute_code(self, invocation_context, code_execution_input):
351351 container. This is the most reliable fallback but destroys
352352 in-container state.
353353
354- ** Permissions:** ` os.kill() ` on the host PID requires the ADK process
355- to run as the same user that started the container (or as root). This
356- is the normal case for local Docker usage. For rootless Docker or
357- restricted environments, the container-restart fallback applies.
354+ ** Permissions and user mismatch risk:** ` os.kill() ` on the host PID
355+ requires the ADK process to run as the same user that owns the
356+ container's exec'd process on the host. The current
357+ ` ContainerCodeExecutor ` does not set ` user= ` on ` containers.run() `
358+ (` container_code_executor.py:182 ` ), so the container process runs as
359+ root. If the ADK process runs as a non-root user (common in
360+ development), ` os.kill(host_pid, SIGKILL) ` will raise
361+ ` PermissionError ` and the container-restart fallback activates.
362+
363+ This user mismatch is the expected case for default Docker usage,
364+ so the ` PermissionError → container.restart() ` path is the ** primary
365+ timeout mechanism in practice** , not an edge case. The ` os.kill ` path
366+ becomes primary only when ADK runs as root or the container is
367+ configured with ` user= ` matching the host user.
358368
359369** Recovery after timeout:**
360370- ** Stateless mode:** No recovery needed. The killed process is gone;
361371 the next ` exec_run ` starts a fresh process in the same container.
362- - ** Stateful mode:** The timed-out block is NOT appended to history
363- (append-after-success invariant). If the container was restarted
364- as fallback, the executor must detect this and replay the
365- accumulated history on the next call.
372+ - ** Stateful mode (Option A / persistent process):** The timed-out
373+ block is NOT appended to history (append-after-success invariant).
374+ If the persistent REPL was killed or the container was restarted,
375+ the executor returns an error with ` stderr ` indicating state was
376+ lost. The caller (LLM flow or skill tool) must handle this —
377+ typically by starting a new session. Automatic replay is not
378+ attempted because it may re-execute side effects.
366379
367380#### 4.2.4 ` GkeCodeExecutor ` — Already Implemented
368381
@@ -844,9 +857,18 @@ class LocalSandboxCodeExecutor(BaseCodeExecutor):
844857 allowed_env_vars: list[str ] = []
845858
846859 def execute_code (self , invocation_context , code_execution_input ):
860+ import platform
847861 import subprocess
862+ import sys
848863 import tempfile
849864
865+ # Windows is out of scope (§2 Non-Goals).
866+ if platform.system() == ' Windows' :
867+ raise NotImplementedError (
868+ ' LocalSandboxCodeExecutor is not supported on '
869+ ' Windows. Use ContainerCodeExecutor instead.'
870+ )
871+
850872 with tempfile.NamedTemporaryFile(
851873 mode = ' w' , suffix = ' .py' , delete = True
852874 ) as f:
@@ -866,28 +888,32 @@ class LocalSandboxCodeExecutor(BaseCodeExecutor):
866888 if timeout is None :
867889 timeout = self .max_cpu_seconds
868890
869- import sys
870891 # Prefer process_group (3.11+) over preexec_fn
871892 # (not fork-safe with threads).
872893 spawn_kwargs = {}
873894 if sys.version_info >= (3 , 11 ):
874895 spawn_kwargs[' process_group' ] = 0
875896 else :
876- # Fallback for 3.10; caveat: not fork-safe
897+ # Fallback for 3.10; caveat: not fork-safe.
898+ # Guard resource import for platforms where
899+ # the module is unavailable.
877900 def _set_limits ():
878- import resource
879- resource.setrlimit(
880- resource.RLIMIT_CPU ,
881- (self .max_cpu_seconds,) * 2 ,
882- )
883- mem = self .max_memory_mb * 1024 * 1024
884- resource.setrlimit(
885- resource.RLIMIT_AS , (mem, mem),
886- )
901+ try :
902+ import resource
903+ resource.setrlimit(
904+ resource.RLIMIT_CPU ,
905+ (self .max_cpu_seconds,) * 2 ,
906+ )
907+ mem = self .max_memory_mb * 1024 * 1024
908+ resource.setrlimit(
909+ resource.RLIMIT_AS , (mem, mem),
910+ )
911+ except (ImportError , OSError ):
912+ pass # timeout-only enforcement
887913 spawn_kwargs[' preexec_fn' ] = _set_limits
888914
889- # Guard resource import for platforms where
890- # it is unavailable (falls back to timeout-only) .
915+ # Inline wrapper sets resource limits in the child
916+ # process. Guarded for missing resource module .
891917 limit_code = (
892918 f ' try: \n '
893919 f ' import resource \n '
@@ -1122,8 +1148,11 @@ is new and has no tests yet.
11221148| Unit tests | Mock-based tests per executor | ** Add ` test_container_code_executor.py ` ** , add ` test_local_sandbox_code_executor.py ` |
11231149| Integration tests | Real executor tests (like ` ExecuteSkillScriptTool ` integration tests) | Add Docker-based container tests (CI-gated) |
11241150| Timeout tests | Scripts with ` time.sleep() ` to verify enforcement | Per-executor timeout tests |
1151+ | Timeout kill fallback | Verify ` PermissionError ` from ` os.kill ` triggers container restart | Mock ` os.kill ` to raise ` PermissionError ` , assert ` container.restart() ` called and ` CodeExecutionResult.stderr ` contains timeout message |
1152+ | Timeout kill success | Verify ` os.kill(host_pid) ` path when permitted | Mock ` exec_inspect ` to return PID, assert ` os.kill ` called with correct signal |
11251153| Security tests | Scripts attempting blocked operations | ` restrict_builtins ` bypass attempts, env var leakage |
11261154| Stateful tests | Multi-call sequences verifying variable persistence | Append-after-success, failure-does-not-poison, ` execution_id ` isolation |
1155+ | Stateful crash recovery | Verify error returned on REPL/container crash | Kill REPL mid-execution, assert error indicates state loss |
11271156
11281157---
11291158
@@ -1141,17 +1170,23 @@ is new and has no tests yet.
114111707 . Update ` ExecuteSkillScriptTool ` to set per-invocation timeout via
11421171 ` CodeExecutionInput.timeout_seconds `
11431172
1144- ### Phase 2: Stateful Container (3-5 days)
1173+ ### Phase 2: Stateful Container (5-8 days)
1174+
1175+ Implement Option A (persistent process) directly, as recommended in
1176+ §5.2.2. This avoids the side-effect replay problems of Option C.
11451177
114611781 . Unfreeze ` stateful ` on ` ContainerCodeExecutor `
1147- 2 . Implement cumulative code history with stdout suppression
1148- (append-after-success invariant)
1149- 3 . Add ` execution_id ` -based history isolation
1150- 4 . Wire ` execution_id ` in ` ExecuteSkillScriptTool `
1151- 5 . Add ` reset_state() ` method
1152- 6 . Add stateful execution tests (including failure-does-not-poison test)
1153- 7 . Update samples and documentation
1154- 8 . Evaluate persistent-process approach (Option A) for Phase 2b
1179+ 2 . Design persistent-process protocol: sentinel-delimited I/O for
1180+ output boundaries, error detection, and process health checks
1181+ 3 . Implement persistent Python REPL management (start, send code,
1182+ read output, detect crash/restart)
1183+ 4 . Add ` execution_id ` -based session isolation (one REPL per
1184+ ` execution_id ` )
1185+ 5 . Wire ` execution_id ` in ` ExecuteSkillScriptTool `
1186+ 6 . Add ` reset_state() ` method (kills and restarts the REPL)
1187+ 7 . Add stateful execution tests (variable persistence, crash recovery,
1188+ ` execution_id ` isolation)
1189+ 8 . Update samples and documentation
11551190
11561191### Phase 3: Security Hardening (5-7 days)
11571192
@@ -1171,13 +1206,12 @@ is new and has no tests yet.
11711206
11721207## 9. Open Questions
11731208
1174- 1 . ** Should we skip Phase 1 (cumulative replay) and go straight to
1175- Phase 2 (persistent process) for stateful execution?**
1176- The side-effect replay problem is fundamental to Option C. If the
1177- persistent-process I/O boundary protocol can be solved with
1178- reasonable complexity (e.g., sentinel-delimited output), the MVP
1179- phase may not be worth the tech debt. Decision: Evaluate during
1180- Phase 2 planning.
1209+ 1 . ** What I/O boundary protocol should the persistent REPL use?**
1210+ The roadmap targets Option A (persistent process) directly. The
1211+ key design question is how to delimit output for each code block:
1212+ sentinel strings in stdout, JSON-envelope protocol, or a side
1213+ channel (e.g., file-based result). Sentinel strings are simplest
1214+ but can collide with user output. Decision: spike during Phase 2.
11811215
118212162 . ** Should ` LocalSandboxCodeExecutor ` support stateful execution?**
11831217 Subprocess-based execution is inherently stateless. Stateful support
@@ -1189,12 +1223,14 @@ is new and has no tests yet.
11891223 ` SecurityWarning ` and documentation should steer users toward safer
11901224 alternatives for anything beyond local prototyping.
11911225
1192- 4 . ** How should ` ContainerCodeExecutor ` handle container crashes in
1193- stateful mode?**
1194- If the container crashes (OOM, segfault), the code history is lost.
1195- Options: (a) re-create container and replay history, (b) return error
1196- and let user restart, (c) persist history to host volume. Recommend
1197- (b) for simplicity.
1226+ 4 . ** How should ` ContainerCodeExecutor ` handle container/REPL crashes
1227+ in stateful mode?**
1228+ If the container crashes (OOM, segfault) or the persistent REPL
1229+ dies, in-process state is lost. The executor returns an error
1230+ indicating state loss and lets the caller handle recovery (e.g.,
1231+ start a new session). Automatic replay is not attempted because
1232+ prior code blocks may have had side effects that should not be
1233+ re-executed (consistent with §4.2.3 recovery policy).
11981234
11991235---
12001236
0 commit comments