Context
PR #1092 added E2E tests for credential sanitization and Telegram injection, but the tests reimplement production logic inline rather than calling the real code. Carlos flagged this in review — the tests can pass even if production regresses.
Problems
test/e2e/test-credential-sanitization.sh
-
stripCredentials() + isCredentialField() reimplemented 3x (C1-C5, C12, C13) — copy-pastes CREDENTIAL_FIELDS, CREDENTIAL_FIELD_PATTERN, and both functions into node -e heredocs instead of importing from migration-state.ts.
-
walkAndRemoveFile() reimplemented 2x (C1-C5, C8) — production uses copyDirectory() with a CREDENTIAL_SENSITIVE_BASENAMES filter. Test implements its own recursive walk.
-
sanitizeConfigFile() behavior has drifted — production does delete config.gateway then stripCredentials(). The test does NOT delete gateway, it strips fields inside it. C4b (gateway.mode preserved) tests wrong behavior.
-
verifyBlueprintDigest / verifyDigest reimplemented (C9-C11) — self-fulfilling tests that define their own verification logic, never calling production's computeFileDigest().
-
Python dependency for JSON parsing (C3-C4) — uses python3 -c "import json..." instead of node -e.
test/e2e/test-telegram-injection.sh
-
send_message_to_sandbox() is dead code — defined but never called. All tests use inline SSH.
-
Tests bypass runAgentInSandbox() / shellQuote() — T1-T4, T8 use MSG=$(cat) && echo "$MSG" over SSH, a different code path than production's shellQuote() from bin/lib/runner.js. A regression in shellQuote() wouldn't be caught.
-
sandbox_exec() still fails open — this copy wasn't updated with the fail-closed fix applied to the credential test.
Required changes
Production code
nemoclaw/src/commands/migration-state.ts: export isCredentialField, stripCredentials, sanitizeConfigFile, computeFileDigest, CREDENTIAL_FIELDS, CREDENTIAL_FIELD_PATTERN (or extract to a shared module).
scripts/telegram-bridge.js: export runAgentInSandbox for testability.
Test code
- Replace all inline
stripCredentials/walkAndRemoveFile/verifyDigest with require() of real code.
- Fix
sanitizeConfigFile drift (gateway deletion vs stripping).
- Replace python3 JSON parsing with node.
- Wire telegram injection tests through
shellQuote() / runAgentInSandbox().
- Remove or use
send_message_to_sandbox() dead code.
- Fix
sandbox_exec() fail-closed in telegram test.
References
Context
PR #1092 added E2E tests for credential sanitization and Telegram injection, but the tests reimplement production logic inline rather than calling the real code. Carlos flagged this in review — the tests can pass even if production regresses.
Problems
test/e2e/test-credential-sanitization.sh
stripCredentials()+isCredentialField()reimplemented 3x (C1-C5, C12, C13) — copy-pastesCREDENTIAL_FIELDS,CREDENTIAL_FIELD_PATTERN, and both functions intonode -eheredocs instead of importing frommigration-state.ts.walkAndRemoveFile()reimplemented 2x (C1-C5, C8) — production usescopyDirectory()with aCREDENTIAL_SENSITIVE_BASENAMESfilter. Test implements its own recursive walk.sanitizeConfigFile()behavior has drifted — production doesdelete config.gatewaythenstripCredentials(). The test does NOT deletegateway, it strips fields inside it. C4b (gateway.modepreserved) tests wrong behavior.verifyBlueprintDigest/verifyDigestreimplemented (C9-C11) — self-fulfilling tests that define their own verification logic, never calling production'scomputeFileDigest().Python dependency for JSON parsing (C3-C4) — uses
python3 -c "import json..."instead ofnode -e.test/e2e/test-telegram-injection.sh
send_message_to_sandbox()is dead code — defined but never called. All tests use inline SSH.Tests bypass
runAgentInSandbox()/shellQuote()— T1-T4, T8 useMSG=$(cat) && echo "$MSG"over SSH, a different code path than production'sshellQuote()frombin/lib/runner.js. A regression inshellQuote()wouldn't be caught.sandbox_exec()still fails open — this copy wasn't updated with the fail-closed fix applied to the credential test.Required changes
Production code
nemoclaw/src/commands/migration-state.ts: exportisCredentialField,stripCredentials,sanitizeConfigFile,computeFileDigest,CREDENTIAL_FIELDS,CREDENTIAL_FIELD_PATTERN(or extract to a shared module).scripts/telegram-bridge.js: exportrunAgentInSandboxfor testability.Test code
stripCredentials/walkAndRemoveFile/verifyDigestwithrequire()of real code.sanitizeConfigFiledrift (gateway deletion vs stripping).shellQuote()/runAgentInSandbox().send_message_to_sandbox()dead code.sandbox_exec()fail-closed in telegram test.References
shellQuoteandvalidateNameare already exported frombin/lib/runner.js✅