fix: resolve env-var placeholders in Codex/Gemini self-defined stdio MCP (closes #1266)#1277
fix: resolve env-var placeholders in Codex/Gemini self-defined stdio MCP (closes #1266)#1277edenfunf wants to merge 4 commits into
Conversation
…MCP (closes microsoft#1266) The Codex and Gemini adapters wrote the env block of self-defined stdio MCP servers to disk verbatim, bypassing the env-var resolution pipeline. All three placeholder forms (${VAR}, ${env:VAR}, legacy <VAR>) landed as literal strings in ~/.codex/config.toml and .gemini/settings.json, so the MCP child processes received e.g. ATLASSIAN_API_TOKEN=${ATLASSIAN_API_TOKEN} and authentication failed. Root cause was twofold. The Codex and Gemini _raw_stdio branches never called _resolve_environment_variables. Even adapters that did call it (Cursor) hit a latent gap: the legacy-mode branch only handled the registry list-of-dict shape, so the dict-shape input from self-defined stdio deps was iterated as keys, every key failed isinstance(..., dict), and the env block came out as {}. Codex additionally extends MCPClientAdapter directly, not CopilotClientAdapter, so the resolver method wasn't even available on the class. Fix moves _resolve_environment_variables, _resolve_env_variable, _resolve_variable_placeholders, the env-placeholder regex helpers, and the per-server placeholder state to MCPClientAdapter so every adapter shares one resolver contract. Adds a legacy-mode dict-shape branch (matches the fix in microsoft#1224 for Claude) and rewrites Codex/Gemini's _raw_stdio branches to route env and args through the resolver. Cursor's silent-drop latent bug dies as a byproduct of consolidating the resolver onto the base; pinned by a new regression test.
There was a problem hiding this comment.
Pull request overview
This PR fixes placeholder resolution for self-defined stdio MCP servers declared in apm.yml, ensuring Codex and Gemini no longer write ${VAR} / ${env:VAR} / <VAR> placeholders verbatim into their generated config files. It also centralizes env/arg placeholder resolution logic so all adapters share the same behavior, and adds regression coverage for Codex/Gemini/Cursor plus the shared resolver.
Changes:
- Move env/arg placeholder resolution helpers into
MCPClientAdapterso all client adapters share a single resolver contract. - Update Codex and Gemini
_raw_stdioformatting to routeenvthrough_resolve_environment_variablesandargsthrough_resolve_variable_placeholders. - Add regression tests covering the fixed placeholder resolution behavior and the previously latent dict-shape env drop.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/adapters/client/base.py |
Adds shared env/arg placeholder resolution and prompt-suppression policy to the base adapter. |
src/apm_cli/adapters/client/copilot.py |
Removes duplicated resolver helpers and imports shared helpers from the base adapter. |
src/apm_cli/adapters/client/codex.py |
Ensures self-defined stdio env/args flow through the shared resolver pipeline. |
src/apm_cli/adapters/client/gemini.py |
Ensures self-defined stdio env/args flow through the shared resolver pipeline. |
tests/unit/test_copilot_adapter.py |
Adds base-resolver regression tests for legacy-mode dict-shaped env blocks. |
tests/unit/test_cursor_mcp.py |
Adds a regression test proving Cursor no longer drops dict-shaped stdio env blocks. |
tests/unit/test_gemini_mcp.py |
Adds Gemini regression tests for resolving all placeholder syntaxes (env + args). |
tests/test_codex_empty_string_and_defaults.py |
Adds Codex regression tests for resolving all placeholder syntaxes (env + args). |
- Loosen translate-mode dict ``translated`` annotation. The branch passes
non-string YAML scalars through verbatim, so ``dict[str, str]`` was a
type liar; ``dict`` keeps the existing behaviour without misleading
mypy.
- Move ``TestCursorSelfDefinedStdioEnvResolution`` above the
``__main__`` guard in tests/unit/test_cursor_mcp.py so direct
execution (``python tests/unit/test_cursor_mcp.py``) includes the
new regression class. pytest discovery was unaffected.
- Remove the stale manifest-schema line claiming Codex passes
``${VAR}`` / ``${env:VAR}`` through verbatim; after this PR Codex
resolves all three placeholder syntaxes consistent with the table's
Copilot CLI / Codex grouping.
# Conflicts: # docs/src/content/docs/reference/manifest-schema.md # src/apm_cli/adapters/client/copilot.py
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 1 | Solid refactor: moving env-var resolution to MCPClientAdapter base is the right architectural move. One missed deduplication (_stringify_env_literal still in copilot.py) and one pre-existing inheritance smell (Gemini extends Copilot). No correctness or security regressions found. |
| CLI Logging Expert | 0 | 1 | 1 | Removal of bare print() for APM_E2E_TESTS is correct. _warn_input_variables now the canonical cross-adapter path still uses bare print() instead of _rich_warning(). |
| DevX UX Expert | 0 | 1 | 2 | The fix is a genuine correctness win. Doc table omits Gemini and Cursor, leaving users of those targets without guidance. |
| Supply Chain Security | 0 | 1 | 1 | Translate/legacy mode gate correctly implemented. Regex safe. Config files containing resolved secrets written with default umask (world-readable risk). |
| OSS Growth Hacker | 0 | 3 | 1 | Fix removes silent failure for highest-value user cohort. Doc update incomplete; CHANGELOG has no entry for this fix. |
| Auth Expert | 0 | 1 | 2 | Architecturally sound. Group indices correct, translate-mode isolated. One tracking-only gap in translate-mode branch. No blocking auth correctness issues. |
| Doc Writer | 1 | 2 | 1 | PR removes one stale claim but introduces/retains a second inaccurate claim; resolution table structurally incomplete; Unicode em dash needs fixing. |
| Test Coverage Expert | 0 | 1 | 1 | Core env-resolution and bool-stringify surfaces covered. _should_skip_env_prompts centralisation has no unit test defending its three-branch TTY/CI policy. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Doc Writer] (blocking-severity) Fix false claim on manifest-schema.md line 454:
<VAR>is described as resolved 'only by Copilot CLI and Codex' -- this is factually wrong after this PR -- Gemini and Cursor also resolve it. One sentence edit; high trust cost if left at merge. - [Supply Chain Security Expert] Config files containing resolved secrets written with default umask (world-readable 0o644) -- add os.chmod(0o600) after write in Codex, Gemini, Cursor adapters -- This PR routes more env vars through the write path, making the permissions gap more acute. One-liner fix per adapter.
- [Test Coverage Expert] Add unit tests for _should_skip_env_prompts three-branch TTY/CI policy (env_overrides truthy, APM_E2E_TESTS=1, non-TTY) -- Missing-test gap on a secure-by-default surface: if the env_overrides branch regresses, CI runs with --env-overrides {} would get unexpected interactive prompts.
- [DevX UX Expert] Emit _rich_warning when skip_prompting=True and a placeholder cannot be resolved -- ${TOKEN} is currently written verbatim with no user signal -- Two panelists independently flagged this silent-failure path; it partially undoes the fix for CI/non-TTY flows.
- [OSS Growth Hacker] Add CHANGELOG [Unreleased] Fixed entry and update manifest-schema.md table header and bullets to include Gemini and Cursor -- Three panelists converged on the same doc gaps. CHANGELOG entry is a release-note beat for the highest-value user cohort.
Architecture
classDiagram
direction LR
class MCPClientAdapter {
<<Abstract>>
+target_name str
+mcp_servers_key str
+_supports_runtime_env_substitution bool
+_DEFAULT_GITHUB_ENV ClassVar~dict~
+_last_env_placeholder_keys set
+_last_legacy_angle_vars set
+_resolve_environment_variables(env_vars, env_overrides) dict
+_resolve_env_variable(name, value, env_overrides) str
+_resolve_variable_placeholders(value, resolved_env, runtime_vars) str
+configure_mcp_server()*
+get_config_path()*
+update_config()*
+get_current_config()*
}
note for MCPClientAdapter "Template Method: base owns resolution protocol"
class CopilotClientAdapter {
<<ConcreteAdapter>>
+_supports_runtime_env_substitution = True
+_collect_previously_baked_keys()
+_emit_install_summary()
+configure_mcp_server()
}
class CodexClientAdapter {
<<ConcreteAdapter>>
+_supports_runtime_env_substitution = False
+_format_server_config()
+configure_mcp_server()
}
class GeminiClientAdapter {
<<ConcreteAdapter>>
+_supports_runtime_env_substitution = False
+_format_server_config()
+configure_mcp_server()
}
MCPClientAdapter <|-- CopilotClientAdapter
CopilotClientAdapter <|-- GeminiClientAdapter
MCPClientAdapter <|-- CodexClientAdapter
classDef touched fill:#fff3b0,stroke:#d47600
class MCPClientAdapter:::touched
class CopilotClientAdapter:::touched
class CodexClientAdapter:::touched
class GeminiClientAdapter:::touched
flowchart TD
A([apm install - stdio server with env block]) --> B[adapter._format_server_config]
B --> C[_resolve_environment_variables\nbase.py MCPClientAdapter]
C --> D{_supports_runtime_env_substitution?}
D -- True - Copilot only --> E[translate mode:\nemit dollar-brace placeholder verbatim]
D -- False - Codex / Gemini / Cursor --> F[legacy mode:\nenv_overrides -> os.environ -> prompt]
E --> G[config file: runtime placeholders]
F --> H[config file: literal resolved values]
B --> I[_resolve_variable_placeholders\nfor args list]
I --> J{_supports_runtime_env_substitution?}
J -- True --> K[emit placeholder verbatim in args]
J -- False --> L[resolve arg placeholders to literals]
G --> M([config written])
H --> M
K --> M
L --> M
Recommendation
The core fix is correct, well-scoped, and closes a real silent failure for Codex, Gemini, and Cursor users. The panel found no blocking correctness or security regressions introduced by this PR. One item must be addressed before or at merge: the doc-writer's blocking finding on manifest-schema.md line 454 -- the word 'only' is factually wrong about the behavior this PR ships and will actively mislead Gemini/Cursor users. This is a single-sentence edit. All remaining findings are recommended or nit tier and should be filed as follow-up issues: config file permissions (supply-chain), _should_skip_env_prompts unit tests (test-coverage), unresolved-placeholder warning (devx-ux + supply-chain convergence), and CHANGELOG + doc table cleanup (oss-growth-hacker + doc-writer + devx-ux convergence). Ship once line 454 is corrected.
Full per-persona findings
Python Architect
-
[recommended]
_stringify_env_literalstill duplicated in copilot.py after moving canonical copy to base.py atsrc/apm_cli/adapters/client/copilot.py:24
base.py defines_stringify_env_literalas a module-level pure helper. copilot.py still defines its own identical copy. Future divergence is a latent correctness risk.
Suggested: Remove the copilot.py copy and add_stringify_env_literalto the existing base import line. -
[nit] GeminiClientAdapter inherits CopilotClientAdapter not MCPClientAdapter then overrides
_supports_runtime_env_substitutionback to False atsrc/apm_cli/adapters/client/gemini.py:38
Pre-existing smell, but refactor makes it more visible. Gemini no longer needs Copilot-specific methods that are dead on its path.
Suggested: Consider making GeminiClientAdapter extend MCPClientAdapter directly in a follow-up.
CLI Logging Expert
-
[recommended]
_warn_input_variablesemits via bareprint()instead of_rich_warning()-- now the canonical cross-adapter path atsrc/apm_cli/adapters/client/base.py:281
The PR promotes this method to shared base, making it the canonical warning path for every adapter. Bare print() bypasses quiet/verbose mode.
Suggested: Replace print() with _rich_warning() and import _rich_warning at top of base.py. -
[nit]
self._last_legacy_angle_vars = set()reset in copilot.py configure_mcp_server is redundant with base init atsrc/apm_cli/adapters/client/copilot.py:196
Dual-initialisation adds confusion risk.
DevX UX Expert
-
[recommended] Doc table column header omits Gemini and Cursor -- users of those targets get no guidance at
docs/src/content/docs/reference/manifest-schema.md
Table reads 'VS Code | Copilot CLI / Codex'. Gemini and Cursor are absent. Post-fix, they resolve all three syntaxes identically to Codex.
Suggested: Rename column to 'Copilot CLI / Codex / Gemini / Cursor' and add a bullet for those targets. -
[nit] Silent round-trip for unresolvable placeholders is undocumented -- users will be confused when auth fails again at
src/apm_cli/adapters/client/base.py
When skip_prompting=True and no value found, ${TOKEN} is written verbatim with no warning. A CI run will succeed, config will contain ${TOKEN}, auth fails at runtime.
Suggested: Emit a _rich_warning when the fallback branch is taken. -
[nit]
<VAR>legacy recommendation bullet uses 'Copilot CLI and Codex' -- inconsistent with PR scope atdocs/src/content/docs/reference/manifest-schema.md
Post-fix, Gemini and Cursor also resolve<VAR>.
Supply Chain Security Expert
-
[recommended] Legacy adapter config files containing resolved secrets written with default umask (world-readable risk) at
src/apm_cli/adapters/client/codex.py, gemini.py, cursor.py
Codex, Gemini, Cursor write config files using open() with no explicit mode. Default umask 0o022 yields 0o644 -- world-readable. Pre-existing gap made more acute by this PR routing more env vars through.
Suggested: Use os.chmod(config_path, 0o600) after write or write atomically via tempfile with explicit permissions. -
[nit] Unresolved placeholder silently preserved in legacy mode config when prompts are skipped at
src/apm_cli/adapters/client/base.py:501
When skip_prompting=True and no value found, ${TOKEN} is written verbatim. No warning emitted. Auth will fail at runtime with no install-time signal.
OSS Growth Hacker
-
[recommended] Table header 'Copilot CLI / Codex' is stale -- Gemini and Cursor now also resolve placeholders at
docs/src/content/docs/reference/manifest-schema.md
Suggested: Rename to 'Copilot CLI / Codex / Gemini / Cursor'. -
[recommended] Bullet still says
<VAR>is 'only resolved by Copilot CLI and Codex' atdocs/src/content/docs/reference/manifest-schema.md
Suggested: Extend to name all four resolvers. -
[recommended] No CHANGELOG entry for this fix at
CHANGELOG.md
Current [Unreleased] Fixed section has entries for Claude stdio env ([BUG] Claude .mcp.json drops env for self-defined stdio MCP servers #1222) but nothing for Codex/Gemini/Cursor placeholder resolution ([BUG] Codex and Gemini adapters pass self-defined stdio env verbatim, skipping placeholder resolution #1266).
Suggested: Add: 'Self-defined stdio MCP server env and args placeholders are now resolved at install time for Codex, Gemini, and Cursor targets; previously placeholder strings were written verbatim, causing silent auth failures. ([BUG] Codex and Gemini adapters pass self-defined stdio env verbatim, skipping placeholder resolution #1266)' -
[nit] Line 453 Copilot CLI bullet omits Gemini and Cursor which now behave identically at
docs/src/content/docs/reference/manifest-schema.md
Auth Expert
-
[recommended]
_resolve_env_variabletranslate-mode scans pre-translated value for placeholder tracking -- misses<LEGACY_VAR>header keys atsrc/apm_cli/adapters/client/base.py:480
_last_env_placeholder_keyspopulated by_ENV_VAR_RE.finditer(value)on the original value before translation._ENV_VAR_REonly matches${...}syntax; it does not match<LEGACY_VAR>. Post-install deprecation warning will silently omit legacy-syntax header vars.
Suggested: Scan the translated result instead:for m in _ENV_VAR_RE.finditer(translated): self._last_env_placeholder_keys.add(m.group(1)) -
[nit] Empty-string env_overrides entry silently falls through to os.environ at
src/apm_cli/adapters/client/base.py:495
env_overrides.get(env_name) or os.getenv(env_name): empty string causes silent fallback. Pre-existing behavior, not introduced by this PR. -
[nit] Codex retains parallel
_process_environment_variableswith hardcodedDEFAULT_GITHUB_ENVdict atsrc/apm_cli/adapters/client/codex.py:402
Definesdefault_github_envinline rather than referencingself._DEFAULT_GITHUB_ENVfrom base. Future drift risk.
Doc Writer
-
[blocking] False claim:
<VAR>described as resolved 'only by Copilot CLI and Codex' after PR extends resolution to Gemini and Cursor atdocs/src/content/docs/reference/manifest-schema.md:454
The word 'only' is factually wrong. This PR routes the env dict through_resolve_environment_variablesin gemini.py and cursor.py. A user targeting Gemini or Cursor will incorrectly believe<VAR>passes through verbatim.
Suggested: Replace with: '<VAR>is legacy; in VS Code it is not recognized and renders as literal text. All other targets (Copilot CLI, Codex, Gemini, Cursor) resolve it at install time for back-compat.' -
[recommended] Resolution table omits Gemini and Cursor entirely at
docs/src/content/docs/reference/manifest-schema.md
Table has only VS Code and 'Copilot CLI / Codex' columns. Post-fix, Gemini and Cursor resolve all three syntaxes at install time identically.
Suggested: Rename column header to 'Copilot CLI / Codex / Gemini / Cursor'. -
[recommended] Copilot CLI bullet omits Gemini and Cursor which now behave identically at
docs/src/content/docs/reference/manifest-schema.md:453
Suggested: Add a new bullet covering Codex / Gemini / Cursor install-time resolution behavior. -
[nit] Unicode em dash (U+2014) used instead of ASCII hyphen in lines 454-455 at
docs/src/content/docs/reference/manifest-schema.md
Project requires ASCII only in doc prose.
Suggested: Replace each em dash with ' - ' (ASCII hyphen with surrounding spaces).
Test Coverage Expert
-
[recommended]
_should_skip_env_promptshas no unit test defending its three-branch policy attests/unit/test_base_adapter.py
New centralized helper replaces per-adapter TTY checks. Three branches: (1) env_overrides truthy -> skip, (2) APM_E2E_TESTS=1 -> skip, (3) stdin/stdout not TTY -> skip. Zero test hits in adapter unit tests.
Proof (missing):tests/unit/test_base_adapter.py::TestShouldSkipEnvPrompts-- proves: When a caller passes env_overrides, interactive prompts are suppressed; empty dict does not suppress; APM_E2E_TESTS=1 suppresses in CI. [devx, secure-by-default]
Suggested: Add TestShouldSkipEnvPrompts with parametrized cases: env_overrides={'K':'V'} -> True, env_overrides={} -> False, APM_E2E_TESTS=1 -> True, non-TTY -> True. -
[nit] Gemini and Codex suites do not exercise bool/None scalar coercion through
_stringify_env_literal
Bool->false and None-omission IS covered by test_copilot_adapter.py::test_dict_shaped_env_block_stringifies_literals_and_omits_none (unit tier, passed). Flagging as nit: one parametrize entry in each adapter's test suite would make the guarantee adapter-local.
Proof (passed):tests/unit/test_copilot_adapter.py::test_dict_shaped_env_block_stringifies_literals_and_omits_none-- proves: YAML bool False serialises as lowercase 'false'; None keys are omitted.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Note
🔒 Integrity filter blocked 2 items
The following items were blocked because they don't meet the GitHub integrity level.
- fix: resolve env-var placeholders in Codex/Gemini self-defined stdio MCP (closes #1266) #1277
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved". - #1277
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
To allow these resources, lower min-integrity in your GitHub frontmatter:
tools:
github:
min-integrity: approved # merged | approved | unapproved | noneGenerated by PR Review Panel for issue #1277 · ● 4.2M · ◷
Summary
Self-defined stdio MCP servers declared in
apm.ymlhad theirenvblock written verbatim to~/.codex/config.tomland.gemini/settings.json. All three placeholder syntaxes (${VAR},${env:VAR}, legacy<VAR>) ended up as literal strings, causing the MCP child process to fail authentication.This PR resolves them at install time, matching the Copilot/Cursor pattern and consistent with the documented manifest-schema behaviour.
Root cause
_raw_stdiobranch assignedraw["env"]directly toconfig["env"], skipping the resolver entirely._resolve_environment_variables(Cursor) hit a latent gap: the legacy-mode branch only handled the registry list-of-dict shape, so the dict-shape input from self-defined stdio deps was iterated as keys, every key failedisinstance(env_var, dict), and the env block came out as{}.MCPClientAdapterdirectly (notCopilotClientAdapter), so the resolver wasn't even available on the class.Fix
_resolve_environment_variables,_resolve_env_variable,_resolve_variable_placeholders, the env-placeholder regex helpers, and the per-server placeholder state toMCPClientAdapterso every adapter shares one resolver contract._resolve_environment_variables(matches the fix in fix: preserve Claude stdio MCP env #1224 for Claude). Cursor's silent-drop latent bug dies as a byproduct._raw_stdiobranch to routeenvthrough the resolver andargsthrough_resolve_variable_placeholders, with the same shape as the existing Cursor/Copilot branches._should_skip_env_promptshelper.Production code is net ~0 lines (mostly relocation); tests add ~180 lines of regression coverage.
Related
Test plan
apm install -t codex,gemini,cursorwith all three placeholder syntaxes inapm.yml; resolved values written to disk for legacy-mode adapters; Copilot translate-mode unchanged (placeholders kept as runtime substitution)ruff check+ruff format --checkclean