LCORE-1329: Adding new MCP E2E Tests#1294
LCORE-1329: Adding new MCP E2E Tests#1294jrobertboos wants to merge 8 commits intolightspeed-core:mainfrom
Conversation
- Introduced new YAML configuration files for MCP authentication methods: file, Kubernetes, and OAuth. - Updated existing MCP configuration to include new authentication methods. - Enhanced end-to-end tests to cover scenarios for file-based, Kubernetes, and OAuth authentication. - Added utility functions for unregistering MCP toolgroups in the testing environment. - Implemented new step definitions for checking response body content in tests.
WalkthroughThis PR adds comprehensive MCP authentication testing infrastructure with multiple configuration scenarios across library and server modes. It introduces test configurations for file-based, Kubernetes, OAuth, and client-based authentication schemes, along with supporting utilities and test steps. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Added a new invalid MCP token file for testing purposes. - Updated the Docker Compose configuration to mount the invalid MCP token. - Introduced a new YAML configuration for testing invalid MCP file authentication. - Enhanced the test scenarios to include checks for invalid MCP file authentication. - Updated feature files to reflect the new authentication configurations.
…handling - Added new configurations for invalid, Kubernetes, client, and OAuth MCP authentication methods. - Updated scenario handling to reference the new configuration paths for Kubernetes, client, and OAuth authentication. - Enhanced the testing environment to support the new authentication configurations.
|
So currently the following e2e tests fail: These failing tests can be grouped into the following bugs:
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/features/mcp.feature (1)
440-447:⚠️ Potential issue | 🟡 MinorUse the same OAuth fixture token as the other happy-path cases.
The server-mode query happy path right above this uses
Bearer oauth-test-token, but this streaming case switches toBearer test-token. If those fixtures diverge, this scenario stops covering the same auth path and will fail independently.🔧 Suggested fix
And I set the "MCP-HEADERS" header to """ - {"mcp-oauth": {"Authorization": "Bearer test-token"}} + {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}} """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/features/mcp.feature` around lines 440 - 447, The streaming-query scenario "Check if streaming_query endpoint succeeds when MCP OAuth auth token is passed" uses a different fixture token ("Bearer test-token") than the other happy-path cases; update the MCP-HEADERS block in that Scenario to use the same fixture token "Bearer oauth-test-token" so it matches the server-mode happy-path tests and continues to exercise the same OAuth fixture.
🧹 Nitpick comments (3)
tests/e2e/utils/llama_stack_tools.py (2)
35-48: Explicitreturn Noneis unnecessary.Line 45 returns
Noneexplicitly, but the function return type isNone. A barereturnor simply omitting it would be clearer.✨ Minor cleanup
if e.status_code == 400 and "not found" in str(e).lower(): - return None + return raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/utils/llama_stack_tools.py` around lines 35 - 48, The function _unregister_toolgroup_async contains an explicit "return None" inside the APIStatusError handler which is redundant given the annotated return type is None; remove the explicit "return None" (either replace with a bare "return" or just omit the return entirely) in the block that checks if e.status_code == 400 and "not found" in str(e).lower(), leaving the rest of the error handling (raising other errors) and the finally block that calls await client.close() unchanged.
51-65: Multiple client instances created during iteration.The
_unregister_mcp_toolgroups_asyncfunction creates a client at line 53 to list toolgroups, then_unregister_toolgroup_asynccreates a new client for each toolgroup to unregister. This results in N+1 client instances for N toolgroups, each requiring cleanup.Consider reusing the client instance:
♻️ Proposed refactor to reuse client
async def _unregister_mcp_toolgroups_async() -> None: """Unregister all MCP toolgroups.""" client = _get_llama_stack_client() try: toolgroups = await client.toolgroups.list() for toolgroup in toolgroups: if ( toolgroup.identifier and toolgroup.provider_id == "model-context-protocol" ): - await _unregister_toolgroup_async(toolgroup.identifier) + try: + await client.toolgroups.unregister(toolgroup.identifier) + except APIStatusError as e: + # 400 "not found": toolgroup already absent, continue + if not (e.status_code == 400 and "not found" in str(e).lower()): + raise except APIConnectionError: raise finally: await client.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/utils/llama_stack_tools.py` around lines 51 - 65, The function _unregister_mcp_toolgroups_async currently creates a client to list toolgroups but calls _unregister_toolgroup_async for each item which creates its own client per toolgroup; refactor to reuse a single Llama Stack client by modifying _unregister_toolgroup_async (or adding a helper like _unregister_toolgroup_with_client) to accept an existing client instance and use that to perform the unregister operation, then call that variant from _unregister_mcp_toolgroups_async to avoid N+1 clients; ensure the shared client is closed once in the outer function’s finally block and preserve the existing APIConnectionError rethrow behavior.tests/e2e/features/mcp.feature (1)
421-428: Drop the unusedmcp-clientheader from this server-mode-only scenario.This scenario is already gated with
@skip-in-library-mode, so it runs against the server-mode OAuth config.tests/e2e/configuration/server-mode/lightspeed-stack-mcp-oauth-auth.yamlonly registersmcp-oauthon Lines 21-25, so the extramcp-cliententry does not participate in the test and makes failures harder to attribute.✂️ Suggested cleanup
And I set the "MCP-HEADERS" header to """ - {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}} + {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}} """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/features/mcp.feature` around lines 421 - 428, Remove the unused "mcp-client" header from the MCP-HEADERS payload in the Scenario "Check if query endpoint succeeds when MCP OAuth auth token is passed" so the test only sets {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}}; locate the Scenario block that sets the "MCP-HEADERS" header and delete the "mcp-client" JSON entry to match the server-mode MCP OAuth config (registered as mcp-oauth).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 10-12: Replace blanket `@skip` tags that disable tests
unconditionally with the targeted `@skip-in-library-mode` tag so the /tools and
invalid-auth scenarios still run in non-library modes; specifically, for the
Scenario "Check if tools endpoint succeeds when MCP file-based auth token is
passed" and the other scenarios at the same locations (lines referenced in the
review), change the tag from `@skip` to `@skip-in-library-mode` (keep
`@MCPFileAuthConfig` intact) so only library-mode runs are skipped while
preserving these regression checks.
---
Outside diff comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 440-447: The streaming-query scenario "Check if streaming_query
endpoint succeeds when MCP OAuth auth token is passed" uses a different fixture
token ("Bearer test-token") than the other happy-path cases; update the
MCP-HEADERS block in that Scenario to use the same fixture token "Bearer
oauth-test-token" so it matches the server-mode happy-path tests and continues
to exercise the same OAuth fixture.
---
Nitpick comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 421-428: Remove the unused "mcp-client" header from the
MCP-HEADERS payload in the Scenario "Check if query endpoint succeeds when MCP
OAuth auth token is passed" so the test only sets {"mcp-oauth":
{"Authorization": "Bearer oauth-test-token"}}; locate the Scenario block that
sets the "MCP-HEADERS" header and delete the "mcp-client" JSON entry to match
the server-mode MCP OAuth config (registered as mcp-oauth).
In `@tests/e2e/utils/llama_stack_tools.py`:
- Around line 35-48: The function _unregister_toolgroup_async contains an
explicit "return None" inside the APIStatusError handler which is redundant
given the annotated return type is None; remove the explicit "return None"
(either replace with a bare "return" or just omit the return entirely) in the
block that checks if e.status_code == 400 and "not found" in str(e).lower(),
leaving the rest of the error handling (raising other errors) and the finally
block that calls await client.close() unchanged.
- Around line 51-65: The function _unregister_mcp_toolgroups_async currently
creates a client to list toolgroups but calls _unregister_toolgroup_async for
each item which creates its own client per toolgroup; refactor to reuse a single
Llama Stack client by modifying _unregister_toolgroup_async (or adding a helper
like _unregister_toolgroup_with_client) to accept an existing client instance
and use that to perform the unregister operation, then call that variant from
_unregister_mcp_toolgroups_async to avoid N+1 clients; ensure the shared client
is closed once in the outer function’s finally block and preserve the existing
APIConnectionError rethrow behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b5f69e52-f673-4161-9333-abb5b7d2fdbd
📒 Files selected for processing (18)
docker-compose.yamltests/e2e/configuration/library-mode/lightspeed-stack-invalid-mcp-file-auth.yamltests/e2e/configuration/library-mode/lightspeed-stack-mcp-client-auth.yamltests/e2e/configuration/library-mode/lightspeed-stack-mcp-file-auth.yamltests/e2e/configuration/library-mode/lightspeed-stack-mcp-kubernetes-auth.yamltests/e2e/configuration/library-mode/lightspeed-stack-mcp-oauth-auth.yamltests/e2e/configuration/library-mode/lightspeed-stack-mcp.yamltests/e2e/configuration/server-mode/lightspeed-stack-invalid-mcp-file-auth.yamltests/e2e/configuration/server-mode/lightspeed-stack-mcp-client-auth.yamltests/e2e/configuration/server-mode/lightspeed-stack-mcp-file-auth.yamltests/e2e/configuration/server-mode/lightspeed-stack-mcp-kubernetes-auth.yamltests/e2e/configuration/server-mode/lightspeed-stack-mcp-oauth-auth.yamltests/e2e/configuration/server-mode/lightspeed-stack-mcp.yamltests/e2e/features/environment.pytests/e2e/features/mcp.featuretests/e2e/features/steps/common_http.pytests/e2e/secrets/invalid-mcp-tokentests/e2e/utils/llama_stack_tools.py
| @skip | ||
| @MCPFileAuthConfig | ||
| Scenario: Check if tools endpoint succeeds when MCP file-based auth token is passed |
There was a problem hiding this comment.
Don't merge these scenarios behind blanket @skip tags.
The targeted @skip-in-library-mode cases are understandable, but these plain @skips disable the exact /tools and invalid-auth paths called out as broken in the PR notes. That leaves the new MCP suite green without actually protecting those regressions in any mode.
Also applies to: 49-51, 65-67, 84-86, 104-106, 146-148, 163-165, 183-185, 204-206
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/features/mcp.feature` around lines 10 - 12, Replace blanket `@skip`
tags that disable tests unconditionally with the targeted `@skip-in-library-mode`
tag so the /tools and invalid-auth scenarios still run in non-library modes;
specifically, for the Scenario "Check if tools endpoint succeeds when MCP
file-based auth token is passed" and the other scenarios at the same locations
(lines referenced in the review), change the tag from `@skip` to
`@skip-in-library-mode` (keep `@MCPFileAuthConfig` intact) so only library-mode runs
are skipped while preserving these regression checks.
| ) | ||
|
|
||
|
|
||
| def _get_llama_stack_client() -> AsyncLlamaStackClient: |
There was a problem hiding this comment.
there is already file that contains this logic, merge it together with the llama_stack_shields.py
| Then The status code of the response is 200 | ||
| And The response should contain following fragments | ||
| | Fragments in LLM response | | ||
| | Hello | |
There was a problem hiding this comment.
Add And The body of the response does not contain mcp-client
Description
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit