LCORE-1216: Bump up to llama-stack 0.4.3 by are-ces · Pull Request #52 · lightspeed-core/lightspeed-providers

are-ces · 2026-02-08T16:53:13Z

Description

This is a significant refactoring of all the modules, mostly because the Agents API has been deprecated in favor of the Responses API in llama-stack (already from 0.3.x).

This upgrade is needed to keep lightspeed-providers on par with LCORE

NOTE: run_moderation has not been designed for redaction but to only block the request, thus lightspeed-redactions will block the message if an unauthorized string is detected, as opposed to run_shield where it is possible to redact the original message.

Changes:

Bump up llama-stack library to 0.4.3
Refactor agent code to migrate from Agents API to Responses API
Refactor safety module run_shield, added run_moderation
Kept temperature override, prioritization to latest used tools, tool fitering

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Partially generated by: Claude

Related Tickets & Documents

Related Issue # LCORE-1216
Closes # LCORE-1216

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

I tested manually via curl requests the following:

Question validity run_shield (valid/invalid questions)
Question validity run_moderation
Redaction run_shield (sensitive data redacted)
Redaction run_moderation (message with sensitive data BLOCKED)
Tool filtering (11→1 tools)
min_tools threshold
Previously called tools persistence
always_include_tools config
Temperature override (1.0 for GPT-5)

tisnik

I'd say LGTM on my side. But definitely need at least one more reviewer, especially from teams that managed to use provider(s).

pyproject.toml

ldjebran · 2026-02-10T07:45:59Z

lightspeed_stack_providers/providers/inline/agents/lightspeed_inline_agent/agent_instance.py

You are removing the inline::lightspeed_inline_agent we are using in Ansible Lightspeed chatbot, if this PR is merged this will break the chatbot functionality.

inline::lightspeed_inline_agent still works, the logic has been moved from agent_instance.py to agents.py

lightspeed_stack_providers/providers/remote/agents/lightspeed_agent/lightspeed.py

pyproject.toml

TamiTakamiya · 2026-02-10T22:03:53Z

@are-ces @ldjebran I could run the updated lightspeed_inline_agent with ansible-chatbot-stack The test setup uses:

This PR (I created a wheel file from the PR branch)
Newly generated RAG DB with Build AAP RAG DB image for LCS 0.4.x ansible/aap-rag-content#293
ansible-chatbot-stack modified for Llama Stack 0.4.3 & this PR https://github.com/TamiTakamiya/ansible-chatbot-stack/tree/TamiTakamiya/AAP-64341/lightspeed-core-0_4_x

The setup is somehow complicated because it's using a number of codes that are not merged to main yet. I will create a memo on my test setup.

Note: My setup does not enable MCP server yet. After writing the memo, I plan to test this with MCP server enabled.

pyproject.toml

Jdubrick

@are-ces since we only consume the safety shield portion for my use case, that part lgtm, fyi

ldjebran · 2026-02-11T14:24:51Z

@are-ces seems the file https://github.com/lightspeed-core/lightspeed-providers/blob/main/resources/external_providers/inline/agents/lightspeed_inline_agent.yaml

needs to be updated to:

config_class: lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.config.LightspeedAgentsImplConfig
module: lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent
api_dependencies: [ inference, safety, tool_runtime, tool_groups, conversations, prompts ]
optional_api_dependencies: [vector_io, files]

The agent lightspeed_inline_agent is passing through the queries and overriding the temperature when configured , unfortunately I was not able to test mcp filtring as seems the lightspeed-stack has a regression as not passing mcp headers received from client by MCP-HEADERS header.

There is a big work done her, @are-ces many thanks for your efforts,
can we wait a little to merge to see comments of the team about mcp headers ?

ldjebran

@are-ces many thanks for the work the changes that I proposed in my last comment still valid, tested the mcp but seems the lightspeed_inline_agent is unfortunately not working as expected and breaking when enabling the mcp configuration, I see the mcp returning the list of tools, but the agent seems do not detect that tools and see only 2 instead of more than 300.
this will needs more investigations.

are-ces · 2026-02-12T11:14:44Z

Hey @ldjebran good catch! I have encountered the same problem, I was handling the tools in a wrong way; basically the MCP servers were not being expanded to their tools so we were counting the MCP servers and comparing them with min_tools.
I have tested it on my side and it works as expected, hopefully the same on your side 😄

TamiTakamiya · 2026-02-25T13:26:04Z

@are-ces Thanks for the updates. I am trying to verify this PR with lightspeed-core/lightspeed-stack#1179 on my CRC instance. I think I can set up the environment today to run the tests.

TamiTakamiya

@are-ces Sorry for letting you wait so long. Though I am still unable to set up my test environment on my CRC, I could successfully test this PR using a test script + newly built ansible-chatbot-stack container image:

INFO     2026-02-26 04:06:00,253 lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.agents:302  
         agents: Previously called tools: set()                                                                         
INFO     2026-02-26 04:06:00,254 lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.agents:158  
         agents: Always included tools (config + previously called): {'knowledge_search'}                               
INFO     2026-02-26 04:06:00,911 lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.agents:354  
         agents: Extracted 127 unique tool definitions from 2 tool configs                                              
INFO     2026-02-26 04:06:00,912 lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.agents:179  
         agents: Tool filtering enabled - filtering 127 tools (threshold: 10)                                           
INFO     2026-02-26 04:06:02,009 lightspeed_stack_providers.providers.inline.agents.lightspeed_inline_agent.agents:237  
         agents: Filtered tool names from LLM: ['job_templates_list']

I approve this PR. Thank you!

TamiTakamiya · 2026-03-02T00:24:32Z

@ldjebran @are-ces I could also see the issue at the end of streaming. It occurs on the server side as:

ERROR    2026-03-02 00:05:09,161 uvicorn.error:424 uncategorized: ASGI callable returned without completing response.

and on the client side as:

aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, mes│[03/01/26 19:05:09] DEBUG    Got event: http.disconnect. Stop streaming.                              sse.py:200
sage='Not enough data to satisfy transfer length header.'>

I should have looked at that my previous tests, but apparently did not pay much attention. Thank you @ldjebran for bringing this out.

I have tested by changing several conditions and found it occurs:

Only when MCP tool call is made
MCP servers do not have to be our AAP MCP servers. I could recreate the weather sample server
It occurs with the meta-reference agent as well. So it is not an issue of Lightspeed Agent.

Based on those observations, this seemed to be a more general issue of Llama Stack on cleaning MCP sessions at the end of a stream.

Then I searched Llama Stack repo and found code changed added with this PR llamastack/llama-stack#4758 seemed to address the issue. It was included in Llama Stack 0.5.0. However, we need to have the fix with Llama Stack 0.4.3.

So I have ported the fix to lightspeed-providers and created a commit. As far as I tested, it could eliminate the error. Could you try it in your test environment? Thank you.

ldjebran · 2026-03-02T08:29:06Z

@TamiTakamiya tank you for your investigations
I tested the same from your branch with llama-stack fixes, and its working as expected

@are-ces Many thanks for all your efforts to make this working.
we will have to investigate on how to backport this llama-stack fixes.

This PR LGTM

are-ces marked this pull request as draft February 8, 2026 16:53

are-ces requested review from TamiTakamiya, ldjebran and manstis February 8, 2026 16:53

are-ces force-pushed the llama-stack-0.4.x-bumpup branch 3 times, most recently from b2b25c6 to c84a80e Compare February 8, 2026 17:25

tisnik approved these changes Feb 9, 2026

View reviewed changes

Jdubrick reviewed Feb 9, 2026

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

ldjebran reviewed Feb 10, 2026

View reviewed changes

lightspeed_stack_providers/providers/remote/agents/lightspeed_agent/lightspeed.py Outdated Show resolved Hide resolved

are-ces force-pushed the llama-stack-0.4.x-bumpup branch 3 times, most recently from 218e6d4 to 3ad6905 Compare February 10, 2026 11:29

TamiTakamiya reviewed Feb 10, 2026

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 3ad6905 to f99d3c1 Compare February 11, 2026 08:31

are-ces marked this pull request as ready for review February 11, 2026 08:32

ldjebran reviewed Feb 11, 2026

View reviewed changes

pyproject.toml Show resolved Hide resolved

Jdubrick approved these changes Feb 11, 2026

View reviewed changes

ldjebran reviewed Feb 11, 2026

View reviewed changes

are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 84d4bf7 to 622151e Compare February 12, 2026 11:08

are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 1daeb16 to 88bb4db Compare February 25, 2026 13:18

are-ces force-pushed the llama-stack-0.4.x-bumpup branch 2 times, most recently from 5070e07 to 7e235c2 Compare February 25, 2026 13:57

Refactor: Bump up llama-stack 0.4.3

e250fef

are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 46c2ab8 to 0f82e43 Compare February 25, 2026 15:00

TamiTakamiya approved these changes Feb 26, 2026

View reviewed changes

are-ces force-pushed the llama-stack-0.4.x-bumpup branch 2 times, most recently from c01e069 to 19137c1 Compare February 27, 2026 10:04

TamiTakamiya mentioned this pull request Feb 27, 2026

LCORE-86: Prioritize BYOK content over built-in content lightspeed-core/lightspeed-stack#1208

Open

19 tasks

ldjebran approved these changes Mar 2, 2026

View reviewed changes

Fix tool filtering for MCP tools

25f15af

are-ces force-pushed the llama-stack-0.4.x-bumpup branch from 19137c1 to 25f15af Compare March 2, 2026 15:48

are-ces merged commit 934f3e8 into lightspeed-core:main Mar 2, 2026
2 of 3 checks passed

TamiTakamiya mentioned this pull request Mar 2, 2026

Workaround for the MCP server connection cleanup issue #57

Draft

15 tasks

Conversation

are-ces commented Feb 8, 2026

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ldjebran Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

are-ces Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

TamiTakamiya commented Feb 10, 2026

Uh oh!

Uh oh!

Jdubrick left a comment

Choose a reason for hiding this comment

Uh oh!

ldjebran commented Feb 11, 2026

Uh oh!

ldjebran left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

are-ces commented Feb 12, 2026

Uh oh!

TamiTakamiya commented Feb 25, 2026

Uh oh!

TamiTakamiya left a comment

Choose a reason for hiding this comment

Uh oh!

TamiTakamiya commented Mar 2, 2026

Uh oh!

ldjebran commented Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ldjebran left a comment •

edited

Loading