Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
id: TASK-12118
title: Fix llama provider alias rejection in character chat completions
status: Done
labels:
- bug
- chat
- llm-providers
- webui
priority: high
---

## Description

<!-- SECTION:DESCRIPTION:BEGIN -->
Character chat completion can receive the WebUI catalog provider id `llama`, but the backend shared provider resolver treated it as a separate credentialed provider instead of canonicalizing it to `llama.cpp`. This caused `/api/v1/chats/{id}/complete-v2` to return `missing_provider_credentials` even though the llama.cpp server endpoint was configured and normal chat could stream successfully.
<!-- SECTION:DESCRIPTION:END -->

## Acceptance Criteria
<!-- AC:BEGIN -->
- [x] #1 `api_provider: llama` normalizes to `llama.cpp` for `/api/v1/chat/completions`.
- [x] #2 Character chat `/complete-v2` normalizes raw `provider: llama` before credentials/provider dispatch.
- [x] #3 Regression tests cover schema validation and shared provider/model resolution aliases.
- [x] #4 Live localhost verification against configured llama.cpp returns 200 streaming response.
- [x] #5 Bandit reports zero findings for touched backend source files.
<!-- AC:END -->

## Implementation Notes

<!-- SECTION:IMPLEMENTATION_NOTES:BEGIN -->
Root cause: standard chat normalized WebUI catalog provider id `llama` via ChatCompletionRequest, but character chat `/complete-v2` passed raw `provider: llama` into the shared resolver. The resolver treated `llama` as a separate credentialed provider instead of canonical `llama.cpp`, producing `missing_provider_credentials` despite the configured llama.cpp endpoint. Implemented normalization in both ChatCompletionRequest validation and shared chat_service provider/model resolution. Live verification: created a fresh Miku chat and POSTed `/api/v1/chats/3ed7b614-922c-4161-807c-d8a7c048d15b/complete-v2?scope_type=global` with `provider: "llama"`; response was HTTP 200 text/event-stream from `gemma-4-26B-A4B-it-ultra-uncensored-heretic-Q4_K_M.gguf`. Focused pytest: `tldw_Server_API/tests/Chat_NEW/unit/test_chat_schemas.py tldw_Server_API/tests/Chat_NEW/unit/test_provider_model_resolution.py -q` passed 40 tests. Bandit: `/tmp/bandit_task_12120_final.json` reported 0 findings for `chat_request_schemas.py` and `chat_service.py`. Screenshots regenerated in `/tmp/tldw-github-showcase/`. Known non-blocking warning: WebUI still logs per-chat settings 404 warnings, which are unrelated to this provider alias regression.
PR #2586 review follow-up in progress: address Qodo type-hint/style/test-stability comments, Qodo unreachable warning logging comment, and Gemini defensive strip/lower normalization comments; then rebase onto latest dev and repush.
PR #2586 review follow-up addressed: added explicit type hints and `-> None` to new tests; shortened/wrapped the schema regression test signature; replaced the Pydantic post-construction mutation with a `SimpleNamespace` request stub matching the character-chat resolver input shape; preserved defensive `.strip().lower()` before provider alias normalization; removed the unreachable `pass` before the alias-resolution warning log. Verification after review fixes: focused pytest passed 40 tests with 5 warnings; Bandit `/tmp/bandit_task_12118_review.json` reported 0 findings; `git diff --check` passed.
<!-- SECTION:IMPLEMENTATION_NOTES:END -->

## Final Summary

<!-- SECTION:FINAL_SUMMARY:BEGIN -->
Fixed the llama.cpp provider alias regression and addressed PR review follow-ups. The WebUI catalog id `llama` now canonicalizes to `llama.cpp` in standard chat validation and shared character-chat provider/model resolution, while preserving prior trim/lower defensive behavior. Review follow-up tightened the regression tests, removed brittle Pydantic mutation, and restored the intended warning log in alias-resolution exception handling.
<!-- SECTION:FINAL_SUMMARY:END -->

## Definition of Done
<!-- DOD:BEGIN -->
- [x] #1 Acceptance criteria completed
- [x] #2 Tests or verification recorded
- [x] #3 Documentation updated when relevant
- [x] #4 Bandit run for touched code when applicable or document non-code/environment skip
- [x] #5 Final summary added
- [x] #6 Known skips or blockers documented
<!-- DOD:END -->
6 changes: 4 additions & 2 deletions tldw_Server_API/app/api/v1/schemas/chat_request_schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
custom_openai_section_name,
iter_custom_openai_provider_names,
)
from tldw_Server_API.app.core.LLM_Calls.provider_readiness import normalize_catalog_provider_for_chat

_config = load_and_log_configs() or {}

Expand Down Expand Up @@ -835,8 +836,9 @@ def validate_api_provider(cls, value: Optional[str]) -> Optional[str]:
"""Validate provider ids without Python 3.11-only dynamic Literal syntax."""
if value is None:
return value
if value in ALL_SUPPORTED_PROVIDER_NAMES_LIST:
return value
normalized_value = normalize_catalog_provider_for_chat(value)
if normalized_value in ALL_SUPPORTED_PROVIDER_NAMES_LIST:
return normalized_value
raise ValueError(f"Unsupported api_provider: {value}")

# --- Standard OpenAI-like Parameters ---
Expand Down
23 changes: 16 additions & 7 deletions tldw_Server_API/app/core/Chat/chat_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@
from tldw_Server_API.app.core.Chat.streaming_utils import (
STREAMING_IDLE_TIMEOUT as CHAT_IDLE_TIMEOUT,
)
from tldw_Server_API.app.core.LLM_Calls.provider_readiness import normalize_catalog_provider_for_chat
from tldw_Server_API.app.core.Chat.chat_loop_engine import is_chat_loop_mode_enabled
from tldw_Server_API.app.core.Chat.run_first_presentation import (
present_chat_tools,
Expand Down Expand Up @@ -1336,13 +1337,16 @@ def parse_provider_model_for_metrics(
parts = model_str.split("/", 1)
if len(parts) == 2:
model_provider, model_name = parts
provider = (api_provider or model_provider).lower()
raw_provider = api_provider or model_provider
provider = normalize_catalog_provider_for_chat((raw_provider or "").strip().lower())
model = model_name
else:
provider = (api_provider or default_provider).lower()
raw_provider = api_provider or default_provider
provider = normalize_catalog_provider_for_chat((raw_provider or "").strip().lower())
model = model_str
else:
provider = (api_provider or default_provider).lower()
raw_provider = api_provider or default_provider
provider = normalize_catalog_provider_for_chat((raw_provider or "").strip().lower())
model = model_str
return provider, model

Expand Down Expand Up @@ -1375,7 +1379,9 @@ def normalize_request_provider_and_model(
parts_for_alias = model_str.split("/", 1)
if len(parts_for_alias) == 2:
inline_provider, inline_model_part = parts_for_alias[0].strip(), parts_for_alias[1].strip()
provider_for_mapping = ((inline_provider or api_provider or default_provider) or "").strip().lower()
provider_for_mapping = normalize_catalog_provider_for_chat(
((inline_provider or api_provider or default_provider) or "").strip().lower()
)
Comment thread
rmusser01 marked this conversation as resolved.


def _resolve_alias(provider: str, raw_model: str) -> str | None:
Expand Down Expand Up @@ -1434,14 +1440,17 @@ def _resolve_alias(provider: str, raw_model: str) -> str | None:
pass
except _CHAT_NONCRITICAL_EXCEPTIONS as _unexpected:
# Unexpected exceptions should be visible in production logs
pass
logger.warning(f"Unexpected error during model alias resolution: {_unexpected}")
provider = (api_provider or default_provider).lower()
provider = normalize_catalog_provider_for_chat(
((api_provider or default_provider) or "").strip().lower()
)
if "/" in model_str:
parts = model_str.split("/", 1)
if len(parts) == 2:
model_provider, actual_model = parts
inline_provider_lower = model_provider.lower()
inline_provider_lower = normalize_catalog_provider_for_chat(
model_provider.strip().lower()
)
# If the api_provider was not explicitly set, allow inline provider to select it
if not api_provider:
provider = inline_provider_lower
Expand Down
15 changes: 15 additions & 0 deletions tldw_Server_API/tests/Chat_NEW/unit/test_chat_schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,21 @@ def test_minimal_valid_request(self):
assert len(request.messages) == 1
assert request.messages[0].role == "user"

@pytest.mark.unit
@pytest.mark.parametrize("api_provider", ["llama", "llama.cpp"])
def test_llamacpp_aliases_normalize_to_canonical_provider(
self: "TestChatCompletionRequest",
api_provider: str,
) -> None:
"""Test WebUI llama provider ids route through the canonical llama.cpp adapter."""
request = ChatCompletionRequest(
api_provider=api_provider,
model="local-model",
messages=[{"role": "user", "content": "Hello"}],
)

assert request.api_provider == "llama.cpp"

@pytest.mark.unit
def test_request_accepts_typed_research_context(self):
"""Test request accepts a bounded typed research_context payload."""
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from types import SimpleNamespace

import pytest

from tldw_Server_API.app.api.v1.schemas.chat_request_schemas import ChatCompletionRequest
Expand Down Expand Up @@ -46,6 +48,27 @@ def test_resolve_provider_and_model_inline_alias(monkeypatch):
assert debug_info["normalized"]["model"] == selected_model


@pytest.mark.unit
def test_resolve_provider_and_model_normalizes_llamacpp_catalog_alias() -> None:
"""Catalog provider id `llama` should execute through the llama.cpp adapter."""
request = SimpleNamespace(api_provider="llama", model="local-model")

metrics_provider, metrics_model, selected_provider, selected_model, debug_info = (
resolve_provider_and_model(
request_data=request,
metrics_default_provider="openai",
normalize_default_provider="openai",
)
)

assert metrics_provider == "llama.cpp"
assert metrics_model == "local-model"
assert selected_provider == "llama.cpp"
assert selected_model == "local-model"
assert debug_info["raw"]["api_provider"] == "llama"
assert debug_info["normalized"]["provider"] == "llama.cpp"


@pytest.mark.unit
def test_resolve_provider_and_model_catalog_unique_match_when_provider_not_explicit(monkeypatch):
request = ChatCompletionRequest(
Expand Down
Loading