rmusser01 · rmusser01 · Jul 3, 2026 · Jul 3, 2026
diff --git a/...ask-12118 - Fix-llama-provider-alias-rejection-in-character-chat-completions.md b/...ask-12118 - Fix-llama-provider-alias-rejection-in-character-chat-completions.md
@@ -0,0 +1,50 @@
+---
+id: TASK-12118
+title: Fix llama provider alias rejection in character chat completions
+status: Done
+labels:
+- bug
+- chat
+- llm-providers
+- webui
+priority: high
+---
+
+## Description
+
+<!-- SECTION:DESCRIPTION:BEGIN -->
+Character chat completion can receive the WebUI catalog provider id `llama`, but the backend shared provider resolver treated it as a separate credentialed provider instead of canonicalizing it to `llama.cpp`. This caused `/api/v1/chats/{id}/complete-v2` to return `missing_provider_credentials` even though the llama.cpp server endpoint was configured and normal chat could stream successfully.
+<!-- SECTION:DESCRIPTION:END -->
+
+## Acceptance Criteria
+<!-- AC:BEGIN -->
+- [x] #1 `api_provider: llama` normalizes to `llama.cpp` for `/api/v1/chat/completions`.
+- [x] #2 Character chat `/complete-v2` normalizes raw `provider: llama` before credentials/provider dispatch.
+- [x] #3 Regression tests cover schema validation and shared provider/model resolution aliases.
+- [x] #4 Live localhost verification against configured llama.cpp returns 200 streaming response.
+- [x] #5 Bandit reports zero findings for touched backend source files.
+<!-- AC:END -->
+
+## Implementation Notes
+
+<!-- SECTION:IMPLEMENTATION_NOTES:BEGIN -->
+Root cause: standard chat normalized WebUI catalog provider id `llama` via ChatCompletionRequest, but character chat `/complete-v2` passed raw `provider: llama` into the shared resolver. The resolver treated `llama` as a separate credentialed provider instead of canonical `llama.cpp`, producing `missing_provider_credentials` despite the configured llama.cpp endpoint. Implemented normalization in both ChatCompletionRequest validation and shared chat_service provider/model resolution. Live verification: created a fresh Miku chat and POSTed `/api/v1/chats/3ed7b614-922c-4161-807c-d8a7c048d15b/complete-v2?scope_type=global` with `provider: "llama"`; response was HTTP 200 text/event-stream from `gemma-4-26B-A4B-it-ultra-uncensored-heretic-Q4_K_M.gguf`. Focused pytest: `tldw_Server_API/tests/Chat_NEW/unit/test_chat_schemas.py tldw_Server_API/tests/Chat_NEW/unit/test_provider_model_resolution.py -q` passed 40 tests. Bandit: `/tmp/bandit_task_12120_final.json` reported 0 findings for `chat_request_schemas.py` and `chat_service.py`. Screenshots regenerated in `/tmp/tldw-github-showcase/`. Known non-blocking warning: WebUI still logs per-chat settings 404 warnings, which are unrelated to this provider alias regression.
+PR #2586 review follow-up in progress: address Qodo type-hint/style/test-stability comments, Qodo unreachable warning logging comment, and Gemini defensive strip/lower normalization comments; then rebase onto latest dev and repush.
+PR #2586 review follow-up addressed: added explicit type hints and `-> None` to new tests; shortened/wrapped the schema regression test signature; replaced the Pydantic post-construction mutation with a `SimpleNamespace` request stub matching the character-chat resolver input shape; preserved defensive `.strip().lower()` before provider alias normalization; removed the unreachable `pass` before the alias-resolution warning log. Verification after review fixes: focused pytest passed 40 tests with 5 warnings; Bandit `/tmp/bandit_task_12118_review.json` reported 0 findings; `git diff --check` passed.
+<!-- SECTION:IMPLEMENTATION_NOTES:END -->
+
+## Final Summary
+
+<!-- SECTION:FINAL_SUMMARY:BEGIN -->
+Fixed the llama.cpp provider alias regression and addressed PR review follow-ups. The WebUI catalog id `llama` now canonicalizes to `llama.cpp` in standard chat validation and shared character-chat provider/model resolution, while preserving prior trim/lower defensive behavior. Review follow-up tightened the regression tests, removed brittle Pydantic mutation, and restored the intended warning log in alias-resolution exception handling.
+<!-- SECTION:FINAL_SUMMARY:END -->
+
+## Definition of Done
+<!-- DOD:BEGIN -->
+- [x] #1 Acceptance criteria completed
+- [x] #2 Tests or verification recorded
+- [x] #3 Documentation updated when relevant
+- [x] #4 Bandit run for touched code when applicable or document non-code/environment skip
+- [x] #5 Final summary added
+- [x] #6 Known skips or blockers documented
+<!-- DOD:END -->
diff --git a/tldw_Server_API/app/api/v1/schemas/chat_request_schemas.py b/tldw_Server_API/app/api/v1/schemas/chat_request_schemas.py
@@ -56,6 +56,7 @@
     custom_openai_section_name,
     iter_custom_openai_provider_names,
 )
+from tldw_Server_API.app.core.LLM_Calls.provider_readiness import normalize_catalog_provider_for_chat
 
 _config = load_and_log_configs() or {}
 
@@ -835,8 +836,9 @@ def validate_api_provider(cls, value: Optional[str]) -> Optional[str]:
         """Validate provider ids without Python 3.11-only dynamic Literal syntax."""
         if value is None:
             return value
-        if value in ALL_SUPPORTED_PROVIDER_NAMES_LIST:
-            return value
+        normalized_value = normalize_catalog_provider_for_chat(value)
+        if normalized_value in ALL_SUPPORTED_PROVIDER_NAMES_LIST:
+            return normalized_value
         raise ValueError(f"Unsupported api_provider: {value}")
 
     # --- Standard OpenAI-like Parameters ---

diff --git a/tldw_Server_API/app/core/Chat/chat_service.py b/tldw_Server_API/app/core/Chat/chat_service.py
@@ -78,6 +78,7 @@
 from tldw_Server_API.app.core.Chat.streaming_utils import (
     STREAMING_IDLE_TIMEOUT as CHAT_IDLE_TIMEOUT,
 )
+from tldw_Server_API.app.core.LLM_Calls.provider_readiness import normalize_catalog_provider_for_chat
 from tldw_Server_API.app.core.Chat.chat_loop_engine import is_chat_loop_mode_enabled
 from tldw_Server_API.app.core.Chat.run_first_presentation import (
     present_chat_tools,
@@ -1336,13 +1337,16 @@ def parse_provider_model_for_metrics(
         parts = model_str.split("/", 1)
         if len(parts) == 2:
             model_provider, model_name = parts
-            provider = (api_provider or model_provider).lower()
+            raw_provider = api_provider or model_provider
+            provider = normalize_catalog_provider_for_chat((raw_provider or "").strip().lower())
             model = model_name
         else:
-            provider = (api_provider or default_provider).lower()
+            raw_provider = api_provider or default_provider
+            provider = normalize_catalog_provider_for_chat((raw_provider or "").strip().lower())
             model = model_str
     else:
-        provider = (api_provider or default_provider).lower()
+        raw_provider = api_provider or default_provider
+        provider = normalize_catalog_provider_for_chat((raw_provider or "").strip().lower())
         model = model_str
     return provider, model
 
@@ -1375,7 +1379,9 @@ def normalize_request_provider_and_model(
             parts_for_alias = model_str.split("/", 1)
             if len(parts_for_alias) == 2:
                 inline_provider, inline_model_part = parts_for_alias[0].strip(), parts_for_alias[1].strip()
-        provider_for_mapping = ((inline_provider or api_provider or default_provider) or "").strip().lower()
+        provider_for_mapping = normalize_catalog_provider_for_chat(
+            ((inline_provider or api_provider or default_provider) or "").strip().lower()
+        )
 
 
         def _resolve_alias(provider: str, raw_model: str) -> str | None:
@@ -1434,14 +1440,17 @@ def _resolve_alias(provider: str, raw_model: str) -> str | None:
         pass
     except _CHAT_NONCRITICAL_EXCEPTIONS as _unexpected:
         # Unexpected exceptions should be visible in production logs
-        pass
         logger.warning(f"Unexpected error during model alias resolution: {_unexpected}")
-    provider = (api_provider or default_provider).lower()
+    provider = normalize_catalog_provider_for_chat(
+        ((api_provider or default_provider) or "").strip().lower()
+    )
     if "/" in model_str:
         parts = model_str.split("/", 1)
         if len(parts) == 2:
             model_provider, actual_model = parts
-            inline_provider_lower = model_provider.lower()
+            inline_provider_lower = normalize_catalog_provider_for_chat(
+                model_provider.strip().lower()
+            )
             # If the api_provider was not explicitly set, allow inline provider to select it
             if not api_provider:
                 provider = inline_provider_lower

diff --git a/tldw_Server_API/tests/Chat_NEW/unit/test_chat_schemas.py b/tldw_Server_API/tests/Chat_NEW/unit/test_chat_schemas.py
@@ -116,6 +116,21 @@ def test_minimal_valid_request(self):
         assert len(request.messages) == 1
         assert request.messages[0].role == "user"
 
+    @pytest.mark.unit
+    @pytest.mark.parametrize("api_provider", ["llama", "llama.cpp"])
+    def test_llamacpp_aliases_normalize_to_canonical_provider(
+        self: "TestChatCompletionRequest",
+        api_provider: str,
+    ) -> None:
+        """Test WebUI llama provider ids route through the canonical llama.cpp adapter."""
+        request = ChatCompletionRequest(
+            api_provider=api_provider,
+            model="local-model",
+            messages=[{"role": "user", "content": "Hello"}],
+        )
+
+        assert request.api_provider == "llama.cpp"
+
     @pytest.mark.unit
     def test_request_accepts_typed_research_context(self):
         """Test request accepts a bounded typed research_context payload."""

diff --git a/tldw_Server_API/tests/Chat_NEW/unit/test_provider_model_resolution.py b/tldw_Server_API/tests/Chat_NEW/unit/test_provider_model_resolution.py
@@ -1,3 +1,5 @@
+from types import SimpleNamespace
+
 import pytest
 
 from tldw_Server_API.app.api.v1.schemas.chat_request_schemas import ChatCompletionRequest
@@ -46,6 +48,27 @@ def test_resolve_provider_and_model_inline_alias(monkeypatch):
     assert debug_info["normalized"]["model"] == selected_model
 
 
+@pytest.mark.unit
+def test_resolve_provider_and_model_normalizes_llamacpp_catalog_alias() -> None:
+    """Catalog provider id `llama` should execute through the llama.cpp adapter."""
+    request = SimpleNamespace(api_provider="llama", model="local-model")
+
+    metrics_provider, metrics_model, selected_provider, selected_model, debug_info = (
+        resolve_provider_and_model(
+            request_data=request,
+            metrics_default_provider="openai",
+            normalize_default_provider="openai",
+        )
+    )
+
+    assert metrics_provider == "llama.cpp"
+    assert metrics_model == "local-model"
+    assert selected_provider == "llama.cpp"
+    assert selected_model == "local-model"
+    assert debug_info["raw"]["api_provider"] == "llama"
+    assert debug_info["normalized"]["provider"] == "llama.cpp"
+
+
 @pytest.mark.unit
 def test_resolve_provider_and_model_catalog_unique_match_when_provider_not_explicit(monkeypatch):
     request = ChatCompletionRequest(