Skip to content

Condensing Context Ignores Provider API Endpoint Prefix #4661

@strikeoncmputrz

Description

@strikeoncmputrz

Plugin Type

VSCode Extension

App Version

v4.140.2

Description

Kilo is working fine in terms of tool calling and completions. My provider base url is set to: https://myhost/v1/

However, when Kilo attempted to condense the context it submits requests to /chat/completions rather than /v1/chat/completions despite my provider base URL explicitly including /v1.

Here are the logged failures:

0.0 s, Process: 57045 cached tokens and 581 new tokens at 145.25 T/s, Generate: 19.73 T/s, Context: 57626 tokens) 
2025-12-24 12:50:41.600 INFO:     192.168.5.2:6534 - "POST /v1/chat/completions HTTP/1.1" 200
2025-12-24 12:50:41.602 INFO:     Received chat completion streaming request a51a7ff9355f447c969b69cae7ce5811
2025-12-24 12:50:48.518 INFO:     Finished chat completion streaming request a51a7ff9355f447c969b69cae7ce5811
2025-12-24 12:50:48.519 INFO:     Metrics (ID: a51a7ff9355f447c969b69cae7ce5811): 62 tokens generated in 6.69 seconds (Queue: 
0.0 s, Process: 57685 cached tokens and 422 new tokens at 119.55 T/s, Generate: 19.63 T/s, Context: 58107 tokens) 
2025-12-24 12:50:49.537 INFO:     192.168.5.2:6534 - "POST /v1/chat/completions HTTP/1.1" 200
2025-12-24 12:50:49.538 INFO:     Received chat completion streaming request c6fcba4bec2f4a488fa6458b75da383d
2025-12-24 12:50:57.386 INFO:     Finished chat completion streaming request c6fcba4bec2f4a488fa6458b75da383d
2025-12-24 12:50:57.387 INFO:     Metrics (ID: c6fcba4bec2f4a488fa6458b75da383d): 98 tokens generated in 7.64 seconds (Queue: 
0.0 s, Process: 58168 cached tokens and 268 new tokens at 99.26 T/s, Generate: 19.85 T/s, Context: 58436 tokens) 
2025-12-24 12:50:58.478 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:50:58.902 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:50:59.369 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:50:59.754 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:00.234 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:00.702 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:01.144 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:01.589 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:02.079 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:02.509 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:02.957 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:03.442 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:03.886 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:04.286 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:04.671 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:05.098 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:05.636 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:06.766 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404
2025-12-24 12:51:07.216 INFO:     192.168.5.2:6534 - "POST /chat/completions HTTP/1.1" 404

Reproduction steps

  1. Set BaseURL to openAI compliant API using /v1
  2. Wait for Kilo to condense its context

Provider

TabbyAPI

Model

GLM 4.6

System Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    Status

    Intake

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions