diff --git a/.changeset/cmdv-image-paste-macos.md b/.changeset/cmdv-image-paste-macos.md
deleted file mode 100644
index 778e74b47b4..00000000000
--- a/.changeset/cmdv-image-paste-macos.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-"kilo-code": patch
----
-
-Support Cmd+V for pasting images on macOS in VSCode terminal
-
-- Detect empty bracketed paste (when clipboard contains image instead of text)
-- Trigger clipboard image check on empty paste or paste timeout
-- Add Cmd+V (meta key) support alongside Ctrl+V for image paste
diff --git a/.changeset/crisp-rabbits-lick.md b/.changeset/crisp-rabbits-lick.md
deleted file mode 100644
index 7d89183c0d4..00000000000
--- a/.changeset/crisp-rabbits-lick.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"kilo-code": patch
----
-
-Faster autocomplete when using the Mistral provider
diff --git a/.changeset/enable-jetbrains-autocomplete.md b/.changeset/enable-jetbrains-autocomplete.md
deleted file mode 100644
index 109f89977ea..00000000000
--- a/.changeset/enable-jetbrains-autocomplete.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"kilo-code": patch
----
-
-Enable autocomplete by default in the JetBrains extension
diff --git a/.changeset/fix-vscode-paste-truncation.md b/.changeset/fix-vscode-paste-truncation.md
deleted file mode 100644
index 36a1c97f667..00000000000
--- a/.changeset/fix-vscode-paste-truncation.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-"kilo-code": patch
----
-
-Fix paste truncation in VSCode terminal
-
-- Prevent React StrictMode cleanup from interrupting paste operations
-- Remove `completePaste()` and `clearBuffers()` from useEffect cleanup
-- Paste buffer refs now persist across React re-mounts and flush properly when paste end marker is received
diff --git a/.changeset/new-taxes-accept.md b/.changeset/new-taxes-accept.md
deleted file mode 100644
index d9248a54c67..00000000000
--- a/.changeset/new-taxes-accept.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"kilo-code": patch
----
-
-Disable structured outputs for Anthropic models, because the tool schema doesn't yet support it
diff --git a/.changeset/smooth-wombats-stand.md b/.changeset/smooth-wombats-stand.md
deleted file mode 100644
index b779d265825..00000000000
--- a/.changeset/smooth-wombats-stand.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-"kilo-code": patch
----
-
-Filter unhelpful suggestions in chat autocomplete
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9fcf33f0562..e2f61f2dbba 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,41 @@
 # kilo-code
 
+## 4.143.2
+
+### Patch Changes
+
+- [#4833](https://github.com/Kilo-Org/kilocode/pull/4833) [`2c7cd08`](https://github.com/Kilo-Org/kilocode/commit/2c7cd084bf4707eedda61fed554cf15fcc8b065b) Thanks [@sebastiand-cerebras](https://github.com/sebastiand-cerebras)! - Add `zai-glm-4.7` to Cerebras models
+
+- [#4853](https://github.com/Kilo-Org/kilocode/pull/4853) [`435c879`](https://github.com/Kilo-Org/kilocode/commit/435c879a29d55b75f5f6ffe7bf14854630e085cb) Thanks [@chrarnoldus](https://github.com/chrarnoldus)! - Improved prompt caching when using Anthropic models on OpenRouter with native tool calling
+
+- [#4859](https://github.com/Kilo-Org/kilocode/pull/4859) [`35fb2ad`](https://github.com/Kilo-Org/kilocode/commit/35fb2adc65dfb1e71e28f7368f96765062c43579) Thanks [@marius-kilocode](https://github.com/marius-kilocode)! - Fix Architect mode unnecessarily switching to Code mode to edit markdown files
+
+- [#4829](https://github.com/Kilo-Org/kilocode/pull/4829) [`4e09e36`](https://github.com/Kilo-Org/kilocode/commit/4e09e36bba165a2ab6f5e07f71a420faa49ea3ec) Thanks [@marius-kilocode](https://github.com/marius-kilocode)! - Fix browser action results displaying raw base64 screenshot data as hexadecimal garbage
+
+## 4.143.1
+
+### Patch Changes
+
+- [#4832](https://github.com/Kilo-Org/kilocode/pull/4832) [`22a4ebf`](https://github.com/Kilo-Org/kilocode/commit/22a4ebfcd9f885b6ef9979dc6830226db9a4f397) Thanks [@Drilmo](https://github.com/Drilmo)! - Support Cmd+V for pasting images on macOS in VSCode terminal
+
+    - Detect empty bracketed paste (when clipboard contains image instead of text)
+    - Trigger clipboard image check on empty paste or paste timeout
+    - Add Cmd+V (meta key) support alongside Ctrl+V for image paste
+
+- [#3856](https://github.com/Kilo-Org/kilocode/pull/3856) [`91e0a17`](https://github.com/Kilo-Org/kilocode/commit/91e0a1788963b8be50c58881f11ded96516ab163) Thanks [@markijbema](https://github.com/markijbema)! - Faster autocomplete when using the Mistral provider
+
+- [#4839](https://github.com/Kilo-Org/kilocode/pull/4839) [`abaada6`](https://github.com/Kilo-Org/kilocode/commit/abaada6b7ced6d3f4e37e69441e722e453289b81) Thanks [@markijbema](https://github.com/markijbema)! - Enable autocomplete by default in the JetBrains extension
+
+- [#4831](https://github.com/Kilo-Org/kilocode/pull/4831) [`a9cbb2c`](https://github.com/Kilo-Org/kilocode/commit/a9cbb2cebd75e0c675dc3b55e7a1653ccb93921b) Thanks [@Drilmo](https://github.com/Drilmo)! - Fix paste truncation in VSCode terminal
+
+    - Prevent React StrictMode cleanup from interrupting paste operations
+    - Remove `completePaste()` and `clearBuffers()` from useEffect cleanup
+    - Paste buffer refs now persist across React re-mounts and flush properly when paste end marker is received
+
+- [#4847](https://github.com/Kilo-Org/kilocode/pull/4847) [`8ee812a`](https://github.com/Kilo-Org/kilocode/commit/8ee812a18da5da691bf76ee5c5d9d94cfb678f25) Thanks [@chrarnoldus](https://github.com/chrarnoldus)! - Disable structured outputs for Anthropic models, because the tool schema doesn't yet support it
+
+- [#4843](https://github.com/Kilo-Org/kilocode/pull/4843) [`0e3520a`](https://github.com/Kilo-Org/kilocode/commit/0e3520a0aa9a74f7a28af1f820558d2343fd4fba) Thanks [@markijbema](https://github.com/markijbema)! - Filter unhelpful suggestions in chat autocomplete
+
 ## 4.143.0
 
 ### Minor Changes
diff --git a/apps/kilocode-docs/docs/providers/cerebras.md b/apps/kilocode-docs/docs/providers/cerebras.md
index 5a92ce88246..14ab0289b2f 100644
--- a/apps/kilocode-docs/docs/providers/cerebras.md
+++ b/apps/kilocode-docs/docs/providers/cerebras.md
@@ -20,7 +20,8 @@ Cerebras is known for their ultra-fast AI inference powered by the Cerebras CS-3
 Kilo Code supports the following Cerebras models:
 
 - `gpt-oss-120b` (Default) – High-performance open-source model optimized for fast inference
-- `zai-glm-4.6` – Advanced GLM model with enhanced reasoning capabilities
+- `zai-glm-4.6` – Fast general-purpose model on Cerebras (up to 1,000 tokens/s). To be deprecated soon.
+- `zai-glm-4.7` – Highly capable general-purpose model on Cerebras (up to 1,000 tokens/s), competitive with leading proprietary models on coding tasks.
 
 Refer to the [Cerebras documentation](https://docs.cerebras.ai/) for detailed information on model capabilities and performance characteristics.
 
diff --git a/cli/CHANGELOG.md b/cli/CHANGELOG.md
index 9e8cbf4ff42..60e44509f7b 100644
--- a/cli/CHANGELOG.md
+++ b/cli/CHANGELOG.md
@@ -1,5 +1,11 @@
 # @kilocode/cli
 
+## 0.19.2
+
+### Patch Changes
+
+- [#4829](https://github.com/Kilo-Org/kilocode/pull/4829) [`4e09e36`](https://github.com/Kilo-Org/kilocode/commit/4e09e36bba165a2ab6f5e07f71a420faa49ea3ec) Thanks [@marius-kilocode](https://github.com/marius-kilocode)! - Fix browser action results displaying raw base64 screenshot data as hexadecimal garbage
+
 ## 0.19.1
 
 ### Patch Changes
diff --git a/cli/README.md b/cli/README.md
index 2a9a9a7f101..74370cce808 100644
--- a/cli/README.md
+++ b/cli/README.md
@@ -245,7 +245,7 @@ To build and run the CLI locally off your branch:
 cd src
 pnpm bundle
 pnpm vsix
-pnpm vsix:unpackged
+pnpm vsix:unpacked
 cd ..
 ```
 
diff --git a/cli/package.dist.json b/cli/package.dist.json
index 6ef1d55840a..78d970a51ac 100644
--- a/cli/package.dist.json
+++ b/cli/package.dist.json
@@ -1,6 +1,6 @@
 {
 	"name": "@kilocode/cli",
-	"version": "0.19.1",
+	"version": "0.19.2",
 	"description": "Terminal User Interface for Kilo Code",
 	"type": "module",
 	"main": "index.js",
diff --git a/cli/package.json b/cli/package.json
index 5c90ddba5c7..4cb3c88a356 100644
--- a/cli/package.json
+++ b/cli/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@kilocode/cli",
-	"version": "0.19.1",
+	"version": "0.19.2",
 	"description": "Terminal User Interface for Kilo Code",
 	"type": "module",
 	"main": "dist/index.js",
diff --git a/cli/src/ui/messages/extension/say/SayBrowserActionResultMessage.tsx b/cli/src/ui/messages/extension/say/SayBrowserActionResultMessage.tsx
index c27d3130231..6a8eef3d9dc 100644
--- a/cli/src/ui/messages/extension/say/SayBrowserActionResultMessage.tsx
+++ b/cli/src/ui/messages/extension/say/SayBrowserActionResultMessage.tsx
@@ -1,14 +1,61 @@
 import React from "react"
 import { Box, Text } from "ink"
 import type { MessageComponentProps } from "../types.js"
-import { MarkdownText } from "../../../components/MarkdownText.js"
 import { useTheme } from "../../../../state/hooks/useTheme.js"
 
 /**
- * Display browser action results
+ * Parsed browser action result data
+ */
+interface BrowserActionResultData {
+	screenshot?: string
+	logs?: string
+	currentUrl?: string
+	currentMousePosition?: string
+	viewportWidth?: number
+	viewportHeight?: number
+}
+
+/**
+ * Parse browser action result from message text
+ */
+function parseBrowserActionResult(text: string | undefined): BrowserActionResultData | null {
+	if (!text) return null
+	try {
+		return JSON.parse(text) as BrowserActionResultData
+	} catch {
+		return null
+	}
+}
+
+/**
+ * Display browser action results in a readable format
+ * Parses the JSON data and shows meaningful info instead of raw base64 screenshot data
  */
 export const SayBrowserActionResultMessage: React.FC<MessageComponentProps> = ({ message }) => {
 	const theme = useTheme()
+	const result = parseBrowserActionResult(message.text)
+
+	// If we can't parse, show a simple message
+	if (!result) {
+		return (
+			<Box flexDirection="column" marginY={1}>
+				<Box>
+					<Text color={theme.semantic.info} bold>
+						🌐 Browser Action Result
+					</Text>
+				</Box>
+				<Box marginLeft={2} marginTop={1}>
+					<Text color={theme.ui.text.dimmed}>Browser action completed</Text>
+				</Box>
+			</Box>
+		)
+	}
+
+	const hasScreenshot = !!result.screenshot
+	const hasLogs = result.logs && result.logs.trim().length > 0
+	const hasUrl = !!result.currentUrl
+	const hasViewport = result.viewportWidth && result.viewportHeight
+
 	return (
 		<Box flexDirection="column" marginY={1}>
 			<Box>
@@ -17,11 +64,56 @@ export const SayBrowserActionResultMessage: React.FC<MessageComponentProps> = ({
 				</Text>
 			</Box>
 
-			{message.text && (
-				<Box marginLeft={2} marginTop={1}>
-					<MarkdownText>{message.text}</MarkdownText>
-				</Box>
-			)}
+			<Box flexDirection="column" marginLeft={2} marginTop={1}>
+				{/* Screenshot indicator */}
+				{hasScreenshot && (
+					<Box>
+						<Text color={theme.ui.text.dimmed}>📷 Screenshot captured</Text>
+					</Box>
+				)}
+
+				{/* Current URL */}
+				{hasUrl && (
+					<Box>
+						<Text color={theme.ui.text.dimmed}>
+							URL: <Text color={theme.markdown.link}>{result.currentUrl}</Text>
+						</Text>
+					</Box>
+				)}
+
+				{/* Viewport dimensions */}
+				{hasViewport && (
+					<Box>
+						<Text color={theme.ui.text.dimmed}>
+							Viewport: {result.viewportWidth}x{result.viewportHeight}
+						</Text>
+					</Box>
+				)}
+
+				{/* Cursor position */}
+				{result.currentMousePosition && (
+					<Box>
+						<Text color={theme.ui.text.dimmed}>Cursor: {result.currentMousePosition}</Text>
+					</Box>
+				)}
+
+				{/* Console logs */}
+				{hasLogs && (
+					<Box flexDirection="column" marginTop={1}>
+						<Text color={theme.ui.text.dimmed}>Console logs:</Text>
+						<Box marginLeft={2}>
+							<Text color={theme.ui.text.secondary}>{result.logs}</Text>
+						</Box>
+					</Box>
+				)}
+
+				{/* Fallback if no meaningful data */}
+				{!hasScreenshot && !hasLogs && !hasUrl && !hasViewport && (
+					<Box>
+						<Text color={theme.ui.text.dimmed}>Browser action completed</Text>
+					</Box>
+				)}
+			</Box>
 		</Box>
 	)
 }
diff --git a/cli/src/ui/messages/extension/say/__tests__/SayBrowserActionResultMessage.test.tsx b/cli/src/ui/messages/extension/say/__tests__/SayBrowserActionResultMessage.test.tsx
new file mode 100644
index 00000000000..9a45519a584
--- /dev/null
+++ b/cli/src/ui/messages/extension/say/__tests__/SayBrowserActionResultMessage.test.tsx
@@ -0,0 +1,198 @@
+import React from "react"
+import { render } from "ink-testing-library"
+import { describe, it, expect } from "vitest"
+import { SayBrowserActionResultMessage } from "../SayBrowserActionResultMessage.js"
+import type { ExtensionChatMessage } from "../../../../../types/messages.js"
+
+describe("SayBrowserActionResultMessage", () => {
+	const baseMessage: ExtensionChatMessage = {
+		ts: Date.now(),
+		type: "say",
+		say: "browser_action_result",
+	}
+
+	it("should display header for browser action result", () => {
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify({ screenshot: "data:image/png;base64,abc123" }),
+				}}
+			/>,
+		)
+		expect(lastFrame()).toContain("Browser Action Result")
+	})
+
+	it("should show screenshot indicator instead of base64 data", () => {
+		const browserResult = {
+			screenshot: "data:image/webp;base64,UklGRn44AABXRUJQVlA4...", // Simulated base64 data
+			logs: "",
+			currentUrl: "https://example.com",
+			viewportWidth: 1280,
+			viewportHeight: 800,
+		}
+
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify(browserResult),
+				}}
+			/>,
+		)
+
+		const output = lastFrame()
+
+		// Should show screenshot indicator
+		expect(output).toContain("Screenshot captured")
+
+		// Should NOT contain the base64 data
+		expect(output).not.toContain("UklGRn44AABXRUJQVlA4")
+		expect(output).not.toContain("data:image")
+
+		// Should show URL
+		expect(output).toContain("https://example.com")
+
+		// Should show viewport
+		expect(output).toContain("1280x800")
+	})
+
+	it("should display console logs when present", () => {
+		const browserResult = {
+			logs: "Console: Hello from the page\nError: Something went wrong",
+		}
+
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify(browserResult),
+				}}
+			/>,
+		)
+
+		const output = lastFrame()
+		expect(output).toContain("Console logs:")
+		expect(output).toContain("Hello from the page")
+		expect(output).toContain("Something went wrong")
+	})
+
+	it("should display cursor position when present", () => {
+		const browserResult = {
+			currentMousePosition: "500,300",
+		}
+
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify(browserResult),
+				}}
+			/>,
+		)
+
+		expect(lastFrame()).toContain("Cursor: 500,300")
+	})
+
+	it("should handle empty result gracefully", () => {
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify({}),
+				}}
+			/>,
+		)
+
+		expect(lastFrame()).toContain("Browser action completed")
+	})
+
+	it("should handle invalid JSON gracefully", () => {
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: "not valid json",
+				}}
+			/>,
+		)
+
+		expect(lastFrame()).toContain("Browser action completed")
+	})
+
+	it("should handle missing text gracefully", () => {
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: undefined,
+				}}
+			/>,
+		)
+
+		expect(lastFrame()).toContain("Browser action completed")
+	})
+
+	it("should not show logs section when logs are empty", () => {
+		const browserResult = {
+			screenshot: "data:image/png;base64,abc",
+			logs: "",
+		}
+
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify(browserResult),
+				}}
+			/>,
+		)
+
+		expect(lastFrame()).not.toContain("Console logs:")
+	})
+
+	it("should not show logs section when logs are only whitespace", () => {
+		const browserResult = {
+			screenshot: "data:image/png;base64,abc",
+			logs: "   \n\t  ",
+		}
+
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify(browserResult),
+				}}
+			/>,
+		)
+
+		expect(lastFrame()).not.toContain("Console logs:")
+	})
+
+	it("should display all available info together", () => {
+		const browserResult = {
+			screenshot: "data:image/png;base64,abc123",
+			logs: "Page loaded",
+			currentUrl: "https://test.com/page",
+			currentMousePosition: "100,200",
+			viewportWidth: 1920,
+			viewportHeight: 1080,
+		}
+
+		const { lastFrame } = render(
+			<SayBrowserActionResultMessage
+				message={{
+					...baseMessage,
+					text: JSON.stringify(browserResult),
+				}}
+			/>,
+		)
+
+		const output = lastFrame()
+		expect(output).toContain("Screenshot captured")
+		expect(output).toContain("https://test.com/page")
+		expect(output).toContain("1920x1080")
+		expect(output).toContain("100,200")
+		expect(output).toContain("Page loaded")
+	})
+})
diff --git a/packages/types/src/mode.ts b/packages/types/src/mode.ts
index 8de583e6aae..52ea2bb8619 100644
--- a/packages/types/src/mode.ts
+++ b/packages/types/src/mode.ts
@@ -148,7 +148,7 @@ export const DEFAULT_MODES: readonly ModeConfig[] = [
 		description: "Plan and design before implementation",
 		groups: ["read", ["edit", { fileRegex: "\\.md$", description: "Markdown files only" }], "browser", "mcp"],
 		customInstructions:
-			"1. Do some information gathering (using provided tools) to get more context about the task.\n\n2. You should also ask the user clarifying questions to get a better understanding of the task.\n\n3. Once you've gained more context about the user's request, break down the task into clear, actionable steps and create a todo list using the `update_todo_list` tool. Each todo item should be:\n   - Specific and actionable\n   - Listed in logical execution order\n   - Focused on a single, well-defined outcome\n   - Clear enough that another mode could execute it independently\n\n   **Note:** If the `update_todo_list` tool is not available, write the plan to a markdown file (e.g., `plan.md` or `todo.md`) instead.\n\n4. As you gather more information or discover new requirements, update the todo list to reflect the current understanding of what needs to be accomplished.\n\n5. Ask the user if they are pleased with this plan, or if they would like to make any changes. Think of this as a brainstorming session where you can discuss the task and refine the todo list.\n\n6. Include Mermaid diagrams if they help clarify complex workflows or system architecture. Please avoid using double quotes (\"\") and parentheses () inside square brackets ([]) in Mermaid diagrams, as this can cause parsing errors.\n\n7. Use the switch_mode tool to request that the user switch to another mode to implement the solution.\n\n**IMPORTANT: Focus on creating clear, actionable todo lists rather than lengthy markdown documents. Use the todo list as your primary planning tool to track and organize the work that needs to be done.**\n\n**CRITICAL: Never provide level of effort time estimates (e.g., hours, days, weeks) for tasks. Focus solely on breaking down the work into clear, actionable steps without estimating how long they will take.**\n\nUnless told otherwise, if you want to save a plan file, put it in the /plans directory",
+			"1. Do some information gathering (using provided tools) to get more context about the task.\n\n2. You should also ask the user clarifying questions to get a better understanding of the task.\n\n3. Once you've gained more context about the user's request, break down the task into clear, actionable steps and create a todo list using the `update_todo_list` tool. Each todo item should be:\n   - Specific and actionable\n   - Listed in logical execution order\n   - Focused on a single, well-defined outcome\n   - Clear enough that another mode could execute it independently\n\n   **Note:** If the `update_todo_list` tool is not available, write the plan to a markdown file (e.g., `plan.md` or `todo.md`) instead.\n\n4. As you gather more information or discover new requirements, update the todo list to reflect the current understanding of what needs to be accomplished.\n\n5. Ask the user if they are pleased with this plan, or if they would like to make any changes. Think of this as a brainstorming session where you can discuss the task and refine the todo list.\n\n6. Include Mermaid diagrams if they help clarify complex workflows or system architecture. Please avoid using double quotes (\"\") and parentheses () inside square brackets ([]) in Mermaid diagrams, as this can cause parsing errors.\n\n7. Use the switch_mode tool to request switching to another mode when you need to edit non-markdown files (like source code files: .ts, .js, .py, .java, etc.) or execute commands. You CAN directly create and edit markdown files (.md) without switching modes.\n\n**IMPORTANT: Focus on creating clear, actionable todo lists rather than lengthy markdown documents. Use the todo list as your primary planning tool to track and organize the work that needs to be done.**\n\n**CRITICAL: Never provide level of effort time estimates (e.g., hours, days, weeks) for tasks. Focus solely on breaking down the work into clear, actionable steps without estimating how long they will take.**\n\nUnless told otherwise, if you want to save a plan file, put it in the /plans directory",
 	},
 	{
 		slug: "code",
diff --git a/packages/types/src/providers/cerebras.ts b/packages/types/src/providers/cerebras.ts
index 1f28c00bdfd..c5f770d4b2a 100644
--- a/packages/types/src/providers/cerebras.ts
+++ b/packages/types/src/providers/cerebras.ts
@@ -14,7 +14,18 @@ export const cerebrasModels = {
 		supportsNativeTools: true,
 		inputPrice: 0,
 		outputPrice: 0,
-		description: "Highly intelligent general purpose model with up to 1,000 tokens/s",
+		description: "Fast general-purpose model on Cerebras (up to 1,000 tokens/s). To be deprecated soon.",
+	},
+	"zai-glm-4.7": {
+		maxTokens: 16384, // Conservative default to avoid premature rate limiting (Cerebras reserves quota upfront)
+		contextWindow: 131072,
+		supportsImages: false,
+		supportsPromptCache: false,
+		supportsNativeTools: true,
+		inputPrice: 0,
+		outputPrice: 0,
+		description:
+			"Highly capable general-purpose model on Cerebras (up to 1,000 tokens/s), competitive with leading proprietary models on coding tasks.",
 	},
 	"qwen-3-235b-a22b-instruct-2507": {
 		maxTokens: 16384, // Conservative default to avoid premature rate limiting
diff --git a/src/api/providers/openrouter.ts b/src/api/providers/openrouter.ts
index f51eeb05134..5fb60bde6df 100644
--- a/src/api/providers/openrouter.ts
+++ b/src/api/providers/openrouter.ts
@@ -20,7 +20,7 @@ import { resolveToolProtocol } from "../../utils/resolveToolProtocol"
 import { TOOL_PROTOCOL } from "@roo-code/types"
 import { ApiStreamChunk } from "../transform/stream"
 import { convertToR1Format } from "../transform/r1-format"
-import { addCacheBreakpoints as addAnthropicCacheBreakpoints } from "../transform/caching/anthropic"
+import { addAnthropicCacheBreakpoints } from "../transform/caching/kilocode" // kilocode_change: own implementation that supports tool results
 import { addCacheBreakpoints as addGeminiCacheBreakpoints } from "../transform/caching/gemini"
 import type { OpenRouterReasoningParams } from "../transform/reasoning"
 import { getModelParams } from "../transform/model-params"
diff --git a/src/api/transform/caching/__tests__/kilocode.spec.ts b/src/api/transform/caching/__tests__/kilocode.spec.ts
new file mode 100644
index 00000000000..e4d1ddbde62
--- /dev/null
+++ b/src/api/transform/caching/__tests__/kilocode.spec.ts
@@ -0,0 +1,245 @@
+// npx vitest run src/api/transform/caching/__tests__/kilocode.spec.ts
+
+import OpenAI from "openai"
+
+import { addAnthropicCacheBreakpoints } from "../kilocode"
+
+describe("addAnthropicCacheBreakpoints (Kilocode)", () => {
+	const systemPrompt = "You are a helpful assistant."
+
+	it("should add a cache breakpoint to the system prompt", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: "Hello" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should add a breakpoint to the only user message if only one exists", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: "User message 1" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// System prompt gets cache control
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+
+		// Last user message gets cache control
+		expect(messages[1].content).toEqual([
+			{ type: "text", text: "User message 1", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should add breakpoints to system, last user, and user before last assistant", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: "User message 1" },
+			{ role: "assistant", content: "Assistant response 1" },
+			{ role: "user", content: "User message 2" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// System prompt gets cache control
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+
+		// User message before last assistant gets cache control
+		expect(messages[1].content).toEqual([
+			{ type: "text", text: "User message 1", cache_control: { type: "ephemeral" } },
+		])
+
+		// Assistant message should not be modified
+		expect(messages[2].content).toBe("Assistant response 1")
+
+		// Last user message gets cache control
+		expect(messages[3].content).toEqual([
+			{ type: "text", text: "User message 2", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should handle multiple assistant messages and find the user before the last one", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: "User message 1" },
+			{ role: "assistant", content: "Assistant response 1" },
+			{ role: "user", content: "User message 2" },
+			{ role: "assistant", content: "Assistant response 2" },
+			{ role: "user", content: "User message 3" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// System prompt gets cache control
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+
+		// First user message should NOT get cache control (not before last assistant)
+		expect(messages[1].content).toBe("User message 1")
+
+		// User message before last assistant (index 4) gets cache control
+		expect(messages[3].content).toEqual([
+			{ type: "text", text: "User message 2", cache_control: { type: "ephemeral" } },
+		])
+
+		// Last user message gets cache control
+		expect(messages[5].content).toEqual([
+			{ type: "text", text: "User message 3", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should handle tool messages the same as user messages", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: "User message 1" },
+			{ role: "assistant", content: "Let me use a tool" },
+			{ role: "tool", content: "Tool result", tool_call_id: "call_123" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// System prompt gets cache control
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+
+		// User message before last assistant gets cache control
+		expect(messages[1].content).toEqual([
+			{ type: "text", text: "User message 1", cache_control: { type: "ephemeral" } },
+		])
+
+		// Tool message (last user/tool) gets cache control
+		expect(messages[3].content).toEqual([
+			{ type: "text", text: "Tool result", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should handle array content and add cache control to last item", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{
+				role: "user",
+				content: [
+					{ type: "text", text: "First part" },
+					{ type: "image_url", image_url: { url: "data:image/png;base64,..." } },
+					{ type: "text", text: "Last part" },
+				],
+			},
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		expect(messages[1].content).toEqual([
+			{ type: "text", text: "First part" },
+			{ type: "image_url", image_url: { url: "data:image/png;base64,..." } },
+			{ type: "text", text: "Last part", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should add cache control to last item of array when it's an image", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{
+				role: "user",
+				content: [
+					{ type: "text", text: "Some text" },
+					{ type: "image_url", image_url: { url: "data:image/png;base64,..." } },
+				],
+			},
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// Cache control should be on the last item (the image)
+		expect(messages[1].content).toEqual([
+			{ type: "text", text: "Some text" },
+			{
+				type: "image_url",
+				image_url: { url: "data:image/png;base64,..." },
+				cache_control: { type: "ephemeral" },
+			},
+		])
+	})
+
+	it("should not add breakpoints when there are no user or tool messages", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "assistant", content: "Hello" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// Only system prompt should get cache control
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+
+		// Assistant message should not be modified
+		expect(messages[1].content).toBe("Hello")
+	})
+
+	it("should handle case when system prompt is found in messages array", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: "Different system prompt in array" },
+			{ role: "user", content: "Hello" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// Should use the system prompt found in messages, not the passed parameter
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: "Different system prompt in array", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should handle when last user message is also user before last assistant (same message)", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: "User message 1" },
+			{ role: "assistant", content: "Assistant response" },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// System prompt gets cache control
+		expect(messages[0].content).toEqual([
+			{ type: "text", text: systemPrompt, cache_control: { type: "ephemeral" } },
+		])
+
+		// User message 1 is both before last assistant and is the last user message
+		// It should have cache control set (the function calls setCacheControl twice on same message)
+		expect(messages[1].content).toEqual([
+			{ type: "text", text: "User message 1", cache_control: { type: "ephemeral" } },
+		])
+	})
+
+	it("should handle empty messages array gracefully", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = []
+
+		// Should not throw
+		expect(() => addAnthropicCacheBreakpoints(systemPrompt, messages)).not.toThrow()
+	})
+
+	it("should handle empty array content", () => {
+		const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
+			{ role: "system", content: systemPrompt },
+			{ role: "user", content: [] },
+		]
+
+		addAnthropicCacheBreakpoints(systemPrompt, messages)
+
+		// Empty array should remain empty (no last item to add cache control to)
+		expect(messages[1].content).toEqual([])
+	})
+})
diff --git a/src/api/transform/caching/kilocode.ts b/src/api/transform/caching/kilocode.ts
new file mode 100644
index 00000000000..b955ef620a8
--- /dev/null
+++ b/src/api/transform/caching/kilocode.ts
@@ -0,0 +1,47 @@
+import OpenAI from "openai"
+import { findLast, findLastIndex } from "../../../shared/array"
+
+function setCacheControl(message: OpenAI.ChatCompletionMessageParam) {
+	if (typeof message.content === "string") {
+		message.content = [
+			{
+				type: "text",
+				text: message.content,
+				// @ts-ignore-next-line
+				cache_control: { type: "ephemeral" },
+			},
+		]
+	} else if (Array.isArray(message.content)) {
+		const lastItem = message.content.at(-1)
+		if (lastItem) {
+			// @ts-ignore-next-line
+			lastItem.cache_control = { type: "ephemeral" }
+		}
+	}
+}
+
+export function addAnthropicCacheBreakpoints(
+	_systemPrompt: string,
+	messages: OpenAI.Chat.ChatCompletionMessageParam[],
+) {
+	const systemPrompt = messages.find((msg) => msg.role === "system")
+	if (systemPrompt) {
+		setCacheControl(systemPrompt)
+	}
+
+	const lastUserMessage = findLast(messages, (msg) => msg.role === "user" || msg.role === "tool")
+	if (lastUserMessage) {
+		setCacheControl(lastUserMessage)
+	}
+
+	const lastAssistantIndex = findLastIndex(messages, (msg) => msg.role === "assistant")
+	if (lastAssistantIndex >= 0) {
+		const previousUserMessage = findLast(
+			messages.slice(0, lastAssistantIndex),
+			(msg) => msg.role === "user" || msg.role === "tool",
+		)
+		if (previousUserMessage) {
+			setCacheControl(previousUserMessage)
+		}
+	}
+}
diff --git a/src/core/prompts/__tests__/__snapshots__/add-custom-instructions/architect-mode-prompt.snap b/src/core/prompts/__tests__/__snapshots__/add-custom-instructions/architect-mode-prompt.snap
index df49170e975..093e0d50222 100644
--- a/src/core/prompts/__tests__/__snapshots__/add-custom-instructions/architect-mode-prompt.snap
+++ b/src/core/prompts/__tests__/__snapshots__/add-custom-instructions/architect-mode-prompt.snap
@@ -497,7 +497,7 @@ Mode-specific Instructions:
 
 6. Include Mermaid diagrams if they help clarify complex workflows or system architecture. Please avoid using double quotes ("") and parentheses () inside square brackets ([]) in Mermaid diagrams, as this can cause parsing errors.
 
-7. Use the switch_mode tool to request that the user switch to another mode to implement the solution.
+7. Use the switch_mode tool to request switching to another mode when you need to edit non-markdown files (like source code files: .ts, .js, .py, .java, etc.) or execute commands. You CAN directly create and edit markdown files (.md) without switching modes.
 
 **IMPORTANT: Focus on creating clear, actionable todo lists rather than lengthy markdown documents. Use the todo list as your primary planning tool to track and organize the work that needs to be done.**
 
diff --git a/src/package.json b/src/package.json
index 28d26e7f29b..a769233626e 100644
--- a/src/package.json
+++ b/src/package.json
@@ -3,7 +3,7 @@
 	"displayName": "%extension.displayName%",
 	"description": "%extension.description%",
 	"publisher": "kilocode",
-	"version": "4.143.0",
+	"version": "4.143.2",
 	"icon": "assets/icons/logo-outline-black.png",
 	"galleryBanner": {
 		"color": "#FFFFFF",