diff --git a/README.md b/README.md
index 303a426a..dae742c8 100644
--- a/README.md
+++ b/README.md
@@ -1,39 +1,32 @@
-# Kubernetes MCP Server
+# OpenShift MCP Server
-[](https://github.com/containers/kubernetes-mcp-server/blob/main/LICENSE)
-[](https://www.npmjs.com/package/kubernetes-mcp-server)
-[](https://pypi.org/project/kubernetes-mcp-server/)
-[](https://github.com/containers/kubernetes-mcp-server/releases/latest)
-[](https://github.com/containers/kubernetes-mcp-server/actions/workflows/build.yaml)
-
-[β¨ Features](#features) | [π Getting Started](#getting-started) | [π₯ Demos](#demos) | [βοΈ Configuration](#configuration) | [π οΈ Tools](#tools-and-functionalities) | [π§βπ» Development](#development)
-
-https://github.com/user-attachments/assets/be2b67b3-fc1c-4d11-ae46-93deba8ed98e
+OpenShift MCP Server is currently under development.
## β¨ Features
A powerful and flexible Kubernetes [Model Context Protocol (MCP)](https://blog.marcnuri.com/model-context-protocol-mcp-introduction) server implementation with support for **Kubernetes** and **OpenShift**.
- **β
Configuration**:
- - Automatically detect changes in the Kubernetes configuration and update the MCP server.
- - **View** and manage the current [Kubernetes `.kube/config`](https://blog.marcnuri.com/where-is-my-default-kubeconfig-file) or in-cluster configuration.
+ - Automatically detect changes in the Kubernetes configuration and update the MCP server.
+ - **View** and manage the current [Kubernetes `.kube/config`](https://blog.marcnuri.com/where-is-my-default-kubeconfig-file) or in-cluster configuration.
- **β
Generic Kubernetes Resources**: Perform operations on **any** Kubernetes or OpenShift resource.
- - Any CRUD operation (Create or Update, Get, List, Delete).
+ - Any CRUD operation (Create or Update, Get, List, Delete).
- **β
Pods**: Perform Pod-specific operations.
- - **List** pods in all namespaces or in a specific namespace.
- - **Get** a pod by name from the specified namespace.
- - **Delete** a pod by name from the specified namespace.
- - **Show logs** for a pod by name from the specified namespace.
- - **Top** gets resource usage metrics for all pods or a specific pod in the specified namespace.
- - **Exec** into a pod and run a command.
- - **Run** a container image in a pod and optionally expose it.
+ - **List** pods in all namespaces or in a specific namespace.
+ - **Get** a pod by name from the specified namespace.
+ - **Delete** a pod by name from the specified namespace.
+ - **Show logs** for a pod by name from the specified namespace.
+ - **Top** gets resource usage metrics for all pods or a specific pod in the specified namespace.
+ - **Exec** into a pod and run a command.
+ - **Run** a container image in a pod and optionally expose it.
+ - **Node debug** run privileged commands directly on cluster nodes via a managed debug pod.
- **β
Namespaces**: List Kubernetes Namespaces.
- **β
Events**: View Kubernetes events in all namespaces or in a specific namespace.
- **β
Projects**: List OpenShift Projects.
- **βΈοΈ Helm**:
- - **Install** a Helm chart in the current or provided namespace.
- - **List** Helm releases in all namespaces or in a specific namespace.
- - **Uninstall** a Helm release in the current or provided namespace.
+ - **Install** a Helm chart in the current or provided namespace.
+ - **List** Helm releases in all namespaces or in a specific namespace.
+ - **Uninstall** a Helm release in the current or provided namespace.
Unlike other Kubernetes MCP server implementations, this **IS NOT** just a wrapper around `kubectl` or `helm` command-line tools.
It is a **Go-based native implementation** that interacts directly with the Kubernetes API server.
@@ -94,7 +87,7 @@ code-insiders --add-mcp '{"name":"kubernetes","command":"npx","args":["kubernete
Install the Kubernetes MCP server extension in Cursor by pressing the following link:
-[](https://cursor.com/en/install-mcp?name=kubernetes-mcp-server&config=eyJjb21tYW5kIjoibnB4IC15IGt1YmVybmV0ZXMtbWNwLXNlcnZlckBsYXRlc3QifQ%3D%3D)
+[](https://cursor.com/install-mcp?name=kubernetes-mcp-server&config=JTdCJTIyY29tbWFuZCUyMiUzQSUyMm5weCUyMC15JTIwa3ViZXJuZXRlcy1tY3Atc2VydmVyJTQwbGF0ZXN0JTIyJTdE)
Alternatively, you can install the extension manually by editing the `mcp.json` file:
@@ -128,30 +121,6 @@ extensions:
```
-## π₯ Demos
-
-### Diagnosing and automatically fixing an OpenShift Deployment
-
-Demo showcasing how Kubernetes MCP server is leveraged by Claude Desktop to automatically diagnose and fix a deployment in OpenShift without any user assistance.
-
-https://github.com/user-attachments/assets/a576176d-a142-4c19-b9aa-a83dc4b8d941
-
-### _Vibe Coding_ a simple game and deploying it to OpenShift
-
-In this demo, I walk you through the process of _Vibe Coding_ a simple game using VS Code and how to leverage [Podman MCP server](https://github.com/manusa/podman-mcp-server) and Kubernetes MCP server to deploy it to OpenShift.
-
-
-
-
-
-### Supercharge GitHub Copilot with Kubernetes MCP Server in VS Code - One-Click Setup!
-
-In this demo, I'll show you how to set up Kubernetes MCP server in VS code just by clicking a link.
-
-
-
-
-
## βοΈ Configuration
The Kubernetes MCP server can be configured using command line (CLI) arguments.
@@ -183,141 +152,258 @@ uvx kubernetes-mcp-server@latest --help
| `--list-output` | Output format for resource list operations (one of: yaml, table) (default "table") |
| `--read-only` | If set, the MCP server will run in read-only mode, meaning it will not allow any write operations (create, update, delete) on the Kubernetes cluster. This is useful for debugging or inspecting the cluster without making changes. |
| `--disable-destructive` | If set, the MCP server will disable all destructive operations (delete, update, etc.) on the Kubernetes cluster. This is useful for debugging or inspecting the cluster without accidentally making changes. This option has no effect when `--read-only` is used. |
-| `--toolsets` | Comma-separated list of toolsets to enable. Check the [π οΈ Tools and Functionalities](#tools-and-functionalities) section for more information. |
-
-## π οΈ Tools and Functionalities
-
-The Kubernetes MCP server supports enabling or disabling specific groups of tools and functionalities (tools, resources, prompts, and so on) via the `--toolsets` command-line flag or `toolsets` configuration option.
-This allows you to control which Kubernetes functionalities are available to your AI tools.
-Enabling only the toolsets you need can help reduce the context size and improve the LLM's tool selection accuracy.
-
-### Available Toolsets
-
-The following sets of tools are available (all on by default):
-
-
-
-| Toolset | Description |
-|---------|-------------------------------------------------------------------------------------|
-| config | View and manage the current local Kubernetes configuration (kubeconfig) |
-| core | Most common tools for Kubernetes management (Pods, Generic Resources, Events, etc.) |
-| helm | Tools for managing Helm charts and releases |
-
-
-
-### Tools
-
+## π οΈ Tools
-
+### `configuration_view`
-config
+Get the current Kubernetes configuration content as a kubeconfig YAML
-- **configuration_view** - Get the current Kubernetes configuration content as a kubeconfig YAML
- - `minified` (`boolean`) - Return a minified version of the configuration. If set to true, keeps only the current-context and the relevant pieces of the configuration for that context. If set to false, all contexts, clusters, auth-infos, and users are returned in the configuration. (Optional, default true)
+**Parameters:**
+- `minified` (`boolean`, optional, default: `true`)
+ - Return a minified version of the configuration
+ - If `true`, keeps only the current-context and relevant configuration pieces
+ - If `false`, returns all contexts, clusters, auth-infos, and users
-
+### `events_list`
-
+List all the Kubernetes events in the current cluster from all namespaces
-core
+**Parameters:**
+- `namespace` (`string`, optional)
+ - Namespace to retrieve the events from. If not provided, will list events from all namespaces
-- **events_list** - List all the Kubernetes events in the current cluster from all namespaces
- - `namespace` (`string`) - Optional Namespace to retrieve the events from. If not provided, will list events from all namespaces
+### `helm_install`
-- **namespaces_list** - List all the Kubernetes namespaces in the current cluster
+Install a Helm chart in the current or provided namespace with the provided name and chart
-- **projects_list** - List all the OpenShift projects in the current cluster
+**Parameters:**
+- `chart` (`string`, required)
+ - Name of the Helm chart to install
+ - Can be a local path or a remote URL
+ - Example: `./my-chart.tgz` or `https://example.com/my-chart.tgz`
+- `values` (`object`, optional)
+ - Values to pass to the Helm chart
+ - Example: `{"key": "value"}`
+- `name` (`string`, optional)
+ - Name of the Helm release
+ - Random name if not provided
+- `namespace` (`string`, optional)
+ - Namespace to install the Helm chart in
+ - If not provided, will use the configured namespace
-- **pods_list** - List all the Kubernetes pods in the current cluster from all namespaces
- - `labelSelector` (`string`) - Optional Kubernetes label selector (e.g. 'app=myapp,env=prod' or 'app in (myapp,yourapp)'), use this option when you want to filter the pods by label
+### `helm_list`
-- **pods_list_in_namespace** - List all the Kubernetes pods in the specified namespace in the current cluster
- - `labelSelector` (`string`) - Optional Kubernetes label selector (e.g. 'app=myapp,env=prod' or 'app in (myapp,yourapp)'), use this option when you want to filter the pods by label
- - `namespace` (`string`) **(required)** - Namespace to list pods from
+List all the Helm releases in the current or provided namespace (or in all namespaces if specified)
-- **pods_get** - Get a Kubernetes Pod in the current or provided namespace with the provided name
- - `name` (`string`) **(required)** - Name of the Pod
- - `namespace` (`string`) - Namespace to get the Pod from
+**Parameters:**
+- `namespace` (`string`, optional)
+ - Namespace to list the Helm releases from
+ - If not provided, will use the configured namespace
+- `all_namespaces` (`boolean`, optional)
+ - If `true`, will list Helm releases from all namespaces
+ - If `false`, will list Helm releases from the specified namespace
-- **pods_delete** - Delete a Kubernetes Pod in the current or provided namespace with the provided name
- - `name` (`string`) **(required)** - Name of the Pod to delete
- - `namespace` (`string`) - Namespace to delete the Pod from
+### `helm_uninstall`
-- **pods_top** - List the resource consumption (CPU and memory) as recorded by the Kubernetes Metrics Server for the specified Kubernetes Pods in the all namespaces, the provided namespace, or the current namespace
- - `all_namespaces` (`boolean`) - If true, list the resource consumption for all Pods in all namespaces. If false, list the resource consumption for Pods in the provided namespace or the current namespace
- - `label_selector` (`string`) - Kubernetes label selector (e.g. 'app=myapp,env=prod' or 'app in (myapp,yourapp)'), use this option when you want to filter the pods by label (Optional, only applicable when name is not provided)
- - `name` (`string`) - Name of the Pod to get the resource consumption from (Optional, all Pods in the namespace if not provided)
- - `namespace` (`string`) - Namespace to get the Pods resource consumption from (Optional, current namespace if not provided and all_namespaces is false)
+Uninstall a Helm release in the current or provided namespace with the provided name
-- **pods_exec** - Execute a command in a Kubernetes Pod in the current or provided namespace with the provided name and command
- - `command` (`array`) **(required)** - Command to execute in the Pod container. The first item is the command to be run, and the rest are the arguments to that command. Example: ["ls", "-l", "/tmp"]
- - `container` (`string`) - Name of the Pod container where the command will be executed (Optional)
- - `name` (`string`) **(required)** - Name of the Pod where the command will be executed
- - `namespace` (`string`) - Namespace of the Pod where the command will be executed
+**Parameters:**
+- `name` (`string`, required)
+ - Name of the Helm release to uninstall
+- `namespace` (`string`, optional)
+ - Namespace to uninstall the Helm release from
+ - If not provided, will use the configured namespace
-- **pods_log** - Get the logs of a Kubernetes Pod in the current or provided namespace with the provided name
- - `container` (`string`) - Name of the Pod container to get the logs from (Optional)
- - `name` (`string`) **(required)** - Name of the Pod to get the logs from
- - `namespace` (`string`) - Namespace to get the Pod logs from
- - `previous` (`boolean`) - Return previous terminated container logs (Optional)
- - `tail` (`integer`) - Number of lines to retrieve from the end of the logs (Optional, default: 100)
+### `namespaces_list`
-- **pods_run** - Run a Kubernetes Pod in the current or provided namespace with the provided container image and optional name
- - `image` (`string`) **(required)** - Container Image to run in the Pod
- - `name` (`string`) - Name of the Pod (Optional, random name if not provided)
- - `namespace` (`string`) - Namespace to run the Pod in
- - `port` (`number`) - TCP/IP port to expose from the Pod container (Optional, no port exposed if not provided)
+List all the Kubernetes namespaces in the current cluster
-- **resources_list** - List Kubernetes resources and objects in the current cluster by providing their apiVersion and kind and optionally the namespace and label selector
-(common apiVersion and kind include: v1 Pod, v1 Service, v1 Node, apps/v1 Deployment, networking.k8s.io/v1 Ingress, route.openshift.io/v1 Route)
- - `apiVersion` (`string`) **(required)** - apiVersion of the resources (examples of valid apiVersion are: v1, apps/v1, networking.k8s.io/v1)
- - `kind` (`string`) **(required)** - kind of the resources (examples of valid kind are: Pod, Service, Deployment, Ingress)
- - `labelSelector` (`string`) - Optional Kubernetes label selector (e.g. 'app=myapp,env=prod' or 'app in (myapp,yourapp)'), use this option when you want to filter the pods by label
- - `namespace` (`string`) - Optional Namespace to retrieve the namespaced resources from (ignored in case of cluster scoped resources). If not provided, will list resources from all namespaces
+**Parameters:** None
-- **resources_get** - Get a Kubernetes resource in the current cluster by providing its apiVersion, kind, optionally the namespace, and its name
-(common apiVersion and kind include: v1 Pod, v1 Service, v1 Node, apps/v1 Deployment, networking.k8s.io/v1 Ingress, route.openshift.io/v1 Route)
- - `apiVersion` (`string`) **(required)** - apiVersion of the resource (examples of valid apiVersion are: v1, apps/v1, networking.k8s.io/v1)
- - `kind` (`string`) **(required)** - kind of the resource (examples of valid kind are: Pod, Service, Deployment, Ingress)
- - `name` (`string`) **(required)** - Name of the resource
- - `namespace` (`string`) - Optional Namespace to retrieve the namespaced resource from (ignored in case of cluster scoped resources). If not provided, will get resource from configured namespace
+### `nodes_debug_exec`
-- **resources_create_or_update** - Create or update a Kubernetes resource in the current cluster by providing a YAML or JSON representation of the resource
-(common apiVersion and kind include: v1 Pod, v1 Service, v1 Node, apps/v1 Deployment, networking.k8s.io/v1 Ingress, route.openshift.io/v1 Route)
- - `resource` (`string`) **(required)** - A JSON or YAML containing a representation of the Kubernetes resource. Should include top-level fields such as apiVersion,kind,metadata, and spec
+Run commands on an OpenShift node by creating a short-lived privileged debug pod that automatically chroots into the host filesystem. Output is limited to the latest 100 log lines, so use filtering (e.g., `grep`, `journalctl --since`, etc.) for high-volume commands.
-- **resources_delete** - Delete a Kubernetes resource in the current cluster by providing its apiVersion, kind, optionally the namespace, and its name
-(common apiVersion and kind include: v1 Pod, v1 Service, v1 Node, apps/v1 Deployment, networking.k8s.io/v1 Ingress, route.openshift.io/v1 Route)
- - `apiVersion` (`string`) **(required)** - apiVersion of the resource (examples of valid apiVersion are: v1, apps/v1, networking.k8s.io/v1)
- - `kind` (`string`) **(required)** - kind of the resource (examples of valid kind are: Pod, Service, Deployment, Ingress)
- - `name` (`string`) **(required)** - Name of the resource
- - `namespace` (`string`) - Optional Namespace to delete the namespaced resource from (ignored in case of cluster scoped resources). If not provided, will delete resource from configured namespace
+**Parameters:**
+- `node` (`string`, required)
+ - Name of the node to target (for example `worker-0`)
+- `command` (`string[]`, required)
+ - Command and arguments to run inside the node's host filesystem
+ - Example: `["systemctl", "status", "kubelet"]`
+- `namespace` (`string`, optional)
+ - Namespace used for the temporary debug pod
+ - Defaults to the configured namespace or `default`
+- `image` (`string`, optional)
+ - Override the container image used for the debug pod
+- `timeout_seconds` (`integer`, optional)
+ - Maximum time to wait for the command to finish (defaults to 300 seconds)
-
+### `pods_delete`
-
+Delete a Kubernetes Pod in the current or provided namespace with the provided name
-helm
+**Parameters:**
+- `name` (`string`, required)
+ - Name of the Pod to delete
+- `namespace` (`string`, required)
+ - Namespace to delete the Pod from
-- **helm_install** - Install a Helm chart in the current or provided namespace
- - `chart` (`string`) **(required)** - Chart reference to install (for example: stable/grafana, oci://ghcr.io/nginxinc/charts/nginx-ingress)
- - `name` (`string`) - Name of the Helm release (Optional, random name if not provided)
- - `namespace` (`string`) - Namespace to install the Helm chart in (Optional, current namespace if not provided)
- - `values` (`object`) - Values to pass to the Helm chart (Optional)
+### `pods_exec`
-- **helm_list** - List all the Helm releases in the current or provided namespace (or in all namespaces if specified)
- - `all_namespaces` (`boolean`) - If true, lists all Helm releases in all namespaces ignoring the namespace argument (Optional)
- - `namespace` (`string`) - Namespace to list Helm releases from (Optional, all namespaces if not provided)
+Execute a command in a Kubernetes Pod in the current or provided namespace with the provided name and command
-- **helm_uninstall** - Uninstall a Helm release in the current or provided namespace
- - `name` (`string`) **(required)** - Name of the Helm release to uninstall
- - `namespace` (`string`) - Namespace to uninstall the Helm release from (Optional, current namespace if not provided)
+**Parameters:**
+- `command` (`string[]`, required)
+ - Command to execute in the Pod container
+ - First item is the command, rest are arguments
+ - Example: `["ls", "-l", "/tmp"]`
+- `name` (string, required)
+ - Name of the Pod
+- `namespace` (string, required)
+ - Namespace of the Pod
+- `container` (`string`, optional)
+ - Name of the Pod container to get logs from
-
+### `pods_get`
+Get a Kubernetes Pod in the current or provided namespace with the provided name
+
+**Parameters:**
+- `name` (`string`, required)
+ - Name of the Pod
+- `namespace` (`string`, required)
+ - Namespace to get the Pod from
+
+### `pods_list`
+
+List all the Kubernetes pods in the current cluster from all namespaces
+
+**Parameters:**
+- `labelSelector` (`string`, optional)
+ - Kubernetes label selector (e.g., 'app=myapp,env=prod' or 'app in (myapp,yourapp)'). Use this option to filter the pods by label
+
+### `pods_list_in_namespace`
+
+List all the Kubernetes pods in the specified namespace in the current cluster
+
+**Parameters:**
+- `namespace` (`string`, required)
+ - Namespace to list pods from
+- `labelSelector` (`string`, optional)
+ - Kubernetes label selector (e.g., 'app=myapp,env=prod' or 'app in (myapp,yourapp)'). Use this option to filter the pods by label
-
+### `pods_log`
+
+Get the logs of a Kubernetes Pod in the current or provided namespace with the provided name
+
+**Parameters:**
+- `name` (`string`, required)
+ - Name of the Pod to get logs from
+- `namespace` (`string`, required)
+ - Namespace to get the Pod logs from
+- `container` (`string`, optional)
+ - Name of the Pod container to get logs from
+
+### `pods_run`
+
+Run a Kubernetes Pod in the current or provided namespace with the provided container image and optional name
+
+**Parameters:**
+- `image` (`string`, required)
+ - Container Image to run in the Pod
+- `namespace` (`string`, required)
+ - Namespace to run the Pod in
+- `name` (`string`, optional)
+ - Name of the Pod (random name if not provided)
+- `port` (`number`, optional)
+ - TCP/IP port to expose from the Pod container
+ - No port exposed if not provided
+
+### `pods_top`
+
+Lists the resource consumption (CPU and memory) as recorded by the Kubernetes Metrics Server for the specified Kubernetes Pods in the all namespaces, the provided namespace, or the current namespace
+
+**Parameters:**
+- `all_namespaces` (`boolean`, optional, default: `true`)
+ - If `true`, lists resource consumption for Pods in all namespaces
+ - If `false`, lists resource consumption for Pods in the configured or provided namespace
+- `namespace` (`string`, optional)
+ - Namespace to list the Pod resources from
+ - If not provided, will list Pods from the configured namespace (in case all_namespaces is false)
+- `name` (`string`, optional)
+ - Name of the Pod to get resource consumption from
+ - If not provided, will list resource consumption for all Pods in the applicable namespace(s)
+- `label_selector` (`string`, optional)
+ - Kubernetes label selector (e.g. 'app=myapp,env=prod' or 'app in (myapp,yourapp)'), use this option when you want to filter the pods by label (Optional, only applicable when name is not provided)
+
+### `projects_list`
+
+List all the OpenShift projects in the current cluster
+
+### `resources_create_or_update`
+
+Create or update a Kubernetes resource in the current cluster by providing a YAML or JSON representation of the resource
+
+**Parameters:**
+- `resource` (`string`, required)
+ - A JSON or YAML containing a representation of the Kubernetes resource
+ - Should include top-level fields such as apiVersion, kind, metadata, and spec
+
+**Common apiVersion and kind include:**
+- v1 Pod
+- v1 Service
+- v1 Node
+- apps/v1 Deployment
+- networking.k8s.io/v1 Ingress
+
+### `resources_delete`
+
+Delete a Kubernetes resource in the current cluster
+
+**Parameters:**
+- `apiVersion` (`string`, required)
+ - apiVersion of the resource (e.g., `v1`, `apps/v1`, `networking.k8s.io/v1`)
+- `kind` (`string`, required)
+ - kind of the resource (e.g., `Pod`, `Service`, `Deployment`, `Ingress`)
+- `name` (`string`, required)
+ - Name of the resource
+- `namespace` (`string`, optional)
+ - Namespace to delete the namespaced resource from
+ - Ignored for cluster-scoped resources
+ - Uses configured namespace if not provided
+
+### `resources_get`
+
+Get a Kubernetes resource in the current cluster
+
+**Parameters:**
+- `apiVersion` (`string`, required)
+ - apiVersion of the resource (e.g., `v1`, `apps/v1`, `networking.k8s.io/v1`)
+- `kind` (`string`, required)
+ - kind of the resource (e.g., `Pod`, `Service`, `Deployment`, `Ingress`)
+- `name` (`string`, required)
+ - Name of the resource
+- `namespace` (`string`, optional)
+ - Namespace to retrieve the namespaced resource from
+ - Ignored for cluster-scoped resources
+ - Uses configured namespace if not provided
+
+### `resources_list`
+
+List Kubernetes resources and objects in the current cluster
+
+**Parameters:**
+- `apiVersion` (`string`, required)
+ - apiVersion of the resources (e.g., `v1`, `apps/v1`, `networking.k8s.io/v1`)
+- `kind` (`string`, required)
+ - kind of the resources (e.g., `Pod`, `Service`, `Deployment`, `Ingress`)
+- `namespace` (`string`, optional)
+ - Namespace to retrieve the namespaced resources from
+ - Ignored for cluster-scoped resources
+ - Lists resources from all namespaces if not provided
+- `labelSelector` (`string`, optional)
+ - Kubernetes label selector (e.g., 'app=myapp,env=prod' or 'app in (myapp,yourapp)'). Use this option to filter the pods by label.
## π§βπ» Development
diff --git a/pkg/mcp/nodes_test.go b/pkg/mcp/nodes_test.go
new file mode 100644
index 00000000..4efc83ff
--- /dev/null
+++ b/pkg/mcp/nodes_test.go
@@ -0,0 +1,265 @@
+package mcp
+
+import (
+ "encoding/json"
+ "io"
+ "net/http"
+ "strings"
+ "testing"
+
+ "github.com/mark3labs/mcp-go/mcp"
+ v1 "k8s.io/api/core/v1"
+ metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+ "k8s.io/apimachinery/pkg/runtime"
+ "k8s.io/apimachinery/pkg/runtime/serializer"
+
+ "github.com/containers/kubernetes-mcp-server/internal/test"
+)
+
+func TestNodesDebugExecTool(t *testing.T) {
+ testCase(t, func(c *mcpContext) {
+ mockServer := test.NewMockServer()
+ defer mockServer.Close()
+ c.withKubeConfig(mockServer.Config())
+
+ var (
+ createdPod v1.Pod
+ deleteCalled bool
+ )
+ const namespace = "debug"
+ const logOutput = "filesystem repaired"
+
+ scheme := runtime.NewScheme()
+ _ = v1.AddToScheme(scheme)
+ codec := serializer.NewCodecFactory(scheme).UniversalDeserializer()
+
+ mockServer.Handle(http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
+ switch {
+ case req.URL.Path == "/api":
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = w.Write([]byte(`{"kind":"APIVersions","versions":["v1"],"serverAddressByClientCIDRs":[{"clientCIDR":"0.0.0.0/0"}]}`))
+ case req.URL.Path == "/apis":
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = w.Write([]byte(`{"kind":"APIGroupList","apiVersion":"v1","groups":[]}`))
+ case req.URL.Path == "/api/v1":
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = w.Write([]byte(`{"kind":"APIResourceList","apiVersion":"v1","resources":[{"name":"pods","singularName":"","namespaced":true,"kind":"Pod","verbs":["get","list","watch","create","update","patch","delete"]}]}`))
+ case req.Method == http.MethodPatch && strings.HasPrefix(req.URL.Path, "/api/v1/namespaces/"+namespace+"/pods/"):
+ // Handle server-side apply (PATCH with fieldManager query param)
+ body, err := io.ReadAll(req.Body)
+ if err != nil {
+ t.Fatalf("failed to read apply body: %v", err)
+ }
+ created := &v1.Pod{}
+ if _, _, err = codec.Decode(body, nil, created); err != nil {
+ t.Fatalf("failed to decode apply body: %v", err)
+ }
+ createdPod = *created
+ // Keep the name from the request URL if it was provided
+ pathParts := strings.Split(req.URL.Path, "/")
+ if len(pathParts) > 0 {
+ createdPod.Name = pathParts[len(pathParts)-1]
+ }
+ createdPod.Namespace = namespace
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(&createdPod)
+ case req.Method == http.MethodPost && req.URL.Path == "/api/v1/namespaces/"+namespace+"/pods":
+ body, err := io.ReadAll(req.Body)
+ if err != nil {
+ t.Fatalf("failed to read create body: %v", err)
+ }
+ created := &v1.Pod{}
+ if _, _, err = codec.Decode(body, nil, created); err != nil {
+ t.Fatalf("failed to decode create body: %v", err)
+ }
+ createdPod = *created
+ createdPod.ObjectMeta = metav1.ObjectMeta{
+ Namespace: namespace,
+ Name: createdPod.GenerateName + "abc",
+ }
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(&createdPod)
+ case req.Method == http.MethodGet && createdPod.Name != "" && req.URL.Path == "/api/v1/namespaces/"+namespace+"/pods/"+createdPod.Name:
+ podStatus := createdPod.DeepCopy()
+ podStatus.Status = v1.PodStatus{
+ Phase: v1.PodSucceeded,
+ ContainerStatuses: []v1.ContainerStatus{{
+ Name: "debug",
+ State: v1.ContainerState{Terminated: &v1.ContainerStateTerminated{
+ ExitCode: 0,
+ }},
+ }},
+ }
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(podStatus)
+ case req.Method == http.MethodDelete && createdPod.Name != "" && req.URL.Path == "/api/v1/namespaces/"+namespace+"/pods/"+createdPod.Name:
+ deleteCalled = true
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(&metav1.Status{Status: "Success"})
+ case req.Method == http.MethodGet && createdPod.Name != "" && req.URL.Path == "/api/v1/namespaces/"+namespace+"/pods/"+createdPod.Name+"/log":
+ w.Header().Set("Content-Type", "text/plain")
+ _, _ = w.Write([]byte(logOutput))
+ }
+ }))
+
+ toolResult, err := c.callTool("nodes_debug_exec", map[string]any{
+ "node": "worker-0",
+ "namespace": namespace,
+ "command": []any{"uname", "-a"},
+ })
+
+ t.Run("call succeeds", func(t *testing.T) {
+ if err != nil {
+ t.Fatalf("call tool failed: %v", err)
+ }
+ if toolResult.IsError {
+ t.Fatalf("tool returned error: %v", toolResult.Content)
+ }
+ if len(toolResult.Content) == 0 {
+ t.Fatalf("expected output content")
+ }
+ text := toolResult.Content[0].(mcp.TextContent).Text
+ if text != logOutput {
+ t.Fatalf("unexpected tool output %q", text)
+ }
+ })
+
+ t.Run("debug pod shaped correctly", func(t *testing.T) {
+ if createdPod.Spec.Containers == nil || len(createdPod.Spec.Containers) != 1 {
+ t.Fatalf("expected single container in debug pod")
+ }
+ container := createdPod.Spec.Containers[0]
+ expectedPrefix := []string{"chroot", "/host", "uname", "-a"}
+ if !equalStringSlices(container.Command, expectedPrefix) {
+ t.Fatalf("unexpected debug command: %v", container.Command)
+ }
+ if container.SecurityContext == nil || container.SecurityContext.Privileged == nil || !*container.SecurityContext.Privileged {
+ t.Fatalf("expected privileged container")
+ }
+ if len(createdPod.Spec.Volumes) == 0 || createdPod.Spec.Volumes[0].HostPath == nil {
+ t.Fatalf("expected hostPath volume on debug pod")
+ }
+ if !deleteCalled {
+ t.Fatalf("expected debug pod to be deleted")
+ }
+ })
+ })
+}
+
+func equalStringSlices(a, b []string) bool {
+ if len(a) != len(b) {
+ return false
+ }
+ for i := range a {
+ if a[i] != b[i] {
+ return false
+ }
+ }
+ return true
+}
+
+func TestNodesDebugExecToolNonZeroExit(t *testing.T) {
+ testCase(t, func(c *mcpContext) {
+ mockServer := test.NewMockServer()
+ defer mockServer.Close()
+ c.withKubeConfig(mockServer.Config())
+
+ const namespace = "default"
+ const errorMessage = "failed"
+
+ scheme := runtime.NewScheme()
+ _ = v1.AddToScheme(scheme)
+ codec := serializer.NewCodecFactory(scheme).UniversalDeserializer()
+
+ mockServer.Handle(http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
+ switch {
+ case req.URL.Path == "/api":
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = w.Write([]byte(`{"kind":"APIVersions","versions":["v1"],"serverAddressByClientCIDRs":[{"clientCIDR":"0.0.0.0/0"}]}`))
+ case req.URL.Path == "/apis":
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = w.Write([]byte(`{"kind":"APIGroupList","apiVersion":"v1","groups":[]}`))
+ case req.URL.Path == "/api/v1":
+ w.Header().Set("Content-Type", "application/json")
+ _, _ = w.Write([]byte(`{"kind":"APIResourceList","apiVersion":"v1","resources":[{"name":"pods","singularName":"","namespaced":true,"kind":"Pod","verbs":["get","list","watch","create","update","patch","delete"]}]}`))
+ case req.Method == http.MethodPatch && strings.HasPrefix(req.URL.Path, "/api/v1/namespaces/"+namespace+"/pods/"):
+ // Handle server-side apply (PATCH with fieldManager query param)
+ body, err := io.ReadAll(req.Body)
+ if err != nil {
+ t.Fatalf("failed to read apply body: %v", err)
+ }
+ pod := &v1.Pod{}
+ if _, _, err = codec.Decode(body, nil, pod); err != nil {
+ t.Fatalf("failed to decode apply body: %v", err)
+ }
+ // Keep the name from the request URL if it was provided
+ pathParts := strings.Split(req.URL.Path, "/")
+ if len(pathParts) > 0 {
+ pod.Name = pathParts[len(pathParts)-1]
+ }
+ pod.Namespace = namespace
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(pod)
+ case req.Method == http.MethodPost && req.URL.Path == "/api/v1/namespaces/"+namespace+"/pods":
+ body, err := io.ReadAll(req.Body)
+ if err != nil {
+ t.Fatalf("failed to read create body: %v", err)
+ }
+ pod := &v1.Pod{}
+ if _, _, err = codec.Decode(body, nil, pod); err != nil {
+ t.Fatalf("failed to decode create body: %v", err)
+ }
+ pod.ObjectMeta = metav1.ObjectMeta{Name: pod.GenerateName + "xyz", Namespace: namespace}
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(pod)
+ case strings.HasPrefix(req.URL.Path, "/api/v1/namespaces/"+namespace+"/pods/") && strings.HasSuffix(req.URL.Path, "/log"):
+ w.Header().Set("Content-Type", "text/plain")
+ _, _ = w.Write([]byte(errorMessage))
+ case req.Method == http.MethodGet && strings.HasPrefix(req.URL.Path, "/api/v1/namespaces/"+namespace+"/pods/"):
+ pathParts := strings.Split(req.URL.Path, "/")
+ podName := pathParts[len(pathParts)-1]
+ pod := &v1.Pod{
+ TypeMeta: metav1.TypeMeta{
+ APIVersion: "v1",
+ Kind: "Pod",
+ },
+ ObjectMeta: metav1.ObjectMeta{
+ Name: podName,
+ Namespace: namespace,
+ },
+ }
+ pod.Status = v1.PodStatus{
+ Phase: v1.PodSucceeded,
+ ContainerStatuses: []v1.ContainerStatus{{
+ Name: "debug",
+ State: v1.ContainerState{Terminated: &v1.ContainerStateTerminated{
+ ExitCode: 2,
+ Reason: "Error",
+ }},
+ }},
+ }
+ w.Header().Set("Content-Type", "application/json")
+ _ = json.NewEncoder(w).Encode(pod)
+ }
+ }))
+
+ toolResult, err := c.callTool("nodes_debug_exec", map[string]any{
+ "node": "infra-1",
+ "command": []any{"journalctl"},
+ })
+
+ if err != nil {
+ t.Fatalf("call tool failed: %v", err)
+ }
+ if !toolResult.IsError {
+ t.Fatalf("expected tool to return error")
+ }
+ text := toolResult.Content[0].(mcp.TextContent).Text
+ if !strings.Contains(text, "command exited with code 2") {
+ t.Fatalf("expected exit code message, got %q", text)
+ }
+ if !strings.Contains(text, "Error") {
+ t.Fatalf("expected error reason included, got %q", text)
+ }
+ })
+}
diff --git a/pkg/mcp/testdata/toolsets-core-tools.json b/pkg/mcp/testdata/toolsets-core-tools.json
index 43680dae..b9f9fce1 100644
--- a/pkg/mcp/testdata/toolsets-core-tools.json
+++ b/pkg/mcp/testdata/toolsets-core-tools.json
@@ -33,6 +33,50 @@
},
"name": "namespaces_list"
},
+ {
+ "annotations": {
+ "title": "Nodes: Debug Exec",
+ "readOnlyHint": false,
+ "destructiveHint": true,
+ "idempotentHint": false,
+ "openWorldHint": true
+ },
+ "description": "Run commands on an OpenShift node using a privileged debug pod (output is truncated to the most recent 100 lines, so prefer filters like grep when expecting large logs).",
+ "inputSchema": {
+ "type": "object",
+ "properties": {
+ "node": {
+ "description": "Name of the node to debug (e.g. worker-0).",
+ "type": "string"
+ },
+ "command": {
+ "description": "Command to execute on the node via chroot. Provide each argument as a separate array item (e.g. ['systemctl', 'status', 'kubelet']).",
+ "items": {
+ "type": "string"
+ },
+ "type": "array"
+ },
+ "namespace": {
+ "description": "Namespace to create the temporary debug pod in (optional, defaults to the current namespace or 'default').",
+ "type": "string"
+ },
+ "image": {
+ "description": "Container image to use for the debug pod (optional). Defaults to a Fedora-based utility image that includes chroot.",
+ "type": "string"
+ },
+ "timeout_seconds": {
+ "description": "Maximum time to wait for the command to complete before timing out (optional, defaults to 300 seconds).",
+ "minimum": 1,
+ "type": "integer"
+ }
+ },
+ "required": [
+ "node",
+ "command"
+ ]
+ },
+ "name": "nodes_debug_exec"
+ },
{
"annotations": {
"title": "Pods: Delete",
diff --git a/pkg/mcp/testdata/toolsets-full-tools-multicluster-enum.json b/pkg/mcp/testdata/toolsets-full-tools-multicluster-enum.json
index 97af6fb5..d874a5c0 100644
--- a/pkg/mcp/testdata/toolsets-full-tools-multicluster-enum.json
+++ b/pkg/mcp/testdata/toolsets-full-tools-multicluster-enum.json
@@ -195,6 +195,58 @@
},
"name": "namespaces_list"
},
+ {
+ "annotations": {
+ "title": "Nodes: Debug Exec",
+ "readOnlyHint": false,
+ "destructiveHint": true,
+ "idempotentHint": false,
+ "openWorldHint": true
+ },
+ "description": "Run commands on an OpenShift node using a privileged debug pod (output is truncated to the most recent 100 lines, so prefer filters like grep when expecting large logs).",
+ "inputSchema": {
+ "type": "object",
+ "properties": {
+ "node": {
+ "description": "Name of the node to debug (e.g. worker-0).",
+ "type": "string"
+ },
+ "command": {
+ "description": "Command to execute on the node via chroot. Provide each argument as a separate array item (e.g. ['systemctl', 'status', 'kubelet']).",
+ "items": {
+ "type": "string"
+ },
+ "type": "array"
+ },
+ "context": {
+ "description": "Optional parameter selecting which context to run the tool in. Defaults to fake-context if not set",
+ "enum": [
+ "extra-cluster",
+ "fake-context"
+ ],
+ "type": "string"
+ },
+ "namespace": {
+ "description": "Namespace to create the temporary debug pod in (optional, defaults to the current namespace or 'default').",
+ "type": "string"
+ },
+ "image": {
+ "description": "Container image to use for the debug pod (optional). Defaults to a Fedora-based utility image that includes chroot.",
+ "type": "string"
+ },
+ "timeout_seconds": {
+ "description": "Maximum time to wait for the command to complete before timing out (optional, defaults to 300 seconds).",
+ "minimum": 1,
+ "type": "integer"
+ }
+ },
+ "required": [
+ "node",
+ "command"
+ ]
+ },
+ "name": "nodes_debug_exec"
+ },
{
"annotations": {
"title": "Pods: Delete",
@@ -677,4 +729,4 @@
},
"name": "resources_list"
}
-]
+]
\ No newline at end of file
diff --git a/pkg/mcp/testdata/toolsets-full-tools-multicluster.json b/pkg/mcp/testdata/toolsets-full-tools-multicluster.json
index 861a1b5a..e05452c8 100644
--- a/pkg/mcp/testdata/toolsets-full-tools-multicluster.json
+++ b/pkg/mcp/testdata/toolsets-full-tools-multicluster.json
@@ -175,6 +175,54 @@
},
"name": "namespaces_list"
},
+ {
+ "annotations": {
+ "title": "Nodes: Debug Exec",
+ "readOnlyHint": false,
+ "destructiveHint": true,
+ "idempotentHint": false,
+ "openWorldHint": true
+ },
+ "description": "Run commands on an OpenShift node using a privileged debug pod (output is truncated to the most recent 100 lines, so prefer filters like grep when expecting large logs).",
+ "inputSchema": {
+ "type": "object",
+ "properties": {
+ "node": {
+ "description": "Name of the node to debug (e.g. worker-0).",
+ "type": "string"
+ },
+ "command": {
+ "description": "Command to execute on the node via chroot. Provide each argument as a separate array item (e.g. ['systemctl', 'status', 'kubelet']).",
+ "items": {
+ "type": "string"
+ },
+ "type": "array"
+ },
+ "context": {
+ "description": "Optional parameter selecting which context to run the tool in. Defaults to fake-context if not set",
+ "type": "string"
+ },
+ "namespace": {
+ "description": "Namespace to create the temporary debug pod in (optional, defaults to the current namespace or 'default').",
+ "type": "string"
+ },
+ "image": {
+ "description": "Container image to use for the debug pod (optional). Defaults to a Fedora-based utility image that includes chroot.",
+ "type": "string"
+ },
+ "timeout_seconds": {
+ "description": "Maximum time to wait for the command to complete before timing out (optional, defaults to 300 seconds).",
+ "minimum": 1,
+ "type": "integer"
+ }
+ },
+ "required": [
+ "node",
+ "command"
+ ]
+ },
+ "name": "nodes_debug_exec"
+ },
{
"annotations": {
"title": "Pods: Delete",
@@ -609,4 +657,4 @@
},
"name": "resources_list"
}
-]
+]
\ No newline at end of file
diff --git a/pkg/mcp/testdata/toolsets-full-tools-openshift.json b/pkg/mcp/testdata/toolsets-full-tools-openshift.json
index b5018945..f299e837 100644
--- a/pkg/mcp/testdata/toolsets-full-tools-openshift.json
+++ b/pkg/mcp/testdata/toolsets-full-tools-openshift.json
@@ -139,6 +139,50 @@
},
"name": "namespaces_list"
},
+ {
+ "annotations": {
+ "title": "Nodes: Debug Exec",
+ "readOnlyHint": false,
+ "destructiveHint": true,
+ "idempotentHint": false,
+ "openWorldHint": true
+ },
+ "description": "Run commands on an OpenShift node using a privileged debug pod (output is truncated to the most recent 100 lines, so prefer filters like grep when expecting large logs).",
+ "inputSchema": {
+ "type": "object",
+ "properties": {
+ "node": {
+ "description": "Name of the node to debug (e.g. worker-0).",
+ "type": "string"
+ },
+ "command": {
+ "description": "Command to execute on the node via chroot. Provide each argument as a separate array item (e.g. ['systemctl', 'status', 'kubelet']).",
+ "items": {
+ "type": "string"
+ },
+ "type": "array"
+ },
+ "namespace": {
+ "description": "Namespace to create the temporary debug pod in (optional, defaults to the current namespace or 'default').",
+ "type": "string"
+ },
+ "image": {
+ "description": "Container image to use for the debug pod (optional). Defaults to a Fedora-based utility image that includes chroot.",
+ "type": "string"
+ },
+ "timeout_seconds": {
+ "description": "Maximum time to wait for the command to complete before timing out (optional, defaults to 300 seconds).",
+ "minimum": 1,
+ "type": "integer"
+ }
+ },
+ "required": [
+ "node",
+ "command"
+ ]
+ },
+ "name": "nodes_debug_exec"
+ },
{
"annotations": {
"title": "Pods: Delete",
diff --git a/pkg/mcp/testdata/toolsets-full-tools.json b/pkg/mcp/testdata/toolsets-full-tools.json
index 7b9f471d..6533e46f 100644
--- a/pkg/mcp/testdata/toolsets-full-tools.json
+++ b/pkg/mcp/testdata/toolsets-full-tools.json
@@ -139,6 +139,50 @@
},
"name": "namespaces_list"
},
+ {
+ "annotations": {
+ "title": "Nodes: Debug Exec",
+ "readOnlyHint": false,
+ "destructiveHint": true,
+ "idempotentHint": false,
+ "openWorldHint": true
+ },
+ "description": "Run commands on an OpenShift node using a privileged debug pod (output is truncated to the most recent 100 lines, so prefer filters like grep when expecting large logs).",
+ "inputSchema": {
+ "type": "object",
+ "properties": {
+ "node": {
+ "description": "Name of the node to debug (e.g. worker-0).",
+ "type": "string"
+ },
+ "command": {
+ "description": "Command to execute on the node via chroot. Provide each argument as a separate array item (e.g. ['systemctl', 'status', 'kubelet']).",
+ "items": {
+ "type": "string"
+ },
+ "type": "array"
+ },
+ "namespace": {
+ "description": "Namespace to create the temporary debug pod in (optional, defaults to the current namespace or 'default').",
+ "type": "string"
+ },
+ "image": {
+ "description": "Container image to use for the debug pod (optional). Defaults to a Fedora-based utility image that includes chroot.",
+ "type": "string"
+ },
+ "timeout_seconds": {
+ "description": "Maximum time to wait for the command to complete before timing out (optional, defaults to 300 seconds).",
+ "minimum": 1,
+ "type": "integer"
+ }
+ },
+ "required": [
+ "node",
+ "command"
+ ]
+ },
+ "name": "nodes_debug_exec"
+ },
{
"annotations": {
"title": "Pods: Delete",
diff --git a/pkg/ocp/nodes_debug.go b/pkg/ocp/nodes_debug.go
new file mode 100644
index 00000000..946e00b1
--- /dev/null
+++ b/pkg/ocp/nodes_debug.go
@@ -0,0 +1,346 @@
+package ocp
+
+import (
+ "context"
+ "errors"
+ "fmt"
+ "strings"
+ "time"
+
+ "github.com/containers/kubernetes-mcp-server/pkg/kubernetes"
+ "github.com/containers/kubernetes-mcp-server/pkg/version"
+
+ corev1 "k8s.io/api/core/v1"
+ metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+ "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
+ "k8s.io/apimachinery/pkg/runtime"
+ "k8s.io/apimachinery/pkg/runtime/schema"
+ "k8s.io/utils/ptr"
+ "sigs.k8s.io/yaml"
+)
+
+const (
+ // DefaultNodeDebugImage is a lightweight image that provides the tooling required to run chroot.
+ DefaultNodeDebugImage = "quay.io/fedora/fedora:latest"
+ // NodeDebugContainerName is the name used for the debug container, matching oc debug defaults.
+ NodeDebugContainerName = "debug"
+ // DefaultNodeDebugTimeout is the maximum time to wait for the debug pod to finish executing.
+ DefaultNodeDebugTimeout = 1 * time.Minute
+)
+
+// KubernetesClient defines the interface needed for node debug operations.
+type KubernetesClient interface {
+ NamespaceOrDefault(namespace string) string
+ ResourcesCreateOrUpdate(ctx context.Context, resource string) ([]*unstructured.Unstructured, error)
+ ResourcesGet(ctx context.Context, gvk *schema.GroupVersionKind, namespace, name string) (*unstructured.Unstructured, error)
+ ResourcesDelete(ctx context.Context, gvk *schema.GroupVersionKind, namespace, name string) error
+ PodsLog(ctx context.Context, namespace, name, container string, previous bool, tail int64) (string, error)
+}
+
+// NodesDebugExec mimics `oc debug node/ -- ` by creating a privileged pod on the target
+// node, running the provided command within a chroot of the host filesystem, collecting its output, and
+// removing the pod afterwards.
+//
+// When namespace is empty, the configured namespace (or "default" if none) is used. When image is empty the
+// default debug image is used. Timeout controls how long we wait for the pod to complete.
+func NodesDebugExec(
+ ctx context.Context,
+ k KubernetesClient,
+ namespace string,
+ nodeName string,
+ image string,
+ command []string,
+ timeout time.Duration,
+) (string, error) {
+ if nodeName == "" {
+ return "", errors.New("node name is required")
+ }
+ if len(command) == 0 {
+ return "", errors.New("command is required")
+ }
+
+ ns := k.NamespaceOrDefault(namespace)
+ if ns == "" {
+ ns = "default"
+ }
+ debugImage := image
+ if debugImage == "" {
+ debugImage = DefaultNodeDebugImage
+ }
+ if timeout <= 0 {
+ timeout = DefaultNodeDebugTimeout
+ }
+
+ // Create the debug pod
+ created, err := createDebugPod(ctx, k, nodeName, ns, debugImage, command)
+ if err != nil {
+ return "", err
+ }
+
+ // Ensure the pod is deleted regardless of completion state.
+ defer func() {
+ deleteCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+ defer cancel()
+ _ = k.ResourcesDelete(deleteCtx, &podGVK, ns, created.Name)
+ }()
+
+ // Poll for debug pod completion
+ terminated, lastPod, waitMsg, err := pollForCompletion(ctx, k, ns, created.Name, timeout)
+ if err != nil {
+ return "", err
+ }
+
+ // Retrieve the logs
+ logs, err := retrieveLogs(ctx, k, ns, created.Name)
+ if err != nil {
+ return "", err
+ }
+
+ // Process the results
+ return processResults(terminated, lastPod, waitMsg, logs)
+}
+
+// createDebugPod creates a privileged pod on the target node to run debug commands.
+func createDebugPod(
+ ctx context.Context,
+ k KubernetesClient,
+ nodeName string,
+ namespace string,
+ image string,
+ command []string,
+) (*corev1.Pod, error) {
+ sanitizedNode := sanitizeForName(nodeName)
+ hostPathType := corev1.HostPathDirectory
+
+ // Generate a unique name since ResourcesCreateOrUpdate doesn't support GenerateName
+ podName := fmt.Sprintf("node-debug-%s-%d", sanitizedNode, time.Now().UnixNano())
+
+ debugPod := &corev1.Pod{
+ ObjectMeta: metav1.ObjectMeta{
+ Name: podName,
+ Namespace: namespace,
+ Labels: map[string]string{
+ kubernetes.AppKubernetesManagedBy: version.BinaryName,
+ kubernetes.AppKubernetesComponent: "node-debug",
+ kubernetes.AppKubernetesName: fmt.Sprintf("node-debug-%s", sanitizedNode),
+ },
+ },
+ Spec: corev1.PodSpec{
+ AutomountServiceAccountToken: ptr.To(false),
+ NodeName: nodeName,
+ RestartPolicy: corev1.RestartPolicyNever,
+ SecurityContext: &corev1.PodSecurityContext{
+ RunAsUser: ptr.To[int64](0),
+ },
+ Tolerations: []corev1.Toleration{
+ {Operator: corev1.TolerationOpExists},
+ {Operator: corev1.TolerationOpExists, Effect: corev1.TaintEffectNoSchedule},
+ {Operator: corev1.TolerationOpExists, Effect: corev1.TaintEffectNoExecute},
+ },
+ Volumes: []corev1.Volume{
+ {
+ Name: "host-root",
+ VolumeSource: corev1.VolumeSource{
+ HostPath: &corev1.HostPathVolumeSource{
+ Path: "/",
+ Type: &hostPathType,
+ },
+ },
+ },
+ },
+ Containers: []corev1.Container{
+ {
+ Name: NodeDebugContainerName,
+ Image: image,
+ ImagePullPolicy: corev1.PullIfNotPresent,
+ Command: append([]string{"chroot", "/host"}, command...),
+ SecurityContext: &corev1.SecurityContext{
+ Privileged: ptr.To(true),
+ RunAsUser: ptr.To[int64](0),
+ },
+ VolumeMounts: []corev1.VolumeMount{
+ {Name: "host-root", MountPath: "/host"},
+ },
+ },
+ },
+ },
+ }
+
+ // Convert Pod to YAML for ResourcesCreateOrUpdate
+ debugPod.TypeMeta = metav1.TypeMeta{
+ APIVersion: "v1",
+ Kind: "Pod",
+ }
+ podYAML, err := yaml.Marshal(debugPod)
+ if err != nil {
+ return nil, fmt.Errorf("failed to marshal pod to YAML: %w", err)
+ }
+
+ // Create the pod using the high-level API
+ createdList, err := k.ResourcesCreateOrUpdate(ctx, string(podYAML))
+ if err != nil {
+ return nil, fmt.Errorf("failed to create debug pod: %w", err)
+ }
+
+ if len(createdList) == 0 {
+ return nil, fmt.Errorf("ResourcesCreateOrUpdate returned empty result")
+ }
+ if len(createdList) != 1 {
+ return nil, fmt.Errorf("expected 1 pod to be created, got %d", len(createdList))
+ }
+ if createdList[0] == nil {
+ return nil, fmt.Errorf("ResourcesCreateOrUpdate returned nil pod")
+ }
+
+ // Convert back to typed Pod
+ pod, err := unstructuredToPod(createdList[0])
+ if err != nil {
+ return nil, fmt.Errorf("failed to convert created pod: %w", err)
+ }
+ return pod, nil
+}
+
+// pollForCompletion polls the debug pod until it completes or times out.
+func pollForCompletion(
+ ctx context.Context,
+ k KubernetesClient,
+ namespace string,
+ podName string,
+ timeout time.Duration,
+) (*corev1.ContainerStateTerminated, *corev1.Pod, string, error) {
+ pollCtx, cancel := context.WithTimeout(ctx, timeout)
+ defer cancel()
+
+ ticker := time.NewTicker(2 * time.Second)
+ defer ticker.Stop()
+
+ var (
+ lastPod *corev1.Pod
+ terminated *corev1.ContainerStateTerminated
+ waitMsg string
+ )
+
+ for {
+ select {
+ case <-pollCtx.Done():
+ return nil, nil, "", fmt.Errorf("timed out waiting for debug pod %s to complete: %w", podName, pollCtx.Err())
+ default:
+ }
+
+ // Get pod status using the high-level API
+ unstructuredPod, getErr := k.ResourcesGet(pollCtx, &podGVK, namespace, podName)
+ if getErr != nil {
+ return nil, nil, "", fmt.Errorf("failed to get debug pod status: %w", getErr)
+ }
+
+ current, err := unstructuredToPod(unstructuredPod)
+ if err != nil {
+ return nil, nil, "", err
+ }
+ lastPod = current
+
+ if status := containerStatusByName(current.Status.ContainerStatuses, NodeDebugContainerName); status != nil {
+ if status.State.Waiting != nil {
+ waitMsg = fmt.Sprintf("container waiting: %s", status.State.Waiting.Reason)
+ // Image pull issues should fail fast.
+ if status.State.Waiting.Reason == "ErrImagePull" || status.State.Waiting.Reason == "ImagePullBackOff" {
+ return nil, nil, "", fmt.Errorf("debug container failed to start (%s): %s", status.State.Waiting.Reason, status.State.Waiting.Message)
+ }
+ }
+ if status.State.Terminated != nil {
+ terminated = status.State.Terminated
+ break
+ }
+ }
+
+ if current.Status.Phase == corev1.PodFailed {
+ break
+ }
+
+ select {
+ case <-pollCtx.Done():
+ return nil, nil, "", fmt.Errorf("timed out waiting for debug pod %s to complete: %w", podName, pollCtx.Err())
+ case <-ticker.C:
+ }
+ }
+
+ return terminated, lastPod, waitMsg, nil
+}
+
+// retrieveLogs retrieves the logs from the debug pod.
+func retrieveLogs(ctx context.Context, k KubernetesClient, namespace, podName string) (string, error) {
+ logCtx, logCancel := context.WithTimeout(ctx, 30*time.Second)
+ defer logCancel()
+ logs, logErr := k.PodsLog(logCtx, namespace, podName, NodeDebugContainerName, false, 0)
+ if logErr != nil {
+ return "", fmt.Errorf("failed to retrieve debug pod logs: %w", logErr)
+ }
+ return strings.TrimSpace(logs), nil
+}
+
+// processResults processes the debug pod completion status and returns the appropriate result.
+func processResults(terminated *corev1.ContainerStateTerminated, lastPod *corev1.Pod, waitMsg, logs string) (string, error) {
+ if terminated != nil {
+ if terminated.ExitCode != 0 {
+ errMsg := fmt.Sprintf("command exited with code %d", terminated.ExitCode)
+ if terminated.Reason != "" {
+ errMsg = fmt.Sprintf("%s (%s)", errMsg, terminated.Reason)
+ }
+ if terminated.Message != "" {
+ errMsg = fmt.Sprintf("%s: %s", errMsg, terminated.Message)
+ }
+ return logs, errors.New(errMsg)
+ }
+ return logs, nil
+ }
+
+ if lastPod != nil && lastPod.Status.Reason != "" {
+ return logs, fmt.Errorf("debug pod failed: %s", lastPod.Status.Reason)
+ }
+ if waitMsg != "" {
+ return logs, fmt.Errorf("debug container did not complete: %s", waitMsg)
+ }
+ return logs, errors.New("debug container did not reach a terminal state")
+}
+
+func sanitizeForName(name string) string {
+ lower := strings.ToLower(name)
+ var b strings.Builder
+ b.Grow(len(lower))
+ for _, r := range lower {
+ if (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9') || r == '-' {
+ b.WriteRune(r)
+ continue
+ }
+ b.WriteRune('-')
+ }
+ sanitized := strings.Trim(b.String(), "-")
+ if sanitized == "" {
+ sanitized = "node"
+ }
+ if len(sanitized) > 40 {
+ sanitized = sanitized[:40]
+ }
+ return sanitized
+}
+
+func containerStatusByName(statuses []corev1.ContainerStatus, name string) *corev1.ContainerStatus {
+ for idx := range statuses {
+ if statuses[idx].Name == name {
+ return &statuses[idx]
+ }
+ }
+ return nil
+}
+
+// unstructuredToPod converts an unstructured object to a typed Pod.
+func unstructuredToPod(obj *unstructured.Unstructured) (*corev1.Pod, error) {
+ pod := &corev1.Pod{}
+ err := runtime.DefaultUnstructuredConverter.FromUnstructured(obj.Object, pod)
+ if err != nil {
+ return nil, fmt.Errorf("failed to convert unstructured to Pod: %w", err)
+ }
+ return pod, nil
+}
+
+var podGVK = schema.GroupVersionKind{Group: "", Version: "v1", Kind: "Pod"}
diff --git a/pkg/ocp/nodes_debug_test.go b/pkg/ocp/nodes_debug_test.go
new file mode 100644
index 00000000..93ebfe13
--- /dev/null
+++ b/pkg/ocp/nodes_debug_test.go
@@ -0,0 +1,360 @@
+package ocp
+
+import (
+ "context"
+ "fmt"
+ "strings"
+ "testing"
+ "time"
+
+ corev1 "k8s.io/api/core/v1"
+)
+
+func TestNodesDebugExecCreatesPrivilegedChrootPod(t *testing.T) {
+ env := NewNodeDebugTestEnv(t)
+ env.Pods.Logs = "kernel 6.8"
+
+ out, err := NodesDebugExec(context.Background(), env.KubernetesClient, "", "worker-0", "", []string{"uname", "-a"}, 2*time.Minute)
+ if err != nil {
+ t.Fatalf("NodesDebugExec returned error: %v", err)
+ }
+ if out != "kernel 6.8" {
+ t.Fatalf("unexpected command output: %q", out)
+ }
+
+ created := env.Pods.Created
+ if created == nil {
+ t.Fatalf("expected debug pod to be created")
+ }
+ if created.Namespace != "default" {
+ t.Fatalf("expected default namespace fallback, got %q", created.Namespace)
+ }
+ if created.Spec.NodeName != "worker-0" {
+ t.Fatalf("expected pod to target node worker-0, got %q", created.Spec.NodeName)
+ }
+ if !env.Pods.Deleted {
+ t.Fatalf("expected debug pod to be deleted after execution")
+ }
+
+ if len(created.Spec.Containers) != 1 {
+ t.Fatalf("expected single container in debug pod")
+ }
+ container := created.Spec.Containers[0]
+ if container.Image != DefaultNodeDebugImage {
+ t.Fatalf("expected default image %q, got %q", DefaultNodeDebugImage, container.Image)
+ }
+ expectedCommand := []string{"chroot", "/host", "uname", "-a"}
+ if len(container.Command) != len(expectedCommand) {
+ t.Fatalf("unexpected command length, got %v", container.Command)
+ }
+ for i, part := range expectedCommand {
+ if container.Command[i] != part {
+ t.Fatalf("command[%d] = %q, expected %q", i, container.Command[i], part)
+ }
+ }
+ if container.SecurityContext == nil || container.SecurityContext.Privileged == nil || !*container.SecurityContext.Privileged {
+ t.Fatalf("expected container to run privileged")
+ }
+ if len(container.VolumeMounts) != 1 || container.VolumeMounts[0].MountPath != "/host" {
+ t.Fatalf("expected container to mount host root at /host")
+ }
+
+ if created.Spec.SecurityContext == nil || created.Spec.SecurityContext.RunAsUser == nil || *created.Spec.SecurityContext.RunAsUser != 0 {
+ t.Fatalf("expected pod security context to run as root")
+ }
+
+ if len(created.Spec.Volumes) != 1 || created.Spec.Volumes[0].HostPath == nil {
+ t.Fatalf("expected host root volume to be configured")
+ }
+}
+
+func TestNodesDebugExecReturnsErrorForNonZeroExit(t *testing.T) {
+ env := NewNodeDebugTestEnv(t)
+ env.Pods.ExitCode = 5
+ env.Pods.TerminatedReason = "Error"
+ env.Pods.TerminatedMessage = "some failure"
+ env.Pods.Logs = "bad things happened"
+
+ out, err := NodesDebugExec(context.Background(), env.KubernetesClient, "debug-ns", "infra-node", "registry.example/custom:latest", []string{"journalctl", "-xe"}, time.Minute)
+ if err == nil {
+ t.Fatalf("expected error for non-zero exit code")
+ }
+ if out != "bad things happened" {
+ t.Fatalf("expected logs to be returned alongside error, got %q", out)
+ }
+
+ created := env.Pods.Created
+ if created == nil {
+ t.Fatalf("expected pod to be created")
+ }
+ if created.Namespace != "debug-ns" {
+ t.Fatalf("expected provided namespace to be used, got %q", created.Namespace)
+ }
+ if containerImage := created.Spec.Containers[0].Image; containerImage != "registry.example/custom:latest" {
+ t.Fatalf("expected custom image to be used, got %q", containerImage)
+ }
+}
+
+func TestCreateDebugPod(t *testing.T) {
+ env := NewNodeDebugTestEnv(t)
+
+ created, err := createDebugPod(context.Background(), env.KubernetesClient, "worker-1", "test-ns", "custom:v1", []string{"ls", "-la"})
+ if err != nil {
+ t.Fatalf("createDebugPod failed: %v", err)
+ }
+ if created == nil {
+ t.Fatalf("expected pod to be created")
+ }
+ if created.Namespace != "test-ns" {
+ t.Fatalf("expected namespace test-ns, got %q", created.Namespace)
+ }
+ if created.Spec.NodeName != "worker-1" {
+ t.Fatalf("expected node worker-1, got %q", created.Spec.NodeName)
+ }
+ if !strings.HasPrefix(created.Name, "node-debug-worker-1-") {
+ t.Fatalf("unexpected pod name: %q", created.Name)
+ }
+ if len(created.Spec.Containers) != 1 {
+ t.Fatalf("expected 1 container, got %d", len(created.Spec.Containers))
+ }
+ container := created.Spec.Containers[0]
+ if container.Image != "custom:v1" {
+ t.Fatalf("expected image custom:v1, got %q", container.Image)
+ }
+ expectedCmd := []string{"chroot", "/host", "ls", "-la"}
+ if len(container.Command) != len(expectedCmd) {
+ t.Fatalf("expected %d command parts, got %d", len(expectedCmd), len(container.Command))
+ }
+ for i, part := range expectedCmd {
+ if container.Command[i] != part {
+ t.Fatalf("command[%d] = %q, expected %q", i, container.Command[i], part)
+ }
+ }
+ if container.SecurityContext == nil || !*container.SecurityContext.Privileged {
+ t.Fatalf("expected privileged container")
+ }
+}
+
+func TestPollForCompletion(t *testing.T) {
+ tests := []struct {
+ name string
+ exitCode int32
+ terminatedReason string
+ waitingReason string
+ waitingMessage string
+ expectError bool
+ expectTerminated bool
+ errorContains []string
+ expectedExitCode int32
+ expectedReason string
+ }{
+ {
+ name: "successful completion",
+ exitCode: 0,
+ expectTerminated: true,
+ expectedExitCode: 0,
+ },
+ {
+ name: "non-zero exit code",
+ exitCode: 42,
+ terminatedReason: "Error",
+ expectTerminated: true,
+ expectedExitCode: 42,
+ expectedReason: "Error",
+ },
+ {
+ name: "image pull error",
+ waitingReason: "ErrImagePull",
+ waitingMessage: "image not found",
+ expectError: true,
+ errorContains: []string{"ErrImagePull", "image not found"},
+ },
+ {
+ name: "image pull backoff",
+ waitingReason: "ImagePullBackOff",
+ waitingMessage: "back-off pulling image",
+ expectError: true,
+ errorContains: []string{"ImagePullBackOff", "back-off pulling image"},
+ },
+ }
+
+ for _, tt := range tests {
+ t.Run(tt.name, func(t *testing.T) {
+ env := NewNodeDebugTestEnv(t)
+ env.Pods.ExitCode = tt.exitCode
+ env.Pods.TerminatedReason = tt.terminatedReason
+ env.Pods.WaitingReason = tt.waitingReason
+ env.Pods.WaitingMessage = tt.waitingMessage
+
+ created, _ := createDebugPod(context.Background(), env.KubernetesClient, "node-1", "default", DefaultNodeDebugImage, []string{"echo", "test"})
+
+ terminated, lastPod, waitMsg, err := pollForCompletion(context.Background(), env.KubernetesClient, "default", created.Name, time.Minute)
+
+ if tt.expectError {
+ if err == nil {
+ t.Fatalf("expected error but got none")
+ }
+ for _, substr := range tt.errorContains {
+ if !strings.Contains(err.Error(), substr) {
+ t.Fatalf("expected error to contain %q, got: %v", substr, err)
+ }
+ }
+ return
+ }
+
+ if err != nil {
+ t.Fatalf("unexpected error: %v", err)
+ }
+
+ if tt.expectTerminated {
+ if terminated == nil {
+ t.Fatalf("expected terminated state")
+ }
+ if terminated.ExitCode != tt.expectedExitCode {
+ t.Fatalf("expected exit code %d, got %d", tt.expectedExitCode, terminated.ExitCode)
+ }
+ if tt.expectedReason != "" && terminated.Reason != tt.expectedReason {
+ t.Fatalf("expected reason %q, got %q", tt.expectedReason, terminated.Reason)
+ }
+ if lastPod == nil {
+ t.Fatalf("expected lastPod to be set")
+ }
+ }
+
+ if tt.waitingReason == "" && waitMsg != "" {
+ t.Fatalf("expected no wait message, got %q", waitMsg)
+ }
+ })
+ }
+}
+
+func TestRetrieveLogs(t *testing.T) {
+ env := NewNodeDebugTestEnv(t)
+ env.Pods.Logs = " some output with whitespace \n"
+
+ created, _ := createDebugPod(context.Background(), env.KubernetesClient, "node-1", "default", DefaultNodeDebugImage, []string{"echo", "test"})
+
+ logs, err := retrieveLogs(context.Background(), env.KubernetesClient, "default", created.Name)
+ if err != nil {
+ t.Fatalf("retrieveLogs failed: %v", err)
+ }
+ if logs != "some output with whitespace" {
+ t.Fatalf("expected trimmed logs, got %q", logs)
+ }
+}
+
+func TestProcessResults(t *testing.T) {
+ tests := []struct {
+ name string
+ terminated *corev1.ContainerStateTerminated
+ pod *corev1.Pod
+ waitMsg string
+ logs string
+ expectError bool
+ errorContains []string
+ }{
+ {
+ name: "successful completion",
+ terminated: &corev1.ContainerStateTerminated{
+ ExitCode: 0,
+ },
+ logs: "success output",
+ expectError: false,
+ },
+ {
+ name: "non-zero exit code",
+ terminated: &corev1.ContainerStateTerminated{
+ ExitCode: 127,
+ Reason: "CommandNotFound",
+ Message: "command not found",
+ },
+ logs: "error logs",
+ expectError: true,
+ errorContains: []string{"127", "CommandNotFound", "command not found"},
+ },
+ {
+ name: "non-zero exit code without reason or message",
+ terminated: &corev1.ContainerStateTerminated{
+ ExitCode: 1,
+ },
+ logs: "failed",
+ expectError: true,
+ errorContains: []string{"command exited with code 1"},
+ },
+ {
+ name: "pod failed",
+ pod: &corev1.Pod{
+ Status: corev1.PodStatus{
+ Reason: "Evicted",
+ },
+ },
+ logs: "pod evicted",
+ expectError: true,
+ errorContains: []string{"Evicted"},
+ },
+ {
+ name: "container waiting",
+ waitMsg: "container waiting: ImagePullBackOff",
+ logs: "waiting logs",
+ expectError: true,
+ errorContains: []string{"did not complete"},
+ },
+ {
+ name: "no terminal state",
+ logs: "incomplete",
+ expectError: true,
+ errorContains: []string{"did not reach a terminal state"},
+ },
+ }
+
+ for _, tt := range tests {
+ t.Run(tt.name, func(t *testing.T) {
+ result, err := processResults(tt.terminated, tt.pod, tt.waitMsg, tt.logs)
+
+ if tt.expectError {
+ if err == nil {
+ t.Fatalf("expected error but got none")
+ }
+ for _, substr := range tt.errorContains {
+ if !strings.Contains(err.Error(), substr) {
+ t.Fatalf("expected error to contain %q, got: %v", substr, err)
+ }
+ }
+ } else {
+ if err != nil {
+ t.Fatalf("expected no error, got: %v", err)
+ }
+ }
+
+ if result != tt.logs {
+ t.Fatalf("expected result %q, got %q", tt.logs, result)
+ }
+ })
+ }
+}
+
+func TestSanitizeForName(t *testing.T) {
+ tests := []struct {
+ input string
+ expected string
+ }{
+ {"worker-0", "worker-0"},
+ {"WORKER-0", "worker-0"},
+ {"worker.0", "worker-0"},
+ {"worker_0", "worker-0"},
+ {"ip-10-0-1-42.ec2.internal", "ip-10-0-1-42-ec2-internal"},
+ {"", "node"},
+ {"---", "node"},
+ {strings.Repeat("a", 50), strings.Repeat("a", 40)},
+ {"Worker-Node_123.domain", "worker-node-123-domain"},
+ }
+
+ for _, tt := range tests {
+ t.Run(fmt.Sprintf("sanitize(%q)", tt.input), func(t *testing.T) {
+ result := sanitizeForName(tt.input)
+ if result != tt.expected {
+ t.Fatalf("expected %q, got %q", tt.expected, result)
+ }
+ })
+ }
+}
diff --git a/pkg/ocp/testhelpers.go b/pkg/ocp/testhelpers.go
new file mode 100644
index 00000000..f4a9788e
--- /dev/null
+++ b/pkg/ocp/testhelpers.go
@@ -0,0 +1,190 @@
+package ocp
+
+import (
+ "context"
+ "fmt"
+ "io"
+ "net/http"
+ "net/url"
+ "strings"
+ "testing"
+
+ corev1 "k8s.io/api/core/v1"
+ metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+ "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
+ "k8s.io/apimachinery/pkg/runtime"
+ "k8s.io/apimachinery/pkg/runtime/schema"
+ schemek8s "k8s.io/client-go/kubernetes/scheme"
+ corev1client "k8s.io/client-go/kubernetes/typed/core/v1"
+ restclient "k8s.io/client-go/rest"
+ "sigs.k8s.io/yaml"
+)
+
+// NodeDebugTestEnv bundles a Kubernetes instance with a controllable pods client for tests.
+type NodeDebugTestEnv struct {
+ KubernetesClient KubernetesClient // Interface for ocp package functions
+ Pods *FakePodInterface
+}
+
+// NewNodeDebugTestEnv constructs a testing harness for exercising NodesDebugExec.
+func NewNodeDebugTestEnv(t *testing.T) *NodeDebugTestEnv {
+ t.Helper()
+
+ podsClient := &FakePodInterface{}
+ fakeK8s := &FakeKubernetes{
+ pods: podsClient,
+ }
+
+ return &NodeDebugTestEnv{
+ KubernetesClient: fakeK8s,
+ Pods: podsClient,
+ }
+}
+
+// FakeKubernetes implements the subset of kubernetes.Kubernetes methods needed for testing.
+type FakeKubernetes struct {
+ pods *FakePodInterface
+}
+
+func (f *FakeKubernetes) NamespaceOrDefault(namespace string) string {
+ if namespace == "" {
+ return "default"
+ }
+ return namespace
+}
+
+func (f *FakeKubernetes) ResourcesCreateOrUpdate(ctx context.Context, resource string) ([]*unstructured.Unstructured, error) {
+ // Parse YAML to Pod
+ pod := &corev1.Pod{}
+ err := yaml.Unmarshal([]byte(resource), pod)
+ if err != nil {
+ return nil, fmt.Errorf("failed to unmarshal YAML: %w", err)
+ }
+
+ // Use the fake pod interface to create
+ created, err := f.pods.Create(ctx, pod, metav1.CreateOptions{})
+ if err != nil {
+ return nil, err
+ }
+
+ // Convert back to unstructured
+ unstructuredObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(created)
+ if err != nil {
+ return nil, fmt.Errorf("failed to convert to unstructured: %w", err)
+ }
+ return []*unstructured.Unstructured{{Object: unstructuredObj}}, nil
+}
+
+func (f *FakeKubernetes) ResourcesGet(ctx context.Context, gvk *schema.GroupVersionKind, namespace, name string) (*unstructured.Unstructured, error) {
+ // Use the fake pod interface to get
+ pod, err := f.pods.Get(ctx, name, metav1.GetOptions{})
+ if err != nil {
+ return nil, err
+ }
+
+ // Convert to unstructured
+ unstructuredObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(pod)
+ if err != nil {
+ return nil, fmt.Errorf("failed to convert to unstructured: %w", err)
+ }
+ return &unstructured.Unstructured{Object: unstructuredObj}, nil
+}
+
+func (f *FakeKubernetes) ResourcesDelete(ctx context.Context, gvk *schema.GroupVersionKind, namespace, name string) error {
+ return f.pods.Delete(ctx, name, metav1.DeleteOptions{})
+}
+
+func (f *FakeKubernetes) PodsLog(ctx context.Context, namespace, name, container string, previous bool, tail int64) (string, error) {
+ req := f.pods.GetLogs(name, &corev1.PodLogOptions{Container: container, Previous: previous})
+ res := req.Do(ctx)
+ if res.Error() != nil {
+ return "", res.Error()
+ }
+ rawData, err := res.Raw()
+ if err != nil {
+ return "", err
+ }
+ return string(rawData), nil
+}
+
+// FakePodInterface implements corev1client.PodInterface with deterministic behaviour for tests.
+type FakePodInterface struct {
+ corev1client.PodInterface
+ Created *corev1.Pod
+ Deleted bool
+ ExitCode int32
+ TerminatedReason string
+ TerminatedMessage string
+ WaitingReason string
+ WaitingMessage string
+ Logs string
+}
+
+func (f *FakePodInterface) Create(ctx context.Context, pod *corev1.Pod, opts metav1.CreateOptions) (*corev1.Pod, error) {
+ copy := pod.DeepCopy()
+ if copy.Name == "" && copy.GenerateName != "" {
+ copy.Name = copy.GenerateName + "test"
+ }
+ f.Created = copy
+ return copy.DeepCopy(), nil
+}
+
+func (f *FakePodInterface) Get(ctx context.Context, name string, opts metav1.GetOptions) (*corev1.Pod, error) {
+ if f.Created == nil {
+ return nil, fmt.Errorf("pod not created yet")
+ }
+ pod := f.Created.DeepCopy()
+
+ // If waiting state is set, return that instead of terminated
+ if f.WaitingReason != "" {
+ waiting := &corev1.ContainerStateWaiting{Reason: f.WaitingReason}
+ if f.WaitingMessage != "" {
+ waiting.Message = f.WaitingMessage
+ }
+ pod.Status.ContainerStatuses = []corev1.ContainerStatus{{
+ Name: NodeDebugContainerName,
+ State: corev1.ContainerState{Waiting: waiting},
+ }}
+ pod.Status.Phase = corev1.PodPending
+ return pod, nil
+ }
+
+ // Otherwise return terminated state
+ terminated := &corev1.ContainerStateTerminated{ExitCode: f.ExitCode}
+ if f.TerminatedReason != "" {
+ terminated.Reason = f.TerminatedReason
+ }
+ if f.TerminatedMessage != "" {
+ terminated.Message = f.TerminatedMessage
+ }
+ pod.Status.ContainerStatuses = []corev1.ContainerStatus{{
+ Name: NodeDebugContainerName,
+ State: corev1.ContainerState{Terminated: terminated},
+ }}
+ pod.Status.Phase = corev1.PodSucceeded
+ return pod, nil
+}
+
+func (f *FakePodInterface) Delete(ctx context.Context, name string, opts metav1.DeleteOptions) error {
+ f.Deleted = true
+ return nil
+}
+
+func (f *FakePodInterface) GetLogs(name string, opts *corev1.PodLogOptions) *restclient.Request {
+ body := io.NopCloser(strings.NewReader(f.Logs))
+ client := &http.Client{Transport: roundTripperFunc(func(*http.Request) (*http.Response, error) {
+ return &http.Response{StatusCode: http.StatusOK, Body: body}, nil
+ })}
+ content := restclient.ClientContentConfig{
+ ContentType: runtime.ContentTypeJSON,
+ GroupVersion: schema.GroupVersion{Version: "v1"},
+ Negotiator: runtime.NewClientNegotiator(schemek8s.Codecs.WithoutConversion(), schema.GroupVersion{Version: "v1"}),
+ }
+ return restclient.NewRequestWithClient(&url.URL{Scheme: "https", Host: "localhost"}, "", content, client).Verb("GET")
+}
+
+type roundTripperFunc func(*http.Request) (*http.Response, error)
+
+func (f roundTripperFunc) RoundTrip(req *http.Request) (*http.Response, error) {
+ return f(req)
+}
diff --git a/pkg/toolsets/core/nodes.go b/pkg/toolsets/core/nodes.go
new file mode 100644
index 00000000..e69102a3
--- /dev/null
+++ b/pkg/toolsets/core/nodes.go
@@ -0,0 +1,126 @@
+package core
+
+import (
+ "errors"
+ "fmt"
+ "time"
+
+ "github.com/google/jsonschema-go/jsonschema"
+ "k8s.io/utils/ptr"
+
+ "github.com/containers/kubernetes-mcp-server/pkg/api"
+ "github.com/containers/kubernetes-mcp-server/pkg/ocp"
+)
+
+func initNodes() []api.ServerTool {
+ return []api.ServerTool{
+ {
+ Tool: api.Tool{
+ Name: "nodes_debug_exec",
+ Description: "Run commands on an OpenShift node using a privileged debug pod (output is truncated to the most recent 100 lines, so prefer filters like grep when expecting large logs).",
+ InputSchema: &jsonschema.Schema{
+ Type: "object",
+ Properties: map[string]*jsonschema.Schema{
+ "node": {
+ Type: "string",
+ Description: "Name of the node to debug (e.g. worker-0).",
+ },
+ "command": {
+ Type: "array",
+ Description: "Command to execute on the node via chroot. Provide each argument as a separate array item (e.g. ['systemctl', 'status', 'kubelet']).",
+ Items: &jsonschema.Schema{Type: "string"},
+ },
+ "namespace": {
+ Type: "string",
+ Description: "Namespace to create the temporary debug pod in (optional, defaults to the current namespace or 'default').",
+ },
+ "image": {
+ Type: "string",
+ Description: "Container image to use for the debug pod (optional). Defaults to a Fedora-based utility image that includes chroot.",
+ },
+ "timeout_seconds": {
+ Type: "integer",
+ Description: "Maximum time to wait for the command to complete before timing out (optional, defaults to 300 seconds).",
+ Minimum: ptr.To(float64(1)),
+ },
+ },
+ Required: []string{"node", "command"},
+ },
+ Annotations: api.ToolAnnotations{
+ Title: "Nodes: Debug Exec",
+ ReadOnlyHint: ptr.To(false),
+ DestructiveHint: ptr.To(true),
+ IdempotentHint: ptr.To(false),
+ OpenWorldHint: ptr.To(true),
+ },
+ },
+ Handler: nodesDebugExec,
+ },
+ }
+}
+
+func nodesDebugExec(params api.ToolHandlerParams) (*api.ToolCallResult, error) {
+ nodeArg := params.GetArguments()["node"]
+ nodeName, ok := nodeArg.(string)
+ if nodeArg == nil || !ok || nodeName == "" {
+ return api.NewToolCallResult("", errors.New("missing required argument: node")), nil
+ }
+
+ commandArg := params.GetArguments()["command"]
+ command, err := toStringSlice(commandArg)
+ if err != nil {
+ return api.NewToolCallResult("", fmt.Errorf("invalid command argument: %w", err)), nil
+ }
+
+ namespace := ""
+ if nsArg, ok := params.GetArguments()["namespace"].(string); ok {
+ namespace = nsArg
+ }
+
+ image := ""
+ if imageArg, ok := params.GetArguments()["image"].(string); ok {
+ image = imageArg
+ }
+
+ var timeout time.Duration
+ if timeoutRaw, exists := params.GetArguments()["timeout_seconds"]; exists && timeoutRaw != nil {
+ switch v := timeoutRaw.(type) {
+ case float64:
+ timeout = time.Duration(int64(v)) * time.Second
+ case int:
+ timeout = time.Duration(v) * time.Second
+ case int64:
+ timeout = time.Duration(v) * time.Second
+ default:
+ return api.NewToolCallResult("", errors.New("timeout_seconds must be a numeric value")), nil
+ }
+ }
+
+ output, execErr := ocp.NodesDebugExec(params.Context, params.Kubernetes, namespace, nodeName, image, command, timeout)
+ if output == "" && execErr == nil {
+ output = fmt.Sprintf("Command executed successfully on node %s but produced no output.", nodeName)
+ }
+ return api.NewToolCallResult(output, execErr), nil
+}
+
+func toStringSlice(arg any) ([]string, error) {
+ if arg == nil {
+ return nil, errors.New("command is required")
+ }
+ raw, ok := arg.([]interface{})
+ if !ok {
+ return nil, errors.New("command must be an array of strings")
+ }
+ if len(raw) == 0 {
+ return nil, errors.New("command array cannot be empty")
+ }
+ command := make([]string, 0, len(raw))
+ for _, item := range raw {
+ str, ok := item.(string)
+ if !ok {
+ return nil, errors.New("command items must be strings")
+ }
+ command = append(command, str)
+ }
+ return command, nil
+}
diff --git a/pkg/toolsets/core/nodes_test.go b/pkg/toolsets/core/nodes_test.go
new file mode 100644
index 00000000..c89462fe
--- /dev/null
+++ b/pkg/toolsets/core/nodes_test.go
@@ -0,0 +1,95 @@
+package core
+
+import (
+ "context"
+ "testing"
+ "time"
+
+ "github.com/containers/kubernetes-mcp-server/pkg/api"
+ "github.com/containers/kubernetes-mcp-server/pkg/ocp"
+)
+
+type staticRequest struct {
+ args map[string]any
+}
+
+func (s staticRequest) GetArguments() map[string]any {
+ return s.args
+}
+
+func TestNodesDebugExecHandlerValidatesInput(t *testing.T) {
+ t.Run("missing node", func(t *testing.T) {
+ params := api.ToolHandlerParams{
+ Context: context.Background(),
+ ToolCallRequest: staticRequest{args: map[string]any{}},
+ }
+ result, err := nodesDebugExec(params)
+ if err != nil {
+ t.Fatalf("handler returned error: %v", err)
+ }
+ if result.Error == nil || result.Error.Error() != "missing required argument: node" {
+ t.Fatalf("unexpected error: %v", result.Error)
+ }
+ })
+
+ t.Run("invalid command type", func(t *testing.T) {
+ params := api.ToolHandlerParams{
+ Context: context.Background(),
+ ToolCallRequest: staticRequest{args: map[string]any{
+ "node": "worker-0",
+ "command": "ls -la",
+ }},
+ }
+ result, err := nodesDebugExec(params)
+ if err != nil {
+ t.Fatalf("handler returned error: %v", err)
+ }
+ if result.Error == nil || result.Error.Error() != "invalid command argument: command must be an array of strings" {
+ t.Fatalf("unexpected error: %v", result.Error)
+ }
+ })
+}
+
+func TestNodesDebugExecHandlerExecutesCommand(t *testing.T) {
+ env := ocp.NewNodeDebugTestEnv(t)
+ env.Pods.Logs = "done"
+
+ // Call NodesDebugExec directly instead of going through the handler
+ // This avoids the need to mock the full kubernetes.Kubernetes type
+ output, err := ocp.NodesDebugExec(
+ context.Background(),
+ env.KubernetesClient,
+ "debug",
+ "infra-node",
+ "registry.local/debug:latest",
+ []string{"systemctl", "status", "kubelet"},
+ 15*time.Second,
+ )
+
+ if err != nil {
+ t.Fatalf("NodesDebugExec returned error: %v", err)
+ }
+ if output != "done" {
+ t.Fatalf("unexpected output: %q", output)
+ }
+
+ created := env.Pods.Created
+ if created == nil {
+ t.Fatalf("expected pod creation")
+ }
+ if created.Namespace != "debug" {
+ t.Fatalf("expected namespace override, got %q", created.Namespace)
+ }
+ if created.Spec.Containers[0].Image != "registry.local/debug:latest" {
+ t.Fatalf("expected custom image, got %q", created.Spec.Containers[0].Image)
+ }
+ expectedCommand := []string{"chroot", "/host", "systemctl", "status", "kubelet"}
+ if len(created.Spec.Containers[0].Command) != len(expectedCommand) {
+ t.Fatalf("unexpected command length: %v", created.Spec.Containers[0].Command)
+ }
+ for i, part := range expectedCommand {
+ if created.Spec.Containers[0].Command[i] != part {
+ t.Fatalf("command[%d]=%q expected %q", i, created.Spec.Containers[0].Command[i], part)
+ }
+ }
+}
diff --git a/pkg/toolsets/core/toolset.go b/pkg/toolsets/core/toolset.go
index 9f88c7aa..dfd61f42 100644
--- a/pkg/toolsets/core/toolset.go
+++ b/pkg/toolsets/core/toolset.go
@@ -24,6 +24,7 @@ func (t *Toolset) GetTools(o internalk8s.Openshift) []api.ServerTool {
return slices.Concat(
initEvents(),
initNamespaces(o),
+ initNodes(),
initPods(),
initResources(o),
)