Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,3 +237,20 @@ The operator is a kbuernetes operator that monitors cluster events and coordinat
## [Skyhook Agent](agent/README.md)
The agent is what does the operators work and is a separate container from the package. The agent knowns how to read a package (/skyhook_package/config.json) is what implements the [lifecycle](#stages) packages go though.

## [Skyhook CLI](docs/cli.md)
A kubectl plugin for managing Skyhook deployments, packages, and nodes. Provides SRE tooling for inspecting node/package state, forcing re-runs, managing node lifecycle, and retrieving logs.

### Quick Install
```bash
# Build from source
make build-cli

# Install as kubectl plugin
cp bin/kubectl-skyhook /usr/local/bin/

# Verify installation
kubectl skyhook version
```

See the [full CLI documentation](docs/cli.md) for detailed usage and examples.

289 changes: 289 additions & 0 deletions docs/cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
# Skyhook CLI

kubectl plugin for managing Skyhook deployments, packages, and nodes.

## Overview

The Skyhook CLI (`kubectl skyhook`) provides SRE tooling for managing Skyhook operators and their packages across Kubernetes cluster nodes. It supports inspecting node/package state, forcing re-runs, managing node lifecycle, and retrieving logs.

## Installation

```bash
# Build from source
make build-cli

# Install as kubectl plugin
cp bin/kubectl-skyhook /usr/local/bin/

# Verify installation
kubectl skyhook version
```

## Usage Structure

### Basic Command Structure
```
kubectl skyhook [global-flags] <command> [subcommand] [flags] [arguments]
```

### Global Flags
- `-h, --help` - Show help for any command
- `--version` - Show version information
- `-n, --namespace` - Kubernetes namespace (default: "skyhook")
- `-o, --output` - Output format: table|json|yaml|wide
- `-v, --verbose` - Enable verbose output
- `--dry-run` - Preview changes without applying them
- `--kubeconfig` - Path to kubeconfig file

## Commands

### Version Command

Show plugin and operator versions.

```bash
# Show both plugin and operator versions
kubectl skyhook version

# Show only plugin version (no cluster query)
kubectl skyhook version --client-only

# With custom timeout
kubectl skyhook version --timeout 10s
```

### Pause/Resume Commands

Control Skyhook processing state.

```bash
# Pause a Skyhook (stops processing new nodes)
kubectl skyhook pause my-skyhook
kubectl skyhook pause my-skyhook --confirm # Skip confirmation

# Resume a paused Skyhook
kubectl skyhook resume my-skyhook
```

### Disable/Enable Commands

Completely disable or re-enable a Skyhook.

```bash
# Disable a Skyhook completely
kubectl skyhook disable my-skyhook
kubectl skyhook disable my-skyhook --confirm

# Re-enable a disabled Skyhook
kubectl skyhook enable my-skyhook
```

### Node Commands

Manage Skyhook nodes across the cluster.

```bash
# List all nodes targeted by a Skyhook
kubectl skyhook node list --skyhook my-skyhook
kubectl skyhook node list --skyhook my-skyhook -o json

# Show all Skyhook activity on specific node(s)
kubectl skyhook node status worker-1
kubectl skyhook node status worker-1 worker-2
kubectl skyhook node status "worker-.*" # Regex pattern
kubectl skyhook node status worker-1 --skyhook my-skyhook # Filter by Skyhook

# Reset all package state on node(s)
kubectl skyhook node reset worker-1 --skyhook my-skyhook --confirm
kubectl skyhook node reset "node-.*" --skyhook my-skyhook --dry-run

# Ignore/unignore nodes from processing
kubectl skyhook node ignore worker-1
kubectl skyhook node ignore "test-node-.*"
kubectl skyhook node unignore worker-1
```

#### Node Flags

| Command | Flag | Description |
|---------|------|-------------|
| `list` | `--skyhook` | Skyhook name (required) |
| `status` | `--skyhook` | Filter by Skyhook name |
| `reset` | `--skyhook` | Skyhook name (required) |
| `reset` | `--confirm, -y` | Skip confirmation prompt |

### Package Commands

Manage Skyhook packages.

```bash
# Query package status across nodes
kubectl skyhook package status my-package --skyhook my-skyhook
kubectl skyhook package status my-package --skyhook my-skyhook --node worker-1
kubectl skyhook package status my-package --skyhook my-skyhook -o wide

# Force package re-run on specific nodes
kubectl skyhook package rerun my-package --skyhook my-skyhook --node worker-1
kubectl skyhook package rerun my-package --skyhook my-skyhook --node "worker-.*" --confirm
kubectl skyhook package rerun my-package --skyhook my-skyhook --node worker-1 --stage config

# Get package logs
kubectl skyhook package logs my-package --skyhook my-skyhook
kubectl skyhook package logs my-package --skyhook my-skyhook --node worker-1
kubectl skyhook package logs my-package --skyhook my-skyhook --stage apply
kubectl skyhook package logs my-package --skyhook my-skyhook -f # Follow
kubectl skyhook package logs my-package --skyhook my-skyhook --tail 100
```

#### Package Flags

| Command | Flag | Description |
|---------|------|-------------|
| `status` | `--skyhook` | Skyhook name (required) |
| `status` | `--node` | Filter by node pattern (repeatable) |
| `rerun` | `--skyhook` | Skyhook name (required) |
| `rerun` | `--node` | Node pattern (required, repeatable) |
| `rerun` | `--stage` | Re-run from stage: apply, config, interrupt, post-interrupt |
| `rerun` | `--confirm, -y` | Skip confirmation prompt |
| `logs` | `--skyhook` | Skyhook name (required) |
| `logs` | `--node` | Filter by node name |
| `logs` | `--stage` | Filter by stage |
| `logs` | `-f, --follow` | Follow log output |
| `logs` | `--tail` | Lines from end (-1 for all) |

## Help System

```bash
# General help
kubectl skyhook --help

# Command group help
kubectl skyhook node --help
kubectl skyhook package --help

# Specific command help
kubectl skyhook node reset --help
kubectl skyhook package rerun --help
```

## Common Usage Patterns

### Debugging a Failed Package
```bash
# 1. Check package status
kubectl skyhook package status my-package --skyhook my-skyhook -o wide

# 2. View logs for the failed package
kubectl skyhook package logs my-package --skyhook my-skyhook --node worker-1

# 3. Fix the issue, then force re-run
kubectl skyhook package rerun my-package --skyhook my-skyhook --node worker-1 --confirm
```

### Node Maintenance
```bash
# 1. Ignore node before maintenance
kubectl skyhook node ignore worker-1

# 2. Perform maintenance...

# 3. Unignore and reset to re-run all packages
kubectl skyhook node unignore worker-1
kubectl skyhook node reset worker-1 --skyhook my-skyhook --confirm
```

### Cluster-Wide Status Check
```bash
# List all nodes for a Skyhook
kubectl skyhook node list --skyhook my-skyhook

# Check status of all nodes
kubectl skyhook node status

# Check specific Skyhook across all nodes
kubectl skyhook node status --skyhook my-skyhook -o json
```

### Emergency Stop
```bash
# Pause all processing
kubectl skyhook pause my-skyhook --confirm

# Or disable completely
kubectl skyhook disable my-skyhook --confirm
```

## Output Formats

All status commands support multiple output formats:

```bash
# Table (default) - human-readable
kubectl skyhook node list --skyhook my-skyhook

# Wide - table with additional columns
kubectl skyhook node list --skyhook my-skyhook -o wide

# JSON - machine-readable
kubectl skyhook node list --skyhook my-skyhook -o json

# YAML - machine-readable
kubectl skyhook node list --skyhook my-skyhook -o yaml
```

## Architecture

### Package Structure
```
operator/cmd/cli/app/ # CLI commands
├── cli.go # Root command (NewSkyhookCommand)
├── version.go # Version command
├── lifecycle.go # Pause, resume, disable, enable commands
├── node/ # Node subcommands
│ ├── node.go # Parent command
│ ├── node_list.go
│ ├── node_status.go
│ ├── node_reset.go
│ └── node_ignore.go # Ignore and unignore commands
└── package/ # Package subcommands
├── package.go # Parent command
├── package_status.go
├── package_rerun.go
└── package_logs.go

operator/internal/cli/ # Shared CLI utilities
├── client/ # Kubernetes client wrapper
├── context/ # CLI context and global flags
└── utils/ # Shared utilities
```

### Command Creation Flow
```
main()
└── cli.Execute()
└── NewSkyhookCommand(ctx)
├── NewVersionCmd(ctx)
├── NewPauseCmd(ctx)
├── NewResumeCmd(ctx)
├── NewDisableCmd(ctx)
├── NewEnableCmd(ctx)
├── node.NewNodeCmd(ctx)
│ ├── NewListCmd(ctx)
│ ├── NewStatusCmd(ctx)
│ ├── NewResetCmd(ctx)
│ ├── NewIgnoreCmd(ctx)
│ └── NewUnignoreCmd(ctx)
└── pkg.NewPackageCmd(ctx)
├── NewStatusCmd(ctx)
├── NewRerunCmd(ctx)
└── NewLogsCmd(ctx)
```

### Testing
```bash
# Run CLI tests
make test-cli

# Run all tests
make test
```