Testing

Three tiers of tests, increasing in scope and external dependencies.

Unit Tests

No external dependencies. Run anywhere.

go test ./...
go vet ./...

All packages are covered. These must pass before any commit.

Integration Tests (Docker required)

Build-tagged integration. Require Docker running locally. No real credentials needed — tests use locally-built fixture images.

go test -tags integration -v ./...

Spike Tests (live Discord + Docker required)

Spike tests are the live end-to-end validation layer. They require real credentials, Docker, and a real Discord server. They are not CI tests.

There are currently two spike paths:

TestSpikeRollCall: the broad driver-parity validation path. Boots all 6 driver types plus cllama passthrough and clawdash, sends a Discord roll call, and verifies runtime-specific responses.
TestSpikeComposeUp: the deeper trading-desk validation path. Focuses on artifact generation, startup wiring, and Discord activity for the richer multi-service example.

Run a spike test when implementing or validating driver/runtime behavior end to end.

Rollcall Driver Parity Spike

The rollcall spike (TestSpikeRollCall) is the best single validation path for cross-driver support. It uses examples/rollcall/ and exercises:

openclaw
nullclaw
microclaw
nanoclaw
nanobot
picoclaw
cllama passthrough
clawdash

What it validates

Base images build for all 6 driver families
Agent images build from their Clawfiles
claw up succeeds on the rollcall pod
All agent containers converge to healthy/running state
A Discord trigger message causes each runtime to post an AI-generated self-identification response
cllama exposes cost data after traffic flows through the proxy

Prerequisites

Docker running
Go toolchain
A Discord server with:
- One bot application token with permission to read and post in the target channel
- A text channel for the roll call
- An incoming webhook URL for posting the non-bot trigger message
At least one LLM provider key:
- OPENROUTER_API_KEY or
- ANTHROPIC_API_KEY

Setup

cd examples/rollcall
cp .env.example .env
# Edit .env with real values

Required .env values:

Variable	What it is
`DISCORD_BOT_TOKEN`	Bot token used by all rollcall services
`DISCORD_BOT_ID`	Discord application/user ID for that bot
`DISCORD_GUILD_ID`	Discord server (guild) ID
`ROLLCALL_CHANNEL_ID`	Channel ID used for the roll call
`DISCORD_WEBHOOK_URL`	Incoming webhook URL used to post the trigger message
`OPENROUTER_API_KEY`	Optional, used by OpenRouter-backed passthrough services
`ANTHROPIC_API_KEY`	Optional, used by Anthropic-backed passthrough services

Running

go test -tags spike -v -run TestSpikeRollCall ./cmd/claw/...

Expected duration: 3-10 minutes depending on Docker cache warmth, image build time, Discord gateway connection, and LLM latency.

Output

The test logs:

Image builds and compose startup progress
Health convergence for each rollcall container
Matching Discord responses for each runtime name
Recent container logs on teardown or failure
cllama health/cost endpoint checks

Cleanup

Containers are torn down automatically on success, failure, or Ctrl-C.

If a run is killed hard, clean up manually:

docker compose -p rollcall down --volumes --remove-orphans

Trading-Desk Spike

The trading-desk spike (TestSpikeComposeUp) remains the deeper artifact and workflow validation instrument for the richer examples/trading-desk/ example.

What it validates

claw up succeeds without error on examples/trading-desk/claw-pod.yml
openclaw.json generated correctly: channels.discord.token, guilds keyed by guild ID, groupPolicy, dmPolicy, allowBots, mentionPatterns, peer users[]
jobs.json generated correctly: agentTurn payloads with delivery.mode=announce and delivery.to resolved to the real channel ID
compose.generated.yml contains correct bind mounts for /app/config and /app/state/cron
Both agent containers start and serve mounted files at the expected paths
openclaw health --json reports healthy inside the tiverton container
Both agents post startup greetings to Discord (tiverton online., westin online.)
trading-api posts a webhook startup message mentioning both agent Discord IDs (<@TIVERTON_ID> and <@WESTIN_ID>) — proves CLAW_HANDLE_* env vars are broadcast to non-claw pod services

Prerequisites

Docker running
Go toolchain
A Discord server with:
- Distinct Discord application/user IDs for tiverton, westin, allen, logan, micro, and hermes
- At least one bot token with permission to read and post in the target channel
- A text channel the bots can read and post to
- An incoming webhook URL for the trading-api startup announcement (optional — if absent, webhook posting is skipped and logged)

Setup

cd examples/trading-desk
cp .env.example .env
# Edit .env with real values:

Variable	What it is
`TIVERTON_BOT_TOKEN`	Bot token for the tiverton Discord application
`TIVERTON_DISCORD_ID`	Application/user ID for tiverton bot
`WESTIN_BOT_TOKEN`	Bot token for the westin Discord application
`WESTIN_DISCORD_ID`	Application/user ID for westin bot
`ALLEN_DISCORD_ID`	Application/user ID declared for allen
`LOGAN_DISCORD_ID`	Application/user ID declared for logan
`MICRO_DISCORD_ID`	Application/user ID declared for micro
`HERMES_DISCORD_ID`	Application/user ID declared for hermes
`DISCORD_GUILD_ID`	Discord server (guild) ID
`DISCORD_TRADING_FLOOR_CHANNEL`	Channel ID the bots post to
`DISCORD_TRADING_API_WEBHOOK`	Incoming webhook URL (optional)

Only TIVERTON_BOT_TOKEN is strictly required for the local spike harness; the other bot token env vars can reuse it if you just want one real Discord bot for the run. The Discord IDs are env-owned pod metadata and still need to be present for every managed service because the parser enforces unique concurrent handles.

Running

go test -tags spike -v -run TestSpikeComposeUp ./cmd/claw/...

The test builds trading-desk:latest and trading-api:latest fresh on every run (so image changes are always picked up). openclaw:latest is only built if not already present locally.

Expected duration: 2–5 minutes depending on Docker layer cache warmth and Discord gateway connection time.

Output

The test logs:

Generated artifact paths and content excerpts
Container health state
openclaw health --json output from inside the tiverton container
trading-api env var presence (no values) and early logs
Each Discord message found that matches a verification check

On failure, teardown dumps the last 100 log lines from each container.

Cleanup

Containers are torn down automatically on test completion or Ctrl-C. Interrupt is handled gracefully — compose down runs before exit.

If a run is killed hard (SIGKILL), clean up manually:

docker compose -p trading-desk down --volumes --remove-orphans

Quickstart Docs Smoke Test

Validates that the documented quickstart shell blocks are runnable end-to-end in a fresh Docker CLI container:

Extracts shell code blocks from:
- README.md quickstart section
- examples/quickstart/README.md
Runs extracted commands in a new docker:27-cli container (mounted to local repo
- Docker socket)
Rewrites .env from provided credentials and checks runtime health convergence (assistant=healthy, cllama-passthrough=healthy)
Verifies cllama runtime signals in logs (api listening on :8080, ui listening on :8081)

Run:

go test -tags spike -v -run TestQuickstartDocsRunInFreshDockerContainer ./cmd/claw/...

Required env vars (or values in examples/quickstart/.env):

OPENROUTER_API_KEY
DISCORD_BOT_TOKEN
DISCORD_BOT_ID
DISCORD_GUILD_ID

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing

Unit Tests

Integration Tests (Docker required)

Spike Tests (live Discord + Docker required)

Rollcall Driver Parity Spike

What it validates

Prerequisites

Setup

Running

Output

Cleanup

Trading-Desk Spike

What it validates

Prerequisites

Setup

Running

Output

Cleanup

Quickstart Docs Smoke Test

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Testing

Unit Tests

Integration Tests (Docker required)

Spike Tests (live Discord + Docker required)

Rollcall Driver Parity Spike

What it validates

Prerequisites

Setup

Running

Output

Cleanup

Trading-Desk Spike

What it validates

Prerequisites

Setup

Running

Output

Cleanup

Quickstart Docs Smoke Test