Add custom providers and models (Ollama, vLLM, LM Studio, proxies) via ~/.gsd/agent/models.json.
- Minimal Example
- Full Example
- Supported APIs
- Provider Configuration
- Model Configuration
- Overriding Built-in Providers
- Per-model Overrides
- OpenAI Compatibility
For local models (Ollama, LM Studio, vLLM), only id is required per model:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{ "id": "llama3.1:8b" },
{ "id": "qwen2.5-coder:7b" }
]
}
}
}The apiKey is required but Ollama ignores it, so any value works.
Some OpenAI-compatible servers do not understand the developer role used for reasoning-capable models. For those providers, set compat.supportsDeveloperRole to false so GSD sends the system prompt as a system message instead. If the server also does not support reasoning_effort, set compat.supportsReasoningEffort to false too.
Some servers (including certain vLLM/TensorRT-LLM deployments) can return 400 errors when prior assistant reasoning_content is replayed. Set compat.stripReasoningContent to true to remove those replayed fields from outbound history.
You can set compat at the provider level to apply to all models, or at the model level to override a specific model. This commonly applies to Ollama, vLLM, SGLang, and similar OpenAI-compatible servers.
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false,
"stripReasoningContent": true
},
"models": [
{
"id": "gpt-oss:20b",
"reasoning": true
}
]
}
}
}Override defaults when you need specific values:
{
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"api": "openai-completions",
"apiKey": "ollama",
"models": [
{
"id": "llama3.1:8b",
"name": "Llama 3.1 8B (Local)",
"reasoning": false,
"input": ["text"],
"contextWindow": 128000,
"maxTokens": 32000,
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
}
]
}
}
}The file reloads each time you open /model. Edit during session; no restart needed.
| API | Description |
|---|---|
openai-completions |
OpenAI Chat Completions (most compatible) |
openai-responses |
OpenAI Responses API |
anthropic-messages |
Anthropic Messages API |
google-generative-ai |
Google Generative AI |
Set api at provider level (default for all models) or model level (override per model).
| Field | Description |
|---|---|
baseUrl |
API endpoint URL |
api |
API type (see above) |
apiKey |
API key (see value resolution below) |
headers |
Custom headers (see value resolution below) |
authHeader |
Set true to add Authorization: Bearer <apiKey> automatically |
models |
Array of model configurations |
modelOverrides |
Per-model overrides for built-in models on this provider |
The apiKey and headers fields support three formats:
- Shell command:
"!command"executes and uses stdout"apiKey": "!security find-generic-password -ws 'anthropic'" "apiKey": "!op read 'op://vault/item/credential'"
- Environment variable: Uses the value of the named variable
"apiKey": "MY_API_KEY"
- Literal value: Used directly
"apiKey": "sk-..."
Shell commands (!command) are restricted to a set of known credential tools. Only commands starting with one of these are allowed to execute:
pass, op, aws, gcloud, vault, security, gpg, bw, gopass, lpass
Commands not on this list are blocked and the value resolves to undefined. A warning is written to stderr.
Shell operators (;, |, &, `, $, >, <) are also blocked in command arguments to prevent injection.
Customizing the allowlist:
If you use a credential tool not on the default list, override it in global settings (~/.gsd/agent/settings.json):
{
"allowedCommandPrefixes": ["pass", "op", "sops", "doppler", "mycli"]
}This replaces the default list entirely — include any defaults you still want.
Alternatively, set the GSD_ALLOWED_COMMAND_PREFIXES environment variable (comma-separated). The env var takes precedence over settings.json:
export GSD_ALLOWED_COMMAND_PREFIXES="pass,op,sops,doppler"Note: This setting is global-only. Project-level settings.json (
<project>/.gsd/settings.json) cannot override the command allowlist — this prevents a cloned repo from escalating command execution privileges.
{
"providers": {
"custom-proxy": {
"baseUrl": "https://proxy.example.com/v1",
"apiKey": "MY_API_KEY",
"api": "anthropic-messages",
"headers": {
"x-portkey-api-key": "PORTKEY_API_KEY",
"x-secret": "!op read 'op://vault/item/secret'"
},
"models": [...]
}
}
}| Field | Required | Default | Description |
|---|---|---|---|
id |
Yes | — | Model identifier (passed to the API) |
name |
No | id |
Human-readable model label. Used for matching (--model patterns) and shown in model details/status text. |
api |
No | provider's api |
Override provider's API for this model |
reasoning |
No | false |
Supports extended thinking |
input |
No | ["text"] |
Input types: ["text"] or ["text", "image"] |
contextWindow |
No | 128000 |
Context window size in tokens |
maxTokens |
No | 16384 |
Maximum output tokens |
cost |
No | all zeros | {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0} (per million tokens) |
compat |
No | provider compat |
OpenAI compatibility overrides. Merged with provider-level compat when both are set. |
Current behavior:
/modeland--list-modelslist entries by modelid.- The configured
nameis used for model matching and detail/status text.
Route a built-in provider through a proxy without redefining models:
{
"providers": {
"anthropic": {
"baseUrl": "https://my-proxy.example.com/v1"
}
}
}All built-in Anthropic models remain available. Existing OAuth or API key auth continues to work.
To merge custom models into a built-in provider, include the models array:
{
"providers": {
"anthropic": {
"baseUrl": "https://my-proxy.example.com/v1",
"apiKey": "ANTHROPIC_API_KEY",
"api": "anthropic-messages",
"models": [...]
}
}
}Merge semantics:
- Built-in models are kept.
- Custom models are upserted by
idwithin the provider. - If a custom model
idmatches a built-in modelid, the custom model replaces that built-in model. - If a custom model
idis new, it is added alongside built-in models.
Use modelOverrides to customize specific built-in models without replacing the provider's full model list.
{
"providers": {
"openrouter": {
"modelOverrides": {
"anthropic/claude-sonnet-4": {
"name": "Claude Sonnet 4 (Bedrock Route)",
"compat": {
"openRouterRouting": {
"only": ["amazon-bedrock"]
}
}
}
}
}
}
}modelOverrides supports these fields per model: name, reasoning, input, cost (partial), contextWindow, maxTokens, headers, compat.
Behavior notes:
modelOverridesare applied to built-in provider models.- Unknown model IDs are ignored.
- You can combine provider-level
baseUrl/headerswithmodelOverrides. - If
modelsis also defined for a provider, custom models are merged after built-in overrides. A custom model with the sameidreplaces the overridden built-in model entry.
For providers with partial OpenAI compatibility, use the compat field.
- Provider-level
compatapplies defaults to all models under that provider. - Model-level
compatoverrides provider-level values for that model.
{
"providers": {
"local-llm": {
"baseUrl": "http://localhost:8080/v1",
"api": "openai-completions",
"compat": {
"supportsUsageInStreaming": false,
"maxTokensField": "max_tokens"
},
"models": [...]
}
}
}| Field | Description |
|---|---|
supportsStore |
Provider supports store field |
supportsDeveloperRole |
Use developer vs system role |
supportsReasoningEffort |
Support for reasoning_effort parameter |
reasoningEffortMap |
Map GSD thinking levels to provider-specific reasoning_effort values |
supportsUsageInStreaming |
Supports stream_options: { include_usage: true } (default: true) |
maxTokensField |
Use max_completion_tokens or max_tokens |
requiresToolResultName |
Include name on tool result messages |
requiresAssistantAfterToolResult |
Insert an assistant message before a user message after tool results |
requiresThinkingAsText |
Convert thinking blocks to plain text |
stripReasoningContent |
Strip replayed assistant reasoning_content fields from outbound message history (default: false; enable for some vLLM/TensorRT-LLM endpoints that otherwise return 400 errors) |
thinkingFormat |
Use reasoning_effort, zai, qwen, or qwen-chat-template thinking parameters |
supportsStrictMode |
Include the strict field in tool definitions |
openRouterRouting |
OpenRouter routing config passed to OpenRouter for model/provider selection |
vercelGatewayRouting |
Vercel AI Gateway routing config for provider selection (only, order) |
qwen uses top-level enable_thinking. Use qwen-chat-template for local Qwen-compatible servers that require chat_template_kwargs.enable_thinking.
Example:
{
"providers": {
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "OPENROUTER_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "openrouter/anthropic/claude-3.5-sonnet",
"name": "OpenRouter Claude 3.5 Sonnet",
"compat": {
"openRouterRouting": {
"order": ["anthropic"],
"fallbacks": ["openai"]
}
}
}
]
}
}
}Vercel AI Gateway example:
{
"providers": {
"vercel-ai-gateway": {
"baseUrl": "https://ai-gateway.vercel.sh/v1",
"apiKey": "AI_GATEWAY_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "moonshotai/kimi-k2.5",
"name": "Kimi K2.5 (Fireworks via Vercel)",
"reasoning": true,
"input": ["text", "image"],
"cost": { "input": 0.6, "output": 3, "cacheRead": 0, "cacheWrite": 0 },
"contextWindow": 262144,
"maxTokens": 262144,
"compat": {
"vercelGatewayRouting": {
"only": ["fireworks", "novita"],
"order": ["fireworks", "novita"]
}
}
}
]
}
}
}