Skip to content

security: harden remote URL validation at config parse time#3210

Merged
maphew merged 1 commit intogastownhall:mainfrom
harry-miller-trimble:upstream/harden-remote-url
Apr 13, 2026
Merged

security: harden remote URL validation at config parse time#3210
maphew merged 1 commit intogastownhall:mainfrom
harry-miller-trimble:upstream/harden-remote-url

Conversation

@harry-miller-trimble
Copy link
Copy Markdown
Contributor

Summary

Hardens remote URL validation to prevent injection attacks when multi-remote support lands.

Changes

New validation functions in internal/remotecache/url.go

  • ValidateRemoteURL() — strict security boundary that rejects control characters (null bytes, newlines, tabs, ANSI escapes), CLI flag injection (leading dash), disallowed schemes, and structurally invalid URLs per scheme (missing host/org/bucket)
  • ValidateRemoteName() — allowlist validation aligned with existing peer-name policy ([a-zA-Z][a-zA-Z0-9_-]*, max 64 chars)
  • ValidateRemoteURLWithPatterns() — enterprise lockdown via glob-style federation.allowed-remote-patterns config

Defense-in-depth validation points

  • Config parse time (validateSyncConfig)
  • CLI remote operations (AddCLIRemote, RemoveCLIRemote)
  • Clone entry points (cache.Ensure, BootstrapFromGitRemoteWithDB)

Config

  • Added federation.allowed-remote-patterns config key (empty = no restriction)

Tests

  • 40+ test cases covering null bytes, control characters, newline injection, CLI flag injection, scheme validation, per-scheme structural validation, remote name edge cases, pattern matching, and allowed-remote-patterns enforcement

Security Context

When multi-remote support lands, remote names and URLs will be passed to dolt via exec.Command. While Go's exec.Command uses argument arrays (no shell interpolation), URLs with control characters or leading dashes could still cause issues with git's URL parsing or credential helpers. This change validates all remote inputs at multiple boundaries before they reach subprocess calls.

Files changed (8 files, +474/-8)

Rebased from fork PR harry-miller-trimble#32

Add strict security validation for remote URLs and remote names to prevent
injection attacks when multi-remote support lands.

Changes:
- Add ValidateRemoteURL() with control character rejection, CLI flag
  injection prevention, scheme allowlist, and per-scheme structural
  validation (host/path requirements)
- Add ValidateRemoteName() aligned with existing peer-name policy
  (letter-start, alphanumeric + hyphen/underscore, max 64 chars)
- Add MatchesRemotePattern() and ValidateRemoteURLWithPatterns() for
  enterprise lockdown via federation.allowed-remote-patterns config
- Move config validation from simple IsRemoteURL() classifier to strict
  ValidateRemoteURL() security boundary
- Add defense-in-depth validation in AddCLIRemote/RemoveCLIRemote
- Add validation at clone entry points (cache.Ensure, bootstrap)
- Tighten SCP-style URL regex to exclude control characters in path
- Add comprehensive test coverage for null bytes, control chars,
  newline injection, CLI flag injection, scheme validation, structural
  validation, remote name edge cases, and pattern matching

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@harry-miller-trimble harry-miller-trimble force-pushed the upstream/harden-remote-url branch from a5e2dba to 25af1e0 Compare April 12, 2026 06:07
Copy link
Copy Markdown
Collaborator

@maphew maphew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: LGTM — approve

Thanks @harry-miller-trimble for the thorough security hardening. Defense-in-depth at config parse, CLI ops, and clone entry points is the right shape, and the 40+ test cases cover the realistic attack surface (control chars, CLI flag injection, scheme allowlist, per-scheme structural checks, name injection).

Spot checks that pass

  • git+ssh:// normalization via placeholder:// prefix correctly round-trips through net/url while preserving scheme allowlist enforcement.
  • SCP-style URLs are gated by the regex before ValidateRemoteURL proceeds, so control chars in the path portion can't slip past (pattern excludes \x00-\x1f\x7f).
  • ValidateRemoteURLWithPatterns calls ValidateRemoteURL first, so malformed URLs can't bypass pattern checks by matching a lenient glob.
  • exec.Command argv usage in doltutil/remotes.go was already injection-safe; the added name/URL validation is belt-and-suspenders against git URL parsing / credential helper surprises, which is the stated goal.

Minor nits (non-blocking)

  1. PR description vs code: body mentions hooking into BootstrapFromGitRemoteWithDB, but the diff actually touches BootstrapFromRemoteWithDB in internal/storage/dolt/bootstrap.go. Cosmetic.
  2. Dead branch in validateSCPURL: the if atIdx < 0 || colonIdx < 0 check is unreachable because gitSSHPattern.MatchString has already succeeded. Can be dropped or kept as a paranoia guard.
  3. Duplicate guard in validateSyncConfig: the two successive if federationRemote != "" blocks in cmd/bd/config.go could be consolidated into one.
  4. C1 control chars (0x80–0x9F): rune-based check rejects C0 + DEL but not C1. Not a practical injection vector through exec.Command argv and dolt would reject them on its own, so leaving as-is is fine — just noting for completeness.
  5. Behavior change: isValidRemoteURL switches from IsRemoteURL (lenient) to ValidateRemoteURL (strict). Users with loosely-formed but previously-accepted federation.remote values (e.g., dolthub://orgonly) will now get a bd doctor issue. This is the intended tightening — worth calling out in release notes.

Tested locally

go test ./internal/remotecache/... — PASS. (cmd/bd build blocked on unrelated ICU header issue in my env; CI is green across all platforms including the parallelized dolt-cmd matrix.)

Happy to merge. Nits can be follow-ups if you prefer.

@maphew maphew merged commit 7ea14b2 into gastownhall:main Apr 13, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants