security: harden remote URL validation at config parse time#3210
Merged
maphew merged 1 commit intogastownhall:mainfrom Apr 13, 2026
Merged
Conversation
Add strict security validation for remote URLs and remote names to prevent injection attacks when multi-remote support lands. Changes: - Add ValidateRemoteURL() with control character rejection, CLI flag injection prevention, scheme allowlist, and per-scheme structural validation (host/path requirements) - Add ValidateRemoteName() aligned with existing peer-name policy (letter-start, alphanumeric + hyphen/underscore, max 64 chars) - Add MatchesRemotePattern() and ValidateRemoteURLWithPatterns() for enterprise lockdown via federation.allowed-remote-patterns config - Move config validation from simple IsRemoteURL() classifier to strict ValidateRemoteURL() security boundary - Add defense-in-depth validation in AddCLIRemote/RemoveCLIRemote - Add validation at clone entry points (cache.Ensure, bootstrap) - Tighten SCP-style URL regex to exclude control characters in path - Add comprehensive test coverage for null bytes, control chars, newline injection, CLI flag injection, scheme validation, structural validation, remote name edge cases, and pattern matching Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a5e2dba to
25af1e0
Compare
maphew
approved these changes
Apr 13, 2026
Collaborator
maphew
left a comment
There was a problem hiding this comment.
Review: LGTM — approve
Thanks @harry-miller-trimble for the thorough security hardening. Defense-in-depth at config parse, CLI ops, and clone entry points is the right shape, and the 40+ test cases cover the realistic attack surface (control chars, CLI flag injection, scheme allowlist, per-scheme structural checks, name injection).
Spot checks that pass
git+ssh://normalization viaplaceholder://prefix correctly round-trips throughnet/urlwhile preserving scheme allowlist enforcement.- SCP-style URLs are gated by the regex before
ValidateRemoteURLproceeds, so control chars in the path portion can't slip past (pattern excludes\x00-\x1f\x7f). ValidateRemoteURLWithPatternscallsValidateRemoteURLfirst, so malformed URLs can't bypass pattern checks by matching a lenient glob.exec.Commandargv usage indoltutil/remotes.gowas already injection-safe; the added name/URL validation is belt-and-suspenders against git URL parsing / credential helper surprises, which is the stated goal.
Minor nits (non-blocking)
- PR description vs code: body mentions hooking into
BootstrapFromGitRemoteWithDB, but the diff actually touchesBootstrapFromRemoteWithDBininternal/storage/dolt/bootstrap.go. Cosmetic. - Dead branch in
validateSCPURL: theif atIdx < 0 || colonIdx < 0check is unreachable becausegitSSHPattern.MatchStringhas already succeeded. Can be dropped or kept as a paranoia guard. - Duplicate guard in
validateSyncConfig: the two successiveif federationRemote != ""blocks incmd/bd/config.gocould be consolidated into one. - C1 control chars (0x80–0x9F): rune-based check rejects C0 + DEL but not C1. Not a practical injection vector through
exec.Commandargv and dolt would reject them on its own, so leaving as-is is fine — just noting for completeness. - Behavior change:
isValidRemoteURLswitches fromIsRemoteURL(lenient) toValidateRemoteURL(strict). Users with loosely-formed but previously-acceptedfederation.remotevalues (e.g.,dolthub://orgonly) will now get abd doctorissue. This is the intended tightening — worth calling out in release notes.
Tested locally
go test ./internal/remotecache/... — PASS. (cmd/bd build blocked on unrelated ICU header issue in my env; CI is green across all platforms including the parallelized dolt-cmd matrix.)
Happy to merge. Nits can be follow-ups if you prefer.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens remote URL validation to prevent injection attacks when multi-remote support lands.
Changes
New validation functions in
internal/remotecache/url.goValidateRemoteURL()— strict security boundary that rejects control characters (null bytes, newlines, tabs, ANSI escapes), CLI flag injection (leading dash), disallowed schemes, and structurally invalid URLs per scheme (missing host/org/bucket)ValidateRemoteName()— allowlist validation aligned with existing peer-name policy ([a-zA-Z][a-zA-Z0-9_-]*, max 64 chars)ValidateRemoteURLWithPatterns()— enterprise lockdown via glob-stylefederation.allowed-remote-patternsconfigDefense-in-depth validation points
validateSyncConfig)AddCLIRemote,RemoveCLIRemote)cache.Ensure,BootstrapFromGitRemoteWithDB)Config
federation.allowed-remote-patternsconfig key (empty = no restriction)Tests
Security Context
When multi-remote support lands, remote names and URLs will be passed to
doltviaexec.Command. While Go'sexec.Commanduses argument arrays (no shell interpolation), URLs with control characters or leading dashes could still cause issues with git's URL parsing or credential helpers. This change validates all remote inputs at multiple boundaries before they reach subprocess calls.Files changed (8 files, +474/-8)
Rebased from fork PR harry-miller-trimble#32