fix(security): harden API proxy, CORS, SSRF, ReDoS, and sanitization#2226
fix(security): harden API proxy, CORS, SSRF, ReDoS, and sanitization#2226plato23 wants to merge 2 commits intokoala73:mainfrom
Conversation
Security improvements across multiple vectors: - **MCP proxy**: Block sensitive header overrides (Host, Cookie, Transfer-Encoding, proxy headers) via BLOCKED_HEADER_NAMES allowlist; add rate limiting; sanitize error responses to prevent info leakage - **RSS proxy**: Remove internal error details (url, message) from client-facing error responses — log server-side only - **Relay helper**: Remove internal error details from relay error responses; enforce HTTPS-only relay connections (ws:// now upgrades to https:// instead of http://) - **News classifier**: Pre-build all keyword regexes at module load time to eliminate runtime RegExp construction from string arguments (ReDoS surface elimination per CWE-1333) - **Feed digest**: Restrict extractTag() to pre-cached KNOWN_TAGS only, returning empty string for unknown tags — eliminates dynamic RegExp construction entirely - **vercel.json**: Remove wildcard CORS headers (Access-Control-Allow-Origin: *) from static /api/* header block that was undermining the dynamic origin-checking CORS logic in edge functions - **Dockerfile.relay**: Run relay container as non-root user (appuser) matching the main Dockerfile's security posture All 2296 existing tests updated and passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…locklist Second batch of security fixes from comprehensive audit: - **gateway.ts**: CORS fallback on exception now returns hardcoded production origin instead of wildcard `*` — prevents silent bypass of the entire origin allowlist on any CORS computation error - **LiveWebcamsPanel.ts**: Add origin validation to postMessage handler matching LiveNewsPanel pattern — only accept messages from YouTube, Windy, and the local sidecar (fixes CWE-345) - **widget-sanitizer.ts**: escapeSrcdoc now escapes `<` and `>` in addition to `&` and `"` — prevents srcdoc attribute breakout in pro widget iframe rendering - **DeductionPanel.ts**: Restrict DOMPurify config on AI-generated markdown output — blocks `<a>`, `<img>`, and other tags that could enable phishing or data exfiltration from compromised AI backend - **mcp-proxy.js**: Expand SSRF blocklist to cover `0.0.0.0`, unspecified IPv6 `::`, IPv4-mapped IPv6 `::ffff:`, and bracket-wrapped IPv6 hostnames Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@plato23 is attempting to deploy a commit to the Elie Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR applies a broad sweep of defensive security hardening across the API proxy layer, CORS configuration, SSRF protection, ReDoS prevention, and client-side sanitization. The changes are well-motivated and largely correct, but four issues warrant attention before merge. Key changes:
Issues found:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Browser
participant Vercel CDN
participant MCPProxy as api/mcp-proxy.js
participant RateLimit as _rate-limit.js
participant SSRF as validateServerUrl()
participant MCP as External MCP Server
Browser->>Vercel CDN: GET/POST /api/mcp-proxy
Note over Vercel CDN: Adds X-Content-Type-Options: nosniff (vercel.json)
Vercel CDN->>MCPProxy: Forward request
MCPProxy->>MCPProxy: isDisallowedOrigin() → 403 if blocked
MCPProxy->>MCPProxy: OPTIONS preflight → 204
MCPProxy->>RateLimit: checkRateLimit(req)
alt Rate limit exceeded
RateLimit-->>Browser: 429 Too Many Requests
end
MCPProxy->>SSRF: validateServerUrl(rawServer)
Note over SSRF: Checks protocol, BLOCKED_HOST_PATTERNS<br/>(loopback, RFC1918, link-local, ULA-fd, ::ffff:)
alt URL blocked
SSRF-->>MCPProxy: null
MCPProxy-->>Browser: 400 Invalid serverUrl
end
MCPProxy->>MCP: initialize + tools/list or tools/call
Note over MCPProxy,MCP: buildHeaders() strips BLOCKED_HEADER_NAMES<br/>(host, cookie, x-forwarded-*, cf-connecting-ip…)
alt SSE transport (/sse path)
MCPProxy->>MCP: GET SSE stream
MCP-->>MCPProxy: endpoint event URL
MCPProxy->>SSRF: Re-validate endpoint URL (SSRF check)
MCPProxy->>MCP: POST JSON-RPC to endpoint
end
MCP-->>MCPProxy: JSON-RPC response
MCPProxy-->>Browser: 200 { tools } or { result }
Note over MCPProxy: Errors → generic "MCP request failed"<br/>(internal details logged server-side only)
Reviews (1): Last reviewed commit: "fix(security): harden CORS fallback, pos..." | Re-trigger Greptile |
| /^::1$/, | ||
| /^::$/, // unspecified IPv6 | ||
| /^::ffff:/i, // IPv4-mapped IPv6 (e.g. ::ffff:127.0.0.1) | ||
| /^\[/, // bracket-wrapped IPv6 in hostname |
There was a problem hiding this comment.
/^\[/ pattern is dead code and never matches
The WHATWG URL parser (used in Cloudflare Workers / Vercel Edge) always strips brackets from IPv6 hostnames. new URL('http://[::1]/').hostname returns '::1', not '[::1]'. Because BLOCKED_HOST_PATTERNS is tested against url.hostname, the /^\[/ pattern can never match and provides false confidence of protection.
The existing patterns (/^::1$/, /^::$/, /^::ffff:/i, /^fd.../i, /^fe80:/i) do cover the critical IPv6 cases, but any full IPv6 literal passed to validateServerUrl that slips through (see the separate fc comment) would never hit this guard regardless.
| /^\[/, // bracket-wrapped IPv6 in hostname | |
| /^\[/, // bracket-wrapped IPv6 in hostname (unreachable via URL.hostname — URL parser strips brackets; kept as defence-in-depth for raw-string callers) |
Consider removing the check or adding a comment that it guards only hypothetical non-URL-parsed callers, so future reviewers don't rely on it for URL-derived hostnames.
| /^::$/, // unspecified IPv6 | ||
| /^::ffff:/i, // IPv4-mapped IPv6 (e.g. ::ffff:127.0.0.1) | ||
| /^\[/, // bracket-wrapped IPv6 in hostname | ||
| /^fd[0-9a-f]{2}:/i, |
There was a problem hiding this comment.
fc00::/8 ULA prefix not blocked — SSRF gap
The RFC 4193 Unique Local Address (ULA) range is fc00::/7, which covers both fc00::/8 (the fc prefix) and fd00::/8 (the fd prefix). The current pattern only blocks addresses beginning with fd:
/^fd[0-9a-f]{2}:/i
An attacker supplying a serverUrl like http://[fc00::1]/internal would have url.hostname = 'fc00::1', which does not match any pattern in BLOCKED_HOST_PATTERNS and would therefore pass validateServerUrl.
fc addresses are rare in practice, but the full ULA range should be blocked for correctness:
| /^fd[0-9a-f]{2}:/i, | |
| /^f[cd][0-9a-f]{2}:/i, |
This covers both fc and fd prefixes across the entire fc00::/7 range.
| const httpUrl = relayUrl.replace(/^wss:\/\//, 'https://'); | ||
| // If the env var was already https:// or got converted above, we're done. | ||
| // Otherwise force https:// for any remaining non-secure scheme. | ||
| const secured = httpUrl.startsWith('https://') ? httpUrl : 'https://' + httpUrl.replace(/^[a-z]+:\/\//, ''); | ||
| return secured.replace(/\/$/, ''); |
There was a problem hiding this comment.
Case-sensitive regex silently produces a broken URL
Both the wss:// replacement and the fallback replace(/^[a-z]+:\/\//, '') use case-sensitive patterns. If an operator sets WS_RELAY_URL=WSS://relay.example.com (uppercase — valid in RFC 3986), neither regex matches:
/^wss:\/\//→ no match →httpUrl = 'WSS://relay.example.com'httpUrl.startsWith('https://')→false/^[a-z]+:\/\//→ no match → strips nothing →'https://WSS://relay.example.com'
This silently constructs a malformed URL that will cause all relay requests to fail with no clear error. The same issue exists in the mirrored copy at server/_shared/relay.ts (line 5).
| const httpUrl = relayUrl.replace(/^wss:\/\//, 'https://'); | |
| // If the env var was already https:// or got converted above, we're done. | |
| // Otherwise force https:// for any remaining non-secure scheme. | |
| const secured = httpUrl.startsWith('https://') ? httpUrl : 'https://' + httpUrl.replace(/^[a-z]+:\/\//, ''); | |
| return secured.replace(/\/$/, ''); | |
| const httpUrl = relayUrl.replace(/^wss:\/\//i, 'https://'); | |
| // If the env var was already https:// or got converted above, we're done. | |
| // Otherwise force https:// for any remaining non-secure scheme. | |
| const secured = /^https:\/\//i.test(httpUrl) ? httpUrl : 'https://' + httpUrl.replace(/^[a-zA-Z]+:\/\//, ''); |
The same fix should be applied to the identical logic in server/_shared/relay.ts.
| } catch { | ||
| corsHeaders = { 'Access-Control-Allow-Origin': '*' }; | ||
| // Never fall back to wildcard CORS — that would bypass the origin allowlist. | ||
| // Use the hardcoded production origin as a safe default. | ||
| corsHeaders = { 'Access-Control-Allow-Origin': 'https://worldmonitor.app', 'Vary': 'Origin' }; |
There was a problem hiding this comment.
Hardcoded fallback origin breaks CORS for non-production variants
When getCorsHeaders throws (however unlikely), the fallback is now https://worldmonitor.app. Requests from tech.worldmonitor.app, finance.worldmonitor.app, commodity.worldmonitor.app, etc. would receive Access-Control-Allow-Origin: https://worldmonitor.app, causing browsers to block those cross-origin responses.
The intent to avoid the previous wildcard fallback is correct. A safer approach is to echo back the request's Origin only when it already passed the isDisallowedOrigin check performed a few lines above — by the time we reach this try/catch, any disallowed origin has already been rejected:
| } catch { | |
| corsHeaders = { 'Access-Control-Allow-Origin': '*' }; | |
| // Never fall back to wildcard CORS — that would bypass the origin allowlist. | |
| // Use the hardcoded production origin as a safe default. | |
| corsHeaders = { 'Access-Control-Allow-Origin': 'https://worldmonitor.app', 'Vary': 'Origin' }; | |
| // Never fall back to wildcard CORS — that would bypass the origin allowlist. | |
| // The disallowed-origin check above already rejected cross-origin requests, | |
| // so it is safe to echo the request origin here. | |
| const fallbackOrigin = req.headers.get('origin') || 'https://worldmonitor.app'; | |
| corsHeaders = { 'Access-Control-Allow-Origin': fallbackOrigin, 'Vary': 'Origin' }; |
|
@plato23 — thanks for the security hardening sweep! The Greptile review identified issues that need fixing before merge: Must fix:
Minor: The /^[/ bracket pattern in BLOCKED_HOST_PATTERNS is dead code — URL.hostname never contains brackets. Please also rebase on main and fill in the PR checklist. @koala73 — not safe to merge as-is due to the SSRF and CORS gaps. |
|
Note: merge conflicts with main. Fork PR — can't rebase from maintainer side. @plato23 — please rebase on current main when addressing the SSRF/CORS issues noted above. |
Summary
Type of change
Affected areas
/api/*)Checklist
api/rss-proxy.jsallowlist (if adding feeds)npm run typecheck)Screenshots