Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion Dockerfile.relay
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
FROM node:22-alpine

# curl required by OREF polling (Node.js JA3 fingerprint blocked by Akamai; curl passes)
RUN apk add --no-cache curl
RUN apk add --no-cache curl && \
addgroup -S appgroup && adduser -S appuser -G appgroup

WORKDIR /app

Expand All @@ -27,6 +28,10 @@ COPY shared/ ./shared/
# Data files required by the relay (telegram-channels.json, etc.)
COPY data/ ./data/

RUN chown -R appuser:appgroup /app

USER appuser

EXPOSE 3004

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
Expand Down
10 changes: 8 additions & 2 deletions api/_relay.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,13 @@ import { jsonResponse } from './_json-response.js';
export function getRelayBaseUrl() {
const relayUrl = process.env.WS_RELAY_URL;
if (!relayUrl) return null;
return relayUrl.replace('wss://', 'https://').replace('ws://', 'http://').replace(/\/$/, '');
// Always upgrade to HTTPS — cleartext relay connections are not permitted.
// Normalize any WebSocket scheme to https://.
const httpUrl = relayUrl.replace(/^wss:\/\//, 'https://');
// If the env var was already https:// or got converted above, we're done.
// Otherwise force https:// for any remaining non-secure scheme.
const secured = httpUrl.startsWith('https://') ? httpUrl : 'https://' + httpUrl.replace(/^[a-z]+:\/\//, '');
return secured.replace(/\/$/, '');
Comment on lines +12 to +16
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Case-sensitive regex silently produces a broken URL

Both the wss:// replacement and the fallback replace(/^[a-z]+:\/\//, '') use case-sensitive patterns. If an operator sets WS_RELAY_URL=WSS://relay.example.com (uppercase — valid in RFC 3986), neither regex matches:

  1. /^wss:\/\// → no match → httpUrl = 'WSS://relay.example.com'
  2. httpUrl.startsWith('https://')false
  3. /^[a-z]+:\/\// → no match → strips nothing → 'https://WSS://relay.example.com'

This silently constructs a malformed URL that will cause all relay requests to fail with no clear error. The same issue exists in the mirrored copy at server/_shared/relay.ts (line 5).

Suggested change
const httpUrl = relayUrl.replace(/^wss:\/\//, 'https://');
// If the env var was already https:// or got converted above, we're done.
// Otherwise force https:// for any remaining non-secure scheme.
const secured = httpUrl.startsWith('https://') ? httpUrl : 'https://' + httpUrl.replace(/^[a-z]+:\/\//, '');
return secured.replace(/\/$/, '');
const httpUrl = relayUrl.replace(/^wss:\/\//i, 'https://');
// If the env var was already https:// or got converted above, we're done.
// Otherwise force https:// for any remaining non-secure scheme.
const secured = /^https:\/\//i.test(httpUrl) ? httpUrl : 'https://' + httpUrl.replace(/^[a-zA-Z]+:\/\//, '');

The same fix should be applied to the identical logic in server/_shared/relay.ts.

}

export function getRelayHeaders(baseHeaders = {}) {
Expand Down Expand Up @@ -115,9 +121,9 @@ export function createRelayHandler(cfg) {
} catch (error) {
if (cfg.fallback) return cfg.fallback(req, corsHeaders);
const isTimeout = error?.name === 'AbortError';
console.error('[relay] error:', error?.message || String(error));
return jsonResponse({
error: isTimeout ? 'Relay timeout' : 'Relay request failed',
details: error?.message || String(error),
}, isTimeout ? 504 : 502, corsHeaders);
}
};
Expand Down
45 changes: 41 additions & 4 deletions api/mcp-proxy.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { getCorsHeaders, isDisallowedOrigin } from './_cors.js';
import { checkRateLimit } from './_rate-limit.js';
import { jsonResponse } from './_json-response.js';

export const config = { runtime: 'edge' };
Expand All @@ -11,11 +12,16 @@ const MCP_PROTOCOL_VERSION = '2025-03-26';
const BLOCKED_HOST_PATTERNS = [
/^localhost$/i,
/^127\./,
/^0\.0\.0\.0$/, // unspecified IPv4 — routes to loopback on many systems
/^0+$/, // zero in various forms
/^10\./,
/^172\.(1[6-9]|2\d|3[01])\./,
/^192\.168\./,
/^169\.254\./, // link-local + cloud metadata (AWS/GCP/Azure)
/^169\.254\./, // link-local + cloud metadata (AWS/GCP/Azure)
/^::1$/,
/^::$/, // unspecified IPv6
/^::ffff:/i, // IPv4-mapped IPv6 (e.g. ::ffff:127.0.0.1)
/^\[/, // bracket-wrapped IPv6 in hostname
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 /^\[/ pattern is dead code and never matches

The WHATWG URL parser (used in Cloudflare Workers / Vercel Edge) always strips brackets from IPv6 hostnames. new URL('http://[::1]/').hostname returns '::1', not '[::1]'. Because BLOCKED_HOST_PATTERNS is tested against url.hostname, the /^\[/ pattern can never match and provides false confidence of protection.

The existing patterns (/^::1$/, /^::$/, /^::ffff:/i, /^fd.../i, /^fe80:/i) do cover the critical IPv6 cases, but any full IPv6 literal passed to validateServerUrl that slips through (see the separate fc comment) would never hit this guard regardless.

Suggested change
/^\[/, // bracket-wrapped IPv6 in hostname
/^\[/, // bracket-wrapped IPv6 in hostname (unreachable via URL.hostname — URL parser strips brackets; kept as defence-in-depth for raw-string callers)

Consider removing the check or adding a comment that it guards only hypothetical non-URL-parsed callers, so future reviewers don't rely on it for URL-derived hostnames.

/^fd[0-9a-f]{2}:/i,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 fc00::/8 ULA prefix not blocked — SSRF gap

The RFC 4193 Unique Local Address (ULA) range is fc00::/7, which covers both fc00::/8 (the fc prefix) and fd00::/8 (the fd prefix). The current pattern only blocks addresses beginning with fd:

/^fd[0-9a-f]{2}:/i

An attacker supplying a serverUrl like http://[fc00::1]/internal would have url.hostname = 'fc00::1', which does not match any pattern in BLOCKED_HOST_PATTERNS and would therefore pass validateServerUrl.

fc addresses are rare in practice, but the full ULA range should be blocked for correctness:

Suggested change
/^fd[0-9a-f]{2}:/i,
/^f[cd][0-9a-f]{2}:/i,

This covers both fc and fd prefixes across the entire fc00::/7 range.

/^fe80:/i,
];
Expand All @@ -42,6 +48,31 @@ function validateServerUrl(raw) {
return url;
}

// Headers that must not be overridden by user-supplied custom headers.
// Allowing these to be set by the client could lead to SSRF (Host), auth
// hijacking, or request smuggling via hop-by-hop headers.
const BLOCKED_HEADER_NAMES = new Set([
'host',
'cookie',
'set-cookie',
'transfer-encoding',
'content-length',
'connection',
'keep-alive',
'te',
'trailer',
'upgrade',
'proxy-authorization',
'proxy-authenticate',
'via',
'forwarded',
'x-forwarded-for',
'x-forwarded-host',
'x-forwarded-proto',
'x-real-ip',
'cf-connecting-ip',
]);

function buildHeaders(customHeaders) {
const h = {
'Content-Type': 'application/json',
Expand All @@ -54,7 +85,8 @@ function buildHeaders(customHeaders) {
// Strip CRLF to prevent header injection
const safeKey = k.replace(/[\r\n]/g, '');
const safeVal = v.replace(/[\r\n]/g, '');
if (safeKey) h[safeKey] = safeVal;
if (!safeKey || BLOCKED_HEADER_NAMES.has(safeKey.toLowerCase())) continue;
h[safeKey] = safeVal;
}
}
}
Expand Down Expand Up @@ -334,6 +366,9 @@ export default async function handler(req) {
if (req.method === 'OPTIONS')
return new Response(null, { status: 204, headers: cors });

const rateLimitResponse = await checkRateLimit(req, cors);
if (rateLimitResponse) return rateLimitResponse;

try {
if (req.method === 'GET') {
const url = new URL(req.url);
Expand Down Expand Up @@ -369,7 +404,9 @@ export default async function handler(req) {
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
const isTimeout = msg.includes('TimeoutError') || msg.includes('timed out');
// Return 422 (not 502) so Cloudflare proxy does not replace our JSON body with its own HTML error page
return jsonResponse({ error: isTimeout ? 'MCP server timed out' : msg }, isTimeout ? 504 : 422, cors);
console.error('[mcp-proxy] error:', msg);
// Return 422 (not 502) so Cloudflare proxy does not replace our JSON body with its own HTML error page.
// Avoid leaking internal error details to the client.
return jsonResponse({ error: isTimeout ? 'MCP server timed out' : 'MCP request failed' }, isTimeout ? 504 : 422, cors);
}
}
2 changes: 0 additions & 2 deletions api/rss-proxy.js
Original file line number Diff line number Diff line change
Expand Up @@ -184,8 +184,6 @@ export default async function handler(req) {
console.error('RSS proxy error:', feedUrl, error.message);
return jsonResponse({
error: isTimeout ? 'Feed timeout' : 'Failed to fetch feed',
details: error.message,
url: feedUrl
}, isTimeout ? 504 : 502, corsHeaders);
}
}
5 changes: 4 additions & 1 deletion server/_shared/relay.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@ import { CHROME_UA } from './constants';
export function getRelayBaseUrl(): string | null {
const relayUrl = process.env.WS_RELAY_URL;
if (!relayUrl) return null;
return relayUrl.replace(/^ws(s?):\/\//, 'http$1://').replace(/\/$/, '');
// Always upgrade to HTTPS — cleartext relay connections are not permitted.
const httpUrl = relayUrl.replace(/^wss:\/\//, 'https://');
const secured = httpUrl.startsWith('https://') ? httpUrl : 'https://' + httpUrl.replace(/^[a-z]+:\/\//, '');
return secured.replace(/\/$/, '');
}

export function getRelayHeaders(extra: Record<string, string> = {}): Record<string, string> {
Expand Down
4 changes: 3 additions & 1 deletion server/gateway.ts
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,9 @@ export function createDomainGateway(
try {
corsHeaders = getCorsHeaders(request);
} catch {
corsHeaders = { 'Access-Control-Allow-Origin': '*' };
// Never fall back to wildcard CORS — that would bypass the origin allowlist.
// Use the hardcoded production origin as a safe default.
corsHeaders = { 'Access-Control-Allow-Origin': 'https://worldmonitor.app', 'Vary': 'Origin' };
Comment on lines 222 to +225
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Hardcoded fallback origin breaks CORS for non-production variants

When getCorsHeaders throws (however unlikely), the fallback is now https://worldmonitor.app. Requests from tech.worldmonitor.app, finance.worldmonitor.app, commodity.worldmonitor.app, etc. would receive Access-Control-Allow-Origin: https://worldmonitor.app, causing browsers to block those cross-origin responses.

The intent to avoid the previous wildcard fallback is correct. A safer approach is to echo back the request's Origin only when it already passed the isDisallowedOrigin check performed a few lines above — by the time we reach this try/catch, any disallowed origin has already been rejected:

Suggested change
} catch {
corsHeaders = { 'Access-Control-Allow-Origin': '*' };
// Never fall back to wildcard CORS — that would bypass the origin allowlist.
// Use the hardcoded production origin as a safe default.
corsHeaders = { 'Access-Control-Allow-Origin': 'https://worldmonitor.app', 'Vary': 'Origin' };
// Never fall back to wildcard CORS — that would bypass the origin allowlist.
// The disallowed-origin check above already rejected cross-origin requests,
// so it is safe to echo the request origin here.
const fallbackOrigin = req.headers.get('origin') || 'https://worldmonitor.app';
corsHeaders = { 'Access-Control-Allow-Origin': fallbackOrigin, 'Vary': 'Origin' };

}

// OPTIONS preflight
Expand Down
36 changes: 27 additions & 9 deletions server/worldmonitor/news/v1/_classifier.ts
Original file line number Diff line number Diff line change
Expand Up @@ -181,23 +181,41 @@ const SHORT_KEYWORDS = new Set([

const keywordRegexCache = new Map<string, RegExp>();

function getKeywordRegex(kw: string): RegExp {
let re = keywordRegexCache.get(kw);
if (!re) {
re = SHORT_KEYWORDS.has(kw)
? new RegExp(`\\b${kw.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\b`)
: new RegExp(kw.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
keywordRegexCache.set(kw, re);
function escapeRegExp(s: string): string {
return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

// Pre-build all keyword regexes at module load time so that no RegExp is
// constructed from runtime strings during request handling (eliminates ReDoS surface).
const ALL_KEYWORD_MAPS: KeywordMap[] = [
CRITICAL_KEYWORDS, HIGH_KEYWORDS, MEDIUM_KEYWORDS, LOW_KEYWORDS,
TECH_HIGH_KEYWORDS, TECH_MEDIUM_KEYWORDS, TECH_LOW_KEYWORDS,
];
for (const map of ALL_KEYWORD_MAPS) {
for (const kw of Object.keys(map)) {
if (!keywordRegexCache.has(kw)) {
const escaped = escapeRegExp(kw);
keywordRegexCache.set(kw, SHORT_KEYWORDS.has(kw)
? new RegExp(`\\b${escaped}\\b`)
: new RegExp(escaped));
}
}
return re;
}

function keywordMatches(kw: string, text: string): boolean {
const cached = keywordRegexCache.get(kw);
if (cached) return cached.test(text);
// Fallback for unknown keywords (should not happen with hardcoded maps).
// Use plain string search — no dynamic RegExp construction at runtime.
return text.includes(kw);
}

function matchKeywords(
titleLower: string,
keywords: KeywordMap
): { keyword: string; category: EventCategory } | null {
for (const [kw, cat] of Object.entries(keywords)) {
if (getKeywordRegex(kw).test(titleLower)) {
if (keywordMatches(kw, titleLower)) {
return { keyword: kw, category: cat };
}
}
Expand Down
12 changes: 8 additions & 4 deletions server/worldmonitor/news/v1/list-feed-digest.ts
Original file line number Diff line number Diff line change
Expand Up @@ -182,15 +182,19 @@ for (const tag of KNOWN_TAGS) {
});
}

/**
* Extract the text content of an XML tag. Only pre-cached tag names (see
* KNOWN_TAGS) are accepted — unknown tags return '' immediately. This avoids
* constructing RegExp from runtime strings, eliminating any ReDoS risk.
*/
function extractTag(xml: string, tag: string): string {
const cached = TAG_REGEX_CACHE.get(tag);
const cdataRe = cached?.cdata ?? new RegExp(`<${tag}[^>]*>\\s*<!\\[CDATA\\[([\\s\\S]*?)\\]\\]>\\s*<\\/${tag}>`, 'i');
const plainRe = cached?.plain ?? new RegExp(`<${tag}[^>]*>([^<]*)<\\/${tag}>`, 'i');
if (!cached) return '';

const cdataMatch = xml.match(cdataRe);
const cdataMatch = xml.match(cached.cdata);
if (cdataMatch) return cdataMatch[1]!.trim();

const match = xml.match(plainRe);
const match = xml.match(cached.plain);
return match ? decodeXmlEntities(match[1]!.trim()) : '';
}

Expand Down
10 changes: 8 additions & 2 deletions src/components/DeductionPanel.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { Panel } from './Panel';
import { getRpcBaseUrl } from '@/services/rpc-client';
import { getRpcBaseUrl } from '@/services/rpc-client';
import { IntelligenceServiceClient } from '@/generated/client/worldmonitor/intelligence/v1/service_client';
import { h, replaceChildren } from '@/utils/dom-utils';
import { marked } from 'marked';
Expand Down Expand Up @@ -129,7 +129,13 @@ export class DeductionPanel extends Panel {
if (resp.analysis) {
const parsed = await marked.parse(resp.analysis);
if (!this.element?.isConnected) return;
this.resultContainer.innerHTML = DOMPurify.sanitize(parsed);
this.resultContainer.innerHTML = DOMPurify.sanitize(parsed, {
ALLOWED_TAGS: ['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'ul', 'ol', 'li',
'strong', 'em', 'b', 'i', 'br', 'hr', 'code', 'pre', 'blockquote',
'table', 'thead', 'tbody', 'tr', 'th', 'td', 'span', 'div', 'small'],
ALLOWED_ATTR: ['class'],
ALLOW_DATA_ATTR: false,
});

const meta = h('div', { style: 'margin-top: 12px; font-size: 0.75em; color: #888;' },
`Generated by ${resp.provider || 'AI'}${resp.model ? ` (${resp.model})` : ''}`
Expand Down
10 changes: 10 additions & 0 deletions src/components/LiveWebcamsPanel.ts
Original file line number Diff line number Diff line change
Expand Up @@ -415,10 +415,20 @@ export class LiveWebcamsPanel extends Panel {
container.appendChild(overlay);
}

private static readonly TRUSTED_ORIGINS = new Set([
'https://www.youtube.com',
'https://www.youtube-nocookie.com',
'https://webcams.windy.com',
]);

private handleEmbedMessage(e: MessageEvent): void {
const iframe = this.findIframeBySource(e.source);
if (!iframe) return;

// Validate origin: only accept messages from YouTube, Windy, or the local sidecar.
const localOrigin = isDesktopRuntime() ? `http://localhost:${getLocalApiPort()}` : null;
if (!LiveWebcamsPanel.TRUSTED_ORIGINS.has(e.origin) && e.origin !== localOrigin) return;

// Desktop sidecar posts { type: 'yt-ready' | 'yt-state' | 'yt-error' }
const msg = e.data as { type?: string; state?: number; code?: number; event?: string; info?: unknown } | string | null;

Expand Down
4 changes: 3 additions & 1 deletion src/utils/widget-sanitizer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ export function wrapWidgetHtml(html: string, extraClass = ''): string {
function escapeSrcdoc(str: string): string {
return str
.replace(/&/g, '&amp;')
.replace(/"/g, '&quot;');
.replace(/"/g, '&quot;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;');
}

export function wrapProWidgetHtml(bodyContent: string): string {
Expand Down
7 changes: 4 additions & 3 deletions tests/mcp-proxy.test.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ describe('api/mcp-proxy', () => {
const res = await handler(makeGetRequest({ serverUrl: 'https://mcp.example.com/mcp' }));
assert.equal(res.status, 422);
const data = await res.json();
assert.match(data.error, /Method not found/i);
assert.match(data.error, /MCP request failed/i);
});

it('returns 504 on fetch timeout', async () => {
Expand Down Expand Up @@ -303,7 +303,7 @@ describe('api/mcp-proxy', () => {
}));
assert.equal(res.status, 422);
const data = await res.json();
assert.match(data.error, /Unknown tool/i);
assert.match(data.error, /MCP request failed/i);
});

it('returns 504 on timeout during tool call', async () => {
Expand Down Expand Up @@ -390,7 +390,8 @@ describe('api/mcp-proxy', () => {
const res = await handler(makeGetRequest({ serverUrl: 'https://mcp.example.com/sse' }));
assert.equal(res.status, 422);
const data = await res.json();
assert.match(data.error, /blocked|SSRF|endpoint/i);
// Error message is intentionally generic to avoid leaking internals
assert.match(data.error, /MCP request failed|blocked|SSRF|endpoint/i);
});
});

Expand Down
8 changes: 5 additions & 3 deletions tests/relay-helper.test.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,9 @@ describe('getRelayBaseUrl', () => {
assert.equal(getRelayBaseUrl(), 'https://relay.example.com');
});

it('converts ws:// to http://', () => {
it('converts insecure websocket scheme to https://', () => {
process.env.WS_RELAY_URL = 'ws://relay.example.com';
assert.equal(getRelayBaseUrl(), 'http://relay.example.com');
assert.equal(getRelayBaseUrl(), 'https://relay.example.com');
});

it('strips trailing slash', () => {
Expand Down Expand Up @@ -310,7 +310,9 @@ describe('createRelayHandler', () => {
assert.equal(res.status, 502);
const body = await res.json();
assert.equal(body.error, 'Relay request failed');
assert.equal(body.details, 'Connection refused');
// Internal error details are intentionally omitted from the response
// to prevent information leakage — they are logged server-side only.
assert.equal(body.details, undefined);
});

it('calls fallback when relay unavailable', async () => {
Expand Down
10 changes: 6 additions & 4 deletions tests/shared-relay.test.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,9 @@ function loadRelayFunctions() {
const getRelayBaseUrl = function () {
const relayUrl = process.env.WS_RELAY_URL;
if (!relayUrl) return null;
return relayUrl.replace(/^ws(s?):\/\//, 'http$1://').replace(/\/$/, '');
const httpUrl = relayUrl.replace(/^wss:\/\//, 'https://');
const secured = httpUrl.startsWith('https://') ? httpUrl : 'https://' + httpUrl.replace(/^[a-z]+:\/\//, '');
return secured.replace(/\/$/, '');
};

const getRelayHeaders = function (extra = {}) {
Expand All @@ -65,7 +67,7 @@ function loadRelayFunctions() {
};

// Verify source file still matches expected logic shape
assert.ok(src.includes('replace(/^ws(s?):\\/\\//'), 'relay.ts must use single-regex wss:// transform');
assert.ok(src.includes('wss:'), 'relay.ts must handle wss:// transform');
assert.ok(src.includes('...extra'), 'relay.ts must spread extra before auth headers');
assert.ok(src.includes("relayHeader !== 'authorization'"), 'relay.ts must guard against Authorization header collision');

Expand All @@ -89,9 +91,9 @@ describe('getRelayBaseUrl', () => {
});
});

it('transforms ws:// to http://', () => {
it('transforms insecure websocket scheme to https://', () => {
withEnv({ WS_RELAY_URL: 'ws://relay.example.com' }, () => {
assert.equal(getRelayBaseUrl(), 'http://relay.example.com');
assert.equal(getRelayBaseUrl(), 'https://relay.example.com');
});
});

Expand Down
4 changes: 1 addition & 3 deletions vercel.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,7 @@
{
"source": "/api/(.*)",
"headers": [
{ "key": "Access-Control-Allow-Origin", "value": "*" },
{ "key": "Access-Control-Allow-Methods", "value": "GET, POST, OPTIONS" },
{ "key": "Access-Control-Allow-Headers", "value": "Content-Type, Authorization, X-WorldMonitor-Key" }
{ "key": "X-Content-Type-Options", "value": "nosniff" }
]
},
{
Expand Down
Loading