-
Notifications
You must be signed in to change notification settings - Fork 1k
fix: Vercel Blob hostname validation in documentUploadSchema and update example env #1840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: Vercel Blob hostname validation in documentUploadSchema and update example env #1840
Conversation
WalkthroughCentralizes and extends URL/hostname validation (Notion official, .notion.site, custom domains, Vercel blob host via VERCEL_BLOB_HOST or vercel-storage.com) into lib/zod/url-validation.ts, adds Notion custom-domain page-id extraction/verification, introduces getDubDiscountForExternalUserId and applies it in billing checkout, and updates various endpoints to use async Zod validation. Changes
Sequence Diagram(s)%% Notion URL validation and page-id resolution (high-level)
sequenceDiagram
autonumber
actor Client
participant API as Server Endpoint
participant Zod as url-validation
participant NotionUtils as lib/notion/utils
participant Env as Env (VERCEL_BLOB_HOST)
Client->>API: submit notion URL
API->>Zod: notionUrlUpdateSchema.safeParseAsync(url)
activate Zod
Zod->>Env: read VERCEL_BLOB_HOST
Env-->>Zod: host or undefined
note right of Zod #eef2ff: validateUrlHostname checks\n- official Notion hostnames\n- endsWith(".notion.site")\n- Vercel blob host via env/fallback
alt hostname is standard Notion
Zod-->>API: valid (may parse pageId via parsePageId)
else custom or non-standard hostname
Zod->>NotionUtils: getNotionPageIdFromSlug(url)
NotionUtils-->>Zod: pageId or null
alt pageId found
Zod-->>API: valid
else
Zod-->>API: invalid (400)
end
end
deactivate Zod
%% Billing upgrade: applying Dub discount (high-level)
sequenceDiagram
autonumber
actor User
participant Server as billing/upgrade API
participant Dub as lib/dub.getDubDiscountForExternalUserId
participant Stripe as Stripe Checkout
User->>Server: request upgrade
Server->>Dub: getDubDiscountForExternalUserId(user.externalId)
Dub-->>Server: dubDiscount or null
Server->>Stripe: createCheckoutSession(payload + dubDiscount?)
Stripe-->>Server: session created
Server-->>User: redirect/checkout URL
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
All contributors have signed the CLA ✍️ ✅ |
@kprocks86 is attempting to deploy a commit to the mftsio Team on Vercel. A member of the Team first needs to authorize it. |
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
.env.example (1)
14-14
: Clarify what value VERCEL_BLOB_HOST expects (base domain suffix, no protocol).Avoid confusion between full host and base domain. This also aligns with the validation logic that matches by domain suffix.
-VERCEL_BLOB_HOST=vercel-storage.com +# Vercel Storage Blob base host used for validation (domain suffix only) +# Example: vercel-storage.com (do not include protocol, path, or a leading dot) +VERCEL_BLOB_HOST=vercel-storage.comlib/zod/url-validation.ts (5)
107-116
: Update comment and error message to reflect that Vercel Blob HTTPS URLs are now accepted by filePathSchema.The implementation now allows Vercel Blob hosts via
validateUrlHostname
, but the comment and error still imply Notion-only.- // Case 1: Notion URLs - must start with notion domains + // Case 1: Hosted URLs - must be Notion or Vercel Blob hosts @@ - message: - "File path must be either a Notion URL or an file storage path", + message: + "File path must be a Notion/Vercel Blob URL or a storage path",Also applies to: 124-126
206-225
: Deduplicate Notion-host checks; consider a small helper for consistency.The Notion hostname logic is duplicated here instead of reusing a single helper. Extracting
isNotionHostname(h: string)
reduces drift and keeps behavior aligned if domains change.Example helper to place above:
const isNotionHostname = (h: string): boolean => { const host = h.toLowerCase(); return ( host === "www.notion.so" || host === "notion.so" || host.endsWith(".notion.site") ); };Then use:
- const hostname = urlObj.hostname; - return ( - hostname === "www.notion.so" || - hostname === "notion.so" || - hostname.endsWith(".notion.site") - ); + return isNotionHostname(urlObj.hostname);
231-241
: Comment no longer matches behavior.This branch now accepts Vercel Blob (and Notion) hosts via
validateUrlHostname
, but the comment says “Must be a Notion URL”.- // Must be a Notion URL for VERCEL_BLOB + // Must be a Notion or Vercel Blob URL for VERCEL_BLOB
118-122
: Avoid repeating the S3 path regex; extract a shared constant.The same strict pattern appears in two places; centralize to reduce maintenance risk.
- const s3PathPattern = - /^[a-zA-Z0-9_-]+\/doc_[a-zA-Z0-9_-]+\/[a-zA-Z0-9_.-]+\.[a-zA-Z0-9]+$/; + const s3PathPattern = S3_PATH_PATTERN;Add near the top of the file:
const S3_PATH_PATTERN = /^[a-zA-Z0-9_-]+\/doc_[a-zA-Z0-9_-]+\/[a-zA-Z0-9_.-]+\.[a-zA-Z0-9]+$/;And here:
- return /^[a-zA-Z0-9_-]+\/doc_[a-zA-Z0-9_-]+\/[a-zA-Z0-9_.-]+\.[a-zA-Z0-9]+$/.test( - data.url, - ); + return S3_PATH_PATTERN.test(data.url);Also applies to: 243-245
79-99
: Add tests for hostname validation edge cases.Target cases that would pass with
includes
but must be rejected with suffix checks.Suggested cases:
- Accept: example.public.blob.vercel-storage.com
- Accept: public.blob.vercel-storage.com
- Accept when env set to blob.vercel-storage.com: foo.blob.vercel-storage.com
- Reject: vercel-storage.com.attacker.tld
- Reject: notvercel-storage.com
- Case-insensitivity: VERCEL-STORAGE.COM and MixedCase hosts
- Notion: www.notion.so, notion.so, foo.notion.site
I can open a follow-up PR with these unit tests if helpful.
Also applies to: 112-116, 231-241
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
.env.example
(1 hunks)lib/zod/url-validation.ts
(3 hunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
lib/zod/url-validation.ts (1)
89-104
: Good fix: replaced substring matching with exact/suffix allowlist for Vercel BlobSwitching from includes(...) to exact match or dot-suffix checks closes the allowlist bypass (e.g., vercel-storage.com.attacker.tld). Nice improvement.
🧹 Nitpick comments (1)
lib/zod/url-validation.ts (1)
121-121
: createFilePathValidator now accepts Vercel Blob URLs; update the stale comment and confirm intentThis now returns validateUrlHostname(hostname), which also allows Vercel Blob hosts, not just Notion. That’s likely intended, but the nearby comment “Case 1: Notion URLs - must start with notion domains” is now misleading, and the error message below still says “either a Notion URL or a file storage path.”
- Please update the comment to reflect “trusted HTTPS URLs (Notion or Vercel Blob)”.
- Consider aligning the error message accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
lib/zod/url-validation.ts
(3 hunks)
🔇 Additional comments (1)
lib/zod/url-validation.ts (1)
246-246
: Consistent reuse of hostname helper in storageType checkGood reuse of validateUrlHostname here; keeps the allowlist logic centralized and consistent.
// Helper function to validate URL hostnames for different storage types | ||
const validateUrlHostname = (hostname: string): boolean => { | ||
// Valid notion domains | ||
const validNotionDomains = ["www.notion.so", "notion.so"]; | ||
|
||
// Check for notion.site subdomains (e.g., example-something.notion.site) | ||
const isNotionSite = hostname.endsWith(".notion.site"); | ||
const isValidNotionDomain = validNotionDomains.includes(hostname); | ||
|
||
// Check for vercel blob storage | ||
let isVercelBlob = false; | ||
|
||
const normalizedHostname = hostname.toLowerCase().trim(); | ||
|
||
if (process.env.VERCEL_BLOB_HOST) { | ||
|
||
const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim(); | ||
// Use exact match or suffix-with-dot to prevent bypasses | ||
isVercelBlob = normalizedHostname === normalizedBlobHost || | ||
normalizedHostname.endsWith("." + normalizedBlobHost); | ||
} else { | ||
// Fallback: check for common Vercel Blob patterns if env var is not set | ||
// Use exact match or suffix-with-dot to prevent bypasses | ||
isVercelBlob = normalizedHostname === "vercel-storage.com" || | ||
normalizedHostname.endsWith(".vercel-storage.com"); | ||
} | ||
|
||
return isNotionSite || isValidNotionDomain || isVercelBlob; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Lowercase and normalize hostnames for all checks; fix case-sensitivity and trailing-dot edge cases
Notion checks use the raw hostname while Vercel uses a normalized one. Hostnames are case-insensitive, and trailing dots are valid. Today, a URL like https://WWW.NOTION.SO/ or https://page.notion.site./ will be rejected, while https://foo.vercel-storage.com./ won’t pass either due to the trailing dot. Normalize once and reuse the same variable across all checks.
Apply this diff to make the checks robust and consistent:
-// Helper function to validate URL hostnames for different storage types
-const validateUrlHostname = (hostname: string): boolean => {
- // Valid notion domains
- const validNotionDomains = ["www.notion.so", "notion.so"];
-
- // Check for notion.site subdomains (e.g., example-something.notion.site)
- const isNotionSite = hostname.endsWith(".notion.site");
- const isValidNotionDomain = validNotionDomains.includes(hostname);
-
- // Check for vercel blob storage
- let isVercelBlob = false;
-
- const normalizedHostname = hostname.toLowerCase().trim();
-
- if (process.env.VERCEL_BLOB_HOST) {
-
- const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim();
- // Use exact match or suffix-with-dot to prevent bypasses
- isVercelBlob = normalizedHostname === normalizedBlobHost ||
- normalizedHostname.endsWith("." + normalizedBlobHost);
- } else {
- // Fallback: check for common Vercel Blob patterns if env var is not set
- // Use exact match or suffix-with-dot to prevent bypasses
- isVercelBlob = normalizedHostname === "vercel-storage.com" ||
- normalizedHostname.endsWith(".vercel-storage.com");
- }
-
- return isNotionSite || isValidNotionDomain || isVercelBlob;
-};
+// Helper function to validate URL hostnames for different storage types
+const validateUrlHostname = (hostname: string): boolean => {
+ // Normalize once: lowercase, trim, and strip leading/trailing dots
+ const h = hostname.toLowerCase().trim().replace(/^\.+|\.+$/g, "");
+
+ // Valid Notion domains
+ const validNotionDomains = ["www.notion.so", "notion.so"];
+
+ // Check for notion.site subdomains (e.g., example-something.notion.site)
+ const isNotionSite = h.endsWith(".notion.site");
+ const isValidNotionDomain = validNotionDomains.includes(h);
+
+ // Check for Vercel Blob storage by exact-or-suffix match
+ const base = (process.env.VERCEL_BLOB_HOST || "vercel-storage.com")
+ .toLowerCase()
+ .trim()
+ .replace(/^\.+|\.+$/g, "");
+ const isVercelBlob = h === base || h.endsWith(`.${base}`);
+
+ return isNotionSite || isValidNotionDomain || isVercelBlob;
+};
If you want, I can add unit tests covering:
- Mixed-case hosts (WWW.NOTION.SO)
- Trailing-dot hosts (page.notion.site., foo.vercel-storage.com.)
- Env override (custom base with/without trailing dot)
- Negative case (vercel-storage.com.attacker.tld)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Helper function to validate URL hostnames for different storage types | |
const validateUrlHostname = (hostname: string): boolean => { | |
// Valid notion domains | |
const validNotionDomains = ["www.notion.so", "notion.so"]; | |
// Check for notion.site subdomains (e.g., example-something.notion.site) | |
const isNotionSite = hostname.endsWith(".notion.site"); | |
const isValidNotionDomain = validNotionDomains.includes(hostname); | |
// Check for vercel blob storage | |
let isVercelBlob = false; | |
const normalizedHostname = hostname.toLowerCase().trim(); | |
if (process.env.VERCEL_BLOB_HOST) { | |
const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim(); | |
// Use exact match or suffix-with-dot to prevent bypasses | |
isVercelBlob = normalizedHostname === normalizedBlobHost || | |
normalizedHostname.endsWith("." + normalizedBlobHost); | |
} else { | |
// Fallback: check for common Vercel Blob patterns if env var is not set | |
// Use exact match or suffix-with-dot to prevent bypasses | |
isVercelBlob = normalizedHostname === "vercel-storage.com" || | |
normalizedHostname.endsWith(".vercel-storage.com"); | |
} | |
return isNotionSite || isValidNotionDomain || isVercelBlob; | |
}; | |
// Helper function to validate URL hostnames for different storage types | |
const validateUrlHostname = (hostname: string): boolean => { | |
// Normalize once: lowercase, trim whitespace, and strip leading/trailing dots | |
const h = hostname | |
.toLowerCase() | |
.trim() | |
.replace(/^\.+|\.+$/g, ""); | |
// Valid Notion domains | |
const validNotionDomains = ["www.notion.so", "notion.so"]; | |
// Check for notion.site subdomains (e.g., example-something.notion.site) | |
const isNotionSite = h.endsWith(".notion.site"); | |
const isValidNotionDomain = validNotionDomains.includes(h); | |
// Check for Vercel Blob storage by exact-or-suffix match | |
const base = (process.env.VERCEL_BLOB_HOST || "vercel-storage.com") | |
.toLowerCase() | |
.trim() | |
.replace(/^\.+|\.+$/g, ""); | |
const isVercelBlob = h === base || h.endsWith(`.${base}`); | |
return isNotionSite || isValidNotionDomain || isVercelBlob; | |
}; |
🤖 Prompt for AI Agents
In lib/zod/url-validation.ts around lines 79 to 107, the function currently
mixes raw and normalized hostnames causing case-sensitivity and trailing-dot
issues; normalize the input hostname once (lowercase, trim, and remove any
trailing dot) and use that normalizedHostname for all checks (notion domains,
notion.site suffix, and vercel blob checks); also normalize/process
process.env.VERCEL_BLOB_HOST the same way (lowercase, trim, remove trailing dot)
before comparing and continue to use exact-match or suffix-with-dot comparisons
(e.g., normalizedHostname === normalizedBlobHost ||
normalizedHostname.endsWith("." + normalizedBlobHost)) to avoid bypasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pages/api/teams/[teamId]/billing/upgrade.ts (1)
85-102
: Dub discount is not applied for existing Stripe customers.
dubDiscount
is only spread into the “new customer” session. Existing customers won’t get the discount, which likely contradicts intent.Apply this diff to include the discount in the existing-customer flow:
allow_promotion_codes: true, - client_reference_id: teamId, + client_reference_id: teamId, + ...(dubDiscount ?? {}),
♻️ Duplicate comments (3)
lib/zod/url-validation.ts (3)
135-139
: Fix allowlist bypass:endsWith(process.env.VERCEL_BLOB_HOST)
accepts sibling/attacker domains.
evilvercel-storage.com
passesendsWith("vercel-storage.com")
. Use exact-or-dot-bounded suffix with normalization and add a safe fallback when the env is unset. Also, reuse the helper to avoid divergence.- // Check for vercel blob storage - let isVercelBlob = false; - if (process.env.VERCEL_BLOB_HOST) { - isVercelBlob = hostname.endsWith(process.env.VERCEL_BLOB_HOST); - } + // Known hosts (Notion or Vercel Blob) short-circuit + const isKnownHost = validateUrlHostname(hostname); + const isVercelBlob = isKnownHost; // encompassed by helper
354-371
: Repeat of the Vercel Blob allowlist bypass; also missing fallback when env is absent.This duplicates the earlier issue:
endsWith(process.env.VERCEL_BLOB_HOST)
is unsafe and fails when the env var isn’t set (even for legitimate*.vercel-storage.com
hosts). Reuse the hardened helper.- const isStandardNotion = - hostname === "www.notion.so" || - hostname === "notion.so" || - hostname.endsWith(".notion.site"); - - if (isStandardNotion) { - return true; - } - - // Check for vercel blob storage - let isVercelBlob = false; - if (process.env.VERCEL_BLOB_HOST) { - isVercelBlob = hostname.endsWith(process.env.VERCEL_BLOB_HOST); - if (isVercelBlob) { - return true; - } - } + // Known hostnames (Notion or Vercel Blob) are accepted + if (validateUrlHostname(hostname)) { + return true; + }
84-112
: Centralize and harden host checks; normalize everything and require dot-bounded suffix.Great to see a helper; but:
- Notion checks use raw
hostname
, causing case/trailing-dot edge cases.- Vercel check is correct here (exact or
.
-bounded suffix), but this helper isn’t reused below, leading to drift.Normalize once, use it for all families, and then reuse this helper elsewhere in the file.
-const validateUrlHostname = (hostname: string): boolean => { - // Valid notion domains - const validNotionDomains = ["www.notion.so", "notion.so"]; - - // Check for notion.site subdomains (e.g., example-something.notion.site) - const isNotionSite = hostname.endsWith(".notion.site"); - const isValidNotionDomain = validNotionDomains.includes(hostname); - - // Check for vercel blob storage - let isVercelBlob = false; - - const normalizedHostname = hostname.toLowerCase().trim(); - - if (process.env.VERCEL_BLOB_HOST) { - - const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim(); - // Use exact match or suffix-with-dot to prevent bypasses - isVercelBlob = normalizedHostname === normalizedBlobHost || - normalizedHostname.endsWith("." + normalizedBlobHost); - } else { - // Fallback: check for common Vercel Blob patterns if env var is not set - // Use exact match or suffix-with-dot to prevent bypasses - isVercelBlob = normalizedHostname === "vercel-storage.com" || - normalizedHostname.endsWith(".vercel-storage.com"); - } - - return isNotionSite || isValidNotionDomain || isVercelBlob; -}; +const validateUrlHostname = (hostname: string): boolean => { + // Normalize once: lowercase, trim, and strip leading/trailing dots + const h = hostname.toLowerCase().trim().replace(/^\.+|\.+$/g, ""); + + // Notion + const validNotionDomains = ["www.notion.so", "notion.so"]; + const isNotionSite = h.endsWith(".notion.site"); + const isValidNotionDomain = validNotionDomains.includes(h); + + // Vercel Blob: exact or dot-bounded suffix; allow fallback when env is absent + const base = (process.env.VERCEL_BLOB_HOST || "vercel-storage.com") + .toLowerCase() + .trim() + .replace(/^\.+|\.+$/g, ""); + const isVercelBlob = h === base || h.endsWith(`.${base}`); + + return isNotionSite || isValidNotionDomain || isVercelBlob; +};
🧹 Nitpick comments (18)
lib/dub.ts (2)
6-14
: Early-exit when DUB_API_KEY is unset and tolerate SDK list shapes; add a narrow return type.Avoids unnecessary network calls and noisy logs when the key is missing, and handles both array and paginated shapes (
{ data: [...] }
) defensively. Also clarifies the return payload shape so it matches the Stripe param we spread into session creation.@@ -export async function getDubDiscountForExternalUserId(externalId: string) { +type StripeDiscountsParam = { discounts: { coupon: string }[] }; + +export async function getDubDiscountForExternalUserId( + externalId: string, +): Promise<StripeDiscountsParam | null> { + // Fast-fail if not configured to avoid noisy network errors in non-prod + if (!process.env.DUB_API_KEY) { + if (process.env.NODE_ENV !== "production") { + console.warn("DUB_API_KEY is not set; skipping Dub discount lookup"); + } + return null; + } try { - const customers = await dub.customers.list({ + const customers = await dub.customers.list({ externalId, includeExpandedFields: true, }); - const first = customers[0]; + const list = Array.isArray(customers) ? customers : customers?.data ?? []; + const first = list[0];
19-23
: Narrow the error log to avoid leaking sensitive data; add context.Some SDK errors can include request details. Log only the message and relevant context to reduce risk.
- } catch (err) { - console.warn("Skipping Dub discount due to API error", err); + } catch (err) { + console.warn("Skipping Dub discount due to API error", { + externalId, + message: err instanceof Error ? err.message : String(err), + });pages/api/teams/[teamId]/billing/upgrade.ts (1)
85-102
: Mirror metadata for observability parity with the new-customer path.Optional, but adding
metadata.dubCustomerId
in the existing-customer branch keeps analytics/debugging consistent.allow_promotion_codes: true, client_reference_id: teamId, + metadata: { + dubCustomerId: userId, + },pages/api/teams/[teamId]/documents/index.ts (1)
245-247
: Use safeParseAsync: correct. Consider deriving type when absent for robustnessThe async validation change is correct. To reduce client coupling, consider deriving
type
when it’s not provided (similar to agreement route). This avoids 400s if older clients omittype
.Apply:
- const validationResult = await documentUploadSchema.safeParseAsync( - req.body, - ); + const validationResult = await documentUploadSchema.safeParseAsync({ + ...req.body, + // Derive type from URL/filename if client omitted it + type: req.body.type ?? (req.body.url ? getExtension(req.body.url) : getExtension(req.body.name)), + });Outside the selected lines, update the import:
-import { log, serializeFileSize } from "@/lib/utils"; +import { log, serializeFileSize, getExtension } from "@/lib/utils";pages/api/teams/[teamId]/documents/[id]/versions/index.ts (1)
35-38
: Async validation: good. Optional: derive type if clients omit itSwitching to
safeParseAsync
is correct. If some clients don’t sendtype
, consider deriving it fromurl
to keep the endpoint backward compatible.Apply:
- const validationResult = await documentUploadSchema.safeParseAsync({ - ...req.body, - name: `Version ${new Date().toISOString()}`, // Dummy name for validation - }); + const validationResult = await documentUploadSchema.safeParseAsync({ + ...req.body, + name: `Version ${new Date().toISOString()}`, // Dummy name for validation + type: req.body.type ?? getExtension(req.body.url), + });And add the import outside the selected lines:
-import { log } from "@/lib/utils"; +import { log, getExtension } from "@/lib/utils";lib/api/documents/process-document.ts (1)
71-72
: Fix typo in user-facing error message (“publically” → “publicly”)This is shown to users and appears in logs. Quick copy edit improves polish.
- throw new Error("This Notion page isn't publically available."); + throw new Error("This Notion page isn't publicly available.");pages/api/teams/[teamId]/documents/[id]/update-notion-url.ts (1)
28-36
: Schema-driven validation with safeParseAsync is the right move; log invalid attemptsUsing
notionUrlUpdateSchema.safeParseAsync
centralizes validation and SSRF checks. Consider logging invalid attempts for observability.- const validationResult = await notionUrlUpdateSchema.safeParseAsync(url); - if (!validationResult.success) { - return res.status(400).json({ message: validationResult.error.message }); - } + const validationResult = await notionUrlUpdateSchema.safeParseAsync(url); + if (!validationResult.success) { + console.warn( + "[update-notion-url] Validation failed", + JSON.stringify(validationResult.error.errors ?? validationResult.error.issues ?? validationResult.error), + ); + return res.status(400).json({ message: validationResult.error.message }); + }components/documents/add-document-modal.tsx (2)
408-421
: Simplify ID derivation; avoid null-vs-undefined churn and double-naming.Use a single
pageId
variable, coalesceparsePageId
toundefined
, and only call the async slug resolver if still falsy. This reduces branching and avoids reassigningundefined
explicitly.- let validateNotionPageId = parsePageId(notionLink); + let pageId = parsePageId(notionLink) ?? undefined; - if (validateNotionPageId === null) { + if (!pageId) { try { - const pageId = await getNotionPageIdFromSlug(notionLink); - validateNotionPageId = pageId || undefined; + pageId = (await getNotionPageIdFromSlug(notionLink)) ?? undefined; } catch (slugError) { toast.error("Please enter a valid Notion link to proceed."); return; } } - if (!validateNotionPageId) { + if (!pageId) { toast.error("Please enter a valid Notion link to proceed."); return; }
408-421
: Consider reusing the shared Zod notion URL schema client-side.To keep client and server semantics identical, you can import
notionUrlUpdateSchema
and callsafeParseAsync(notionLink)
here, replacing bespoke parsing. This prevents drift as validation evolves.Would you like me to wire
notionUrlUpdateSchema.safeParseAsync
into this handler and adjust the UI message handling?lib/notion/utils.ts (3)
176-183
: Remove forbiddenUser-Agent
header for browser compatibility.Browsers disallow setting the
User-Agent
header; this will be ignored or error in some environments. It’s unnecessary here.const response = await fetch(apiUrl, { method: "POST", headers: { "Content-Type": "application/json", - "User-Agent": "Mozilla/5.0 (compatible; MyApp/1.0)", }, body: JSON.stringify(payload), });
82-101
: Nice fallback path; consider decoding the pathname before regex.
parsePageId
already handles most formats. For the regex fallback, decoding%2D
-style encodings can help in edge cases.- const pathname = urlObj.pathname; + const pathname = decodeURIComponent(urlObj.pathname);
107-120
: Network errors can yield false negatives for custom domains.
isCustomNotionDomain
returnsfalse
on any error (including transient network issues), potentially rejecting valid URLs. Consider distinguishing unreachable vs. invalid: treat network timeouts/5xx as “unknown” and let server-side validation decide, or retry once.I can add a conservative retry with exponential backoff and map 4xx to false, 5xx/network to true-with-warning or “unknown” depending on your preference.
lib/zod/url-validation.ts (6)
140-156
: Deduplicate logic: call the helper first, then fall back to custom-domain Notion resolution.This keeps all standard host logic in one place and reduces the chance of drift.
- // If it's not a standard Notion domain or Vercel blob, check if it's a custom Notion domain - if (!isNotionSite && !isValidNotionDomain && !isVercelBlob) { + // If not a known host (Notion/Vercel Blob), try custom Notion domain resolution + if (!isKnownHost) { try { let pageId = parsePageId(path); if (!pageId) { const pageIdFromSlug = await getNotionPageIdFromSlug(path); if (pageIdFromSlug) { pageId = pageIdFromSlug; } } return !!pageId; } catch { return false; } } - return isNotionSite || isValidNotionDomain || isVercelBlob; + return true;
168-171
: Grammar nit: article usage.“an file storage path” → “a file storage path”.
- "File path must be either a Notion URL or an file storage path", + "File path must be either a Notion URL or a file storage path",
190-192
: Scheme check should be case-insensitive and robust.
startsWith("https://")
missesHTTPS://…
and is bypassable with leading whitespace. Prefer parsing the URL and checkingprotocol === "https:"
.- .refine((url) => url.startsWith("https://"), { + .refine((url) => { + try { + return new URL(url).protocol === "https:"; + } catch { + return false; + } + }, { message: "Notion URL must use HTTPS", })
186-247
: Optional: use the helper innotionUrlUpdateSchema
too.Not strictly required, but calling
validateUrlHostname(urlObj.hostname)
before custom resolution would short-circuit common cases and keep semantics consistent across validators.I can submit a follow-up patch applying the helper here and add unit tests to lock behavior.
248-394
: Message clarity and consistency across branches.Error messages between validators differ slightly; consider aligning them (e.g., “URL contains invalid characters or targets internal resources”) for a consistent UX.
Happy to unify copy across schemas and add i18n keys if you have a translations layer.
120-171
: Add comprehensive domain normalization tests for URL validationPlease add unit tests covering both Notion URL and file–storage path validators to guard against mixed-case, trailing-dot, and malicious domains. Specifically, verify:
- Mixed-case domains (e.g.
https://WWW.NOTION.SO/
)- Trailing-dot hosts (e.g.
https://page.notion.site./abc
)- Valid Vercel blob hosts (e.g.
https://example123.public.blob.vercel-storage.com
)- Invalid attacker variants (e.g.
https://evilvercel-storage.com
,https://vercel-storage.com.attacker.tld
)- Behavior when
process.env.VERCEL_BLOB_HOST
is absent versus present, and with/without a trailing dotApply these tests to both validation blocks (currently at lines 120–171 and lines 312–394 in
lib/zod/url-validation.ts
).Let me know if you’d like help scaffolding these tests.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (10)
components/documents/add-document-modal.tsx
(1 hunks)lib/api/documents/process-document.ts
(1 hunks)lib/dub.ts
(1 hunks)lib/notion/utils.ts
(4 hunks)lib/zod/url-validation.ts
(6 hunks)pages/api/teams/[teamId]/billing/upgrade.ts
(3 hunks)pages/api/teams/[teamId]/documents/[id]/update-notion-url.ts
(4 hunks)pages/api/teams/[teamId]/documents/[id]/versions/index.ts
(1 hunks)pages/api/teams/[teamId]/documents/agreement.ts
(1 hunks)pages/api/teams/[teamId]/documents/index.ts
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
pages/api/teams/[teamId]/documents/index.ts (1)
lib/zod/url-validation.ts (1)
documentUploadSchema
(249-394)
pages/api/teams/[teamId]/documents/agreement.ts (1)
lib/zod/url-validation.ts (1)
documentUploadSchema
(249-394)
pages/api/teams/[teamId]/documents/[id]/versions/index.ts (1)
lib/zod/url-validation.ts (1)
documentUploadSchema
(249-394)
components/documents/add-document-modal.tsx (1)
lib/notion/utils.ts (1)
getNotionPageIdFromSlug
(122-198)
lib/api/documents/process-document.ts (1)
lib/notion/utils.ts (1)
getNotionPageIdFromSlug
(122-198)
pages/api/teams/[teamId]/documents/[id]/update-notion-url.ts (1)
lib/zod/url-validation.ts (1)
notionUrlUpdateSchema
(187-246)
pages/api/teams/[teamId]/billing/upgrade.ts (1)
lib/dub.ts (1)
getDubDiscountForExternalUserId
(7-24)
lib/zod/url-validation.ts (1)
lib/notion/utils.ts (2)
getNotionPageIdFromSlug
(122-198)isCustomNotionDomain
(107-120)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Socket Security: Pull Request Alerts
🔇 Additional comments (9)
lib/dub.ts (1)
15-17
: Env-aware selection of test vs. live coupon looks good.Choosing
couponTestId
in non-production andcouponId
in production aligns with safe environment separation.pages/api/teams/[teamId]/billing/upgrade.ts (3)
119-123
: LGTM: Discount spread + Dub metadata in new-customer flow.The shape
{ discounts: [{ coupon }] }
is correctly spread, andmetadata.dubCustomerId
improves traceability.
127-135
: LGTM: Consolidated waitUntil with Promise.all.Runs identification and analytics concurrently while keeping the response streaming-friendly.
80-81
: Confirm DubexternalId
Mapping Matches Your App’suserId
I didn’t find any calls in our codebase creating or updating Dub customers with an
externalId
field (your search fordub.customers.create|update
andexternalId:
returned no matches). That means the only place we passexternalId
into Dub is ingetDubDiscountForExternalUserId(userId)
—but we have no guarantee that Dub customers were ever provisioned withexternalId
set to ouruserId
rather than, say, email or another identifier.• Locate where Dub customers are created/updated (in your integration code or via your Dub dashboard) and confirm the
externalId
value is set to the sameuserId
that you’re passing intogetDubDiscountForExternalUserId
.
• If your provisioning uses a different key (email, UUID, etc.), you’ll need to update either the provisioning logic or this lookup to prevent silent “no discount” cases.pages/api/teams/[teamId]/documents/agreement.ts (1)
35-39
: Good switch to safeParseAsync; fallback for type is a nice touchMoving to
safeParseAsync
correctly supports async refinements indocumentUploadSchema
. Also, populatingtype
fromgetExtension(req.body.name)
when missing helps keep clients loosely coupled to the API.pages/api/teams/[teamId]/documents/index.ts (1)
245-247
: All Vercel Blob validations use.endsWith
, Zod schemas usesafeParseAsync
, andVERCEL_BLOB_HOST
is present
Verified that:
- No instances of
schema.safeParse(
remain fordocumentUploadSchema
ornotionUrlUpdateSchema
.- All host‐matching logic in
lib/zod/url-validation.ts
useshostname.endsWith(process.env.VERCEL_BLOB_HOST)
(no.includes
usage)..env.example
definesVERCEL_BLOB_HOST=vercel-storage.com
.lib/api/documents/process-document.ts (1)
58-60
: Minor type-safety cleanup looks goodAssigning via
pageIdFromSlug
and coercing toundefined
avoids propagatingnull
. No behavior changes; improves clarity.pages/api/teams/[teamId]/documents/[id]/update-notion-url.ts (1)
97-101
: Response shape is clearReturning both a success message and
newUrl
is helpful for clients. LGTM.lib/zod/url-validation.ts (1)
1-11
: Imports look good; ensure tree-shaking doesn't pull server-only code into client bundles.
getNotionPageIdFromSlug
andisCustomNotionDomain
do network calls; they’re fine client-side, but watch for anynotion
server-only exports getting bundled inadvertently. Keep these functions free of server-only deps.I can scan for cross-imports that might drag server-only code into the client. Would you like a quick audit script?
export async function getNotionPageIdFromSlug( | ||
url: string, | ||
): Promise<string | null> { | ||
// Parse the URL to extract domain and slug | ||
const urlObj = new URL(url); | ||
const hostname = urlObj.hostname; | ||
|
||
const isNotionSo = hostname === "www.notion.so" || hostname === "notion.so"; | ||
const isNotionSite = hostname.endsWith(".notion.site"); | ||
|
||
// notion.so: extract ID from path directly | ||
if (isNotionSo) { | ||
const pageId = parsePageId(url) ?? parsePageId(urlObj.pathname); | ||
if (pageId) return pageId; | ||
throw new Error(`Unable to extract page ID from Notion URL: ${url}`); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Normalize hostnames (lowercase, strip trailing dot) before domain-family checks.
Hostnames are case-insensitive and can legally end with a trailing dot. Without normalization, HTTPS://WWW.NOTION.SO.
would fail, and mixed-case hosts can bypass checks.
-export async function getNotionPageIdFromSlug(
+export async function getNotionPageIdFromSlug(
url: string,
): Promise<string | null> {
// Parse the URL to extract domain and slug
const urlObj = new URL(url);
- const hostname = urlObj.hostname;
+ const hostname = urlObj.hostname.toLowerCase().replace(/^\.+|\.+$/g, "");
- const isNotionSo = hostname === "www.notion.so" || hostname === "notion.so";
- const isNotionSite = hostname.endsWith(".notion.site");
+ const isNotionSo = hostname === "www.notion.so" || hostname === "notion.so";
+ const isNotionSite = hostname.endsWith(".notion.site");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export async function getNotionPageIdFromSlug( | |
url: string, | |
): Promise<string | null> { | |
// Parse the URL to extract domain and slug | |
const urlObj = new URL(url); | |
const hostname = urlObj.hostname; | |
const isNotionSo = hostname === "www.notion.so" || hostname === "notion.so"; | |
const isNotionSite = hostname.endsWith(".notion.site"); | |
// notion.so: extract ID from path directly | |
if (isNotionSo) { | |
const pageId = parsePageId(url) ?? parsePageId(urlObj.pathname); | |
if (pageId) return pageId; | |
throw new Error(`Unable to extract page ID from Notion URL: ${url}`); | |
} | |
export async function getNotionPageIdFromSlug( | |
url: string, | |
): Promise<string | null> { | |
// Parse the URL to extract domain and slug | |
const urlObj = new URL(url); | |
const hostname = urlObj.hostname.toLowerCase().replace(/^\.+|\.+$/g, ""); | |
const isNotionSo = hostname === "www.notion.so" || hostname === "notion.so"; | |
const isNotionSite = hostname.endsWith(".notion.site"); | |
// notion.so: extract ID from path directly | |
if (isNotionSo) { | |
const pageId = parsePageId(url) ?? parsePageId(urlObj.pathname); | |
if (pageId) return pageId; | |
throw new Error(`Unable to extract page ID from Notion URL: ${url}`); | |
} |
🤖 Prompt for AI Agents
In lib/notion/utils.ts around lines 122 to 137, the hostname is used directly to
decide Notion domain families which fails for mixed-case or trailing-dot
hostnames; normalize the hostname by lowercasing it and stripping any trailing
dot before doing comparisons, then use that normalized value for equality checks
("www.notion.so" / "notion.so") and for .endsWith(".notion.site") so URLs like
"HTTPS://WWW.NOTION.SO." or mixed-case hosts match correctly.
This PR is a Fix of issue #1839
Where i found out that the logic which is used to validate the
documentUploadSchema
is bit faulty.issues
VERCEL_BLOB_HOST
this variable is not present in the.env.example
example123.public.blob.vercel-storage.com
and previouly was matching withstartWith
as you can see from the hostincludes
is the best way to match that condition.storageType
the same logic is needed as for theurl
so made a new function and used at both the places.Summary by CodeRabbit
New Features
Bug Fixes
Chores