Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ POSTGRES_PRISMA_URL=
POSTGRES_PRISMA_URL_NON_POOLING=
# This variable is from Vercel Storage Blob
BLOB_READ_WRITE_TOKEN=
VERCEL_BLOB_HOST=vercel-storage.com

# Google client id and secret for authentication
GOOGLE_CLIENT_ID=
Expand Down
53 changes: 32 additions & 21 deletions lib/zod/url-validation.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,36 @@ export const validateUrlSecurity = (url: string): boolean => {
return validatePathSecurity(url) && validateUrlSSRFProtection(url);
};

// Helper function to validate URL hostnames for different storage types
const validateUrlHostname = (hostname: string): boolean => {
// Valid notion domains
const validNotionDomains = ["www.notion.so", "notion.so"];

// Check for notion.site subdomains (e.g., example-something.notion.site)
const isNotionSite = hostname.endsWith(".notion.site");
const isValidNotionDomain = validNotionDomains.includes(hostname);

// Check for vercel blob storage
let isVercelBlob = false;

const normalizedHostname = hostname.toLowerCase().trim();

if (process.env.VERCEL_BLOB_HOST) {

const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim();
// Use exact match or suffix-with-dot to prevent bypasses
isVercelBlob = normalizedHostname === normalizedBlobHost ||
normalizedHostname.endsWith("." + normalizedBlobHost);
} else {
// Fallback: check for common Vercel Blob patterns if env var is not set
// Use exact match or suffix-with-dot to prevent bypasses
isVercelBlob = normalizedHostname === "vercel-storage.com" ||
normalizedHostname.endsWith(".vercel-storage.com");
}

return isNotionSite || isValidNotionDomain || isVercelBlob;
};
Comment on lines +84 to +112
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Lowercase and normalize hostnames for all checks; fix case-sensitivity and trailing-dot edge cases

Notion checks use the raw hostname while Vercel uses a normalized one. Hostnames are case-insensitive, and trailing dots are valid. Today, a URL like https://WWW.NOTION.SO/ or https://page.notion.site./ will be rejected, while https://foo.vercel-storage.com./ won’t pass either due to the trailing dot. Normalize once and reuse the same variable across all checks.

Apply this diff to make the checks robust and consistent:

-// Helper function to validate URL hostnames for different storage types
-const validateUrlHostname = (hostname: string): boolean => {
-  // Valid notion domains
-  const validNotionDomains = ["www.notion.so", "notion.so"];
-
-  // Check for notion.site subdomains (e.g., example-something.notion.site)
-  const isNotionSite = hostname.endsWith(".notion.site");
-  const isValidNotionDomain = validNotionDomains.includes(hostname);
-
-  // Check for vercel blob storage
-  let isVercelBlob = false;
-
-  const normalizedHostname = hostname.toLowerCase().trim();
-
-  if (process.env.VERCEL_BLOB_HOST) {
-
-    const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim();
-    // Use exact match or suffix-with-dot to prevent bypasses
-    isVercelBlob = normalizedHostname === normalizedBlobHost ||
-      normalizedHostname.endsWith("." + normalizedBlobHost);
-  } else {
-    // Fallback: check for common Vercel Blob patterns if env var is not set
-    // Use exact match or suffix-with-dot to prevent bypasses
-    isVercelBlob = normalizedHostname === "vercel-storage.com" ||
-      normalizedHostname.endsWith(".vercel-storage.com");
-  }
-
-  return isNotionSite || isValidNotionDomain || isVercelBlob;
-};
+// Helper function to validate URL hostnames for different storage types
+const validateUrlHostname = (hostname: string): boolean => {
+  // Normalize once: lowercase, trim, and strip leading/trailing dots
+  const h = hostname.toLowerCase().trim().replace(/^\.+|\.+$/g, "");
+
+  // Valid Notion domains
+  const validNotionDomains = ["www.notion.so", "notion.so"];
+
+  // Check for notion.site subdomains (e.g., example-something.notion.site)
+  const isNotionSite = h.endsWith(".notion.site");
+  const isValidNotionDomain = validNotionDomains.includes(h);
+
+  // Check for Vercel Blob storage by exact-or-suffix match
+  const base = (process.env.VERCEL_BLOB_HOST || "vercel-storage.com")
+    .toLowerCase()
+    .trim()
+    .replace(/^\.+|\.+$/g, "");
+  const isVercelBlob = h === base || h.endsWith(`.${base}`);
+
+  return isNotionSite || isValidNotionDomain || isVercelBlob;
+};

If you want, I can add unit tests covering:

  • Mixed-case hosts (WWW.NOTION.SO)
  • Trailing-dot hosts (page.notion.site., foo.vercel-storage.com.)
  • Env override (custom base with/without trailing dot)
  • Negative case (vercel-storage.com.attacker.tld)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Helper function to validate URL hostnames for different storage types
const validateUrlHostname = (hostname: string): boolean => {
// Valid notion domains
const validNotionDomains = ["www.notion.so", "notion.so"];
// Check for notion.site subdomains (e.g., example-something.notion.site)
const isNotionSite = hostname.endsWith(".notion.site");
const isValidNotionDomain = validNotionDomains.includes(hostname);
// Check for vercel blob storage
let isVercelBlob = false;
const normalizedHostname = hostname.toLowerCase().trim();
if (process.env.VERCEL_BLOB_HOST) {
const normalizedBlobHost = process.env.VERCEL_BLOB_HOST.toLowerCase().trim();
// Use exact match or suffix-with-dot to prevent bypasses
isVercelBlob = normalizedHostname === normalizedBlobHost ||
normalizedHostname.endsWith("." + normalizedBlobHost);
} else {
// Fallback: check for common Vercel Blob patterns if env var is not set
// Use exact match or suffix-with-dot to prevent bypasses
isVercelBlob = normalizedHostname === "vercel-storage.com" ||
normalizedHostname.endsWith(".vercel-storage.com");
}
return isNotionSite || isValidNotionDomain || isVercelBlob;
};
// Helper function to validate URL hostnames for different storage types
const validateUrlHostname = (hostname: string): boolean => {
// Normalize once: lowercase, trim whitespace, and strip leading/trailing dots
const h = hostname
.toLowerCase()
.trim()
.replace(/^\.+|\.+$/g, "");
// Valid Notion domains
const validNotionDomains = ["www.notion.so", "notion.so"];
// Check for notion.site subdomains (e.g., example-something.notion.site)
const isNotionSite = h.endsWith(".notion.site");
const isValidNotionDomain = validNotionDomains.includes(h);
// Check for Vercel Blob storage by exact-or-suffix match
const base = (process.env.VERCEL_BLOB_HOST || "vercel-storage.com")
.toLowerCase()
.trim()
.replace(/^\.+|\.+$/g, "");
const isVercelBlob = h === base || h.endsWith(`.${base}`);
return isNotionSite || isValidNotionDomain || isVercelBlob;
};
🤖 Prompt for AI Agents
In lib/zod/url-validation.ts around lines 79 to 107, the function currently
mixes raw and normalized hostnames causing case-sensitivity and trailing-dot
issues; normalize the input hostname once (lowercase, trim, and remove any
trailing dot) and use that normalizedHostname for all checks (notion domains,
notion.site suffix, and vercel blob checks); also normalize/process
process.env.VERCEL_BLOB_HOST the same way (lowercase, trim, remove trailing dot)
before comparing and continue to use exact-match or suffix-with-dot comparisons
(e.g., normalizedHostname === normalizedBlobHost ||
normalizedHostname.endsWith("." + normalizedBlobHost)) to avoid bypasses.


// Custom validator for file paths - either Notion URLs or S3 storage paths
const createFilePathValidator = () => {
return z
Expand All @@ -88,21 +118,7 @@ const createFilePathValidator = () => {
try {
const urlObj = new URL(path);
const hostname = urlObj.hostname;

// Valid notion domains
const validNotionDomains = ["www.notion.so", "notion.so"];

// Check for notion.site subdomains (e.g., example-something.notion.site)
const isNotionSite = hostname.endsWith(".notion.site");
const isValidNotionDomain = validNotionDomains.includes(hostname);

// Check for vercel blob storage
let isVercelBlob = false;
if (process.env.VERCEL_BLOB_HOST) {
isVercelBlob = hostname.startsWith(process.env.VERCEL_BLOB_HOST);
}

return isNotionSite || isValidNotionDomain || isVercelBlob;
return validateUrlHostname(hostname);
} catch {
return false;
}
Expand Down Expand Up @@ -227,12 +243,7 @@ export const documentUploadSchema = z
// Must be a Notion URL for VERCEL_BLOB
try {
const urlObj = new URL(data.url);
const hostname = urlObj.hostname;
return (
hostname === "www.notion.so" ||
hostname === "notion.so" ||
hostname.endsWith(".notion.site")
);
return validateUrlHostname(urlObj.hostname);
} catch {
return false;
}
Expand Down