Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR adds a comprehensive guide for building a Perplexity-style search engine using Steel, Brave Search API, and OpenAI. It also fixes a broken internal link in the Playwright Node guide. The PR title mentions "add Perplexity Guide" but includes an additional stub file (aeo.mdx) and a link fix that should be noted.
- Adds a new 469-line tutorial demonstrating how to build an AI-powered search tool that scrapes web content and synthesizes answers
- Fixes an incorrect internal documentation link in the Playwright Node guide
- Includes an empty stub file for a future AEO Scraper guide
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| content/docs/overview/guides/perplexity.mdx | New comprehensive guide showing how to build a Perplexity-style search engine with Steel scraping, Brave Search, and OpenAI synthesis |
| content/docs/overview/guides/meta.json | Adds "perplexity" entry to navigation, making the new guide discoverable |
| content/docs/overview/guides/playwright-node.mdx | Fixes broken internal link from old path format to correct path format |
| content/docs/overview/guides/aeo.mdx | Adds empty stub file with only frontmatter, not included in navigation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Combine + remove the minutes (":00") if you want "7 PM" instead of "7:00 PM" | ||
| const final = `${dateStr}, ${timeStr.replace(/:00/, "")}`; | ||
|
|
||
| const system = `<goal> You are ...` |
There was a problem hiding this comment.
The system prompt is truncated with const system = \ You are ...``. This incomplete example could confuse users trying to implement this guide. Consider providing the complete prompt or adding a comment indicating where the full prompt can be found.
| const system = `<goal> You are ...` | |
| // System prompt: instruct the LLM to answer the user's question using the provided context and cite sources. | |
| const system = `<goal> |
| description: Scrape LLM providers with Steel and synthesize answers with OpenAI | ||
| sidebarTitle: AEO Scraper (Node) | ||
| llm: true | ||
| --- |
There was a problem hiding this comment.
[nitpick] This file appears to be a stub with only frontmatter and no content. If this is intentional as a placeholder, consider adding a "Coming soon" message or similar content. If it's not ready, consider removing it from this PR.
| --- | |
| --- | |
| > **Coming soon:** This guide is under construction and will be available soon. |
| const now = new Date(); | ||
|
|
||
| // Day of week, month, day, year | ||
| const dateFormatter = new Intl.DateTimeFormat("en-NZ", { | ||
| weekday: "long", | ||
| month: "long", | ||
| day: "2-digit", | ||
| year: "numeric", | ||
| timeZone: "Pacific/Auckland", | ||
| }); | ||
|
|
||
| // Time with hour + timezone abbreviation | ||
| const timeFormatter = new Intl.DateTimeFormat("en-NZ", { | ||
| hour: "numeric", | ||
| minute: "2-digit", | ||
| hour12: true, | ||
| timeZone: "Pacific/Auckland", | ||
| timeZoneName: "short", // gives "NZDT" | ||
| }); | ||
|
|
||
| const dateStr = dateFormatter.format(now); | ||
| const timeStr = timeFormatter.format(now); | ||
|
|
||
| // Combine + remove the minutes (":00") if you want "7 PM" instead of "7:00 PM" | ||
| const final = `${dateStr}, ${timeStr.replace(/:00/, "")}`; | ||
|
|
There was a problem hiding this comment.
The final variable is declared but never used. The date/time formatting code (lines 341-365) appears to be included but not actually utilized in the function. Consider removing this unused code or incorporating it into the system prompt if it was intended to provide context to the AI.
| const now = new Date(); | |
| // Day of week, month, day, year | |
| const dateFormatter = new Intl.DateTimeFormat("en-NZ", { | |
| weekday: "long", | |
| month: "long", | |
| day: "2-digit", | |
| year: "numeric", | |
| timeZone: "Pacific/Auckland", | |
| }); | |
| // Time with hour + timezone abbreviation | |
| const timeFormatter = new Intl.DateTimeFormat("en-NZ", { | |
| hour: "numeric", | |
| minute: "2-digit", | |
| hour12: true, | |
| timeZone: "Pacific/Auckland", | |
| timeZoneName: "short", // gives "NZDT" | |
| }); | |
| const dateStr = dateFormatter.format(now); | |
| const timeStr = timeFormatter.format(now); | |
| // Combine + remove the minutes (":00") if you want "7 PM" instead of "7:00 PM" | |
| const final = `${dateStr}, ${timeStr.replace(/:00/, "")}`; |
| topK, | ||
| }); | ||
|
|
||
| // 1) Use Brave to get top relevant URLs (do double to get more relevant results to search) |
There was a problem hiding this comment.
Grammar error: "do double to get more relevant results to search" should be rephrased for clarity. Consider: "double the count to get more relevant results" or "multiply by 2 to get more relevant search results".
| // 1) Use Brave to get top relevant URLs (do double to get more relevant results to search) | |
| // 1) Use Brave to get top relevant URLs (double the count to get more relevant results) |
| } | ||
|
|
||
| return { url, markdown, links }; | ||
| } catch { |
There was a problem hiding this comment.
The catch block on line 314 silently ignores all errors without any logging or diagnostics. This makes debugging difficult when scraping fails. Consider at least logging the error message or URL that failed to help with troubleshooting.
| } catch { | |
| } catch (err) { | |
| console.error(`Failed to scrape URL: ${url}`, err); |
| import { | ||
| scrapeUrlsToMarkdown, | ||
| synthesizeWithCitations, | ||
| multiQueryBraveSearch, |
There was a problem hiding this comment.
Inconsistent function naming: The import on line 105 uses multiQueryBraveSearch, but line 131 calls singleQueryBraveSearch and line 180 defines singleQueryBraveSearch. Either the import should be changed to singleQueryBraveSearch or the function definition and call should be updated to multiQueryBraveSearch.
| multiQueryBraveSearch, | |
| singleQueryBraveSearch, |
| Step 1: Get relevant URLs | ||
| --------------------------------------- | ||
|
|
||
| - The example calls the Brave API to recieve relevant URLs based on the user query |
There was a problem hiding this comment.
Typo: "recieve" should be spelled "receive".
| - The example calls the Brave API to recieve relevant URLs based on the user query | |
| - The example calls the Brave API to receive relevant URLs based on the user query |
| const endpoint = new URL(config.brave.endpoint); | ||
| endpoint.searchParams.set("q", query); | ||
| endpoint.searchParams.set("country", config.brave.country); | ||
| endpoint.searchParams.set("search_lang", config.brave.lang); | ||
| endpoint.searchParams.set("safesearch", config.brave.safesearch); | ||
| endpoint.searchParams.set( | ||
| "count", | ||
| String(Math.min(topK, config.search.topK)), | ||
| ); | ||
|
|
||
| const res = await fetchWithTimeout(endpoint.toString(), { | ||
| headers: { | ||
| Accept: "application/json", | ||
| "X-Subscription-Token": config.brave.apiKey, | ||
| }, |
There was a problem hiding this comment.
config.brave.endpoint is used directly to build the request URL and the X-Subscription-Token header (config.brave.apiKey) is sent to whatever host the endpoint points to. If BRAVE_SEARCH_ENDPOINT is misconfigured or attacker‑controlled, credentials will be exfiltrated to a malicious host. Fix by hardcoding the official Brave API host or strictly validating the endpoint (scheme must be https, host must equal api.search.brave.com) before sending requests, and avoid sending the token to untrusted hosts.
No description provided.