feat(docs): add llms.txt and llms-full.txt for AI discoverability#1455
feat(docs): add llms.txt and llms-full.txt for AI discoverability#1455yashdev9274 wants to merge 1 commit intogeneralaction:mainfrom
Conversation
|
@yashdev9274 is attempting to deploy a commit to the General Action Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR adds two new Next.js route handlers —
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| docs/app/llms.txt/route.ts | Generates the llms.txt index page; missing required H1 header per the llms.txt spec and uses relative (non-navigable) paths instead of absolute URLs. |
| docs/app/llms-full.txt/route.ts | Concatenates full processed markdown for all docs pages; no fallback for empty output and empty pages are not filtered before joining. |
| docs/lib/get-llm-text.ts | Thin helper that retrieves processed markdown from a fumadocs page; correctly uses optional chaining to guard against missing getText. |
| docs/source.config.ts | Enables includeProcessedMarkdown in the fumadocs-mdx postprocessor so the getText API is available at runtime. |
Sequence Diagram
sequenceDiagram
participant Client as AI Agent / Browser
participant LLMsTxt as GET /llms.txt
participant LLMsFullTxt as GET /llms-full.txt
participant Source as source.getPages()
participant GetLLMText as getLLMText()
participant PageData as page.data.getText()
Client->>LLMsTxt: HTTP GET /llms.txt
LLMsTxt->>Source: getPages()
Source-->>LLMsTxt: Page[] (title, description, url)
LLMsTxt-->>Client: text/markdown index (## Title, > desc, - Source: /url)
Client->>LLMsFullTxt: HTTP GET /llms-full.txt
LLMsFullTxt->>Source: getPages()
Source-->>LLMsFullTxt: Page[]
loop For each page
LLMsFullTxt->>GetLLMText: getLLMText(page)
GetLLMText->>PageData: page.data.getText?.('processed')
PageData-->>GetLLMText: processed markdown string | undefined
GetLLMText-->>LLMsFullTxt: string (or '')
end
LLMsFullTxt-->>Client: text/markdown full docs (all pages joined with \n\n)
Last reviewed commit: 74b7fe8
| const pages = source.getPages(); | ||
| const lines: string[] = []; | ||
|
|
||
| for (const page of pages) { | ||
| const url = page.url; | ||
| const data = page.data as { title?: string; description?: string }; | ||
| const title = data.title ?? 'Page'; | ||
| const description = data.description ?? ''; | ||
| lines.push(`## ${title}`); | ||
| if (description) { | ||
| lines.push(`> ${description}`); | ||
| } | ||
| lines.push(`- Source: ${url}\n`); | ||
| } | ||
|
|
||
| const result = lines.join('\n'); | ||
| return new Response(result || '# No pages found', { | ||
| headers: { | ||
| 'Content-Type': 'text/markdown', | ||
| }, | ||
| }); |
There was a problem hiding this comment.
Missing top-level H1 header required by llms.txt spec
The llms.txt specification requires a top-level # Project Name H1 heading followed by an optional blockquote tagline at the very top of the file before any ## sections. Without the H1, AI agents and tools that parse llms.txt may fail to identify the project or treat the output as malformed.
The current output starts directly with ## Title sections. A spec-compliant header section should be prepended:
| const pages = source.getPages(); | |
| const lines: string[] = []; | |
| for (const page of pages) { | |
| const url = page.url; | |
| const data = page.data as { title?: string; description?: string }; | |
| const title = data.title ?? 'Page'; | |
| const description = data.description ?? ''; | |
| lines.push(`## ${title}`); | |
| if (description) { | |
| lines.push(`> ${description}`); | |
| } | |
| lines.push(`- Source: ${url}\n`); | |
| } | |
| const result = lines.join('\n'); | |
| return new Response(result || '# No pages found', { | |
| headers: { | |
| 'Content-Type': 'text/markdown', | |
| }, | |
| }); | |
| export function GET() { | |
| const pages = source.getPages(); | |
| const lines: string[] = [ | |
| '# Emdash', | |
| '', | |
| '> An Open Source Agentic Development Environment (ADE)', | |
| '', | |
| ]; | |
| for (const page of pages) { | |
| const url = page.url; | |
| const data = page.data as { title?: string; description?: string }; | |
| const title = data.title ?? 'Page'; | |
| const description = data.description ?? ''; | |
| lines.push(`## ${title}`); | |
| if (description) { | |
| lines.push(`> ${description}`); | |
| } | |
| lines.push(`- Source: ${url}\n`); | |
| } | |
| const result = lines.join('\n'); | |
| return new Response(result || '# No pages found', { | |
| headers: { | |
| 'Content-Type': 'text/markdown', | |
| }, | |
| }); | |
| } |
| if (description) { | ||
| lines.push(`> ${description}`); | ||
| } | ||
| lines.push(`- Source: ${url}\n`); |
There was a problem hiding this comment.
Relative URL limits AI agent usability
page.url returns a path like /installation, not an absolute URL (e.g., https://docs.emdash.dev/installation). AI agents consuming llms.txt follow the links to retrieve content — a relative path without a known origin is not navigable. Additionally, the standard llms.txt link format uses a markdown hyperlink [Page Title](URL) rather than a bare Source: /path label.
Consider reading the canonical base URL from an environment variable or Next.js config:
| lines.push(`- Source: ${url}\n`); | |
| lines.push(`- [${title}](${process.env.NEXT_PUBLIC_SITE_URL ?? ''}${url})\n`); |
| const scanned = await Promise.all(pages.map(getLLMText)); | ||
| return new Response(scanned.join('\n\n'), { |
There was a problem hiding this comment.
Empty pages included in concatenated output
getLLMText returns '' when page.data.getText is unavailable or returns undefined. Joining those empty strings with '\n\n' leaves blank sections in the output, which reduces quality for consumers. Unlike llms.txt/route.ts which has a '# No pages found' fallback, this route also has no fallback for an entirely empty result.
Consider filtering and adding a fallback:
| const scanned = await Promise.all(pages.map(getLLMText)); | |
| return new Response(scanned.join('\n\n'), { | |
| const scanned = (await Promise.all(pages.map(getLLMText))).filter(Boolean); | |
| return new Response(scanned.join('\n\n') || '# No content available', { |
Summary
Add llms.txt and llms-full.txt endpoints to the emdash documentation to improve AI agent discoverability and context retrieval. This follows the industry standard for making documentation accessible to LLMs and AI coding agents.
Fixes
Fixes #1412
Snapshot
Type of change
Mandatory Tasks
Checklist
pnpm run format)pnpm run lint)