Skip to content

feat(docs): add llms.txt and llms-full.txt for AI discoverability#1455

Open
yashdev9274 wants to merge 1 commit intogeneralaction:mainfrom
yashdev9274:feat-yd-#1412
Open

feat(docs): add llms.txt and llms-full.txt for AI discoverability#1455
yashdev9274 wants to merge 1 commit intogeneralaction:mainfrom
yashdev9274:feat-yd-#1412

Conversation

@yashdev9274
Copy link
Contributor

Summary

Add llms.txt and llms-full.txt endpoints to the emdash documentation to improve AI agent discoverability and context retrieval. This follows the industry standard for making documentation accessible to LLMs and AI coding agents.

Fixes

Fixes #1412

Snapshot

image

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Chore (refactoring code, technical debt, workflow improvements)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g. code style improvements, linting)
  • This change requires a documentation update

Mandatory Tasks

  • I have self-reviewed the code
  • A decent size PR without self-review might be rejected

Checklist

  • I have read the contributing guide
  • My code follows the style guidelines of this project (pnpm run format)
  • I have commented my code, particularly in hard-to-understand areas
  • I have checked if my PR needs changes to the documentation
  • I have checked if my changes generate no new warnings (pnpm run lint)
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked if new and existing unit tests pass locally with my changes

@vercel
Copy link

vercel bot commented Mar 13, 2026

@yashdev9274 is attempting to deploy a commit to the General Action Team on Vercel.

A member of the Team first needs to authorize it.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR adds two new Next.js route handlers — GET /llms.txt and GET /llms-full.txt — to expose the documentation content in a format consumable by AI agents and LLMs, following the emerging llms.txt convention. It also enables includeProcessedMarkdown in the fumadocs-mdx config so the raw markdown of each doc page can be retrieved at runtime.

  • docs/source.config.ts: Adds postprocess: { includeProcessedMarkdown: true } to make page.data.getText available for all doc pages.
  • docs/lib/get-llm-text.ts: Thin helper that calls page.data.getText?.('processed') with safe optional chaining.
  • docs/app/llms.txt/route.ts: Generates a summary index of all pages (title, description, relative URL). Missing the required top-level # Project Name H1 header per the llms.txt spec, and emits relative paths instead of absolute URLs — both of which reduce usability for AI consumers.
  • docs/app/llms-full.txt/route.ts: Concatenates the full processed markdown of every page. Empty page results are not filtered before joining, leaving potential blank sections, and there is no fallback for an entirely empty response (unlike the llms.txt route).

Confidence Score: 3/5

  • Safe to merge as a non-breaking feature addition, but the llms.txt output is not spec-compliant and relative URLs limit AI agent usability.
  • The implementation is logically sound and introduces no regressions — revalidate = false ensures static caching, optional chaining guards against missing APIs, and the routes are entirely additive. However, the llms.txt output is missing the required H1 header mandated by the spec, uses non-navigable relative URLs, and llms-full.txt lacks filtering and a fallback for empty output. These issues reduce the effectiveness of the feature for its stated purpose.
  • docs/app/llms.txt/route.ts requires the most attention due to the missing H1 header and relative URL issues.

Important Files Changed

Filename Overview
docs/app/llms.txt/route.ts Generates the llms.txt index page; missing required H1 header per the llms.txt spec and uses relative (non-navigable) paths instead of absolute URLs.
docs/app/llms-full.txt/route.ts Concatenates full processed markdown for all docs pages; no fallback for empty output and empty pages are not filtered before joining.
docs/lib/get-llm-text.ts Thin helper that retrieves processed markdown from a fumadocs page; correctly uses optional chaining to guard against missing getText.
docs/source.config.ts Enables includeProcessedMarkdown in the fumadocs-mdx postprocessor so the getText API is available at runtime.

Sequence Diagram

sequenceDiagram
    participant Client as AI Agent / Browser
    participant LLMsTxt as GET /llms.txt
    participant LLMsFullTxt as GET /llms-full.txt
    participant Source as source.getPages()
    participant GetLLMText as getLLMText()
    participant PageData as page.data.getText()

    Client->>LLMsTxt: HTTP GET /llms.txt
    LLMsTxt->>Source: getPages()
    Source-->>LLMsTxt: Page[] (title, description, url)
    LLMsTxt-->>Client: text/markdown index (## Title, > desc, - Source: /url)

    Client->>LLMsFullTxt: HTTP GET /llms-full.txt
    LLMsFullTxt->>Source: getPages()
    Source-->>LLMsFullTxt: Page[]
    loop For each page
        LLMsFullTxt->>GetLLMText: getLLMText(page)
        GetLLMText->>PageData: page.data.getText?.('processed')
        PageData-->>GetLLMText: processed markdown string | undefined
        GetLLMText-->>LLMsFullTxt: string (or '')
    end
    LLMsFullTxt-->>Client: text/markdown full docs (all pages joined with \n\n)
Loading

Last reviewed commit: 74b7fe8

Comment on lines +6 to +26
const pages = source.getPages();
const lines: string[] = [];

for (const page of pages) {
const url = page.url;
const data = page.data as { title?: string; description?: string };
const title = data.title ?? 'Page';
const description = data.description ?? '';
lines.push(`## ${title}`);
if (description) {
lines.push(`> ${description}`);
}
lines.push(`- Source: ${url}\n`);
}

const result = lines.join('\n');
return new Response(result || '# No pages found', {
headers: {
'Content-Type': 'text/markdown',
},
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing top-level H1 header required by llms.txt spec

The llms.txt specification requires a top-level # Project Name H1 heading followed by an optional blockquote tagline at the very top of the file before any ## sections. Without the H1, AI agents and tools that parse llms.txt may fail to identify the project or treat the output as malformed.

The current output starts directly with ## Title sections. A spec-compliant header section should be prepended:

Suggested change
const pages = source.getPages();
const lines: string[] = [];
for (const page of pages) {
const url = page.url;
const data = page.data as { title?: string; description?: string };
const title = data.title ?? 'Page';
const description = data.description ?? '';
lines.push(`## ${title}`);
if (description) {
lines.push(`> ${description}`);
}
lines.push(`- Source: ${url}\n`);
}
const result = lines.join('\n');
return new Response(result || '# No pages found', {
headers: {
'Content-Type': 'text/markdown',
},
});
export function GET() {
const pages = source.getPages();
const lines: string[] = [
'# Emdash',
'',
'> An Open Source Agentic Development Environment (ADE)',
'',
];
for (const page of pages) {
const url = page.url;
const data = page.data as { title?: string; description?: string };
const title = data.title ?? 'Page';
const description = data.description ?? '';
lines.push(`## ${title}`);
if (description) {
lines.push(`> ${description}`);
}
lines.push(`- Source: ${url}\n`);
}
const result = lines.join('\n');
return new Response(result || '# No pages found', {
headers: {
'Content-Type': 'text/markdown',
},
});
}

if (description) {
lines.push(`> ${description}`);
}
lines.push(`- Source: ${url}\n`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relative URL limits AI agent usability

page.url returns a path like /installation, not an absolute URL (e.g., https://docs.emdash.dev/installation). AI agents consuming llms.txt follow the links to retrieve content — a relative path without a known origin is not navigable. Additionally, the standard llms.txt link format uses a markdown hyperlink [Page Title](URL) rather than a bare Source: /path label.

Consider reading the canonical base URL from an environment variable or Next.js config:

Suggested change
lines.push(`- Source: ${url}\n`);
lines.push(`- [${title}](${process.env.NEXT_PUBLIC_SITE_URL ?? ''}${url})\n`);

Comment on lines +8 to +9
const scanned = await Promise.all(pages.map(getLLMText));
return new Response(scanned.join('\n\n'), {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty pages included in concatenated output

getLLMText returns '' when page.data.getText is unavailable or returns undefined. Joining those empty strings with '\n\n' leaves blank sections in the output, which reduces quality for consumers. Unlike llms.txt/route.ts which has a '# No pages found' fallback, this route also has no fallback for an entirely empty result.

Consider filtering and adding a fallback:

Suggested change
const scanned = await Promise.all(pages.map(getLLMText));
return new Response(scanned.join('\n\n'), {
const scanned = (await Promise.all(pages.map(getLLMText))).filter(Boolean);
return new Response(scanned.join('\n\n') || '# No content available', {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feat]: add llm.txt support to Emdash documentation

1 participant