Skip to content

Latest commit

 

History

History

README.md

OpenUI Lang Benchmark

Measures token efficiency and estimated generation latency of OpenUI Lang vs three structured streaming formats across seven real-world UI scenarios:

  • YAML
  • Vercel JSON-Render (RFC 6902 patch stream)
  • Thesys C1 JSON (normalized component tree)

Formats

Format Description
OpenUI Lang Line-oriented DSL generated directly by the LLM
YAML YAML root / elements spec payload
Vercel JSON-Render JSONL stream of JSON Patch (RFC 6902) operations
Thesys C1 JSON Normalized component tree JSON (component + props)

All four formats encode exactly the same UI. The LLM always generates OpenUI Lang, then the parsed AST is projected into the other three formats.

Methodology

  1. Use a fixed set of seven prompts in generate-samples.ts (simple-table, chart-with-data, contact-form, dashboard, pricing-page, settings-panel, e-commerce-product).
  2. Generate OpenUI Lang once per prompt with gpt-5.2 at temperature: 0 using streaming completions.
  3. Parse the OpenUI Lang output with createParser(schema) so positional arguments map to named props via schema.json.
  4. Convert the same parsed AST into:
    • *.c1.json: Thesys C1 format via thesys-c1-converter.ts.
    • *.vercel.jsonl: json-render RFC 6902 patches via vercel-jsonl-converter.ts.
    • *.yaml: json-render YAML spec payload via yaml-converter.ts.
  5. Count tokens from all saved artifacts in samples/ using tiktoken with encoding_for_model("gpt-5").
  6. Report token counts and estimated decode latency at a fixed 60 tokens/second.

Results

Measured with tiktoken (gpt-5 model encoder). Generated by GPT-5.2 at temperature 0.

Scenario YAML Vercel JSON-Render Thesys C1 JSON OpenUI Lang vs YAML vs Vercel vs C1
simple-table 316 340 357 148 -53.2% -56.5% -58.5%
chart-with-data 464 520 516 231 -50.2% -55.6% -55.2%
contact-form 762 893 849 294 -61.4% -67.1% -65.4%
dashboard 2128 2247 2261 1226 -42.4% -45.4% -45.8%
pricing-page 2230 2487 2379 1195 -46.4% -52.0% -49.8%
settings-panel 1077 1244 1205 540 -49.9% -56.6% -55.2%
e-commerce-product 2145 2449 2381 1166 -45.6% -52.4% -51.0%
TOTAL 9122 10180 9948 4800 -47.4% -52.8% -51.7%

Running

Prerequisites

Export OPENAI_API_KEY in your shell:

export OPENAI_API_KEY=sk-...

1. Generate samples (calls OpenAI)

pnpm generate

This calls OpenAI once per scenario, saves the raw .oui output, then converts it to .c1.json, .vercel.jsonl, and .yaml. All files land in samples/. A metrics.json with TTFT and TPS from the API response is also written.

2. Run the benchmark report (offline)

pnpm bench

Reads the files in samples/, counts tokens with tiktoken, and prints token and latency tables.

File Layout

benchmarks/
├── generate-samples.ts        # Calls OpenAI, converts AST to all four formats
├── run-benchmark.ts           # Reads samples/, prints token/latency tables
├── thesys-c1-converter.ts     # AST -> normalized Thesys C1 JSON converter
├── vercel-spec-converter.ts   # AST -> shared json-render spec projection
├── vercel-jsonl-converter.ts  # Shared spec -> RFC 6902 JSONL converter
├── yaml-converter.ts          # Shared spec -> YAML document converter
├── schema.json                # JSON Schema for the default component library
├── system-prompt.txt          # System prompt sent to the LLM
├── package.json
├── pnpm-lock.yaml
└── samples/
    ├── <scenario>.oui
    ├── <scenario>.c1.json
    ├── <scenario>.vercel.jsonl
    ├── <scenario>.yaml
    └── metrics.json

Updating the Schema

schema.json and system-prompt.txt are generated by library.toJSONSchema() and library.prompt() in @openuidev/react-ui. If you add or change components, regenerate them:

import { openuiLibrary } from "@openuidev/react-ui";
import { writeFileSync } from "fs";

writeFileSync("schema.json", JSON.stringify(openuiLibrary.toJSONSchema(), null, 2));
writeFileSync("system-prompt.txt", openuiLibrary.prompt());

The parser (createParser) reads component definitions from the $defs section of this file to map positional arguments to named props.