|
| 1 | +--- |
| 2 | +description: General information based on the latest ./README.md content |
| 3 | +globs: |
| 4 | +--- |
| 5 | +Update it if APIs change: |
| 6 | + |
| 7 | +# ts-syntax-highlighter |
| 8 | + |
| 9 | +A blazing-fast, TypeScript-native syntax highlighter with comprehensive grammar support for modern web languages. Built for performance with both synchronous and asynchronous tokenization modes. |
| 10 | + |
| 11 | +## Features |
| 12 | + |
| 13 | +- ⚡ **Blazing Fast** - Highly optimized tokenization with async and sync modes for maximum performance |
| 14 | +- 🎨 **6 Languages** - JavaScript/JSX, TypeScript/TSX, HTML, CSS, JSON, and STX |
| 15 | +- 🔥 **Modern Syntax** - Full support for ES2024+, BigInt, numeric separators, optional chaining, and more |
| 16 | +- ⚛️ **JSX/TSX Support** - Complete React and TypeScript JSX highlighting |
| 17 | +- 🎯 **CSS4 Features** - Modern color functions (hwb, lab, lch, oklab, oklch), container queries, CSS layers |
| 18 | +- 🧵 **Dual Modes** - Fast async mode and synchronous mode for different use cases |
| 19 | +- 💪 **TypeScript-First** - Fully typed APIs with comprehensive type definitions |
| 20 | +- 🧪 **416 Tests** - Extensively tested with high code coverage |
| 21 | +- 📦 **Zero Dependencies** - Lightweight with no external runtime dependencies |
| 22 | + |
| 23 | +## Installation |
| 24 | + |
| 25 | +```bash |
| 26 | +npm install ts-syntax-highlighter |
| 27 | +# or |
| 28 | +bun add ts-syntax-highlighter |
| 29 | +# or |
| 30 | +pnpm add ts-syntax-highlighter |
| 31 | +``` |
| 32 | + |
| 33 | +## Quick Start |
| 34 | + |
| 35 | +```typescript |
| 36 | +import { Tokenizer } from 'ts-syntax-highlighter' |
| 37 | + |
| 38 | +// Create tokenizer instance |
| 39 | +const tokenizer = new Tokenizer('javascript') |
| 40 | + |
| 41 | +// Tokenize code (async mode - faster) |
| 42 | +const tokens = await tokenizer.tokenizeAsync(` |
| 43 | +const greeting = 'Hello World' |
| 44 | +console.log(greeting) |
| 45 | +`) |
| 46 | + |
| 47 | +// Or use sync mode |
| 48 | +const syncTokens = tokenizer.tokenize(` |
| 49 | +function add(a: number, b: number): number { |
| 50 | + return a + b |
| 51 | +} |
| 52 | +`) |
| 53 | +``` |
| 54 | + |
| 55 | +## Supported Languages |
| 56 | + |
| 57 | +### JavaScript/JSX |
| 58 | + |
| 59 | +- ES2024+ features (BigInt, numeric separators, optional chaining, nullish coalescing) |
| 60 | +- JSX elements and expressions |
| 61 | +- Template literals with expressions |
| 62 | +- Regex literals with all flags |
| 63 | +- Async/await, generators |
| 64 | +- Modern operators: `?.`, `??`, `?.[]`, `?.()` |
| 65 | + |
| 66 | +### TypeScript/TSX |
| 67 | + |
| 68 | +- All JavaScript features plus: |
| 69 | +- Type annotations and assertions |
| 70 | +- Interfaces, types, enums |
| 71 | +- Generics and type parameters |
| 72 | +- TypeScript-specific operators: `is`, `keyof`, `infer` |
| 73 | +- TSX (TypeScript + JSX) |
| 74 | +- Utility types |
| 75 | + |
| 76 | +### HTML |
| 77 | + |
| 78 | +- HTML5 elements |
| 79 | +- Data attributes (`data-*`) |
| 80 | +- ARIA attributes (`aria-*`) |
| 81 | +- Event handlers (`onclick`, `onload`, etc.) |
| 82 | +- HTML entities |
| 83 | +- DOCTYPE declarations |
| 84 | + |
| 85 | +### CSS |
| 86 | + |
| 87 | +- Modern color functions: `hwb()`, `lab()`, `lch()`, `oklab()`, `oklch()`, `color()` |
| 88 | +- Math functions: `calc()`, `min()`, `max()`, `clamp()`, `round()`, `abs()`, `sign()` |
| 89 | +- Trigonometric: `sin()`, `cos()`, `tan()`, `asin()`, `acos()`, `atan()` |
| 90 | +- Gradients: `linear-gradient()`, `radial-gradient()`, `conic-gradient()` |
| 91 | +- At-rules: `@media`, `@keyframes`, `@supports`, `@container`, `@layer`, `@property` |
| 92 | +- CSS custom properties (variables): `--custom-property`, `var()` |
| 93 | + |
| 94 | +### JSON |
| 95 | + |
| 96 | +- Objects and arrays |
| 97 | +- Strings with proper escape sequences |
| 98 | +- Numbers (including scientific notation) |
| 99 | +- Booleans and null |
| 100 | +- Invalid escape detection |
| 101 | + |
| 102 | +### STX |
| 103 | + |
| 104 | +- Blade-like templating syntax |
| 105 | +- 50+ directives |
| 106 | +- Components, layouts, includes |
| 107 | +- Control flow, loops |
| 108 | +- Authentication, authorization |
| 109 | +- And much more |
| 110 | + |
| 111 | +## Performance |
| 112 | + |
| 113 | +ts-syntax-highlighter is built for speed with highly optimized tokenization algorithms: |
| 114 | + |
| 115 | +| Operation | Fast Mode (Async) | Sync Mode | |
| 116 | +|-----------|------------------|-----------| |
| 117 | +| JavaScript tokenization | ~0.05ms | ~0.08ms | |
| 118 | +| TypeScript tokenization | ~0.08ms | ~0.12ms | |
| 119 | +| HTML tokenization | ~0.04ms | ~0.06ms | |
| 120 | +| CSS tokenization | ~0.03ms | ~0.05ms | |
| 121 | + |
| 122 | +### Performance Characteristics |
| 123 | + |
| 124 | +- **Fast Mode**: Async tokenization with worker-like performance characteristics |
| 125 | +- **Sync Mode**: Synchronous processing for simpler integration |
| 126 | +- **Optimized Patterns**: Pattern matching ordered by frequency |
| 127 | +- **Pre-compiled Regex**: All patterns compiled and cached |
| 128 | +- **Minimal Backtracking**: Patterns designed for efficiency |
| 129 | +- **Memory Efficient**: ~3x source code size in memory |
| 130 | + |
| 131 | +### Comparison with Alternatives |
| 132 | + |
| 133 | +When compared to popular syntax highlighters: |
| 134 | + |
| 135 | +| Library | JavaScript | TypeScript | HTML | CSS | |
| 136 | +|---------|-----------|------------|------|-----| |
| 137 | +| **ts-syntax-highlighter (Fast)** | **0.05ms** | **0.08ms** | **0.04ms** | **0.03ms** | |
| 138 | +| highlight.js | 3.8ms | 1.0ms | 1.2ms | 0.9ms | |
| 139 | +| Prism.js | 2.1ms | 0.6ms | 0.8ms | 0.5ms | |
| 140 | + |
| 141 | +Run benchmarks yourself: |
| 142 | + |
| 143 | +```bash |
| 144 | +bun run bench |
| 145 | +``` |
| 146 | + |
| 147 | +## API Reference |
| 148 | + |
| 149 | +### Tokenizer |
| 150 | + |
| 151 | +```typescript |
| 152 | +import { Tokenizer } from 'ts-syntax-highlighter' |
| 153 | + |
| 154 | +// Create tokenizer for a specific language |
| 155 | +const tokenizer = new Tokenizer('javascript' | 'typescript' | 'html' | 'css' | 'json' | 'stx') |
| 156 | + |
| 157 | +// Async tokenization (faster, recommended) |
| 158 | +const tokens = await tokenizer.tokenizeAsync(code: string) |
| 159 | + |
| 160 | +// Sync tokenization |
| 161 | +const tokens = tokenizer.tokenize(code: string) |
| 162 | +``` |
| 163 | + |
| 164 | +### Language Detection |
| 165 | + |
| 166 | +```typescript |
| 167 | +import { getLanguage, getLanguageByExtension } from 'ts-syntax-highlighter' |
| 168 | + |
| 169 | +// Get language by ID or alias |
| 170 | +const lang = getLanguage('js') // Returns JavaScript language |
| 171 | +const langTs = getLanguage('tsx') // Returns TypeScript language |
| 172 | + |
| 173 | +// Get language by file extension |
| 174 | +const langFromExt = getLanguageByExtension('.jsx') // Returns JavaScript language |
| 175 | +``` |
| 176 | + |
| 177 | +### Token Structure |
| 178 | + |
| 179 | +```typescript |
| 180 | +interface Token { |
| 181 | + type: string // Token scope (e.g., 'keyword.control.js', 'string.quoted.double.ts') |
| 182 | + content: string // The actual text content |
| 183 | + line: number // Line number (0-indexed) |
| 184 | + startIndex: number // Character position in the line |
| 185 | +} |
| 186 | + |
| 187 | +interface LineTokens { |
| 188 | + line: number |
| 189 | + tokens: Token[] |
| 190 | +} |
| 191 | +``` |
| 192 | + |
| 193 | +## Development |
| 194 | + |
| 195 | +```bash |
| 196 | +# Install dependencies |
| 197 | +bun install |
| 198 | + |
| 199 | +# Build the library |
| 200 | +bun run build |
| 201 | + |
| 202 | +# Run tests |
| 203 | +bun test |
| 204 | + |
| 205 | +# Run benchmarks |
| 206 | +bun run bench |
| 207 | + |
| 208 | +# Type checking |
| 209 | +bun run typecheck |
| 210 | + |
| 211 | +# Linting |
| 212 | +bun run lint |
| 213 | +``` |
0 commit comments