hyodotdev · hyochan · Feb 24, 2026 · Feb 22, 2026 · Feb 22, 2026 · Feb 22, 2026
diff --git a/.claude/guides/09-expo-ondevice-ai.md b/.claude/guides/09-expo-ondevice-ai.md
@@ -4,14 +4,15 @@
 
 Location: `libraries/expo-ondevice-ai/`
 
-Expo module wrapping the Locanara native SDKs for React Native/Expo apps. Provides TypeScript API for all 7 AI features plus model management, with native modules bridging to Locanara chains on iOS and Android.
+Expo module wrapping the Locanara native SDKs for React Native/Expo apps. Provides TypeScript API for all 8 AI features plus model management, with native modules bridging to Locanara chains on iOS, Android, and web (Chrome Built-in AI).
 
 ## Requirements
 
 - Expo SDK 52+
 - Bun 1.1+
 - iOS 17+ (for llama.cpp engine)
 - Android API 26+ (for ML Kit GenAI)
+- Web: Chrome 138+ (Chrome Built-in AI / Gemini Nano)
 
 ## Build Commands
 
@@ -31,6 +32,7 @@ libraries/expo-ondevice-ai/
 ├── src/
 │   ├── index.ts                      # Public API exports
 │   ├── ExpoOndeviceAiModule.ts       # Native module bridge
+│   ├── ExpoOndeviceAiModule.web.ts   # Web implementation (Chrome Built-in AI)
 │   ├── types.ts                      # TypeScript type definitions
 │   ├── log.ts                        # Logging utilities
 │   └── __tests__/                    # Unit tests
@@ -68,35 +70,55 @@ libraries/expo-ondevice-ai/
 
 Each TypeScript function maps to a built-in Locanara chain:
 
-| TypeScript API | iOS Chain | Android |
-|---------------|-----------|---------|
-| `summarize(text, opts)` | `SummarizeChain(bulletCount:).run(text)` | ML Kit Summarization |
-| `classify(text, opts)` | `ClassifyChain(categories:).run(text)` | Prompt API |
-| `extract(text, opts)` | `ExtractChain(entityTypes:).run(text)` | Prompt API |
-| `chat(message, opts)` | `ChatChain(memory:).run(message)` | Prompt API |
-| `chatStream(message, opts)` | `ChatChain(memory:).streamRun(message)` | Prompt API |
-| `translate(text, opts)` | `TranslateChain(source:target:).run(text)` | Prompt API |
-| `rewrite(text, opts)` | `RewriteChain(style:).run(text)` | ML Kit Rewriting |
-| `proofread(text, opts)` | `ProofreadChain().run(text)` | ML Kit Proofreading |
+| TypeScript API              | iOS Chain                                  | Android              | Web (Chrome Built-in AI)              |
+| --------------------------- | ------------------------------------------ | -------------------- | ------------------------------------- |
+| `summarize(text, opts)`     | `SummarizeChain(bulletCount:).run(text)`   | ML Kit Summarization | `Summarizer` API (key-points)         |
+| `classify(text, opts)`      | `ClassifyChain(categories:).run(text)`     | Prompt API           | `LanguageModel` API                   |
+| `extract(text, opts)`       | `ExtractChain(entityTypes:).run(text)`     | Prompt API           | `LanguageModel` API                   |
+| `chat(message, opts)`       | `ChatChain(memory:).run(message)`          | Prompt API           | `LanguageModel` API                   |
+| `chatStream(message, opts)` | `ChatChain(memory:).streamRun(message)`    | Prompt API           | `LanguageModel.promptStreaming()`     |
+| `translate(text, opts)`     | `TranslateChain(source:target:).run(text)` | Prompt API           | `Translator` API                      |
+| `rewrite(text, opts)`       | `RewriteChain(style:).run(text)`           | ML Kit Rewriting     | `Rewriter` API                        |
+| `proofread(text, opts)`     | `ProofreadChain().run(text)`               | ML Kit Proofreading  | `LanguageModel` API (structured JSON) |
 
 ### Model Management API (iOS)
 
-| TypeScript API | Native call |
-|---------------|-------------|
-| `getAvailableModels()` | `LocanaraClient.shared.getAvailableModels()` |
-| `getDownloadedModels()` | `LocanaraClient.shared.getDownloadedModels()` |
-| `downloadModel(id)` | `LocanaraClient.shared.downloadModelWithProgress(id)` |
-| `loadModel(id)` | `LocanaraClient.shared.loadModel(id)` → auto-switches engine |
-| `deleteModel(id)` | `LocanaraClient.shared.deleteModel(id)` |
-| `getLoadedModel()` | `LocanaraClient.shared.getLoadedModel()` |
-| `getCurrentEngine()` | `LocanaraClient.shared.getCurrentEngine()` |
+| TypeScript API          | Native call                                                  |
+| ----------------------- | ------------------------------------------------------------ |
+| `getAvailableModels()`  | `LocanaraClient.shared.getAvailableModels()`                 |
+| `getDownloadedModels()` | `LocanaraClient.shared.getDownloadedModels()`                |
+| `downloadModel(id)`     | `LocanaraClient.shared.downloadModelWithProgress(id)`        |
+| `loadModel(id)`         | `LocanaraClient.shared.loadModel(id)` → auto-switches engine |
+| `deleteModel(id)`       | `LocanaraClient.shared.deleteModel(id)`                      |
+| `getLoadedModel()`      | `LocanaraClient.shared.getLoadedModel()`                     |
+| `getCurrentEngine()`    | `LocanaraClient.shared.getCurrentEngine()`                   |
 
 ### Native Module Architecture
 
 - `LocanaraClient` is only used for `initialize()`, `getDeviceCapability()`, and model management
 - All AI features use built-in chains directly (not `LocanaraClient.executeFeature()`)
 - `PrefilledMemory` adapts JS chat history `[{role, content}]` to the `Memory` protocol
 
+### Web Implementation (`ExpoOndeviceAiModule.web.ts`)
+
+Metro auto-resolves `.web.ts` over `.ts` for the web platform. The web module uses Chrome Built-in AI APIs (Gemini Nano) directly — no native bridge needed.
+
+**Chrome APIs used:**
+
+- `Summarizer` — text summarization (key-points mode, post-processed to match bullet count)
+- `LanguageModel` — classify, extract, chat, chatStream, proofread (via structured JSON prompts)
+- `Translator` — language translation
+- `Rewriter` — text rewriting (tone/length mapping)
+- `Writer` — fallback for proofread if LanguageModel unavailable
+
+**Key implementation details:**
+
+- **Availability detection**: Lenient checks with 3s timeout; accepts `readily`, `available`, `downloadable`, `after-download` statuses; falls back to API object existence
+- **Streaming**: Uses `LanguageModel.promptStreaming()` with auto-detection of cumulative vs delta chunk format (varies by Chrome version)
+- **Event emitter**: Web polyfill for Expo's native `addListener`/`removeListeners` pattern using a `Map<string, Set<Function>>`
+- **Instance caching**: Summarizer, LanguageModel, Translator, Rewriter, Writer instances are cached and reused
+- **Model management**: No-op on web (Chrome manages models automatically)
+
 ## Config Plugin (`withOndeviceAi.ts`)
 
 The Expo config plugin automates native setup at prebuild time.
@@ -164,15 +186,18 @@ The bridge is discovered at runtime by `LlamaCppBridge.findBridge()` using `NSCl
 ### Key Build Settings
 
 Bridge pod (`pod_target_xcconfig`):
+
 - `SWIFT_INCLUDE_PATHS` / `FRAMEWORK_SEARCH_PATHS` → `$(PODS_CONFIGURATION_BUILD_DIR)` (for SPM modules)
 - `IPHONEOS_DEPLOYMENT_TARGET` → `17.0` (LocalLLMClient requirement)
 - `OTHER_SWIFT_FLAGS` → `-cxx-interoperability-mode=default -Xcc -std=c++20`
 
 App target (`user_target_xcconfig`):
+
 - `OTHER_LDFLAGS` → `-framework "llama"` (link dynamic framework)
 - `FRAMEWORK_SEARCH_PATHS` → `$(PODS_CONFIGURATION_BUILD_DIR)` (find llama.framework)
 
 Embed phase:
+
 - Copies `llama.framework` from `PackageFrameworks/` to app's `Frameworks/`
 - Re-signs with `EXPANDED_CODE_SIGN_IDENTITY`
 
@@ -191,6 +216,9 @@ bun ios --device
 
 # Run on Android
 bun android
+
+# Run on Web (Chrome 138+ required for AI features)
+bun web
 ```
 
 ### App Structure

diff --git a/.claude/guides/09-platform-differences.md b/.claude/guides/09-platform-differences.md
@@ -1,41 +1,51 @@
 # Platform Feature Differences
 
-This guide documents feature availability and implementation differences across iOS and Android platforms.
+This guide documents feature availability and implementation differences across iOS, Android, and Web platforms.
 
 ## Feature Availability Matrix
 
-| Feature | iOS | Android | Notes |
-|---------|-----|---------|-------|
-| **Core Framework** | ✅ | ✅ | Identical API across platforms |
-| Chains (7 built-in) | ✅ | ✅ | Same chain implementations |
-| Pipeline DSL | ✅ | ✅ | Identical syntax |
-| Memory (Buffer/Summary) | ✅ | ✅ | Same memory implementations |
-| Guardrails | ✅ | ✅ | Same guardrail implementations |
-| Tools | ✅ | ✅ | Same tool protocol |
-| Agent (ReAct-lite) | ✅ | ✅ | Same agent implementation |
-| Session Management | ✅ | ✅ | Same session API |
-| **On-Device AI Backends** | | | |
-| Apple Intelligence | ✅ | ❌ | iOS 26+, macOS 26+ only |
-| Gemini Nano | ❌ | ✅ | Android 14+ only |
-| **External Model Support** | | | |
-| llama.cpp (GGUF) | ✅ | ❌ | iOS 17+ via LocalLLMClient |
-| ExecuTorch (GGUF) | ❌ | ✅ | Android API 26+ |
-| **Engine System** | ✅ | ✅ | Both platforms support external models |
-| InferenceRouter | ✅ | ✅ | Auto-routing to active engine |
-| ModelManager | ✅ | ✅ | Download/load/unload GGUF models |
-| ModelRegistry | ✅ | ✅ | Available model catalog |
-| DeviceCapabilityDetector | ✅ | ❌ | iOS-only hardware detection |
-| **RAG** | ✅ | ✅ | Both platforms |
-| VectorStore | ✅ | ✅ | In-memory vector storage |
-| DocumentChunker | ✅ | ✅ | Multiple chunking strategies |
-| EmbeddingEngine | ✅ | ✅ | Text embedding generation |
-| RAGManager | ✅ | ✅ | Collection management |
-| RAGQueryEngine | ✅ | ✅ | Query pipeline |
-| **Personalization** | ✅ | ✅ | Both platforms |
-| PersonalizationManager | ✅ | ✅ | Feedback orchestration |
-| FeedbackCollector | ✅ | ✅ | User feedback collection |
-| PreferenceAnalyzer | ✅ | ✅ | Pattern analysis |
-| PromptOptimizer | ✅ | ✅ | Adaptive prompts |
+| Feature                         | iOS | Android | Web | Notes                                      |
+| ------------------------------- | --- | ------- | --- | ------------------------------------------ |
+| **Core Framework**              | ✅  | ✅      | ❌  | Native SDK only (iOS/Android)              |
+| Chains (7 built-in)             | ✅  | ✅      | ❌  | Native SDK only                            |
+| Pipeline DSL                    | ✅  | ✅      | ❌  | Native SDK only                            |
+| Memory (Buffer/Summary)         | ✅  | ✅      | ❌  | Native SDK only                            |
+| Guardrails                      | ✅  | ✅      | ❌  | Native SDK only                            |
+| Tools                           | ✅  | ✅      | ❌  | Native SDK only                            |
+| Agent (ReAct-lite)              | ✅  | ✅      | ❌  | Native SDK only                            |
+| Session Management              | ✅  | ✅      | ❌  | Native SDK only                            |
+| **AI Features (via Libraries)** |     |         |     |                                            |
+| Summarize                       | ✅  | ✅      | ✅  | Web: Chrome Summarizer API                 |
+| Classify                        | ✅  | ✅      | ✅  | Web: Chrome LanguageModel API              |
+| Extract                         | ✅  | ✅      | ✅  | Web: Chrome LanguageModel API              |
+| Chat                            | ✅  | ✅      | ✅  | Web: Chrome LanguageModel API              |
+| Chat Stream                     | ✅  | ✅      | ✅  | Web: LanguageModel.promptStreaming()       |
+| Translate                       | ✅  | ✅      | ✅  | Web: Chrome Translator API                 |
+| Rewrite                         | ✅  | ✅      | ✅  | Web: Chrome Rewriter API                   |
+| Proofread                       | ✅  | ✅      | ✅  | Web: Chrome LanguageModel API              |
+| **On-Device AI Backends**       |     |         |     |                                            |
+| Apple Intelligence              | ✅  | ❌      | ❌  | iOS 26+, macOS 26+ only                    |
+| Gemini Nano                     | ❌  | ✅      | ✅  | Android 14+ / Chrome 138+                  |
+| Chrome Built-in AI              | ❌  | ❌      | ✅  | Chrome 138+ (Summarizer, Translator, etc.) |
+| **External Model Support**      |     |         |     |                                            |
+| llama.cpp (GGUF)                | ✅  | ❌      | ❌  | iOS 17+ via LocalLLMClient                 |
+| ExecuTorch (GGUF)               | ❌  | ✅      | ❌  | Android API 26+                            |
+| **Engine System**               | ✅  | ✅      | ❌  | Native SDK only                            |
+| InferenceRouter                 | ✅  | ✅      | ❌  | Auto-routing to active engine              |
+| ModelManager                    | ✅  | ✅      | ❌  | Download/load/unload GGUF models           |
+| ModelRegistry                   | ✅  | ✅      | ❌  | Available model catalog                    |
+| DeviceCapabilityDetector        | ✅  | ❌      | ❌  | iOS-only hardware detection                |
+| **RAG**                         | ✅  | ✅      | ❌  | Native SDK only                            |
+| VectorStore                     | ✅  | ✅      | ❌  | In-memory vector storage                   |
+| DocumentChunker                 | ✅  | ✅      | ❌  | Multiple chunking strategies               |
+| EmbeddingEngine                 | ✅  | ✅      | ❌  | Text embedding generation                  |
+| RAGManager                      | ✅  | ✅      | ❌  | Collection management                      |
+| RAGQueryEngine                  | ✅  | ✅      | ❌  | Query pipeline                             |
+| **Personalization**             | ✅  | ✅      | ❌  | Native SDK only                            |
+| PersonalizationManager          | ✅  | ✅      | ❌  | Feedback orchestration                     |
+| FeedbackCollector               | ✅  | ✅      | ❌  | User feedback collection                   |
+| PreferenceAnalyzer              | ✅  | ✅      | ❌  | Pattern analysis                           |
+| PromptOptimizer                 | ✅  | ✅      | ❌  | Adaptive prompts                           |
 
 ## Platform-Specific APIs
 
@@ -125,6 +135,34 @@ val result = SummarizeChain(model).run("text")
 
 **Available:** Android API 26+
 
+### Web-Only Features (via expo-ondevice-ai)
+
+#### Chrome Built-in AI
+
+Web support is available via `expo-ondevice-ai` using Chrome Built-in AI APIs. The web module (`ExpoOndeviceAiModule.web.ts`) maps each feature to the appropriate Chrome API:
+
+```typescript
+// All 8 AI features work on web via Chrome Built-in AI
+import { summarize, classify, chat, translate } from "expo-ondevice-ai";
+
+const result = await summarize("Long text...", { outputType: "THREE_BULLETS" });
+const translation = await translate("Hello", { targetLanguage: "ko" });
+```
+
+**Chrome APIs used:**
+
+| Chrome API                        | Features                           | Notes                                         |
+| --------------------------------- | ---------------------------------- | --------------------------------------------- |
+| `Summarizer`                      | summarize                          | key-points mode, bullet count post-processing |
+| `LanguageModel`                   | classify, extract, chat, proofread | Structured JSON prompts                       |
+| `LanguageModel.promptStreaming()` | chatStream                         | Auto-detects cumulative vs delta chunks       |
+| `Translator`                      | translate                          | Per-language-pair caching                     |
+| `Rewriter`                        | rewrite                            | Tone/length mapping from SDK types            |
+
+**Available:** Chrome 138+ with `chrome://flags/#optimization-guide-on-device-model` enabled
+
+**Note:** `react-native-ondevice-ai` does NOT support web (Nitro Modules are native-only). Web users should use `expo-ondevice-ai` or `packages/web` standalone SDK.
+
 ## Implementation Differences
 
 ### Error Handling
@@ -203,21 +241,28 @@ When a feature is only available on one platform, use suffixes:
 
 ### iOS
 
-| Requirement | Minimum | Recommended |
-|-------------|---------|-------------|
-| iOS Version | 17.0 | 26.0 (for Apple Intelligence) |
-| macOS Version | 14.0 | 26.0 (for Apple Intelligence) |
-| Xcode | 16.0 | 16.0+ |
-| Swift | 6.0 | 6.0+ |
+| Requirement   | Minimum | Recommended                   |
+| ------------- | ------- | ----------------------------- |
+| iOS Version   | 17.0    | 26.0 (for Apple Intelligence) |
+| macOS Version | 14.0    | 26.0 (for Apple Intelligence) |
+| Xcode         | 16.0    | 16.0+                         |
+| Swift         | 6.0     | 6.0+                          |
 
 ### Android
 
-| Requirement | Minimum | Recommended |
-|-------------|---------|-------------|
-| Android API | 26 | 34 (for Prompt API) |
-| Kotlin | 2.0 | 2.0+ |
-| Android Studio | 2024.1.1 | Latest |
-| Gradle | 8.0 | 8.0+ |
+| Requirement    | Minimum  | Recommended         |
+| -------------- | -------- | ------------------- |
+| Android API    | 26       | 34 (for Prompt API) |
+| Kotlin         | 2.0      | 2.0+                |
+| Android Studio | 2024.1.1 | Latest              |
+| Gradle         | 8.0      | 8.0+                |
+
+### Web
+
+| Requirement | Minimum           | Recommended       |
+| ----------- | ----------------- | ----------------- |
+| Chrome      | 138               | Latest            |
+| Gemini Nano | Enabled via flags | Enabled via flags |
 
 ## Testing Platform-Specific Features
 
@@ -258,11 +303,23 @@ adb install -r example/build/outputs/apk/debug/example-debug.apk
 3. Replace `Flow` → `AsyncThrowingStream`
 4. Add `@available` annotations for iOS 26+ APIs
 
+### Web (Expo)
+
+```bash
+# Run Expo example app on web
+cd libraries/expo-ondevice-ai/example
+bun web
+
+# Requires Chrome 138+ with chrome://flags/#optimization-guide-on-device-model enabled
+```
+
 ## Summary
 
-- **Core framework** is identical across platforms (Chains, Pipeline, Memory, Guardrails, Tools, Agent, Session)
-- **Engine, RAG, Personalization** layers available on both platforms
-- **On-device AI backends** differ by platform (Apple Intelligence vs Gemini Nano)
-- **External models** supported on both (llama.cpp on iOS, ExecuTorch on Android)
+- **Core framework** is identical across iOS and Android (Chains, Pipeline, Memory, Guardrails, Tools, Agent, Session)
+- **Engine, RAG, Personalization** layers available on iOS and Android
+- **AI features** (summarize, classify, etc.) available on all 3 platforms via library wrappers
+- **On-device AI backends** differ by platform (Apple Intelligence / Gemini Nano / Chrome Built-in AI)
+- **External models** supported on iOS (llama.cpp) and Android (ExecuTorch), not on web
+- **Web support** available via `expo-ondevice-ai` only (not `react-native-ondevice-ai`)
 - **API naming** is identical for shared features, suffixed for platform-specific features
 - Always test on **real devices** for accurate on-device AI behavior