Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 48 additions & 20 deletions .claude/guides/09-expo-ondevice-ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,15 @@

Location: `libraries/expo-ondevice-ai/`

Expo module wrapping the Locanara native SDKs for React Native/Expo apps. Provides TypeScript API for all 7 AI features plus model management, with native modules bridging to Locanara chains on iOS and Android.
Expo module wrapping the Locanara native SDKs for React Native/Expo apps. Provides TypeScript API for all 8 AI features plus model management, with native modules bridging to Locanara chains on iOS, Android, and web (Chrome Built-in AI).

## Requirements

- Expo SDK 52+
- Bun 1.1+
- iOS 17+ (for llama.cpp engine)
- Android API 26+ (for ML Kit GenAI)
- Web: Chrome 138+ (Chrome Built-in AI / Gemini Nano)

## Build Commands

Expand All @@ -31,6 +32,7 @@ libraries/expo-ondevice-ai/
├── src/
│ ├── index.ts # Public API exports
│ ├── ExpoOndeviceAiModule.ts # Native module bridge
│ ├── ExpoOndeviceAiModule.web.ts # Web implementation (Chrome Built-in AI)
│ ├── types.ts # TypeScript type definitions
│ ├── log.ts # Logging utilities
│ └── __tests__/ # Unit tests
Expand Down Expand Up @@ -68,35 +70,55 @@ libraries/expo-ondevice-ai/

Each TypeScript function maps to a built-in Locanara chain:

| TypeScript API | iOS Chain | Android |
|---------------|-----------|---------|
| `summarize(text, opts)` | `SummarizeChain(bulletCount:).run(text)` | ML Kit Summarization |
| `classify(text, opts)` | `ClassifyChain(categories:).run(text)` | Prompt API |
| `extract(text, opts)` | `ExtractChain(entityTypes:).run(text)` | Prompt API |
| `chat(message, opts)` | `ChatChain(memory:).run(message)` | Prompt API |
| `chatStream(message, opts)` | `ChatChain(memory:).streamRun(message)` | Prompt API |
| `translate(text, opts)` | `TranslateChain(source:target:).run(text)` | Prompt API |
| `rewrite(text, opts)` | `RewriteChain(style:).run(text)` | ML Kit Rewriting |
| `proofread(text, opts)` | `ProofreadChain().run(text)` | ML Kit Proofreading |
| TypeScript API | iOS Chain | Android | Web (Chrome Built-in AI) |
| --------------------------- | ------------------------------------------ | -------------------- | ------------------------------------- |
| `summarize(text, opts)` | `SummarizeChain(bulletCount:).run(text)` | ML Kit Summarization | `Summarizer` API (key-points) |
| `classify(text, opts)` | `ClassifyChain(categories:).run(text)` | Prompt API | `LanguageModel` API |
| `extract(text, opts)` | `ExtractChain(entityTypes:).run(text)` | Prompt API | `LanguageModel` API |
| `chat(message, opts)` | `ChatChain(memory:).run(message)` | Prompt API | `LanguageModel` API |
| `chatStream(message, opts)` | `ChatChain(memory:).streamRun(message)` | Prompt API | `LanguageModel.promptStreaming()` |
| `translate(text, opts)` | `TranslateChain(source:target:).run(text)` | Prompt API | `Translator` API |
| `rewrite(text, opts)` | `RewriteChain(style:).run(text)` | ML Kit Rewriting | `Rewriter` API |
| `proofread(text, opts)` | `ProofreadChain().run(text)` | ML Kit Proofreading | `LanguageModel` API (structured JSON) |

### Model Management API (iOS)

| TypeScript API | Native call |
|---------------|-------------|
| `getAvailableModels()` | `LocanaraClient.shared.getAvailableModels()` |
| `getDownloadedModels()` | `LocanaraClient.shared.getDownloadedModels()` |
| `downloadModel(id)` | `LocanaraClient.shared.downloadModelWithProgress(id)` |
| `loadModel(id)` | `LocanaraClient.shared.loadModel(id)` → auto-switches engine |
| `deleteModel(id)` | `LocanaraClient.shared.deleteModel(id)` |
| `getLoadedModel()` | `LocanaraClient.shared.getLoadedModel()` |
| `getCurrentEngine()` | `LocanaraClient.shared.getCurrentEngine()` |
| TypeScript API | Native call |
| ----------------------- | ------------------------------------------------------------ |
| `getAvailableModels()` | `LocanaraClient.shared.getAvailableModels()` |
| `getDownloadedModels()` | `LocanaraClient.shared.getDownloadedModels()` |
| `downloadModel(id)` | `LocanaraClient.shared.downloadModelWithProgress(id)` |
| `loadModel(id)` | `LocanaraClient.shared.loadModel(id)` → auto-switches engine |
| `deleteModel(id)` | `LocanaraClient.shared.deleteModel(id)` |
| `getLoadedModel()` | `LocanaraClient.shared.getLoadedModel()` |
| `getCurrentEngine()` | `LocanaraClient.shared.getCurrentEngine()` |

### Native Module Architecture

- `LocanaraClient` is only used for `initialize()`, `getDeviceCapability()`, and model management
- All AI features use built-in chains directly (not `LocanaraClient.executeFeature()`)
- `PrefilledMemory` adapts JS chat history `[{role, content}]` to the `Memory` protocol

### Web Implementation (`ExpoOndeviceAiModule.web.ts`)

Metro auto-resolves `.web.ts` over `.ts` for the web platform. The web module uses Chrome Built-in AI APIs (Gemini Nano) directly — no native bridge needed.

**Chrome APIs used:**

- `Summarizer` — text summarization (key-points mode, post-processed to match bullet count)
- `LanguageModel` — classify, extract, chat, chatStream, proofread (via structured JSON prompts)
- `Translator` — language translation
- `Rewriter` — text rewriting (tone/length mapping)
- `Writer` — fallback for proofread if LanguageModel unavailable

**Key implementation details:**

- **Availability detection**: Lenient checks with 3s timeout; accepts `readily`, `available`, `downloadable`, `after-download` statuses; falls back to API object existence
- **Streaming**: Uses `LanguageModel.promptStreaming()` with auto-detection of cumulative vs delta chunk format (varies by Chrome version)
- **Event emitter**: Web polyfill for Expo's native `addListener`/`removeListeners` pattern using a `Map<string, Set<Function>>`
- **Instance caching**: Summarizer, LanguageModel, Translator, Rewriter, Writer instances are cached and reused
- **Model management**: No-op on web (Chrome manages models automatically)

## Config Plugin (`withOndeviceAi.ts`)

The Expo config plugin automates native setup at prebuild time.
Expand Down Expand Up @@ -164,15 +186,18 @@ The bridge is discovered at runtime by `LlamaCppBridge.findBridge()` using `NSCl
### Key Build Settings

Bridge pod (`pod_target_xcconfig`):
Comment thread
hyochan marked this conversation as resolved.

- `SWIFT_INCLUDE_PATHS` / `FRAMEWORK_SEARCH_PATHS` → `$(PODS_CONFIGURATION_BUILD_DIR)` (for SPM modules)
- `IPHONEOS_DEPLOYMENT_TARGET` → `17.0` (LocalLLMClient requirement)
- `OTHER_SWIFT_FLAGS` → `-cxx-interoperability-mode=default -Xcc -std=c++20`

App target (`user_target_xcconfig`):
Comment thread
hyochan marked this conversation as resolved.

- `OTHER_LDFLAGS` → `-framework "llama"` (link dynamic framework)
- `FRAMEWORK_SEARCH_PATHS` → `$(PODS_CONFIGURATION_BUILD_DIR)` (find llama.framework)

Embed phase:

Comment thread
hyochan marked this conversation as resolved.
- Copies `llama.framework` from `PackageFrameworks/` to app's `Frameworks/`
- Re-signs with `EXPANDED_CODE_SIGN_IDENTITY`

Expand All @@ -191,6 +216,9 @@ bun ios --device

# Run on Android
bun android

# Run on Web (Chrome 138+ required for AI features)
bun web
```

### App Structure
Expand Down
155 changes: 106 additions & 49 deletions .claude/guides/09-platform-differences.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,51 @@
# Platform Feature Differences

This guide documents feature availability and implementation differences across iOS and Android platforms.
This guide documents feature availability and implementation differences across iOS, Android, and Web platforms.

## Feature Availability Matrix

| Feature | iOS | Android | Notes |
|---------|-----|---------|-------|
| **Core Framework** | ✅ | ✅ | Identical API across platforms |
| Chains (7 built-in) | ✅ | ✅ | Same chain implementations |
| Pipeline DSL | ✅ | ✅ | Identical syntax |
| Memory (Buffer/Summary) | ✅ | ✅ | Same memory implementations |
| Guardrails | ✅ | ✅ | Same guardrail implementations |
| Tools | ✅ | ✅ | Same tool protocol |
| Agent (ReAct-lite) | ✅ | ✅ | Same agent implementation |
| Session Management | ✅ | ✅ | Same session API |
| **On-Device AI Backends** | | | |
| Apple Intelligence | ✅ | ❌ | iOS 26+, macOS 26+ only |
| Gemini Nano | ❌ | ✅ | Android 14+ only |
| **External Model Support** | | | |
| llama.cpp (GGUF) | ✅ | ❌ | iOS 17+ via LocalLLMClient |
| ExecuTorch (GGUF) | ❌ | ✅ | Android API 26+ |
| **Engine System** | ✅ | ✅ | Both platforms support external models |
| InferenceRouter | ✅ | ✅ | Auto-routing to active engine |
| ModelManager | ✅ | ✅ | Download/load/unload GGUF models |
| ModelRegistry | ✅ | ✅ | Available model catalog |
| DeviceCapabilityDetector | ✅ | ❌ | iOS-only hardware detection |
| **RAG** | ✅ | ✅ | Both platforms |
| VectorStore | ✅ | ✅ | In-memory vector storage |
| DocumentChunker | ✅ | ✅ | Multiple chunking strategies |
| EmbeddingEngine | ✅ | ✅ | Text embedding generation |
| RAGManager | ✅ | ✅ | Collection management |
| RAGQueryEngine | ✅ | ✅ | Query pipeline |
| **Personalization** | ✅ | ✅ | Both platforms |
| PersonalizationManager | ✅ | ✅ | Feedback orchestration |
| FeedbackCollector | ✅ | ✅ | User feedback collection |
| PreferenceAnalyzer | ✅ | ✅ | Pattern analysis |
| PromptOptimizer | ✅ | ✅ | Adaptive prompts |
| Feature | iOS | Android | Web | Notes |
| ------------------------------- | --- | ------- | --- | ------------------------------------------ |
| **Core Framework** | ✅ | ✅ | ❌ | Native SDK only (iOS/Android) |
| Chains (7 built-in) | ✅ | ✅ | ❌ | Native SDK only |
| Pipeline DSL | ✅ | ✅ | ❌ | Native SDK only |
| Memory (Buffer/Summary) | ✅ | ✅ | ❌ | Native SDK only |
| Guardrails | ✅ | ✅ | ❌ | Native SDK only |
| Tools | ✅ | ✅ | ❌ | Native SDK only |
| Agent (ReAct-lite) | ✅ | ✅ | ❌ | Native SDK only |
| Session Management | ✅ | ✅ | ❌ | Native SDK only |
| **AI Features (via Libraries)** | | | | |
| Summarize | ✅ | ✅ | ✅ | Web: Chrome Summarizer API |
| Classify | ✅ | ✅ | ✅ | Web: Chrome LanguageModel API |
| Extract | ✅ | ✅ | ✅ | Web: Chrome LanguageModel API |
| Chat | ✅ | ✅ | ✅ | Web: Chrome LanguageModel API |
| Chat Stream | ✅ | ✅ | ✅ | Web: LanguageModel.promptStreaming() |
| Translate | ✅ | ✅ | ✅ | Web: Chrome Translator API |
| Rewrite | ✅ | ✅ | ✅ | Web: Chrome Rewriter API |
| Proofread | ✅ | ✅ | ✅ | Web: Chrome LanguageModel API |
| **On-Device AI Backends** | | | | |
| Apple Intelligence | ✅ | ❌ | ❌ | iOS 26+, macOS 26+ only |
| Gemini Nano | ❌ | ✅ | ✅ | Android 14+ / Chrome 138+ |
| Chrome Built-in AI | ❌ | ❌ | ✅ | Chrome 138+ (Summarizer, Translator, etc.) |
| **External Model Support** | | | | |
| llama.cpp (GGUF) | ✅ | ❌ | ❌ | iOS 17+ via LocalLLMClient |
| ExecuTorch (GGUF) | ❌ | ✅ | ❌ | Android API 26+ |
| **Engine System** | ✅ | ✅ | ❌ | Native SDK only |
| InferenceRouter | ✅ | ✅ | ❌ | Auto-routing to active engine |
| ModelManager | ✅ | ✅ | ❌ | Download/load/unload GGUF models |
| ModelRegistry | ✅ | ✅ | ❌ | Available model catalog |
| DeviceCapabilityDetector | ✅ | ❌ | ❌ | iOS-only hardware detection |
| **RAG** | ✅ | ✅ | ❌ | Native SDK only |
| VectorStore | ✅ | ✅ | ❌ | In-memory vector storage |
| DocumentChunker | ✅ | ✅ | ❌ | Multiple chunking strategies |
| EmbeddingEngine | ✅ | ✅ | ❌ | Text embedding generation |
| RAGManager | ✅ | ✅ | ❌ | Collection management |
| RAGQueryEngine | ✅ | ✅ | ❌ | Query pipeline |
| **Personalization** | ✅ | ✅ | ❌ | Native SDK only |
| PersonalizationManager | ✅ | ✅ | ❌ | Feedback orchestration |
| FeedbackCollector | ✅ | ✅ | ❌ | User feedback collection |
| PreferenceAnalyzer | ✅ | ✅ | ❌ | Pattern analysis |
| PromptOptimizer | ✅ | ✅ | ❌ | Adaptive prompts |

## Platform-Specific APIs

Expand Down Expand Up @@ -125,6 +135,34 @@ val result = SummarizeChain(model).run("text")

Comment thread
hyochan marked this conversation as resolved.
**Available:** Android API 26+

### Web-Only Features (via expo-ondevice-ai)

#### Chrome Built-in AI

Web support is available via `expo-ondevice-ai` using Chrome Built-in AI APIs. The web module (`ExpoOndeviceAiModule.web.ts`) maps each feature to the appropriate Chrome API:

```typescript
// All 8 AI features work on web via Chrome Built-in AI
import { summarize, classify, chat, translate } from "expo-ondevice-ai";

const result = await summarize("Long text...", { outputType: "THREE_BULLETS" });
const translation = await translate("Hello", { targetLanguage: "ko" });
```

**Chrome APIs used:**

| Chrome API | Features | Notes |
| --------------------------------- | ---------------------------------- | --------------------------------------------- |
| `Summarizer` | summarize | key-points mode, bullet count post-processing |
| `LanguageModel` | classify, extract, chat, proofread | Structured JSON prompts |
| `LanguageModel.promptStreaming()` | chatStream | Auto-detects cumulative vs delta chunks |
| `Translator` | translate | Per-language-pair caching |
| `Rewriter` | rewrite | Tone/length mapping from SDK types |

**Available:** Chrome 138+ with `chrome://flags/#optimization-guide-on-device-model` enabled

**Note:** `react-native-ondevice-ai` does NOT support web (Nitro Modules are native-only). Web users should use `expo-ondevice-ai` or `packages/web` standalone SDK.

## Implementation Differences

### Error Handling
Expand Down Expand Up @@ -203,21 +241,28 @@ When a feature is only available on one platform, use suffixes:

### iOS

| Requirement | Minimum | Recommended |
|-------------|---------|-------------|
| iOS Version | 17.0 | 26.0 (for Apple Intelligence) |
| macOS Version | 14.0 | 26.0 (for Apple Intelligence) |
| Xcode | 16.0 | 16.0+ |
| Swift | 6.0 | 6.0+ |
| Requirement | Minimum | Recommended |
| ------------- | ------- | ----------------------------- |
| iOS Version | 17.0 | 26.0 (for Apple Intelligence) |
| macOS Version | 14.0 | 26.0 (for Apple Intelligence) |
| Xcode | 16.0 | 16.0+ |
| Swift | 6.0 | 6.0+ |

### Android

| Requirement | Minimum | Recommended |
|-------------|---------|-------------|
| Android API | 26 | 34 (for Prompt API) |
| Kotlin | 2.0 | 2.0+ |
| Android Studio | 2024.1.1 | Latest |
| Gradle | 8.0 | 8.0+ |
| Requirement | Minimum | Recommended |
| -------------- | -------- | ------------------- |
| Android API | 26 | 34 (for Prompt API) |
| Kotlin | 2.0 | 2.0+ |
| Android Studio | 2024.1.1 | Latest |
| Gradle | 8.0 | 8.0+ |

### Web

| Requirement | Minimum | Recommended |
| ----------- | ----------------- | ----------------- |
| Chrome | 138 | Latest |
| Gemini Nano | Enabled via flags | Enabled via flags |
Comment thread
hyochan marked this conversation as resolved.

## Testing Platform-Specific Features

Expand Down Expand Up @@ -258,11 +303,23 @@ adb install -r example/build/outputs/apk/debug/example-debug.apk
3. Replace `Flow` → `AsyncThrowingStream`
4. Add `@available` annotations for iOS 26+ APIs

### Web (Expo)

```bash
# Run Expo example app on web
cd libraries/expo-ondevice-ai/example
bun web

# Requires Chrome 138+ with chrome://flags/#optimization-guide-on-device-model enabled
```

## Summary

- **Core framework** is identical across platforms (Chains, Pipeline, Memory, Guardrails, Tools, Agent, Session)
- **Engine, RAG, Personalization** layers available on both platforms
- **On-device AI backends** differ by platform (Apple Intelligence vs Gemini Nano)
- **External models** supported on both (llama.cpp on iOS, ExecuTorch on Android)
- **Core framework** is identical across iOS and Android (Chains, Pipeline, Memory, Guardrails, Tools, Agent, Session)
- **Engine, RAG, Personalization** layers available on iOS and Android
- **AI features** (summarize, classify, etc.) available on all 3 platforms via library wrappers
- **On-device AI backends** differ by platform (Apple Intelligence / Gemini Nano / Chrome Built-in AI)
- **External models** supported on iOS (llama.cpp) and Android (ExecuTorch), not on web
- **Web support** available via `expo-ondevice-ai` only (not `react-native-ondevice-ai`)
- **API naming** is identical for shared features, suffixed for platform-specific features
- Always test on **real devices** for accurate on-device AI behavior
Loading
Loading