diff --git a/docs/FEATURES.md b/docs/FEATURES.md index 1555c007..ef793cdd 100644 --- a/docs/FEATURES.md +++ b/docs/FEATURES.md @@ -15,6 +15,7 @@ Canonical reference for all Osaurus features, their status, and documentation. | Remote MCP Providers | Stable | "Key Features" | REMOTE_MCP_PROVIDERS.md | Managers/MCPProviderManager.swift, Tools/MCPProviderTool.swift | | MCP Server | Stable | "MCP Server" | (in README) | Networking/OsaurusServer.swift, Services/MCP/MCPServerManager.swift | | Tools & Plugins | Stable | "Tools & Plugins" | PLUGIN_AUTHORING.md | Tools/, Managers/Plugin/PluginManager.swift, Services/Plugin/PluginHostAPI.swift, Storage/PluginDatabase.swift, Models/Plugin/PluginHTTP.swift, Views/Plugin/PluginConfigView.swift | +| Business File Runtime | Experimental | "Tools & Plugins" | OFFICE_RUNTIME.md | Managers/Documents/DocumentFormatRegistry.swift, Models/Documents/DocumentFormatAdapter.swift, Models/Documents/DocumentFormatEmitter.swift | | Skills | Stable | "Skills" | SKILLS.md | Managers/SkillManager.swift, Views/Skill/SkillsView.swift, Services/Skill/SkillSearchService.swift | | Methods | Stable | "Skills & Methods" | SKILLS.md | Models/Method/Method.swift, Services/Method/MethodService.swift, Services/Method/MethodSearchService.swift, Storage/MethodDatabase.swift | | Context Management | Stable | - | SKILLS.md | Services/Context/PreflightCapabilitySearch.swift, Tools/CapabilityTools.swift, Services/Tool/ToolSearchService.swift, Services/Tool/ToolIndexService.swift | @@ -744,6 +745,29 @@ See [PLUGIN_AUTHORING.md](PLUGIN_AUTHORING.md) for the full reference. --- +### Business File Runtime + +**Purpose:** Support high-fidelity office artifacts with native paths first and optional office-runtime enhancement when available. + +**Components:** + +- `Managers/Documents/DocumentFormatRegistry.swift` — Process-wide registry for document adapters, emitters, and streamers +- `Models/Documents/DocumentFormatAdapter.swift` — Read-side format adapter protocol +- `Models/Documents/DocumentFormatEmitter.swift` — Write-side format emitter protocol +- `Models/Documents/StructuredDocument.swift` — Typed document representation plus text view for agent context + +**PPTX capability model:** + +| Capability | User-facing meaning | +| ---------------------------------- | ------------------------------------------------------------- | +| **PPTX: Native** | PPTX workflows that do not require external office software | +| **PPTX: Enhanced with LibreOffice** | Validation, PDF export, and slide previews via office runtime | +| **LibreOffice not found** | Enhanced preview/export unavailable; native PPTX remains usable | + +LibreOffice and OpenOffice are optional. Osaurus should detect them on demand, never bundle them, and never force installation. See [OFFICE_RUNTIME.md](OFFICE_RUNTIME.md) for runtime detection, supported enhanced flows, and install-hint wording. + +--- + ### Skills **Purpose:** Import and manage reusable AI capabilities following the Agent Skills specification. @@ -1125,6 +1149,7 @@ Eight settings total, down from v1's 18. The per-section budget knobs, MMR tunin | [MEMORY.md](MEMORY.md) | Memory system and configuration guide | | [SANDBOX.md](SANDBOX.md) | Sandbox VM and plugin guide | | [PLUGIN_AUTHORING.md](PLUGIN_AUTHORING.md) | Creating custom plugins | +| [OFFICE_RUNTIME.md](OFFICE_RUNTIME.md) | Optional office runtime detection and enhanced business-file flows | | [OpenAI_API_GUIDE.md](OpenAI_API_GUIDE.md) | API usage, tool calling, streaming | | [SHARED_CONFIGURATION_GUIDE.md](SHARED_CONFIGURATION_GUIDE.md) | Shared configuration for teams | | [CONTRIBUTING.md](CONTRIBUTING.md) | Contribution guidelines | diff --git a/docs/OFFICE_RUNTIME.md b/docs/OFFICE_RUNTIME.md new file mode 100644 index 00000000..f6c5a591 --- /dev/null +++ b/docs/OFFICE_RUNTIME.md @@ -0,0 +1,60 @@ +# Office Runtime + +Osaurus business-file workflows are designed to run in two layers: + +1. **Native paths** — Create or inspect supported file formats directly in Osaurus. +2. **Enhanced paths** — Use a locally installed office runtime for validation, PDF export, and previews when that runtime is available. + +LibreOffice and OpenOffice are optional. Osaurus does not bundle either application, does not download them automatically, and does not require users to install them for native document or presentation workflows. + +--- + +## Runtime Detection + +Enhanced flows should detect an installed office runtime on demand. Typical discovery locations include: + +- The `soffice` command on `PATH` +- `/Applications/LibreOffice.app/Contents/MacOS/soffice` +- `/Applications/OpenOffice.app/Contents/MacOS/soffice` + +Detection should be fast, local, and non-invasive. If neither runtime is found, Osaurus should continue with native functionality and report that enhanced preview/export is unavailable. + +--- + +## Supported Enhanced Flows + +When LibreOffice or OpenOffice is installed, Osaurus can use it for workflows that benefit from an independent office renderer: + +| Flow | Purpose | +| ----------------- | ----------------------------------------------------------- | +| Validation | Open or convert a generated office file to catch corruption | +| PDF export | Export DOCX, XLSX, or PPTX artifacts to PDF | +| Slide previews | Render presentation slides for visual review | + +These enhanced flows are best-effort additions to native file support. They should not change the requirement that native generation and analysis paths behave predictably without office software. + +--- + +## PPTX Capability Model + +PPTX support is expressed as two capability paths: + +| Capability | Meaning | +| ---------------------------------- | ----------------------------------------------------------------------- | +| **PPTX: Native** | Generate or inspect PPTX artifacts without external office software | +| **PPTX: Enhanced with LibreOffice** | Use LibreOffice or OpenOffice for validation, PDF export, and previews | +| **LibreOffice not found** | Enhanced preview/export unavailable; native PPTX workflows remain usable | + +User-facing capability text should be explicit about which path is active. Avoid wording that implies LibreOffice is bundled or required. + +--- + +## Install Hint + +Documentation may tell users how to install an office runtime when they want enhanced preview/export: + +```bash +brew install --cask libreoffice +``` + +This is a documentation hint only. Osaurus should not force installation, prompt for automatic installation, or make enhanced flows a prerequisite for native business-file work. diff --git a/docs/PLUGIN_AUTHORING.md b/docs/PLUGIN_AUTHORING.md index bc9b2114..824cfed5 100644 --- a/docs/PLUGIN_AUTHORING.md +++ b/docs/PLUGIN_AUTHORING.md @@ -21,6 +21,7 @@ This document describes how to build external plugins for Osaurus using the Gene - [Plugin Skills (SKILL.md)](#plugin-skills-skillmd) - [Plugin Documentation](#plugin-documentation) - [Artifact Handling](#artifact-handling) + - [Business File Formats](#business-file-formats) - [Host API Reference](#host-api-reference) - [Config Store](#config-store) - [Data Store](#data-store) @@ -1128,6 +1129,47 @@ const char* invoke(osr_plugin_ctx_t ctx, const char* type, - Only plugins with ABI version 2 or higher are eligible for artifact handling. - Artifacts produced during plugin-initiated inference (`complete` / `complete_stream` with `share_artifact` in the agentic loop) are fully processed and trigger artifact handler notifications, just like artifacts produced from a chat session. +### Business File Formats + +Osaurus has an internal document-format registry that splits read-side adapters from write-side emitters. That split is the intended shape for high-fidelity business-file plugins: + +- **Parser/adapter side** — Claim a file format, parse bytes into a typed document representation, and provide a text view for the agent. +- **Emitter side** — Claim a typed document representation and write a concrete file format such as PPTX. +- **Enhancement side** — Optionally call out to locally installed office software for validation, PDF export, or preview generation. + +On current `main`, this split is represented in Swift by `DocumentFormatRegistry.register(adapter:)` and `DocumentFormatRegistry.register(emitter:)`. The plugin C ABI does not yet expose host callbacks named `register_parser` or `register_emitter`, so external plugins should not rely on those names as shipped API. Until that ABI surface exists, use normal plugin tools and artifact handling to create, read, validate, or share business files. + +#### Intended PPTX Split + +A PPTX plugin should keep native generation separate from enhanced office-runtime work: + +```json +{ + "plugin_id": "com.acme.pptx", + "version": "1.0.0", + "description": "Creates and validates PowerPoint presentations", + "capabilities": { + "tools": [ + { + "id": "create_presentation", + "description": "Generate a PPTX file from a structured deck outline" + }, + { + "id": "validate_presentation", + "description": "Validate a PPTX with an optional local office runtime" + }, + { + "id": "export_presentation_pdf", + "description": "Export PDF when LibreOffice or OpenOffice is available" + } + ], + "artifact_handler": true + } +} +``` + +Native PPTX generation should work without external office software. Enhanced validation, PDF export, and slide previews should detect LibreOffice or OpenOffice at runtime and report a clear unavailable state when neither is installed. + --- ## Host API Reference diff --git a/docs/SKILLS.md b/docs/SKILLS.md index 648ad29b..1003e5b7 100644 --- a/docs/SKILLS.md +++ b/docs/SKILLS.md @@ -232,6 +232,25 @@ No per-agent skill configuration is needed. The system automatically matches the --- +## High-fidelity business files + +Business-file workflows use the same on-demand skill activation model as other Osaurus skills. Specialized skills such as `presentation-content-producer` and `document-data-analyst` should be installed and discoverable by default, but they should not be loaded into every chat. + +The lightest activation strategy is: + +1. Keep the skill package installed so preflight search and runtime discovery can find it. +2. Give the skill a precise description that names the work it owns, such as presentation narrative, slide structure, spreadsheet analysis, or document data extraction. +3. Let preflight capability search select the skill only when the user asks for that kind of business-file work. +4. Use runtime discovery (`capabilities_search` then `capabilities_load`) when a conversation shifts into business-file work after it has already started. + +`presentation-content-producer` is intended for presentation planning and content decisions: storyline, audience fit, slide-by-slide structure, speaker notes, and refinement passes before or after a PPTX artifact is produced. + +`document-data-analyst` is intended for analytical document work: extracting tables and facts, summarizing business documents, checking consistency, and turning structured document or spreadsheet content into analysis. + +This keeps ordinary chats lean while still making high-fidelity business-file skills available at the moment they are useful. + +--- + ## Troubleshooting ### Skills not appearing in chat