Fix Ollama media handling, crash RCA, and capability-aware model selection#65
Fix Ollama media handling, crash RCA, and capability-aware model selection#65manideepsp wants to merge 2 commits intoPrat011:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes Ollama-mode media-analysis crashes caused by Gemini-only calls, and introduces capability-aware Ollama model discovery/selection for vision/audio workflows across the Electron main process and the renderer model selector UI.
Changes:
- Added Gemini model guard + provider branching so image/audio flows don’t dereference a null Gemini model in Ollama mode.
- Implemented Ollama
/api/chatpaths for image (and best-effort audio) analysis, plus heuristic capability inference from/api/tags. - Exposed Ollama model capability metadata via IPC/preload and updated the UI to show capability badges and install guidance.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
electron/LLMHelper.ts |
Adds provider-safe execution, Ollama image/audio chat calls, capability inference, and auto-selection logic. |
electron/ipcHandlers.ts |
Exposes a new IPC handler to fetch Ollama model capabilities. |
electron/preload.ts |
Adds getOllamaModelCapabilities to the renderer-facing Electron API. |
src/components/ui/ModelSelector.tsx |
Fetches capabilities, shows capability tags/badges, and displays install hints in the model dropdown UI. |
src/App.tsx |
Extends the renderer global window.electronAPI typing with the new capabilities method. |
src/types/electron.d.ts |
Updates shared renderer typings for the expanded Electron API surface. |
Comments suppressed due to low confidence (1)
src/components/ui/ModelSelector.tsx:52
- When the current config is already Ollama,
loadCurrentConfig()callsloadOllamaModels(), and theuseEffect([selectedProvider])will also callloadOllamaModels()aftersetSelectedProvider('ollama'), causing duplicate fetches and potential state races. Consider choosing one mechanism (either the effect or the explicit call) and removing the other, or add a guard to avoid the second call when models are already loaded.
useEffect(() => {
if (selectedProvider === 'ollama') {
loadOllamaModels();
}
}, [selectedProvider]);
const loadCurrentConfig = async () => {
try {
setIsLoading(true);
const config = await window.electronAPI.getCurrentLlmConfig();
setCurrentConfig(config);
setSelectedProvider(config.provider);
if (config.isOllama) {
setSelectedOllamaModel(config.model);
await loadOllamaModels();
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/components/ui/ModelSelector.tsx
Outdated
| const loadOllamaModels = async () => { | ||
| try { | ||
| const models = await window.electronAPI.getAvailableOllamaModels(); | ||
| const capabilities = await window.electronAPI.getOllamaModelCapabilities(); | ||
| const models = capabilities.map((capability) => capability.name); | ||
|
|
||
| setOllamaModelCapabilities(capabilities); | ||
| setAvailableOllamaModels(models); |
There was a problem hiding this comment.
loadOllamaModels always queries model capabilities from whatever ollamaUrl is currently stored in the main process (defaulting to http://localhost:11434). Since the user can edit ollamaUrl in this component before clicking “Apply Changes”, the model list/capability badges can be fetched from the wrong host and appear inconsistent with the entered URL. Consider allowing getOllamaModelCapabilities (and/or getAvailableOllamaModels) to accept a URL argument, or add an IPC method to set the Ollama URL used for discovery prior to switching providers.
| private async ensureOllamaCapability(modality: "vision" | "audio"): Promise<void> { | ||
| if (!this.useOllama) return | ||
|
|
||
| const capabilities = await this.getOllamaModelCapabilities() | ||
| if (capabilities.length === 0) { | ||
| throw new Error(`No Ollama models detected. ${this.getOllamaInstallGuidance(modality)}`) | ||
| } |
There was a problem hiding this comment.
ensureOllamaCapability() calls getOllamaModelCapabilities(), which fetches /api/tags on every image/audio request. For media-heavy flows this adds repeated network round-trips and can become a noticeable latency bottleneck. Consider caching the capabilities for a short TTL (or until the model list changes) and reusing them within a session, invalidating on switchToOllama() / refresh.
| } catch (error) { | ||
| console.error("[LLMHelper] Error calling Ollama chat with images:", error) | ||
| throw new Error(`Failed Ollama image analysis: ${error.message}. Ensure selected Ollama model supports vision.`) | ||
| } |
There was a problem hiding this comment.
In the catch block, the code assumes error has a .message property (${error.message}), but non-Error throwables (or some fetch failures) can be strings/objects, which can cause a secondary crash while building the error message. Prefer narrowing (error instanceof Error) and falling back to String(error) (or centralize via a getErrorMessage() helper) before interpolating.
electron/LLMHelper.ts
Outdated
| } catch (error) { | ||
| errors.push(error.message) |
There was a problem hiding this comment.
The catch block pushes error.message into the errors list without narrowing the caught value. If a non-Error is thrown, this will throw again and hide the original failure. Use error instanceof Error ? error.message : String(error) here (and in other error-message interpolations) to avoid secondary exceptions.
| } catch (error) { | |
| errors.push(error.message) | |
| } catch (error: unknown) { | |
| const message = error instanceof Error ? error.message : String(error) | |
| errors.push(message) |
- Introduced `bootstrap_vosk_model.py` for downloading and extracting Vosk models. - Added `stt_stream.py` to handle real-time speech-to-text streaming using Vosk. - Implemented audio processing utilities in `audio.ts` for handling audio data conversion and preparation. - Enhanced audio input handling with support for microphone and system audio capture. - Added error handling and status reporting for audio stream initialization and processing.
Summary
This PR fixes repeated runtime crashes in Ollama mode during media analysis and adds capability-aware model handling for image/audio workflows.
Bug Details
In Ollama mode, media analysis paths were still using Gemini-only calls:
When Ollama was selected, the Gemini model instance was null by design, which caused:
Root Cause Analysis
Fix Implemented
1) Provider-safe execution paths
2) Ollama image support
3) Ollama audio support path
4) Capability detection and auto-selection
5) IPC and UI exposure
Files Changed
Validation
Behavioral Impact
Notes
Example Install Guidance
Risk Assessment
Low-to-medium:
Follow-up (Optional)