Fix Ollama media handling, crash RCA, and capability-aware model selection by manideepsp · Pull Request #65 · Prat011/free-cluely

manideepsp · 2026-03-28T08:03:07Z

Summary

This PR fixes repeated runtime crashes in Ollama mode during media analysis and adds capability-aware model handling for image/audio workflows.

Bug Details

In Ollama mode, media analysis paths were still using Gemini-only calls:

analyzeAudioFromBase64 and analyzeAudioFile called Gemini generateContent
analyzeImageFile and image-debug/image-extraction paths could also route to Gemini-only behavior

When Ollama was selected, the Gemini model instance was null by design, which caused:

TypeError: Cannot read properties of null (reading 'generateContent')

Root Cause Analysis

Provider mismatch in media paths

Ollama mode sets useOllama=true and does not initialize Gemini model.
Several media methods still dereferenced this.model.generateContent.

Missing capability awareness for Ollama models

The app fetched model names but had no concept of modality support.
Image and audio calls did not verify whether the selected Ollama model could process those modalities.

No guidance/fallbacks

Failures surfaced as generic runtime exceptions rather than actionable remediation.

Fix Implemented

1) Provider-safe execution paths

Added Gemini guard helper to prevent null dereference for Gemini-only invocations.
Updated generation paths to branch by active provider.

2) Ollama image support

Added Ollama multimodal image handling via /api/chat with messages[].images.
Image extraction/debug/analysis now work in Ollama mode when the selected model supports vision.

3) Ollama audio support path

Added best-effort Ollama audio analysis via /api/chat.
Tries compatible payload variants for broader Ollama/model compatibility.
Returns actionable install guidance if audio is unsupported by current installation/model.

4) Capability detection and auto-selection

Added capability inference from Ollama /api/tags model metadata (name + families/details) for:
- supportsVision
- supportsAudio
Before media analysis, the helper now:
- validates current model capability
- auto-switches to an installed capability-matching model when available
- emits clear install guidance when no capable model exists

5) IPC and UI exposure

Exposed capability metadata through Electron IPC/preload APIs.
Model selector now shows capability badges (vision/audio), selected model capability summary, and install hints when missing.

Files Changed

electron/LLMHelper.ts
electron/ipcHandlers.ts
electron/preload.ts
src/components/ui/ModelSelector.tsx
src/App.tsx
src/types/electron.d.ts

Validation

Electron typecheck: npx tsc -p electron/tsconfig.json
Workspace typecheck: npx tsc --noEmit
Result: no TypeScript errors.

Behavioral Impact

Eliminates null dereference crashes in Ollama mode for media-triggered flows.
Enables image analysis in Ollama mode when a vision-capable model is installed.
Adds best-effort audio path in Ollama mode, with explicit guidance when unsupported.

Notes

Capability detection is heuristic-based from Ollama model metadata and naming.
Audio support depends on Ollama version and model-specific multimodal support.

Example Install Guidance

Vision-capable models:
- ollama pull llama3.2-vision:11b
- ollama pull llava:7b
Audio-capable models (if available in your Ollama build):
- ollama pull qwen2-audio:7b

Risk Assessment

Low-to-medium:

Adds provider checks and fallback logic but keeps existing API surface largely unchanged.
Main risk is false positives/negatives from capability inference heuristics, mitigated by clear error messaging and install hints.

Follow-up (Optional)

Replace heuristic capability inference with explicit capability probing against model metadata when Ollama exposes richer modality attributes.

Copilot

Pull request overview

Fixes Ollama-mode media-analysis crashes caused by Gemini-only calls, and introduces capability-aware Ollama model discovery/selection for vision/audio workflows across the Electron main process and the renderer model selector UI.

Changes:

Added Gemini model guard + provider branching so image/audio flows don’t dereference a null Gemini model in Ollama mode.
Implemented Ollama /api/chat paths for image (and best-effort audio) analysis, plus heuristic capability inference from /api/tags.
Exposed Ollama model capability metadata via IPC/preload and updated the UI to show capability badges and install guidance.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`electron/LLMHelper.ts`	Adds provider-safe execution, Ollama image/audio chat calls, capability inference, and auto-selection logic.
`electron/ipcHandlers.ts`	Exposes a new IPC handler to fetch Ollama model capabilities.
`electron/preload.ts`	Adds `getOllamaModelCapabilities` to the renderer-facing Electron API.
`src/components/ui/ModelSelector.tsx`	Fetches capabilities, shows capability tags/badges, and displays install hints in the model dropdown UI.
`src/App.tsx`	Extends the renderer global `window.electronAPI` typing with the new capabilities method.
`src/types/electron.d.ts`	Updates shared renderer typings for the expanded Electron API surface.

Comments suppressed due to low confidence (1)

src/components/ui/ModelSelector.tsx:52

When the current config is already Ollama, loadCurrentConfig() calls loadOllamaModels(), and the useEffect([selectedProvider]) will also call loadOllamaModels() after setSelectedProvider('ollama'), causing duplicate fetches and potential state races. Consider choosing one mechanism (either the effect or the explicit call) and removing the other, or add a guard to avoid the second call when models are already loaded.

  useEffect(() => {
    if (selectedProvider === 'ollama') {
      loadOllamaModels();
    }
  }, [selectedProvider]);

  const loadCurrentConfig = async () => {
    try {
      setIsLoading(true);
      const config = await window.electronAPI.getCurrentLlmConfig();
      setCurrentConfig(config);
      setSelectedProvider(config.provider);
      
      if (config.isOllama) {
        setSelectedOllamaModel(config.model);
        await loadOllamaModels();
      }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T08:07:17Z

src/components/ui/ModelSelector.tsx

  const loadOllamaModels = async () => {
    try {
-      const models = await window.electronAPI.getAvailableOllamaModels();
+      const capabilities = await window.electronAPI.getOllamaModelCapabilities();
+      const models = capabilities.map((capability) => capability.name);
+
+      setOllamaModelCapabilities(capabilities);
      setAvailableOllamaModels(models);


loadOllamaModels always queries model capabilities from whatever ollamaUrl is currently stored in the main process (defaulting to http://localhost:11434). Since the user can edit ollamaUrl in this component before clicking “Apply Changes”, the model list/capability badges can be fetched from the wrong host and appear inconsistent with the entered URL. Consider allowing getOllamaModelCapabilities (and/or getAvailableOllamaModels) to accept a URL argument, or add an IPC method to set the Ollama URL used for discovery prior to switching providers.

Copilot · 2026-03-28T08:07:17Z

electron/LLMHelper.ts

+  private async ensureOllamaCapability(modality: "vision" | "audio"): Promise<void> {
+    if (!this.useOllama) return
+
+    const capabilities = await this.getOllamaModelCapabilities()
+    if (capabilities.length === 0) {
+      throw new Error(`No Ollama models detected. ${this.getOllamaInstallGuidance(modality)}`)
+    }


ensureOllamaCapability() calls getOllamaModelCapabilities(), which fetches /api/tags on every image/audio request. For media-heavy flows this adds repeated network round-trips and can become a noticeable latency bottleneck. Consider caching the capabilities for a short TTL (or until the model list changes) and reusing them within a session, invalidating on switchToOllama() / refresh.

Copilot · 2026-03-28T08:07:18Z

electron/LLMHelper.ts

+    } catch (error) {
+      console.error("[LLMHelper] Error calling Ollama chat with images:", error)
+      throw new Error(`Failed Ollama image analysis: ${error.message}. Ensure selected Ollama model supports vision.`)
+    }


In the catch block, the code assumes error has a .message property (${error.message}), but non-Error throwables (or some fetch failures) can be strings/objects, which can cause a secondary crash while building the error message. Prefer narrowing (error instanceof Error) and falling back to String(error) (or centralize via a getErrorMessage() helper) before interpolating.

Copilot · 2026-03-28T08:07:18Z

electron/LLMHelper.ts

+      } catch (error) {
+        errors.push(error.message)


The catch block pushes error.message into the errors list without narrowing the caught value. If a non-Error is thrown, this will throw again and hide the original failure. Use error instanceof Error ? error.message : String(error) here (and in other error-message interpolations) to avoid secondary exceptions.

Suggested change

} catch (error) {

errors.push(error.message)

} catch (error: unknown) {

const message = error instanceof Error ? error.message : String(error)

errors.push(message)

- Introduced `bootstrap_vosk_model.py` for downloading and extracting Vosk models. - Added `stt_stream.py` to handle real-time speech-to-text streaming using Vosk. - Implemented audio processing utilities in `audio.ts` for handling audio data conversion and preparation. - Enhanced audio input handling with support for microphone and system audio capture. - Added error handling and status reporting for audio stream initialization and processing.

Fix Ollama media handling and model capability detection

c275793

Copilot AI review requested due to automatic review settings March 28, 2026 08:03

Copilot started reviewing on behalf of manideepsp March 28, 2026 08:03 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Ollama media handling, crash RCA, and capability-aware model selection#65

Fix Ollama media handling, crash RCA, and capability-aware model selection#65
manideepsp wants to merge 2 commits intoPrat011:masterfrom
manideepsp:fix/ollama-media-capabilities

manideepsp commented Mar 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Copilot AI Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-      } catch (error) {
-        errors.push(error.message)
+      } catch (error: unknown) {
+        const message = error instanceof Error ? error.message : String(error)
+        errors.push(message)

Conversation

manideepsp commented Mar 28, 2026

Summary

Bug Details

Root Cause Analysis

Fix Implemented

1) Provider-safe execution paths

2) Ollama image support

3) Ollama audio support path

4) Capability detection and auto-selection

5) IPC and UI exposure

Files Changed

Validation

Behavioral Impact

Notes

Example Install Guidance

Risk Assessment

Follow-up (Optional)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants