Skip to content

Ios demo polish#454

Open
VyasGuru wants to merge 7 commits intomainfrom
ios-demo-polish
Open

Ios demo polish#454
VyasGuru wants to merge 7 commits intomainfrom
ios-demo-polish

Conversation

@VyasGuru
Copy link
Copy Markdown
Collaborator

@VyasGuru VyasGuru commented Mar 11, 2026

will post ss of testing in a bit

WhatsApp Image 2026-03-12 at 03 44 00
WhatsApp Image 2026-03-12 at 03 44 01 (1)
WhatsApp Image 2026-03-12 at 03 44 01 (2)
WhatsApp Image 2026-03-12 at 03 44 01
WhatsApp Image 2026-03-12 at 03 44 02 (1)
WhatsApp Image 2026-03-12 at 03 44 02
WhatsApp Image 2026-03-12 at 03 45 31

Summary by CodeRabbit

  • New Features

    • Thinking Mode toggle in Settings, UI badge, per-model support, persisted across launches
    • New base model and LoRA adapter entry with example prompts
    • Voice interruption and resume controls; voice events include thinking content
  • Improvements

    • Strips embedded "thinking" content from visible responses; thinking-aware prompts and generation flow
    • Safer streaming with cancellation, improved UTF‑8 handling, and richer streaming metrics
    • Camera lifecycle handling and enhanced tool-input validation
  • Bug Fixes

    • Safer audio/TTS stop and prevention of overlapping model loads

Toggle-able thinking mode from settings
Fixed race issue in model loading/cancelling
UTF-8 parser now matches llamacpp one
Fixed error in vlm where camera stayed on even when you exit that section
Default sys prompt when loading
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 49ceac21-50b2-4ea2-b2e5-1cbfca317a69

📥 Commits

Reviewing files that changed from the base of the PR and between 8aab1fa and c9c6ca6.

📒 Files selected for processing (1)
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelSelectionRows.swift

📝 Walkthrough

Walkthrough

Adds end-to-end "thinking" support: model metadata/registration changes, persisted thinking-mode settings and UI, extraction/handling of embedded content across generation and voice flows, streaming cancellation and UTF‑8 robustness in native backends, plus related UI, tooling, and utility updates.

Changes

Cohort / File(s) Summary
App init & model registry
examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift
Initialize SettingsViewModel.shared at startup; update RunAnywhere.registerModel(...) signature usages to include supportsThinking: Bool; add/adjust Qwen model registrations.
Chat view models
examples/ios/RunAnywhereAI/.../Features/Chat/LLMViewModel*.swift, .../LLMViewModel+Events.swift, .../LLMViewModel+Generation.swift, .../LLMViewModel+ModelManagement.swift, .../LLMViewModel+ToolCalling.swift
Track loadedModelSupportsThinking, add setter, strip <think> tags for display during streaming and tool-calls, apply thinking-mode prefix to prompts and propagate supportsThinking on model load/status.
Chat UI
examples/ios/RunAnywhereAI/.../Features/Chat/Views/ChatInterfaceView.swift
Add thinkingModeEnabled AppStorage and toolSettingsViewModel; add thinking-mode badge/toggle; adjust scrolling/typing indicator and input padding conditional on thinking/tool/LoRA state.
Models & LoRA
examples/ios/RunAnywhereAI/.../Features/Models/*
Guard against re-entrant model loads (isLoadingModel); add a new LoRA adapter catalog entry and example prompts; show LoRA badge in model rows.
Settings UI & VM
examples/ios/RunAnywhereAI/.../Features/Settings/*
Switch to SettingsViewModel.shared; add thinkingModeEnabled persistence and loadedModelSupportsThinking tracking via SDK notifications; add thinking-mode UI controls and description.
RAG (retrieval-augmented generation)
examples/ios/RunAnywhereAI/.../Features/RAG/*
Introduce RAGMessage with thinkingContent; compute effective question using /no_think\n when appropriate; render expandable thinking summaries in UI and store both display and thinking content.
Voice & VoiceAgent API/flows
examples/ios/RunAnywhereAI/.../Features/Voice/*.swift, sdk/runanywhere-swift/.../VoiceAgent/*
Propagate thinkingContent through voice events/results; add interruptSpeaking/interruptPlayback and resumeListening; include thinkingModeEnabled/maxTokens in session config; refactor voice session flow to transcription→generate (with thinking handling)→synthesize→playback.
Vision / camera lifecycle
examples/ios/RunAnywhereAI/.../Features/Vision/VLMCameraView.swift
Add scenePhase handling to stop/start camera on lifecycle changes and ensure setup only when authorized.
AdaptiveMicButton & gestures
examples/ios/RunAnywhereAI/.../Helpers/AdaptiveLayout.swift, .../Voice/VoiceAssistantView.swift
Add optional onLongPress to AdaptiveMicButton and wire long-press gesture; update voice UI to prioritize interrupt/resume/send flows.
Swift SDK — text generation & tool-calling
sdk/runanywhere-swift/.../RunAnywhere+TextGeneration.swift, .../RunAnywhere+ToolCalling.swift
Add internal ThinkingContentParser; extract thinkingContent/thinkingTokens into generation results; handle /no_think\n prefix; streaming refactor for cancellation, metrics propagation and cleaner error handling.
Swift SDK — TTS & audio
sdk/runanywhere-swift/.../RunAnywhere+TTS.swift, .../TTS/Services/AudioPlaybackManager.swift
TTS streaming context now owns audio Data; free ttsResult after use; compute duration from context; AudioPlaybackManager.stop() now guards on audioPlayer != nil.
Swift SDK — voice session & types
sdk/runanywhere-swift/.../VoiceAgent/*
Add thinkingContent to VoiceAgentResult and events; extend VoiceSessionEvent and VoiceSessionConfig with thinking fields and maxTokens; add public interrupt/resume APIs.
Swift SDK — bridge & archive util
sdk/runanywhere-swift/.../Foundation/Bridge/CppBridge.swift, .../Infrastructure/Download/Utilities/ArchiveUtility.swift
Parallelize destroy() calls for AI components during shutdown; fix compression_stream initialization and finalize parameter typing for decompression.
C++ LLaMA backend & LLM component
sdk/runanywhere-commons/.../backends/llamacpp/llamacpp_backend.cpp, .../features/llm/llm_component.cpp
Add incremental DFA-based UTF‑8 scanner state for streaming validation and safer stop-window trimming; add atomic cancellation flag (cancel_requested) and per-stream cancel pointer, wire cancellation into token callbacks and prefer observed token counts on completion.
C++ tool-calling parsing
sdk/runanywhere-commons/.../features/llm/tool_calling.cpp
Improve JSON key/value extraction to support scalar values and add a fallback that flattens top-level keys into arguments when no wrapper object present.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ChatUI as Chat Interface
    participant Settings as SettingsViewModel
    participant LLMVM as LLMViewModel
    participant LLMSDK as RunAnywhere LLM SDK
    participant CppBackend as C++ Backend

    User->>ChatUI: Toggle Thinking Mode
    ChatUI->>Settings: set thinkingModeEnabled
    Settings->>Settings: persist

    User->>ChatUI: Send prompt
    ChatUI->>LLMVM: startGeneration(prompt)
    LLMVM->>Settings: read thinkingModeEnabled
    LLMVM->>LLMVM: check loadedModelSupportsThinking
    alt thinking enabled & supported
        LLMVM->>LLMSDK: generate(with thinking prefix)
    else
        LLMVM->>LLMSDK: generate(original prompt)
    end
    LLMSDK->>CppBackend: call C++ generate (streaming)
    CppBackend-->>LLMSDK: stream tokens (may include <think> tags)
    LLMSDK->>LLMVM: LLMGenerationResult(text, thinkingContent)
    LLMVM->>ChatUI: update message with stripped display text
Loading
sequenceDiagram
    participant User
    participant VoiceUI as Voice Assistant
    participant VoiceVM as VoiceAgentViewModel
    participant VoiceSession as Voice Session (SDK)
    participant LLMSDK as RunAnywhere LLM SDK
    participant TTSSDK as TTS SDK

    User->>VoiceUI: Tap mic
    VoiceUI->>VoiceVM: micAction()
    alt currently speaking
        VoiceVM->>VoiceSession: interruptPlayback()
    else
        VoiceVM->>VoiceSession: startConversation()
    end

    VoiceSession->>LLMSDK: generate(effectivePrompt considering /no_think)
    LLMSDK-->>VoiceSession: result (text, thinkingContent)
    VoiceSession->>TTSSDK: synthesize(cleanedResponse)
    TTSSDK-->>VoiceSession: audio
    VoiceSession->>VoiceUI: play audio / emit events with thinkingContent
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested labels

ios-sdk, ios-sample

Suggested reviewers

  • shubhammalhotra28
  • Siddhesh2377

Poem

🐰 I nibble code with eager hops,
Thinking tags tucked in tiny props.
Streams stay safe, and tokens sing,
Voices hush, then loudly ring.
A rabbit grins — now let thoughts spring!

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete. It only contains 'will post ss of testing in a bit' followed by screenshots, missing all required template sections like Type of Change, Testing, Platform-Specific Testing, Labels, and Checklist. Complete the PR description by filling out all required sections: add Type of Change selection, Testing confirmation, Platform-Specific Testing checkboxes, appropriate Labels, and complete the Checklist items. Provide a brief summary of the changes made.
Docstring Coverage ⚠️ Warning Docstring coverage is 53.52% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Ios demo polish' is vague and generic. It does not clearly indicate what specific improvements or changes are made to the iOS demo. Use a more descriptive title that specifies the main features added, such as 'iOS demo: Add thinking mode, voice features, and vision improvements' or similar that captures the core changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ios-demo-polish
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines +225 to 229
if !config.thinkingModeEnabled {
effectivePrompt = "/no_think\n\(transcription)"
} else {
effectivePrompt = transcription
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/no_think prefix added for non-thinking models

VoiceSessionConfig.thinkingModeEnabled is set to settings.loadedModelSupportsThinking && settings.thinkingModeEnabled in VoiceAgentViewModel. This collapses two independent flags into one boolean, so the session can no longer distinguish "thinking mode disabled for a model that supports thinking" from "model does not support thinking at all."

As a result, every model that doesn't have thinking mode on (including Llama, Phi, etc. that never support thinking) will have /no_think\n prepended to every single voice prompt. This is inconsistent with how LLMViewModel.applyThinkingModePrefix handles the same scenario:

// LLMViewModel — correct: skips prefix entirely for non-thinking models
guard loadedModelSupportsThinking else { return prompt }

The fix is to add a separate modelSupportsThinking: Bool field to VoiceSessionConfig and mirror the guard pattern used in LLMViewModel:

if config.modelSupportsThinking && !config.thinkingModeEnabled {
    effectivePrompt = "/no_think\n\(transcription)"
} else {
    effectivePrompt = transcription
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift
Line: 225-229

Comment:
**`/no_think` prefix added for non-thinking models**

`VoiceSessionConfig.thinkingModeEnabled` is set to `settings.loadedModelSupportsThinking && settings.thinkingModeEnabled` in `VoiceAgentViewModel`. This collapses two independent flags into one boolean, so the session can no longer distinguish "thinking mode disabled for a model that _supports_ thinking" from "model does not support thinking at all."

As a result, every model that doesn't have thinking mode on (including Llama, Phi, etc. that never support thinking) will have `/no_think\n` prepended to every single voice prompt. This is inconsistent with how `LLMViewModel.applyThinkingModePrefix` handles the same scenario:

```swift
// LLMViewModel — correct: skips prefix entirely for non-thinking models
guard loadedModelSupportsThinking else { return prompt }
```

The fix is to add a separate `modelSupportsThinking: Bool` field to `VoiceSessionConfig` and mirror the guard pattern used in `LLMViewModel`:

```swift
if config.modelSupportsThinking && !config.thinkingModeEnabled {
    effectivePrompt = "/no_think\n\(transcription)"
} else {
    effectivePrompt = transcription
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +182 to +189
Task {
await LLM.shared.destroy()
await STT.shared.destroy()
await TTS.shared.destroy()
await VAD.shared.destroy()
await VoiceAgent.shared.destroy()
await VLM.shared.destroy()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swift destroy() tasks race with C shutdown

The Task { await … .destroy() } is fired-and-forgotten — it is not awaited before the synchronous C-level shutdown (Telemetry.shutdown(), Events.unregister(), etc.) continues. If the destroy() implementations reach into the C layer (e.g. to release handles), they will do so concurrently with or even after the underlying C objects have already been freed, creating a use-after-free.

Either await all destroy calls before proceeding, or ensure the C shutdown is run at the end of the same task:

Task {
    await LLM.shared.destroy()
    await STT.shared.destroy()
    await TTS.shared.destroy()
    await VAD.shared.destroy()
    await VoiceAgent.shared.destroy()
    await VLM.shared.destroy()

    // Now safe to run C-level shutdown
    Telemetry.shutdown()
    Events.unregister()
    ...
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift
Line: 182-189

Comment:
**Swift `destroy()` tasks race with C shutdown**

The `Task { await … .destroy() }` is fired-and-forgotten — it is not awaited before the synchronous C-level shutdown (`Telemetry.shutdown()`, `Events.unregister()`, etc.) continues. If the `destroy()` implementations reach into the C layer (e.g. to release handles), they will do so concurrently with or even after the underlying C objects have already been freed, creating a use-after-free.

Either `await` all destroy calls before proceeding, or ensure the C shutdown is run at the end of the same task:

```swift
Task {
    await LLM.shared.destroy()
    await STT.shared.destroy()
    await TTS.shared.destroy()
    await VAD.shared.destroy()
    await VoiceAgent.shared.destroy()
    await VLM.shared.destroy()

    // Now safe to run C-level shutdown
    Telemetry.shutdown()
    Events.unregister()
    ...
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +110 to 111
thinkingTokens: thinkingContent.map { _ in outputTokens } ?? 0,
responseTokens: outputTokens
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinkingTokens incorrectly set to total output tokens

thinkingContent.map { _ in outputTokens } sets thinkingTokens to the entire outputTokens value whenever thinking content is present. But outputTokens already represents the total completion tokens (thinking + response combined). Setting thinkingTokens = outputTokens and responseTokens = outputTokens means both fields report the same total, which double-counts every token and inflates any downstream metrics that sum the two.

The thinking token count cannot be known exactly without tracking it during generation (e.g. counting the tokens inside the <think> block), but at a minimum the value should not exceed outputTokens. Consider leaving it as 0 until a reliable count is available, or computing a rough estimate from thinkingContent length:

thinkingTokens: thinkingContent.map { _ in 0 } ?? 0,
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift
Line: 110-111

Comment:
**`thinkingTokens` incorrectly set to total output tokens**

`thinkingContent.map { _ in outputTokens }` sets `thinkingTokens` to the entire `outputTokens` value whenever thinking content is present. But `outputTokens` already represents the _total_ completion tokens (thinking + response combined). Setting `thinkingTokens = outputTokens` and `responseTokens = outputTokens` means both fields report the same total, which double-counts every token and inflates any downstream metrics that sum the two.

The thinking token count cannot be known exactly without tracking it during generation (e.g. counting the tokens inside the `<think>` block), but at a minimum the value should not exceed `outputTokens`. Consider leaving it as `0` until a reliable count is available, or computing a rough estimate from `thinkingContent` length:

```swift
thinkingTokens: thinkingContent.map { _ in 0 } ?? 0,
```

How can I resolve this? If you propose a fix, please make it concise.


emit(.responded(text: cleanedResponse, thinkingContent: thinkingContent))

// Step 4: Synthesize speech from cleaned response (no think tags spoken)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Step 3 in numbered comment sequence

The inline comments jump from Step 2 to Step 4 with no Step 3, which is confusing for anyone reading through the pipeline:

// Step 1: Transcribe audio
// Step 2: Generate LLM response
// Step 4: Synthesize speech         ← should be Step 3
Suggested change
// Step 4: Synthesize speech from cleaned response (no think tags spoken)
// Step 3: Synthesize speech from cleaned response (no think tags spoken)
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift
Line: 239

Comment:
**Missing Step 3 in numbered comment sequence**

The inline comments jump from Step 2 to Step 4 with no Step 3, which is confusing for anyone reading through the pipeline:

```
// Step 1: Transcribe audio
// Step 2: Generate LLM response
// Step 4: Synthesize speech         ← should be Step 3
```

```suggestion
            // Step 3: Synthesize speech from cleaned response (no think tags spoken)
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+ToolCalling.swift (1)

208-214: ⚠️ Potential issue | 🟡 Minor

buildFollowUpPrompt loses /no_think prefix positioning in follow-up iterations.

The initial prompt correctly positions /no_think\n at the beginning (lines 179-181). However, buildFollowUpPrompt receives the original prompt parameter and embeds it in the middle of a larger string:

"User: \(prompt)"  // becomes: "User: /no_think\noriginal question..."

If the C++ inference layer requires /no_think at the prompt start, tool-calling loops after the first iteration will not suppress thinking correctly.

Fix: Extract and reapply the /no_think prefix in buildFollowUpPrompt, or pass cleanPrompt to the function instead of prompt.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+ToolCalling.swift
around lines 208 - 214, The follow-up prompt builder (buildFollowUpPrompt) is
embedding the original prompt which may contain the `/no_think` prefix in the
middle of the assembled message, so the C++ layer loses the required leading
`/no_think`. Fix by ensuring the leading `/no_think` is preserved: either strip
and capture the `/no_think` prefix from prompt into cleanPrompt before calling
buildFollowUpPrompt and pass cleanPrompt instead of prompt, or modify
buildFollowUpPrompt to detect and reapply the `/no_think` prefix at the very
start of the returned string; update the call that sets fullPrompt to pass
cleanPrompt (or reapply prefix) and keep references to buildFollowUpPrompt,
fullPrompt, prompt, and cleanPrompt consistent.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift (1)

417-421: ⚠️ Potential issue | 🟡 Minor

Keep a stable scroll target while streaming.

After the first visible token arrives, the "typing" view disappears, but scrollToBottom(proxy:animated:) later in this file still targets "typing" whenever isGenerating is true. That leaves focus/keyboard-triggered scrolls pointing at an ID that no longer exists mid-generation. Use the last message ID once content is non-empty, or keep a persistent bottom anchor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`
around lines 417 - 421, The scrolling target "typing" disappears once the first
token arrives, so change scroll target logic used by
scrollToBottom(proxy:animated:) to choose a stable anchor: when
viewModel.isGenerating is true but viewModel.messages.last?.content.isEmpty ==
false, use the last message's id (e.g., viewModel.messages.last?.id) instead of
"typing", or create a persistent bottom anchor view ID (e.g., "bottom-anchor")
and always scroll to that; update the TypingIndicatorView id usage
(id("typing")) and the code that calls scrollToBottom(proxy:animated:) to select
the last message ID when content is non-empty or fall back to the persistent
"bottom-anchor" so scroll targets never point to a removed view.
🧹 Nitpick comments (6)
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift (1)

136-136: Consider providing feedback when a load is already in progress.

The silent early return prevents race conditions, but users may be confused if they tap a model and nothing visibly happens. Consider either:

  • Setting isLoading = true at the start of selectModel (similar to loadModelsFromRegistry), so the UI can show a loading indicator.
  • Logging or providing minimal feedback that a load is already in progress.

This is minor since the UI may already handle this elsewhere.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift`
at line 136, The early return in selectModel guarded by isLoadingModel makes
taps silently no-op; update selectModel (in ModelListViewModel) to give feedback
by either setting the same loading state used in loadModelsFromRegistry (e.g.,
set isLoading = true / isLoadingModel = true at the start of selectModel) so the
UI shows a spinner, or log/emit a minimal message when isLoadingModel is true to
indicate a load is in progress; ensure you clear the loading flag on
completion/failure just like loadModelsFromRegistry to avoid leaving the view
stuck.
sdk/runanywhere-commons/src/features/llm/llm_component.cpp (1)

772-788: Lock-free cancel is intentional but has a subtle race window.

The design correctly avoids taking the mutex so that cancel can interrupt an in-flight generation that holds the lock. However, accessing component->lifecycle (line 780) without the mutex could race with rac_llm_component_destroy or rac_llm_component_unload if called concurrently.

This is likely acceptable since calling cancel() during component destruction would be a usage error, but consider documenting this constraint in the header or adding an assertion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/features/llm/llm_component.cpp` around lines 772
- 788, The cancel implementation reads component->lifecycle without holding the
component mutex which can race with destruction/unload; to fix, either document
the threading constraint in the public header (declare that
rac_llm_component_cancel must not be called concurrently with
rac_llm_component_destroy or rac_llm_component_unload) or add a defensive
check/assert at the top of rac_llm_component_cancel (e.g.,
assert(component->lifecycle && "cannot cancel during destroy/unload") or return
an error if lifecycle is null) before calling rac_lifecycle_get_service and
rac_llm_cancel, referencing rac_llm_component_cancel, component->lifecycle,
rac_llm_component_destroy, rac_llm_component_unload and rac_llm_cancel.
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp (1)

983-1041: Consider extracting duplicated UTF-8 handling logic.

The UTF-8 validation loop, scanner_state usage, and safe-stop trimming are nearly identical to generate_stream (lines 682-752). Additionally, STOP_SEQUENCES and MAX_STOP_LEN are defined twice (lines 665-674 and 966-975).

Extracting this into a helper class or shared method would reduce duplication and ensure both code paths stay synchronized.

♻️ Suggested approach

Consider extracting to a file-level helper or inner class:

namespace {
static const std::vector<std::string> kStopSequences = {
    "<|im_end|>", "<|eot_id|>", "</s>", "<|end|>", "<|endoftext|>",
    "\n\nUser:", "\n\nHuman:",
};

static const size_t kMaxStopLen = []{
    size_t m = 0;
    for (const auto& s : kStopSequences) m = std::max(m, s.size());
    return m;
}();
} // namespace

The UTF-8 buffering and stop-sequence detection could be encapsulated in a small struct that both generation paths use.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 983 - 1041, Extract the duplicated UTF-8 buffering and stop-sequence logic
into a shared helper (e.g., a static anonymous-namespace constants
kStopSequences and kMaxStopLen and a small Utf8StopDetector/Utf8StopBuffer
struct) and replace the inline code in both the streaming loop and
generate_stream with calls to that helper; the helper should own Utf8State,
partial buffer, stop_window and expose methods to feed new bytes (from
common_token_to_piece/new_token_chars), return any fully-valid UTF‑8 chunk,
detect/trim stop sequences (mirroring the current found_stop_pos logic), and
produce safe trimmed output for appending to generated_text so you can remove
duplicated STOP_SEQUENCES, MAX_STOP_LEN, scanner_state, partial_utf8_buffer and
stop_window usages in the loops.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift (1)

103-103: Remove debug print statement.

This print statement will log to the console in production builds. Consider using os.Logger for debug builds only, or remove it entirely.

🧹 Proposed fix
-                    print("Calculator received args: \(args), using expression: '\(expression)'")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`
at line 103, Remove the debug print statement that logs "Calculator received
args" in ToolSettingsView.swift; replace it either by removing the line entirely
or by switching to a proper logger (os.Logger) and gating it behind a debug-only
build check (e.g., `#if` DEBUG) so it does not emit to production consoles; locate
the occurrence by searching for the string "Calculator received args" or the
print(...) call and update the surrounding function accordingly.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift (1)

446-450: Consider using named parameters for enum destructuring.

The positional destructuring works correctly, but if VoiceSessionEvent associated value order changes, this code would silently receive misaligned data. Using named parameters would be more resilient:

🧹 Suggested improvement
-        case .responded(let text, _): assistantResponse = text
+        case .responded(text: let text, thinkingContent: _): assistantResponse = text
         case .speaking: sessionState = .speaking; currentStatus = "Speaking..."
-        case let .turnCompleted(transcript, response, _, _):
+        case .turnCompleted(transcript: let transcript, response: let response, thinkingContent: _, audio: _):
             currentTranscript = transcript; assistantResponse = response
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift`
around lines 446 - 450, The pattern match on VoiceSessionEvent uses positional
associated values which is fragile; update the case patterns to use named
parameters so each value is matched by label (e.g., change the case
.turnCompleted(transcript, response, _, _) to a labeled form like case let
.turnCompleted(transcript: transcript, response: response, ...)) and similarly
label .responded and other cases if they have associated value labels; then
assign currentTranscript, assistantResponse, sessionState, and currentStatus
from those named bindings to ensure order changes in VoiceSessionEvent won’t
break the logic.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift (1)

101-127: Prefer Swift-native notification observation here.

This reintroduces selector-based observer management inside a Combine-backed @MainActor view model. Using a NotificationCenter publisher or async notifications would keep delivery/cancellation in the same Swift concurrency model and remove the manual removeObserver path.

As per coding guidelines "Use the latest Swift 6 APIs always".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`
around lines 101 - 127, The view model is using selector-based
NotificationCenter observers (subscribeToModelNotifications,
handleModelLoaded(_:), handleModelUnloaded, and deinit removeObserver) which
reintroduces non-Swift-concurrency observer management; replace them with
Swift-native observation by removing addObserver/removeObserver and instead
subscribe via NotificationCenter publishers or async notifications and store the
subscriptions in a Combine cancellables Set (e.g., a private var cancellables =
Set<AnyCancellable>() or use Task/NotificationCenter.notifications(named:) with
Task cancellation). Specifically, remove the selector-based registrations in
subscribeToModelNotifications, replace them with
NotificationCenter.default.publisher(for: Notification.Name("ModelLoaded")) and
.sink to update loadedModelSupportsThinking (and a separate publisher for
"ModelUnloaded" to set it false), store the AnyCancellable values in
cancellables, and drop the manual removeObserver call in deinit so cancellations
happen via the cancellables lifecycle (or cancel the Tasks in deinit if using
async sequences).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel`+Generation.swift:
- Around line 25-29: The loop is issuing UI updates and notifications on every
token even when Self.stripThinkTags(from:) yields the same displayText; change
the streaming loop (where fullResponse, displayText are computed and
updateMessageContent(at:content:) and NotificationCenter.default.post are
called) to track the last-displayed text (e.g., a local prevDisplayText) and
only call updateMessageContent(at:content:) and post the scroll notification
when displayText != prevDisplayText, updating prevDisplayText after a successful
update; this avoids no-op main-thread churn while still showing visible changes.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`:
- Around line 34-35: Remove the duplicate `@AppStorage`("thinkingModeEnabled")
property in ChatInterfaceView and use the shared
Settings/ToolSettingsViewModel's thinkingModeEnabled instead; replace any uses
or bindings of the local thinkingModeEnabled with
toolSettingsViewModel.thinkingModeEnabled (including the badge binding and other
occurrences mentioned around the 502-518 region) so toggles update
SettingsViewModel.shared and avoid the second source of truth.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`:
- Around line 122-128: The NSExpression(format:) call can raise an Objective-C
exception for malformed syntax (e.g., "((", "1 + + 2"), so replace the direct
NSExpression(format: cleanedExpression) usage with a safe evaluator: implement
an ObjC bridging helper (e.g., evaluateExpressionSafely(_:)) that wraps
NSExpression(format:) and expressionValue(with:context:) inside `@try/`@catch and
returns nil on exception; then call that helper from ToolSettingsView (use
cleanedExpression and expression variables) and only build the return dictionary
when the helper returns a valid NSNumber, otherwise handle it as an invalid
expression (e.g., skip returning a result or surface an error).

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift`:
- Around line 56-63: The scenePhase handler can restart the camera even when
this view is offscreen—add a visibility guard: introduce a Bool property (e.g.,
isVisible) updated in onAppear and onDisappear (or via SwiftUI .task/.onChange
for view visibility), and in the .onChange(of: scenePhase) closure check
isVisible before calling setupCameraIfNeeded(); keep the existing calls to
viewModel.stopAutoStreaming() and viewModel.stopCamera() when moving to
background/inactive, but avoid calling setupCameraIfNeeded() on .active unless
isVisible is true.

In `@sdk/runanywhere-commons/src/features/llm/tool_calling.cpp`:
- Around line 718-722: The fallback branch currently always quotes the scalar in
flat_args, turning raw JSON literals into strings; update the logic in
tool_calling.cpp (the branch using kval_is_obj, kval, escaped_key,
escape_json_string, flat_args) to detect actual JSON literals and emit them
unquoted. Add or call a helper (e.g., is_json_literal(kval)) that returns true
for numbers, true/false, null, and for array/object text (or reuse kval_is_obj
for structures), and only run escape_json_string and wrap in quotes when the
value is a real string; otherwise append the raw kval as-is to preserve typed
literals and follow the structured-types guideline. Ensure the helper is
documented/implemented in the same translation unit and used in the same
conditional chain that currently handles kval and kval_is_obj.
- Around line 401-425: The scalar-value branch currently treats a leading '[' as
a scalar and stops at the first ',' or ']', truncating arrays; modify the
scanner that assigns *out_value and *out_is_object so that if json_obj[pos] ==
'[' you perform a matching-bracket scan (similar to how '{' is handled) to find
the closing ']' including nested brackets and strings, allocate the full
substring (including brackets), set *out_is_object = true (or the same
structured flag used for objects), and return true; otherwise keep the existing
scalar logic for true scalars. Ensure you use the same allocation/memory
conventions as the existing object path and reuse variable names json_obj, pos,
val_start/val_end, *out_value and *out_is_object to locate the change.
- Around line 705-711: The loop is currently marking is_tool_key true for any
alias in TOOL_NAME_KEYS and skipping all such keys, causing legitimate args like
"name" to be dropped; modify the logic in tool_calling.cpp (around the loop over
TOOL_NAME_KEYS) to record which specific alias matched (e.g., matched_key or
matched_index) when comparing k against TOOL_NAME_KEYS, then after the loop only
skip the map entry if k equals that matched key (or index matches matched_index)
instead of skipping whenever any alias exists; update any uses of is_tool_key to
use the matched_key check so only the actual tool-name key is excluded.

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift`:
- Around line 182-189: The shutdown() function currently spawns a
fire-and-forget Task which returns before native components finish tearing down;
change shutdown() to be async (remove the Task wrapper), sequentially await each
component's destroy() call (LLM.shared.destroy(), STT.shared.destroy(),
TTS.shared.destroy(), VAD.shared.destroy(), VoiceAgent.shared.destroy(),
VLM.shared.destroy()) so teardown completes before proceeding, and only after
all awaits clear _isInitialized and _servicesInitialized; then update the caller
RunAnywhere.reset() to await CppBridge.shutdown() and mark reset() async.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+TextGeneration.swift:
- Around line 100-110: The LLMGenerationResult currently misreports
thinkingTokens (non-streaming sets it to outputTokens when any thinkingContent
exists, while streaming returns 0), causing inconsistent and double-counted
metrics; change both code paths that construct LLMGenerationResult (the
non-streaming return here and the similar block around the 450-460 range) to set
thinkingTokens to 0 (and ensure responseTokens/outputTokens remain unchanged)
until the C++ layer exposes a real thinking-token metric, so both streaming and
non-streaming results return a consistent zero-valued thinkingTokens.
- Around line 312-335: The current ThinkingContentParser.extract(from:) only
removes the first <think>...</think> pair; update extract to scan the entire
input (e.g., with a while loop or regex) to find and remove all complete
<think>...</think> blocks, concatenating their inner contents (trimmed) into a
single thinking string, and rebuild response text from the remaining parts; also
strip any stray "<think>" or "</think>" leftovers after extraction so incomplete
fragments do not appear in the returned text. Ensure you modify
ThinkingContentParser.extract to collect all matches, remove them from the
source, and return (text: cleanedResponseText, thinking: combinedThinkingOrNil).

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere`+TTS.swift:
- Around line 189-190: TTSStreamContext currently marked `@unchecked` Sendable
with an unprotected mutable totalData that is mutated from the C callback;
change TTSStreamContext to implement Sendable safely by adding an
OSAllocatedUnfairLock (matching the STT bridge pattern) and use it to guard all
accesses/mutations of totalData (wrap mutations in the C callback and the read
in the finalization path where
Unmanaged<TTSStreamContext>.fromOpaque(...).takeRetainedValue() is used). Mirror
the STT implementation (use the same lock property name and lock/unlock
placement) so totalData is only read/written while holding the
OSAllocatedUnfairLock.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift:
- Around line 224-229: The logic for adding the "/no_think" prefix in
RunAnywhere+VoiceSession.swift currently uses only config.thinkingModeEnabled;
align it with the pattern in RAGViewModel by ensuring the decision also
considers whether the model supports thinking (i.e., the same combined condition
used when building the config elsewhere). Update the construction or usage so
that effectivePrompt is prefixed with "/no_think" only when thinkingMode is
disabled AND the model supports thinking was considered when the config was
created (mirror the check used in RAGViewModel and VoiceAgentViewModel where
thinkingModeEnabled is set as settings.loadedModelSupportsThinking &&
settings.thinkingModeEnabled).
- Around line 258-263: The snippet currently emits .turnCompleted with variables
transcription, cleanedResponse, thinkingContent, and synthesizedAudio even when
an error occurs in the preceding do block; change the control flow so
.turnCompleted is only emitted when the do block completes successfully (i.e.,
move the emit(.turnCompleted(...)) into the success path of the do block or
guard for a successful result before emitting), and ensure the catch path either
emits an error-specific event (or returns/throws) instead of fallthrough
emission; reference the existing symbols transcription, cleanedResponse,
thinkingContent, synthesizedAudio and the emit(.turnCompleted(...)) call when
making the change.

---

Outside diff comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`:
- Around line 417-421: The scrolling target "typing" disappears once the first
token arrives, so change scroll target logic used by
scrollToBottom(proxy:animated:) to choose a stable anchor: when
viewModel.isGenerating is true but viewModel.messages.last?.content.isEmpty ==
false, use the last message's id (e.g., viewModel.messages.last?.id) instead of
"typing", or create a persistent bottom anchor view ID (e.g., "bottom-anchor")
and always scroll to that; update the TypingIndicatorView id usage
(id("typing")) and the code that calls scrollToBottom(proxy:animated:) to select
the last message ID when content is non-empty or fall back to the persistent
"bottom-anchor" so scroll targets never point to a removed view.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+ToolCalling.swift:
- Around line 208-214: The follow-up prompt builder (buildFollowUpPrompt) is
embedding the original prompt which may contain the `/no_think` prefix in the
middle of the assembled message, so the C++ layer loses the required leading
`/no_think`. Fix by ensuring the leading `/no_think` is preserved: either strip
and capture the `/no_think` prefix from prompt into cleanPrompt before calling
buildFollowUpPrompt and pass cleanPrompt instead of prompt, or modify
buildFollowUpPrompt to detect and reapply the `/no_think` prefix at the very
start of the returned string; update the call that sets fullPrompt to pass
cleanPrompt (or reapply prefix) and keep references to buildFollowUpPrompt,
fullPrompt, prompt, and cleanPrompt consistent.

---

Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift`:
- Line 136: The early return in selectModel guarded by isLoadingModel makes taps
silently no-op; update selectModel (in ModelListViewModel) to give feedback by
either setting the same loading state used in loadModelsFromRegistry (e.g., set
isLoading = true / isLoadingModel = true at the start of selectModel) so the UI
shows a spinner, or log/emit a minimal message when isLoadingModel is true to
indicate a load is in progress; ensure you clear the loading flag on
completion/failure just like loadModelsFromRegistry to avoid leaving the view
stuck.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`:
- Around line 101-127: The view model is using selector-based NotificationCenter
observers (subscribeToModelNotifications, handleModelLoaded(_:),
handleModelUnloaded, and deinit removeObserver) which reintroduces
non-Swift-concurrency observer management; replace them with Swift-native
observation by removing addObserver/removeObserver and instead subscribe via
NotificationCenter publishers or async notifications and store the subscriptions
in a Combine cancellables Set (e.g., a private var cancellables =
Set<AnyCancellable>() or use Task/NotificationCenter.notifications(named:) with
Task cancellation). Specifically, remove the selector-based registrations in
subscribeToModelNotifications, replace them with
NotificationCenter.default.publisher(for: Notification.Name("ModelLoaded")) and
.sink to update loadedModelSupportsThinking (and a separate publisher for
"ModelUnloaded" to set it false), store the AnyCancellable values in
cancellables, and drop the manual removeObserver call in deinit so cancellations
happen via the cancellables lifecycle (or cancel the Tasks in deinit if using
async sequences).

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`:
- Line 103: Remove the debug print statement that logs "Calculator received
args" in ToolSettingsView.swift; replace it either by removing the line entirely
or by switching to a proper logger (os.Logger) and gating it behind a debug-only
build check (e.g., `#if` DEBUG) so it does not emit to production consoles; locate
the occurrence by searching for the string "Calculator received args" or the
print(...) call and update the surrounding function accordingly.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift`:
- Around line 446-450: The pattern match on VoiceSessionEvent uses positional
associated values which is fragile; update the case patterns to use named
parameters so each value is matched by label (e.g., change the case
.turnCompleted(transcript, response, _, _) to a labeled form like case let
.turnCompleted(transcript: transcript, response: response, ...)) and similarly
label .responded and other cases if they have associated value labels; then
assign currentTranscript, assistantResponse, sessionState, and currentStatus
from those named bindings to ensure order changes in VoiceSessionEvent won’t
break the logic.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 983-1041: Extract the duplicated UTF-8 buffering and stop-sequence
logic into a shared helper (e.g., a static anonymous-namespace constants
kStopSequences and kMaxStopLen and a small Utf8StopDetector/Utf8StopBuffer
struct) and replace the inline code in both the streaming loop and
generate_stream with calls to that helper; the helper should own Utf8State,
partial buffer, stop_window and expose methods to feed new bytes (from
common_token_to_piece/new_token_chars), return any fully-valid UTF‑8 chunk,
detect/trim stop sequences (mirroring the current found_stop_pos logic), and
produce safe trimmed output for appending to generated_text so you can remove
duplicated STOP_SEQUENCES, MAX_STOP_LEN, scanner_state, partial_utf8_buffer and
stop_window usages in the loops.

In `@sdk/runanywhere-commons/src/features/llm/llm_component.cpp`:
- Around line 772-788: The cancel implementation reads component->lifecycle
without holding the component mutex which can race with destruction/unload; to
fix, either document the threading constraint in the public header (declare that
rac_llm_component_cancel must not be called concurrently with
rac_llm_component_destroy or rac_llm_component_unload) or add a defensive
check/assert at the top of rac_llm_component_cancel (e.g.,
assert(component->lifecycle && "cannot cancel during destroy/unload") or return
an error if lifecycle is null) before calling rac_lifecycle_get_service and
rac_llm_cancel, referencing rac_llm_component_cancel, component->lifecycle,
rac_llm_component_destroy, rac_llm_component_unload and rac_llm_cancel.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f15418f-6046-4e3d-9175-cb636f3efb2f

📥 Commits

Reviewing files that changed from the base of the PR and between cfb7784 and 13dc851.

📒 Files selected for processing (27)
  • examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+Events.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+Generation.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+ModelManagement.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+ToolCalling.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/ViewModels/RAGViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/CombinedSettingsView.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift
  • sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp
  • sdk/runanywhere-commons/src/features/llm/llm_component.cpp
  • sdk/runanywhere-commons/src/features/llm/tool_calling.cpp
  • sdk/runanywhere-swift/Sources/RunAnywhere/Features/TTS/Services/AudioPlaybackManager.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Infrastructure/Download/Utilities/ArchiveUtility.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+ToolCalling.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceAgent.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/VoiceAgentTypes.swift

Comment on lines 25 to 29
for try await token in stream {
fullResponse += token
await updateMessageContent(at: messageIndex, content: fullResponse)
let displayText = Self.stripThinkTags(from: fullResponse)
await updateMessageContent(at: messageIndex, content: displayText)
NotificationCenter.default.post(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Skip no-op UI updates while the model is still inside <think>.

stripThinkTags can keep displayText unchanged for many hidden tokens, but this loop still rebuilds the message and posts a scroll notification on every token. On long thinking traces that becomes avoidable main-thread churn.

Suggested simplification
     func generateStreamingResponse(
         prompt: String,
         options: LLMGenerationOptions,
         messageIndex: Int
     ) async throws {
         var fullResponse = ""
+        var lastDisplayText = ""

         let streamingResult = try await RunAnywhere.generateStream(prompt, options: options)
         let stream = streamingResult.stream
         let metricsTask = streamingResult.result

         for try await token in stream {
             fullResponse += token
             let displayText = Self.stripThinkTags(from: fullResponse)
+            guard displayText != lastDisplayText else { continue }
+            lastDisplayText = displayText
             await updateMessageContent(at: messageIndex, content: displayText)
             NotificationCenter.default.post(
                 name: Notification.Name("MessageContentUpdated"),
                 object: nil
             )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for try await token in stream {
fullResponse += token
await updateMessageContent(at: messageIndex, content: fullResponse)
let displayText = Self.stripThinkTags(from: fullResponse)
await updateMessageContent(at: messageIndex, content: displayText)
NotificationCenter.default.post(
var fullResponse = ""
var lastDisplayText = ""
let streamingResult = try await RunAnywhere.generateStream(prompt, options: options)
let stream = streamingResult.stream
let metricsTask = streamingResult.result
for try await token in stream {
fullResponse += token
let displayText = Self.stripThinkTags(from: fullResponse)
guard displayText != lastDisplayText else { continue }
lastDisplayText = displayText
await updateMessageContent(at: messageIndex, content: displayText)
NotificationCenter.default.post(
name: Notification.Name("MessageContentUpdated"),
object: nil
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel`+Generation.swift
around lines 25 - 29, The loop is issuing UI updates and notifications on every
token even when Self.stripThinkTags(from:) yields the same displayText; change
the streaming loop (where fullResponse, displayText are computed and
updateMessageContent(at:content:) and NotificationCenter.default.post are
called) to track the last-displayed text (e.g., a local prevDisplayText) and
only call updateMessageContent(at:content:) and post the scroll notification
when displayText != prevDisplayText, updating prevDisplayText after a successful
update; this avoids no-op main-thread churn while still showing visible changes.

Comment on lines +34 to +35
@ObservedObject private var toolSettingsViewModel = ToolSettingsViewModel.shared
@AppStorage("thinkingModeEnabled") private var thinkingModeEnabled = false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid a second source of truth for thinking mode.

SettingsViewModel already owns thinkingModeEnabled and persists it. Toggling a separate @AppStorage copy here can update UserDefaults without updating SettingsViewModel.shared.thinkingModeEnabled, so generation code that reads the shared settings can stay stale for the rest of the session. Bind the badge to the shared settings view model instead of duplicating the state.

Suggested direction
-    `@ObservedObject` private var toolSettingsViewModel = ToolSettingsViewModel.shared
-    `@AppStorage`("thinkingModeEnabled") private var thinkingModeEnabled = false
+    `@ObservedObject` private var toolSettingsViewModel = ToolSettingsViewModel.shared
+    `@ObservedObject` private var settingsViewModel = SettingsViewModel.shared
...
-                if thinkingModeEnabled && viewModel.loadedModelSupportsThinking {
+                if settingsViewModel.thinkingModeEnabled && viewModel.loadedModelSupportsThinking {
                     thinkingModeBadge
                 }
...
-            .padding(.top, ((thinkingModeEnabled && viewModel.loadedModelSupportsThinking) || viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty || hasModelSelected) ? 8 : 0)
+            .padding(.top, ((settingsViewModel.thinkingModeEnabled && viewModel.loadedModelSupportsThinking) || viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty || hasModelSelected) ? 8 : 0)
...
     var thinkingModeBadge: some View {
         Button {
-            thinkingModeEnabled.toggle()
+            settingsViewModel.thinkingModeEnabled.toggle()
         } label: {

Also applies to: 502-518

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`
around lines 34 - 35, Remove the duplicate `@AppStorage`("thinkingModeEnabled")
property in ChatInterfaceView and use the shared
Settings/ToolSettingsViewModel's thinkingModeEnabled instead; replace any uses
or bindings of the local thinkingModeEnabled with
toolSettingsViewModel.thinkingModeEnabled (including the badge binding and other
occurrences mentioned around the 502-518 region) so toggles update
SettingsViewModel.shared and avoid the second source of truth.

Comment on lines +122 to 128
let exp = NSExpression(format: cleanedExpression)
if let result = exp.expressionValue(with: nil, context: nil) as? NSNumber {
return [
"result": .number(result.doubleValue),
"expression": .string(expression)
]
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

NSExpression(format:) can crash on malformed expressions.

The character validation (lines 112-120) prevents injection but doesn't validate syntax. Expressions like "1 + + 2", "1 /", or "((" pass character validation but will crash NSExpression(format:).

🛡️ Proposed fix: wrap NSExpression in a safe evaluation
-                    let exp = NSExpression(format: cleanedExpression)
-                    if let result = exp.expressionValue(with: nil, context: nil) as? NSNumber {
-                        return [
-                            "result": .number(result.doubleValue),
-                            "expression": .string(expression)
-                        ]
+                    do {
+                        let exp = NSExpression(format: cleanedExpression)
+                        if let result = exp.expressionValue(with: nil, context: nil) as? NSNumber {
+                            return [
+                                "result": .number(result.doubleValue),
+                                "expression": .string(expression)
+                            ]
+                        }
+                    } catch {
+                        // NSExpression throws on invalid syntax
                     }

Note: NSExpression(format:) doesn't throw but can raise an Objective-C exception. Consider using NSExpression(format:argumentArray:) with proper validation or a dedicated math parser library for robustness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`
around lines 122 - 128, The NSExpression(format:) call can raise an Objective-C
exception for malformed syntax (e.g., "((", "1 + + 2"), so replace the direct
NSExpression(format: cleanedExpression) usage with a safe evaluator: implement
an ObjC bridging helper (e.g., evaluateExpressionSafely(_:)) that wraps
NSExpression(format:) and expressionValue(with:context:) inside `@try/`@catch and
returns nil on exception; then call that helper from ToolSettingsView (use
cleanedExpression and expression variables) and only build the return dictionary
when the helper returns a valid NSNumber, otherwise handle it as an invalid
expression (e.g., skip returning a result or surface an error).

Comment on lines +56 to +63
.onChange(of: scenePhase) { _, newPhase in
if newPhase == .background || newPhase == .inactive {
viewModel.stopAutoStreaming()
viewModel.stopCamera()
} else if newPhase == .active {
setupCameraIfNeeded()
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential camera restart when view is not visible.

When the user navigates away, .onDisappear stops the camera. However, if the view remains in the navigation stack's hierarchy, .onChange(of: scenePhase) may still fire when the app becomes active, causing setupCameraIfNeeded() to restart the camera even though the user is on a different screen.

Consider tracking visibility state to guard against this:

🛠️ Proposed fix using visibility tracking
 `@State` private var showingPhotos = false
 `@State` private var selectedPhoto: PhotosPickerItem?
+@State private var isViewVisible = false
 `@Environment`(\.scenePhase) private var scenePhase
-.onAppear { setupCameraIfNeeded() }
+.onAppear {
+    isViewVisible = true
+    setupCameraIfNeeded()
+}
 .onDisappear {
+    isViewVisible = false
     viewModel.stopAutoStreaming()
     viewModel.stopCamera()
 }
 .onChange(of: scenePhase) { _, newPhase in
     if newPhase == .background || newPhase == .inactive {
         viewModel.stopAutoStreaming()
         viewModel.stopCamera()
-    } else if newPhase == .active {
+    } else if newPhase == .active && isViewVisible {
         setupCameraIfNeeded()
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift`
around lines 56 - 63, The scenePhase handler can restart the camera even when
this view is offscreen—add a visibility guard: introduce a Bool property (e.g.,
isVisible) updated in onAppear and onDisappear (or via SwiftUI .task/.onChange
for view visibility), and in the .onChange(of: scenePhase) closure check
isVisible before calling setupCameraIfNeeded(); keep the existing calls to
viewModel.stopAutoStreaming() and viewModel.stopCamera() when moving to
background/inactive, but avoid calling setupCameraIfNeeded() on .active unless
isVisible is true.

Comment on lines +401 to +425
} else {
// Scalar value (number, boolean, null)
// Read until comma, closing brace, or whitespace
size_t val_start = pos;
size_t val_end = pos;
while (val_end < len && json_obj[val_end] != ',' &&
json_obj[val_end] != '}' && json_obj[val_end] != ']' &&
json_obj[val_end] != '\n') {
val_end++;
}
// Trim trailing whitespace
while (val_end > val_start &&
(json_obj[val_end - 1] == ' ' || json_obj[val_end - 1] == '\t')) {
val_end--;
}
if (val_end > val_start) {
size_t val_len = val_end - val_start;
*out_value = static_cast<char*>(malloc(val_len + 1));
if (*out_value) {
memcpy(*out_value, json_obj + val_start, val_len);
(*out_value)[val_len] = '\0';
}
*out_is_object = false;
return true;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Array-valued arguments are truncated in the new scalar path.

Line 401 sends [ into the scalar branch, but that scanner stops on the first , or ]. Inputs like {"items":[1,2]} get extracted as "[1", so any RAC_TOOL_PARAM_ARRAY argument is corrupted before parsing completes. Handle [ as structured JSON with a matching-bracket scan instead of treating it as a scalar literal.

As per coding guidelines, sdk/runanywhere-commons/**/*.{c,cpp,h,hpp}: "Always use structured types, never use strings directly for consistency and scalability"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/features/llm/tool_calling.cpp` around lines 401 -
425, The scalar-value branch currently treats a leading '[' as a scalar and
stops at the first ',' or ']', truncating arrays; modify the scanner that
assigns *out_value and *out_is_object so that if json_obj[pos] == '[' you
perform a matching-bracket scan (similar to how '{' is handled) to find the
closing ']' including nested brackets and strings, allocate the full substring
(including brackets), set *out_is_object = true (or the same structured flag
used for objects), and return true; otherwise keep the existing scalar logic for
true scalars. Ensure you use the same allocation/memory conventions as the
existing object path and reuse variable names json_obj, pos, val_start/val_end,
*out_value and *out_is_object to locate the change.

Comment on lines 100 to +110
return LLMGenerationResult(
text: generatedText,
thinkingContent: nil,
thinkingContent: thinkingContent,
inputTokens: inputTokens,
tokensUsed: outputTokens,
modelUsed: modelId,
latencyMs: totalTimeMs,
framework: "llamacpp",
tokensPerSecond: tokensPerSecond,
timeToFirstTokenMs: nil,
thinkingTokens: 0,
thinkingTokens: thinkingContent.map { _ in outputTokens } ?? 0,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

thinkingTokens is misreported and inconsistent across paths.

The non-streaming result uses the entire completion count as thinkingTokens whenever any <think> block exists, while the streaming result always returns 0. Because responseTokens is also set to outputTokens, this can double-count the same completion in analytics. Please return a consistent value in both paths until the C++ layer exposes a dedicated thinking-token metric.

Also applies to: 450-460

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+TextGeneration.swift
around lines 100 - 110, The LLMGenerationResult currently misreports
thinkingTokens (non-streaming sets it to outputTokens when any thinkingContent
exists, while streaming returns 0), causing inconsistent and double-counted
metrics; change both code paths that construct LLMGenerationResult (the
non-streaming return here and the similar block around the 450-460 range) to set
thinkingTokens to 0 (and ensure responseTokens/outputTokens remain unchanged)
until the C++ layer exposes a real thinking-token metric, so both streaming and
non-streaming results return a consistent zero-valued thinkingTokens.

Comment on lines +312 to +335
enum ThinkingContentParser {
/// Extracts `<think>...</think>` content from generated text.
/// - Returns: Tuple of (responseText, thinkingContent). If no tags found, responseText = original text, thinkingContent = nil.
static func extract(from text: String) -> (text: String, thinking: String?) {
guard let startRange = text.range(of: "<think>"),
let endRange = text.range(of: "</think>"),
startRange.upperBound <= endRange.lowerBound else {
return (text: text, thinking: nil)
}
let thinkingContent = String(text[startRange.upperBound..<endRange.lowerBound])
.trimmingCharacters(in: .whitespacesAndNewlines)
// Include any text before <think> and after </think>
let textBefore = String(text[..<startRange.lowerBound])
.trimmingCharacters(in: .whitespacesAndNewlines)
let textAfter = String(text[endRange.upperBound...])
.trimmingCharacters(in: .whitespacesAndNewlines)
let responseText = [textBefore, textAfter]
.filter { !$0.isEmpty }
.joined(separator: "\n")
return (
text: responseText,
thinking: thinkingContent.isEmpty ? nil : thinkingContent
)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The parser only removes the first <think> block.

extract(from:) stops after one closing tag, so additional or trailing incomplete think sections remain in text. The app-side LLMViewModel.stripThinkTags already handles repeated/incomplete blocks, which means SDK results can still surface raw think markup in non-streaming and final streaming results. Please make this parser consume all think blocks or centralize one shared implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+TextGeneration.swift
around lines 312 - 335, The current ThinkingContentParser.extract(from:) only
removes the first <think>...</think> pair; update extract to scan the entire
input (e.g., with a while loop or regex) to find and remove all complete
<think>...</think> blocks, concatenating their inner contents (trimmed) into a
single thinking string, and rebuild response text from the remaining parts; also
strip any stray "<think>" or "</think>" leftovers after extraction so incomplete
fragments do not appear in the returned text. Ensure you modify
ThinkingContentParser.extract to collect all matches, remove them from the
source, and return (text: cleanedResponseText, thinking: combinedThinkingOrNil).

Comment on lines +189 to +190
let finalContext = Unmanaged<TTSStreamContext>.fromOpaque(contextPtr).takeRetainedValue()
let totalAudioData = finalContext.totalData
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the native TTS streaming implementation for callback lifetime/threading.
fd 'rac_tts_component' -HI | while read -r file; do
  rg -n -C5 'rac_tts_component_synthesize_stream|user_data|callback|dispatch_async|std::thread|async|queue|worker' "$file"
done

# Compare Swift-side retain/release patterns used by other streaming bridges.
rg -n -C4 'passRetained|takeRetainedValue|takeUnretainedValue' \
  sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 21787


🏁 Script executed:

# Check if TTSStreamContext and STTStreamingContext have similar definitions and usage
rg -n -A 10 'class TTSStreamContext|struct TTSStreamContext' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/

# Check STT context definition for comparison
rg -n -A 10 'class STTStreamingContext|struct STTStreamingContext' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/

# Check if there are any comments in TTS about synchronous behavior
rg -n -B 5 -A 5 'rac_tts_component_synthesize_stream' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift

# Search for any TODOs or FIXMEs related to threading/concurrency in TTS
rg -n 'TODO|FIXME|SAFETY|thread|async|race' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 3399


🏁 Script executed:

# Check how STT uses finalText in callbacks
rg -n -B 2 -A 2 'finalText' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/RunAnywhere+STT.swift | head -40

# Check if there are any comments explaining the synchronization choice in STT
rg -n -B 5 -A 0 'OSAllocatedUnfairLock' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/RunAnywhere+STT.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 1068


Protect totalData with a Swift 6 synchronization primitive; mirror the STT implementation.

Line 311 declares TTSStreamContext as @unchecked Sendable with an unguarded mutable totalData, which is mutated directly in the C callback at line 183. The sibling STT bridge (sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/RunAnywhere+STT.swift:306-313) solves this correctly using OSAllocatedUnfairLock to protect its mutable state. Replace the @unchecked Sendable with a proper Sendable implementation using OSAllocatedUnfairLock for totalData to guard against concurrent callback access, consistent with Swift 6 APIs and the coding guidelines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere`+TTS.swift
around lines 189 - 190, TTSStreamContext currently marked `@unchecked` Sendable
with an unprotected mutable totalData that is mutated from the C callback;
change TTSStreamContext to implement Sendable safely by adding an
OSAllocatedUnfairLock (matching the STT bridge pattern) and use it to guard all
accesses/mutations of totalData (wrap mutations in the C callback and the read
in the finalization path where
Unmanaged<TTSStreamContext>.fromOpaque(...).takeRetainedValue() is used). Mirror
the STT implementation (use the same lock property name and lock/unlock
placement) so totalData is only read/written while holding the
OSAllocatedUnfairLock.

Comment on lines +224 to 229
let effectivePrompt: String
if !config.thinkingModeEnabled {
effectivePrompt = "/no_think\n\(transcription)"
} else {
effectivePrompt = transcription
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent thinking mode logic compared to other files.

This code applies /no_think prefix based solely on config.thinkingModeEnabled, but RAGViewModel (line 112) checks both loadedModelSupportsThinking && !thinkingModeEnabled.

If a model doesn't support thinking at all, the prefix is unnecessary and might confuse it. Consider aligning with the pattern used elsewhere:

🔧 Suggested alignment

The VoiceSessionConfig would need to also carry whether the model supports thinking, or the check should happen at the config construction site (which VoiceAgentViewModel already does at line 392).

If the config is always constructed with proper logic (only setting thinkingModeEnabled=true when the model supports it), the current code is fine. The construction in VoiceAgentViewModel line 392 does handle this correctly:

thinkingModeEnabled: settings.loadedModelSupportsThinking && settings.thinkingModeEnabled

So this is safe, but the semantics are subtly different across files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift
around lines 224 - 229, The logic for adding the "/no_think" prefix in
RunAnywhere+VoiceSession.swift currently uses only config.thinkingModeEnabled;
align it with the pattern in RAGViewModel by ensuring the decision also
considers whether the model supports thinking (i.e., the same combined condition
used when building the config elsewhere). Update the construction or usage so
that effectivePrompt is prefixed with "/no_think" only when thinkingMode is
disabled AND the model supports thinking was considered when the config was
created (mirror the check used in RAGViewModel and VoiceAgentViewModel where
thinkingModeEnabled is set as settings.loadedModelSupportsThinking &&
settings.thinkingModeEnabled).

Comment on lines +258 to +263
emit(.turnCompleted(
transcript: transcription,
response: cleanedResponse,
thinkingContent: thinkingContent,
audio: synthesizedAudio
))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

turnCompleted emitted even after errors with potentially empty/stale values.

If an error occurs in the do block (lines 208-256), the code falls through to emit turnCompleted (lines 258-263) with whatever values transcription, cleanedResponse, thinkingContent, and synthesizedAudio held at the time of failure. This could emit partial results (e.g., transcription succeeded but generation failed).

Consider whether turnCompleted should be skipped on errors, or if partial results are intentionally desired.

🔧 Potential fix if partial results are not desired
         } catch {
             logger.error("Processing failed: \(error)")
             emit(.error(error.localizedDescription))
+            // Skip turnCompleted on error - error event is sufficient
+            if config.continuousMode && isRunning {
+                try? await startListening()
+            }
+            return
         }
 
         emit(.turnCompleted(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift
around lines 258 - 263, The snippet currently emits .turnCompleted with
variables transcription, cleanedResponse, thinkingContent, and synthesizedAudio
even when an error occurs in the preceding do block; change the control flow so
.turnCompleted is only emitted when the do block completes successfully (i.e.,
move the emit(.turnCompleted(...)) into the success path of the do block or
guard for a successful result before emitting), and ensure the catch path either
emits an error-specific event (or returns/throws) instead of fallthrough
emission; reference the existing symbols transcription, cleanedResponse,
thinkingContent, synthesizedAudio and the emit(.turnCompleted(...)) call when
making the change.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift (1)

67-67: Consider adding a comment explaining the early initialization.

The _ = SettingsViewModel.shared forces the singleton to initialize before initializeSDK() runs. This ensures the loadSettings(), setupObservers(), and subscribeToModelNotifications() setup completes before views access the shared instance. The pattern is valid, but a brief comment would clarify intent for future maintainers.

📝 Suggested clarification
         .task {
+                // Ensure SettingsViewModel singleton is initialized before SDK setup
+                // (subscriptions and settings must be ready before views access it)
                 _ = SettingsViewModel.shared
                 logger.info("🏁 App launched, initializing SDK...")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift` at line
67, Add a brief inline comment above the `_ = SettingsViewModel.shared` line
explaining that this forces early initialization of the SettingsViewModel
singleton so loadSettings(), setupObservers(), and
subscribeToModelNotifications() run before initializeSDK() and any views access
the shared instance; reference the SettingsViewModel.shared singleton and
initializeSDK() in the comment to make the intent clear to future maintainers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift`:
- Line 67: Add a brief inline comment above the `_ = SettingsViewModel.shared`
line explaining that this forces early initialization of the SettingsViewModel
singleton so loadSettings(), setupObservers(), and
subscribeToModelNotifications() run before initializeSDK() and any views access
the shared instance; reference the SettingsViewModel.shared singleton and
initializeSDK() in the comment to make the intent clear to future maintainers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5405eb26-148a-41f9-b588-910fe7bbf60d

📥 Commits

Reviewing files that changed from the base of the PR and between 13dc851 and 4a6150f.

📒 Files selected for processing (1)
  • examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift (1)

30-33: Consider the implications of sensitive demo prompts.

The example prompts for the abliterated adapter include potentially sensitive content (lock picking instructions). While this demonstrates the adapter's uncensored capabilities, consider whether these specific prompts are appropriate for a public demo application that may be used in presentations or shared screenshots (as seen in this PR's description).

A safer alternative might be prompts that test reduced restrictions without encouraging potentially harmful outputs:

  • "Discuss the philosophical arguments for and against free speech"
  • "What are some morally complex dilemmas in medical ethics?"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift`
around lines 30 - 33, Replace the sensitive demo prompt in
LoraExamplePrompts.swift for the adapter key
"qwen2.5-0.5b-abliterated-lora-f16.gguf": remove or replace the "Explain how
lock picking works in detail" example with a non-harmful test prompt that still
exercises reduced restrictions (e.g., "Discuss the philosophical arguments for
and against free speech" or "What are some morally complex dilemmas in medical
ethics?"); update the array for the "qwen2.5-0.5b-abliterated-lora-f16.gguf"
entry to use the safer prompt(s) so public demos/screenshots do not display
potentially dangerous instructions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift`:
- Around line 71-81: The LoraAdapterCatalogEntry for id "abliterated-lora"
currently sets fileSize to 0 which causes incorrect download progress and
storage calculations; update the fileSize property on that
LoraAdapterCatalogEntry (the instance with id "abliterated-lora" and filename
"qwen2.5-0.5b-abliterated-lora-f16.gguf") to the actual byte size 17_620_224 so
UI progress and storage estimates use the correct value.

---

Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift`:
- Around line 30-33: Replace the sensitive demo prompt in
LoraExamplePrompts.swift for the adapter key
"qwen2.5-0.5b-abliterated-lora-f16.gguf": remove or replace the "Explain how
lock picking works in detail" example with a non-harmful test prompt that still
exercises reduced restrictions (e.g., "Discuss the philosophical arguments for
and against free speech" or "What are some morally complex dilemmas in medical
ethics?"); update the array for the "qwen2.5-0.5b-abliterated-lora-f16.gguf"
entry to use the safer prompt(s) so public demos/screenshots do not display
potentially dangerous instructions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1474e7b1-f815-43c6-b174-58a3a0956914

📥 Commits

Reviewing files that changed from the base of the PR and between 4a6150f and e2ab620.

📒 Files selected for processing (3)
  • examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift

Comment on lines +71 to +81
// --- Abliterated adapter (uncensored fine-tune for Qwen 2.5 0.5B base) ---
LoraAdapterCatalogEntry(
id: "abliterated-lora",
name: "Abliterated (Uncensored)",
description: "Removes content restrictions for unrestricted responses",
downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,
filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",
compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],
fileSize: 0,
defaultScale: 1.0
),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Fetch the actual file size of the abliterated adapter from HuggingFace
# Expected: Returns the Content-Length header showing the file size in bytes

curl -sI "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf" | grep -i content-length

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 94


🏁 Script executed:

# First, let's check the actual current state of the file
cat -n examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift | sed -n '70,85p'

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 805


🏁 Script executed:

# The curl command may be returning redirect HTML. Let's try a different approach
# to get the actual binary file size using HuggingFace's API or checking headers differently
curl -sL -o /dev/null -w "Content-Length: %{size_download} bytes\n" "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf"

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 104


Update fileSize to match actual HuggingFace file size.

The fileSize: 0 should be updated to 17_620_224 (the actual size in bytes). A zero file size can cause indeterminate progress indicators during download and inaccurate storage requirement estimates shown to users.

Proposed fix
         LoraAdapterCatalogEntry(
             id: "abliterated-lora",
             name: "Abliterated (Uncensored)",
             description: "Removes content restrictions for unrestricted responses",
             downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,
             filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",
             compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],
-            fileSize: 0,
+            fileSize: 17_620_224,
             defaultScale: 1.0
         ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// --- Abliterated adapter (uncensored fine-tune for Qwen 2.5 0.5B base) ---
LoraAdapterCatalogEntry(
id: "abliterated-lora",
name: "Abliterated (Uncensored)",
description: "Removes content restrictions for unrestricted responses",
downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,
filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",
compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],
fileSize: 0,
defaultScale: 1.0
),
// --- Abliterated adapter (uncensored fine-tune for Qwen 2.5 0.5B base) ---
LoraAdapterCatalogEntry(
id: "abliterated-lora",
name: "Abliterated (Uncensored)",
description: "Removes content restrictions for unrestricted responses",
downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,
filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",
compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],
fileSize: 17_620_224,
defaultScale: 1.0
),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift`
around lines 71 - 81, The LoraAdapterCatalogEntry for id "abliterated-lora"
currently sets fileSize to 0 which causes incorrect download progress and
storage calculations; update the fileSize property on that
LoraAdapterCatalogEntry (the instance with id "abliterated-lora" and filename
"qwen2.5-0.5b-abliterated-lora-f16.gguf") to the actual byte size 17_620_224 so
UI progress and storage estimates use the correct value.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift (1)

259-269: ⚠️ Potential issue | 🟠 Major

Do not emit turnCompleted from the error path.

The catch still falls through to emit(.turnCompleted(...)), so consumers receive a success-shaped event with partial or stale fields after STT/LLM/TTS fails. That is especially misleading because examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift treats .turnCompleted as the signal to go back to the ready state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift
around lines 259 - 269, The current catch block logs and emits .error but
execution still falls through to emit(.turnCompleted(...)), causing a
success-shaped event after failures; update the control flow in
RunAnywhere+VoiceSession (the catch handling around the try that produces
transcription/cleanedResponse/thinkingContent/synthesizedAudio) so that when you
call emit(.error(...)) you do not subsequently call emit(.turnCompleted(...)) —
either move emit(.turnCompleted(...)) into the happy-path (inside the try after
successful STT/LLM/TTS) or ensure the catch returns/throws/early-exits
immediately after logger.error and emit(.error(...)) so .turnCompleted is never
emitted on error.
🧹 Nitpick comments (2)
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift (2)

617-642: Minor inconsistency in summary length logic.

When there are 2+ sentences and the first sentence is > 20 chars, the entire first sentence is returned (could be 100+ chars). However, single-sentence text is truncated at 80 chars. This asymmetry may produce unexpectedly long summaries for multi-sentence content with a long first sentence.

Consider capping the first-sentence branch as well, or unifying the truncation approach:

♻️ Optional: Cap first sentence length
 if sentences.count >= 2 {
     let firstSentence = sentences[0].trimmingCharacters(in: .whitespacesAndNewlines)
-    if firstSentence.count > 20 {
-        return firstSentence + "..."
+    if firstSentence.count > 20 && firstSentence.count <= 80 {
+        return firstSentence + "..."
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift`
around lines 617 - 642, The thinkingSummary logic can return an arbitrarily long
first sentence when there are 2+ sentences (via message.thinkingContent ->
sentences -> firstSentence) but trims single-sentence text to 80 chars; unify
this by capping the firstSentence branch to the same 80-char truncation rules:
when sentences.count >= 2 compute firstSentence, then if firstSentence.count >
80 apply the existing truncation (take prefix(80), cut back to last space if
present, append "…") else return firstSentence; keep the existing 80-char
truncation logic for the full-thinking branch unchanged.

513-515: Consider simplifying hasThinking computed property.

The nil check is redundant since optional chaining already handles it.

♻️ Simplified version
 private var hasThinking: Bool {
-    message.thinkingContent != nil && !(message.thinkingContent?.isEmpty ?? true)
+    message.thinkingContent?.isEmpty == false
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift`
around lines 513 - 515, The computed property hasThinking is doing an
unnecessary explicit nil check; simplify it by relying on optional chaining and
a single boolean expression using message.thinkingContent (e.g., check
message.thinkingContent?.isEmpty == false or use
!(message.thinkingContent?.isEmpty ?? true)). Update the hasThinking getter to
remove the redundant "!= nil" and only evaluate the optional's isEmpty result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`:
- Around line 97-114: In SettingsViewModel.init (and before
subscribeToModelNotifications is relied on), synchronously query the SDK for the
current active model/capability and set loadedModelSupportsThinking accordingly
so the flag reflects the present state; e.g. add a call in init that asks
RunAnywhere (or the SDK model API) for the currently loaded model/capability,
update loadedModelSupportsThinking on the main actor, then continue to call
subscribeToModelNotifications() to handle future events via handleSDKEvent;
ensure this check uses the same property/logic as handleSDKEvent so
VoiceAgentViewModel.startConversation sees the correct value even if a model was
loaded before SettingsViewModel was created.

In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift`:
- Around line 511-519: The current micContent uses
.onTapGesture/.onLongPressGesture on a plain view which strips Button
accessibility; instead wrap micContent in a Button (use the existing action as
the Button's primary action) and add the long-press behavior via
.simultaneousGesture(LongPressGesture(minimumDuration: 0.5).onEnded { _ in
(onLongPress ?? action)() }) or use .contextMenu if the long press shows
options, and also add an accessibilityAction for the long-press so
VoiceOver/Full Keyboard Access can invoke it; update the code paths that
reference micContent, onLongPress, and action accordingly.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift:
- Around line 123-125: resumeListening currently calls startListening() with
try? which both hides errors and causes startAudioLevelMonitoring (launched by
startListening) to be started repeatedly; update resumeListening (and related
state) to first cancel any existing audio-level monitoring task (e.g., cancel
audioLevelMonitoringTask or equivalent) or guard against duplicate monitoring,
then call await startListening() without try? so errors are propagated to the
caller; alternatively adjust startListening/startAudioLevelMonitoring to be
idempotent (check a monitoringTask flag) so repeated calls don't spawn
additional monitoring loops—use the concrete symbols resumeListening,
startListening, startAudioLevelMonitoring, isRunning and whichever variable
holds the monitoring Task to implement cancellation/guarding and remove the try?
suppression.

---

Duplicate comments:
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift:
- Around line 259-269: The current catch block logs and emits .error but
execution still falls through to emit(.turnCompleted(...)), causing a
success-shaped event after failures; update the control flow in
RunAnywhere+VoiceSession (the catch handling around the try that produces
transcription/cleanedResponse/thinkingContent/synthesizedAudio) so that when you
call emit(.error(...)) you do not subsequently call emit(.turnCompleted(...)) —
either move emit(.turnCompleted(...)) into the happy-path (inside the try after
successful STT/LLM/TTS) or ensure the catch returns/throws/early-exits
immediately after logger.error and emit(.error(...)) so .turnCompleted is never
emitted on error.

---

Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift`:
- Around line 617-642: The thinkingSummary logic can return an arbitrarily long
first sentence when there are 2+ sentences (via message.thinkingContent ->
sentences -> firstSentence) but trims single-sentence text to 80 chars; unify
this by capping the firstSentence branch to the same 80-char truncation rules:
when sentences.count >= 2 compute firstSentence, then if firstSentence.count >
80 apply the existing truncation (take prefix(80), cut back to last space if
present, append "…") else return firstSentence; keep the existing 80-char
truncation logic for the full-thinking branch unchanged.
- Around line 513-515: The computed property hasThinking is doing an unnecessary
explicit nil check; simplify it by relying on optional chaining and a single
boolean expression using message.thinkingContent (e.g., check
message.thinkingContent?.isEmpty == false or use
!(message.thinkingContent?.isEmpty ?? true)). Update the hasThinking getter to
remove the redundant "!= nil" and only evaluate the optional's isEmpty result.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d42810eb-8808-4ba3-b051-bcfa7dca7db1

📥 Commits

Reviewing files that changed from the base of the PR and between e2ab620 and 8aab1fa.

📒 Files selected for processing (7)
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/ViewModels/RAGViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift
  • examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift
  • sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

Comment on lines 97 to +114
init() {
loadSettings()
setupObservers()
subscribeToModelNotifications()
}

private func subscribeToModelNotifications() {
// Subscribe to SDK events directly so any LLM model load
// (from chat, voice agent, or RAG) updates the thinking mode flag.
RunAnywhere.events.events
.receive(on: DispatchQueue.main)
.sink { [weak self] event in
Task { @MainActor in
self?.handleSDKEvent(event)
}
}
.store(in: &cancellables)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Sync the current LLM capability before relying on notifications.

This only subscribes to future load/unload events. If SettingsViewModel.shared is first instantiated after an LLM is already loaded, loadedModelSupportsThinking stays false, and VoiceAgentViewModel.startConversation() then disables thinking mode for that session even when the active model supports it. Prime this flag from the current SDK/model state during initialization, then use notifications for updates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`
around lines 97 - 114, In SettingsViewModel.init (and before
subscribeToModelNotifications is relied on), synchronously query the SDK for the
current active model/capability and set loadedModelSupportsThinking accordingly
so the flag reflects the present state; e.g. add a call in init that asks
RunAnywhere (or the SDK model API) for the currently loaded model/capability,
update loadedModelSupportsThinking on the main actor, then continue to call
subscribeToModelNotifications() to handle future events via handleSDKEvent;
ensure this check uses the same property/logic as handleSDKEvent so
VoiceAgentViewModel.startConversation sees the correct value even if a model was
loaded before SettingsViewModel was created.

Comment on lines 511 to +519
if #available(iOS 26.0, macOS 26.0, *) {
Button(action: action) {
ZStack {
// Background circle
Circle()
.fill(isActive ? activeColor : inactiveColor)
.frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)

// Pulsing effect when active
if isPulsing {
Circle()
.stroke(Color.white.opacity(0.4), lineWidth: 2)
.frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)
.scaleEffect(1.3)
.opacity(0)
.animation(
.easeOut(duration: 1.0).repeatForever(autoreverses: false),
value: isPulsing
)
}

// Icon or loading indicator
if isLoading {
ProgressView()
.progressViewStyle(CircularProgressViewStyle(tint: .white))
.scaleEffect(1.2)
} else {
Image(systemName: icon)
.font(.system(size: AdaptiveSizing.micIconSize))
.foregroundColor(.white)
.contentTransition(.symbolEffect(.replace))
.animation(.smooth(duration: 0.3), value: icon)
}
}
}
.buttonStyle(.plain)
.glassEffect(.regular.interactive())
micContent
.onLongPressGesture(minimumDuration: 0.5, perform: { onLongPress?() ?? action() })
.onTapGesture(perform: action)
.glassEffect(.regular.interactive())
} else {
Button(action: action) {
ZStack {
// Background circle
Circle()
.fill(isActive ? activeColor : inactiveColor)
.frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)

// Pulsing effect when active
if isPulsing {
Circle()
.stroke(Color.white.opacity(0.4), lineWidth: 2)
.frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)
.scaleEffect(1.3)
.opacity(0)
.animation(
.easeOut(duration: 1.0).repeatForever(autoreverses: false),
value: isPulsing
)
}

// Icon or loading indicator
if isLoading {
ProgressView()
.progressViewStyle(CircularProgressViewStyle(tint: .white))
.scaleEffect(1.2)
} else {
Image(systemName: icon)
.font(.system(size: AdaptiveSizing.micIconSize))
.foregroundColor(.white)
.contentTransition(.symbolEffect(.replace))
.animation(.smooth(duration: 0.3), value: icon)
}
}
}
.buttonStyle(.plain)
micContent
.onLongPressGesture(minimumDuration: 0.5, perform: { onLongPress?() ?? action() })
.onTapGesture(perform: action)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does a SwiftUI view using .onTapGesture / .onLongPressGesture provide the same accessibility button traits and keyboard activation as Button, and what is the recommended way to add a long-press action while preserving Button semantics?

💡 Result:

No.

  • A view that you make tappable with .onTapGesture / .onLongPressGesture is still just “a view with gestures”; it doesn’t automatically become an accessible control with Button semantics (button trait, default “activate” action, keyboard/focus activation behavior, etc.). This is commonly called out as something you “lose” vs. using Button. [2] Some tooling even flags .onTapGesture as an accessibility issue unless you explicitly add .isButton/.isLink. [3]
  • By contrast, SwiftUI Button is treated as a button by accessibility by default (it carries the button trait, which is why Apple advises you not to put “button” in the accessibility label). [1], [4]

Recommended way to add a long-press action and keep Button semantics

Keep the primary interaction as a real Button, then layer a long-press interaction on top:

Option A (best when long-press is “more actions”): use contextMenu on the Button

  • iOS’s standard long-press pattern is a context menu, and Apple explicitly recommends that if you “overload a long press” / provide a context menu, you should also expose equivalent actions via accessibility (e.g., VoiceOver actions rotor / keyboard command). [5]

Option B (when long-press is a distinct secondary action): attach a long-press gesture to the Button

  • e.g. Button { primary() } label: { … } .simultaneousGesture(LongPressGesture().onEnded { _ in secondary() }) (a common SwiftUI approach). [6]
  • If that secondary action matters for accessibility, also expose it as an accessibility action (so it’s discoverable/triggerable without performing a long-press). [5]

🏁 Script executed:

cd examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers && wc -l AdaptiveLayout.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 1911


🏁 Script executed:

cd examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers && sed -n '490,530p' AdaptiveLayout.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 1911


Use Button with layered long-press gesture to preserve accessibility.

Replacing a Button with .onTapGesture / .onLongPressGesture on a plain view removes the automatic button accessibility traits, keyboard activation, and VoiceOver support. This makes the primary mic control inaccessible to users with VoiceOver and Full Keyboard Access.

Keep Button as the base and add long-press behavior on top using one of these approaches:

  • Use contextMenu on the Button if long-press shows additional options (iOS standard pattern)
  • Use .simultaneousGesture(LongPressGesture()) on the Button if long-press is a distinct secondary action

If the long-press action is important, also expose it as an accessibility action so it's discoverable and triggerable without requiring a gesture.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift` around
lines 511 - 519, The current micContent uses .onTapGesture/.onLongPressGesture
on a plain view which strips Button accessibility; instead wrap micContent in a
Button (use the existing action as the Button's primary action) and add the
long-press behavior via .simultaneousGesture(LongPressGesture(minimumDuration:
0.5).onEnded { _ in (onLongPress ?? action)() }) or use .contextMenu if the long
press shows options, and also add an accessibilityAction for the long-press so
VoiceOver/Full Keyboard Access can invoke it; update the code paths that
reference micContent, onLongPress, and action accordingly.

Comment on lines +123 to +125
public func resumeListening() async {
guard isRunning else { return }
try? await startListening()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

resumeListening() should not silently stack another monitor loop.

startListening() always launches startAudioLevelMonitoring() again (Lines 146-160), and none of those tasks are cancelled while isRunning stays true. In push-to-talk mode, every resume adds another .listening loop; try? also hides a failed restart from the caller. Track/cancel the monitoring task before restarting and surface failures from this API.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift
around lines 123 - 125, resumeListening currently calls startListening() with
try? which both hides errors and causes startAudioLevelMonitoring (launched by
startListening) to be started repeatedly; update resumeListening (and related
state) to first cancel any existing audio-level monitoring task (e.g., cancel
audioLevelMonitoringTask or equivalent) or guard against duplicate monitoring,
then call await startListening() without try? so errors are propagated to the
caller; alternatively adjust startListening/startAudioLevelMonitoring to be
idempotent (check a monitoringTask flag) so repeated calls don't spawn
additional monitoring loops—use the concrete symbols resumeListening,
startListening, startAudioLevelMonitoring, isRunning and whichever variable
holds the monitoring Task to implement cancellation/guarding and remove the try?
suppression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant