Skip to content

feat(iii-aura): add on-device multimodal voice + vision worker#19

Open
rohitg00 wants to merge 1 commit intomainfrom
feat/iii-aura
Open

feat(iii-aura): add on-device multimodal voice + vision worker#19
rohitg00 wants to merge 1 commit intomainfrom
feat/iii-aura

Conversation

@rohitg00
Copy link
Copy Markdown

@rohitg00 rohitg00 commented Apr 8, 2026

Summary

  • Adds iii-aura, a real-time on-device multimodal AI assistant (voice + vision) wired entirely through iii primitives
  • Python worker runs Gemma 4 E2B for speech/vision understanding and Kokoro TTS for voice output, orchestrated through channels, state, and triggers
  • Browser connects as a full iii worker via iii-browser-sdk — registers functions, creates channels, handles playback

What's included

Component Description
python-worker/ Gemma 4 E2B inference + Kokoro TTS, registers aura::session::open, aura::ingest::turn, aura::interrupt
browser/ Vite app with VAD, camera capture, audio playback — registers ui::aura::transcript, ui::aura::playback
iii-config.example.yaml Engine config with dual worker-managers (internal + RBAC), state, HTTP, observability
README.md Architecture diagram, quick start, env vars, extending guide

iii primitives used

  • iii-browser-sdk — browser acts as a full worker
  • Channels — binary audio streaming (browser ↔ worker)
  • State — session metadata persistence
  • Triggers (Void) — fire-and-forget push from worker → browser
  • iii-worker-manager (2 ports) — internal for Python, RBAC-filtered for browser

Test plan

  • uv sync && uv run iii-aura starts and connects to the engine
  • Browser at localhost:5180 connects, opens session, captures voice + camera
  • LLM inference returns transcription + response via ui::aura::transcript
  • TTS audio streams back via channel and plays in browser
  • Barge-in (speaking during playback) interrupts and restarts listening

Summary by CodeRabbit

  • New Features

    • Introduced iii-aura: on-device multimodal voice+vision app with a browser UI, real-time transcription, AI responses, camera support, barge-in, and streamed TTS playback.
  • Documentation

    • Added a comprehensive README with setup, runtime flow, environment variables, quick-start commands, and extension guidance.
  • Chores

    • Added packaging, example configuration, browser build config, and a .gitignore to exclude local/generated artifacts.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

Adds Aura: an on-device multimodal voice+vision system with a browser UI, a Python worker for LLM inference and TTS, example iii engine config, and supporting build/package files to enable audio/image capture, channel-based ingestion, and streamed TTS playback.

Changes

Cohort / File(s) Summary
Project metadata & docs
iii-aura/.gitignore, iii-aura/README.md, iii-aura/iii-config.example.yaml
Adds gitignore, comprehensive README describing architecture/flows/env vars/quickstart, and example iii engine YAML (workers, state, HTTP, RBAC, observability, optional queue).
Browser app & build
iii-aura/browser/package.json, iii-aura/browser/tsconfig.json, iii-aura/browser/vite.config.ts, iii-aura/browser/index.html
New Vite/ESM browser project config, TypeScript settings, dev server port, and dark-themed HTML UI scaffolding with stable element IDs for video, waveform, messages, controls, and external lib imports.
Browser runtime
iii-aura/browser/src/aura.ts
Implements export async function init() wiring DOM, camera/mic capture, Silero VAD, waveform rendering, iii connection, UI functions (ui::aura::transcript, ui::aura::playback), session open, ingestion channel publishing, barge-in/interrupt handling, and streaming playback management.
Python packaging
iii-aura/python-worker/pyproject.toml, iii-aura/python-worker/src/iii_aura/__init__.py
Adds Python package metadata (hatchling), deps (iii-sdk, litert-lm, numpy, hf-hub), optional platform groups, console entry point iii-aura, and package docstring.
Inference module
iii-aura/python-worker/src/iii_aura/inference.py
Introduces model path resolution (env or HF download), Gemma-4 engine load/unload lifecycle, global engine and tool_result, and respond_to_user() to record LLM tool outputs.
TTS module
iii-aura/python-worker/src/iii_aura/tts.py
Provides TTSBackend interface plus MLXBackend (Apple Silicon) and ONNXBackend (fallback) implementations, platform detection, model loading (HF or MLX), and load() to select backend with sample rate exposure.
Python worker core
iii-aura/python-worker/src/iii_aura/worker.py
Adds worker main entry and registers engine functions (aura::session::open, aura::ingest::turn, aura::interrupt, aura::session::close). Manages per-session interrupt events, custom executor, multimodal prompt construction, LLM inference, emits ui::aura::transcript, streams sentence-by-sentence TTS over playback channels, and installs an HTTP trigger.

Sequence Diagram

sequenceDiagram
    participant Browser as Browser Client
    participant VAD as VAD Speech Detector
    participant Engine as iii Engine/Manager
    participant Worker as Python Worker
    participant LLM as LLM Inference
    participant TTS as TTS Audio Generator

    Browser->>VAD: Capture mic audio stream
    VAD-->>Browser: Speech start/end events
    Browser->>Browser: Convert to WAV, capture camera frame (optional)
    Browser->>Engine: Create channel, publish audio + image metadata
    Browser->>Engine: Trigger aura::ingest::turn
    Engine->>Worker: Deliver turn data (channel reader)
    Worker->>LLM: Run multimodal inference (transcription + response)
    LLM-->>Worker: Return text response
    Worker->>Engine: Emit ui::aura::transcript
    Worker->>TTS: Generate audio per sentence
    TTS-->>Worker: Stream PCM chunks
    Worker->>Engine: Stream playback channel audio chunks
    Engine-->>Browser: Browser receives audio chunks
    Browser->>Browser: Play via WebAudio
    Browser->>Engine: Trigger aura::interrupt (barge-in)
    Engine->>Worker: Signal cancellation for in-flight inference/TTS
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

"I hop and hum where audio meets sight,
Channels flow softly through day and night.
From browser burrow to worker lair,
Multimodal magic fills the air. 🐇✨"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main feature addition: an on-device multimodal voice and vision worker for the iii-aura project, accurately reflecting the core change across all added files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/iii-aura

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (5)
iii-aura/README.md (1)

7-19: Add language identifier to the fenced code block.

The architecture diagram code block is missing a language specifier. While this is an ASCII diagram rather than code, adding a language identifier (or text/plaintext) satisfies the markdown linter and improves rendering consistency.

Suggested fix
-```
+```text
 Browser (iii-browser-sdk)          iii Engine               Python Worker (iii-sdk)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/README.md` around lines 7 - 19, The fenced ASCII diagram block in
the README (the triple-backtick block containing "Browser (iii-browser-sdk)     
iii Engine               Python Worker (iii-sdk)" and the diagram lines) lacks a
language identifier; update that opening fence to include a plain text language
(e.g., "text" or "plaintext") so the markdown linter accepts it and rendering
stays consistent—locate the triple-backtick block in README.md and change ``` to
```text (or ```plaintext).
iii-aura/python-worker/src/iii_aura/worker.py (3)

90-91: Document why the localhost→127.0.0.1 URL replacement is needed.

This URL replacement appears in two places (lines 90-91 and 177-180). Consider extracting this to a helper function and adding a comment explaining why this workaround is necessary (e.g., DNS resolution issues in certain container environments).

Suggested refactor
+def _fix_localhost_url(obj: Any, attr: str) -> None:
+    """Replace 'localhost' with '127.0.0.1' in channel URLs.
+    
+    Workaround for DNS resolution issues in certain container/network configurations
+    where 'localhost' may not resolve correctly.
+    """
+    if hasattr(obj, attr):
+        url = getattr(obj, attr)
+        if '://localhost' in url:
+            setattr(obj, attr, url.replace('://localhost', '://127.0.0.1'))
+

 async def _ingest_turn(data: dict[str, Any]) -> dict[str, Any]:
     # ...
-    if hasattr(reader, '_url'):
-        reader._url = reader._url.replace('://localhost', '://127.0.0.1')
+    _fix_localhost_url(reader, '_url')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/worker.py` around lines 90 - 91, Extract
the repeated localhost→127.0.0.1 replacement into a small helper (e.g.,
normalize_localhost_url or replace_localhost_with_loopback) that accepts the
reader object, checks hasattr(reader, '_url'), and updates reader._url =
reader._url.replace('://localhost', '://127.0.0.1'); replace the two inline
blocks with calls to that helper and add a concise comment on the helper
explaining the rationale (workaround for DNS/localhost resolution issues in some
container/network environments where localhost does not resolve or bypasses
network stack), so both occurrences (the current inline replacement around
reader._url at the two locations) are consolidated and documented.

146-149: Replace lambda assignment with a named function.

Per Ruff E731, lambda expressions should not be assigned to variables. Use a def statement instead for better readability and debugging.

Suggested fix
+def _strip_markers(s: str) -> str:
+    return s.replace('<|"|>', "").strip()
+
     response = await loop.run_in_executor(_executor, _infer)
     llm_time = time.time() - t0

-    strip = lambda s: s.replace('<|"|>', "").strip()
     if inference.tool_result:
-        transcription = strip(inference.tool_result.get("transcription", ""))
-        text_response = strip(inference.tool_result.get("response", ""))
+        transcription = _strip_markers(inference.tool_result.get("transcription", ""))
+        text_response = _strip_markers(inference.tool_result.get("response", ""))
     else:
         transcription = None
         text_response = response["content"][0]["text"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/worker.py` around lines 146 - 149, The
code assigns a lambda to the name `strip` which violates Ruff E731; replace the
lambda assignment with a proper function definition (e.g., define def strip(s):
return s.replace('<|"|>', "").strip()) and keep the subsequent usage unchanged
where `transcription = strip(inference.tool_result.get("transcription", ""))`
and `text_response = strip(inference.tool_result.get("response", ""))`; ensure
the new `strip` function is defined in the same scope so calls from the
`inference.tool_result` handling still work.

127-127: Use asyncio.get_running_loop() instead of deprecated get_event_loop().

asyncio.get_event_loop() is deprecated since Python 3.10 when called from a coroutine. Use asyncio.get_running_loop() which is the recommended approach in async contexts.

Suggested fix
-    loop = asyncio.get_event_loop()
+    loop = asyncio.get_running_loop()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/worker.py` at line 127, Replace the
deprecated asyncio.get_event_loop() call used to assign loop (the line creating
the variable named loop) with asyncio.get_running_loop() so the code uses the
running event loop in async contexts; ensure this change is made where loop is
set (in worker.py) and only called from within a coroutine or async function (or
adjust the caller to obtain the loop in an appropriate async context) to avoid
RuntimeError.
iii-aura/python-worker/src/iii_aura/tts.py (1)

42-47: Consider using the sample rate returned by the model instead of hardcoding.

ONNXBackend hardcodes sample_rate = 24000 but the create() method returns the actual sample rate as _sr which is discarded. If the model version changes, this could cause audio playback issues.

Suggested fix
 class ONNXBackend(TTSBackend):
     def __init__(self):
         import kokoro_onnx  # type: ignore[import-not-found]
         from huggingface_hub import hf_hub_download

         model_path = hf_hub_download("fastrtc/kokoro-onnx", "kokoro-v1.0.onnx")
         voices_path = hf_hub_download("fastrtc/kokoro-onnx", "voices-v1.0.bin")

         self._model = kokoro_onnx.Kokoro(model_path, voices_path)
-        self.sample_rate = 24000
+        # Initialize with expected rate; will be updated on first generate() if different
+        self.sample_rate = 24000

     def generate(self, text: str, voice: str = "af_heart", speed: float = 1.1) -> np.ndarray:
         pcm, _sr = self._model.create(text, voice=voice, speed=speed)
+        if _sr != self.sample_rate:
+            self.sample_rate = _sr
         return pcm
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/tts.py` around lines 42 - 47, The class
currently hardcodes self.sample_rate = 24000 but the model returns the actual
rate from _model.create(text, ...). Change the code so the sample rate comes
from the model: use the _sr returned by kokoro_onnx.Kokoro.create (called in
generate) to set self.sample_rate (or initialize it from the model if
kokoro_onnx exposes a sample rate) instead of keeping the fixed 24000; update
the generate method (and/or __init__) to assign self.sample_rate = _sr after
calling _model.create so playback uses the model-provided rate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@iii-aura/browser/index.html`:
- Around line 348-349: Remove the two CDN script tags that load onnxruntime-web
and `@ricky0123/vad-web` from iii-aura/browser/index.html and instead add them as
npm dependencies (onnxruntime-web and `@ricky0123/vad-web`), import the modules
from those package names in your front-end entry module (e.g., main.ts or
main.js) so Vite will bundle them, and update any global references to use the
imported symbols (e.g., replace window.ort or window.VAD usage with the local
imports) ensuring Vite config does not exclude these packages from the build.

In `@iii-aura/browser/src/aura.ts`:
- Around line 371-377: The addMessage function currently assigns untrusted
`text` and `meta` into `div.innerHTML`, creating an XSS risk; change it to
create and append child nodes instead: set `div.className` as-is, create a
content node and assign `text` to its `textContent` (or use a separate
`renderHtml` boolean parameter to opt into sanitized HTML rendering), create a
separate `metaEl` element and set its `textContent` to `meta` before appending,
then append both to `messagesEl` and update scroll; ensure any caller that
intentionally passes HTML (e.g., the loading-dots usage) either passes
`renderHtml: true` and the content is sanitized or is refactored to use a
dedicated HTML-only helper so untrusted LLM responses always use `textContent`.
- Around line 69-76: The code assigns untrusted data.transcription directly to
lastUserMsg.innerHTML causing an XSS risk; change this to set the transcription
as plain text (use lastUserMsg.textContent or createTextNode) and preserve the
existing meta element by locating meta (from lastUserMsg.querySelector('.meta'))
and appending meta.cloneNode(true) after the text node (or re-attach the
original meta element) instead of concatenating strings into innerHTML; update
the block that references messagesEl, lastUserMsg, meta and innerHTML to use
textContent/createTextNode + appendChild(cloneNode) so transcription is not
parsed as HTML.
- Around line 100-114: The audio playback relies only on receiving the
'audio_end' message via reader.onMessage (inside the audio_end branch), so if
that message is lost the UI can remain in 'speaking' state; add a timeout-based
fallback: when you start playback (where ignoreIncomingAudio is set/stopPlayback
is called), start or reset a playback timeout (e.g., playbackTimeout) and clear
it when reader.onMessage sees parsed.type === 'audio_end' (and when stopPlayback
is called), and on timeout expiry call stopPlayback() and setState('listening')
to recover; ensure the timeout is canceled/cleared whenever ignoreIncomingAudio
is toggled or normal audio_end processing runs to avoid double-handling.

In `@iii-aura/iii-config.example.yaml`:
- Around line 36-44: The CORS default is too permissive: when host is set to
0.0.0.0 and the config key allowed_origins contains "*" any website can call the
HTTP API from browsers; change the example in iii-config.example.yaml so
allowed_origins no longer uses a wildcard and instead restricts to safe defaults
(e.g., localhost entries such as "http://localhost:3111" and
"http://127.0.0.1:3111" or empty list), leaving allowed_methods (GET, POST,
OPTIONS) unchanged; update the allowed_origins value and add a short comment
next to host/port to indicate these are localhost-only defaults for development.

In `@iii-aura/python-worker/pyproject.toml`:
- Around line 10-24: The project is missing the huggingface-hub dependency
required by iii_aura/inference.py and iii_aura/tts.py which import
hf_hub_download; add "huggingface-hub>=0.16.0" to the main dependencies list in
pyproject.toml (alongside "iii-sdk", "litert-lm>=0.10.1", "numpy>=2.0") so fresh
installs won't raise ModuleNotFoundError, using the permissive lower bound
suggested instead of the overly restrictive >=0.23.0.

In `@iii-aura/python-worker/src/iii_aura/inference.py`:
- Around line 44-56: The load() function calls engine.__enter__() on a
litert_lm.Engine instance (created via litert_lm.Engine(...)) but never releases
it; add a corresponding cleanup path by implementing an unload() that calls
engine.__exit__(None, None, None) (and sets engine = None) and register it with
atexit.register(unload) after engine is created, or alternatively ensure
whatever lifecycle manager calls resolve_model_path()/load() will invoke
engine.__exit__(); reference the load(), engine, engine.__enter__(),
engine.__exit__(None, None, None), and atexit to locate the change.

In `@iii-aura/python-worker/src/iii_aura/worker.py`:
- Line 33: The _interrupts dict currently stores asyncio.Event per session and
is never cleaned up; add explicit cleanup to remove the session key when a
session or turn completes (e.g., after handle_turn or at session close) or
implement a session close function that deletes _interrupts[session_id] (and
cancels/sets the Event if needed) to avoid memory leaks; update functions that
create entries (where _interrupts[session_id] = asyncio.Event()) to ensure
corresponding removal and defensively check presence before use.

---

Nitpick comments:
In `@iii-aura/python-worker/src/iii_aura/tts.py`:
- Around line 42-47: The class currently hardcodes self.sample_rate = 24000 but
the model returns the actual rate from _model.create(text, ...). Change the code
so the sample rate comes from the model: use the _sr returned by
kokoro_onnx.Kokoro.create (called in generate) to set self.sample_rate (or
initialize it from the model if kokoro_onnx exposes a sample rate) instead of
keeping the fixed 24000; update the generate method (and/or __init__) to assign
self.sample_rate = _sr after calling _model.create so playback uses the
model-provided rate.

In `@iii-aura/python-worker/src/iii_aura/worker.py`:
- Around line 90-91: Extract the repeated localhost→127.0.0.1 replacement into a
small helper (e.g., normalize_localhost_url or replace_localhost_with_loopback)
that accepts the reader object, checks hasattr(reader, '_url'), and updates
reader._url = reader._url.replace('://localhost', '://127.0.0.1'); replace the
two inline blocks with calls to that helper and add a concise comment on the
helper explaining the rationale (workaround for DNS/localhost resolution issues
in some container/network environments where localhost does not resolve or
bypasses network stack), so both occurrences (the current inline replacement
around reader._url at the two locations) are consolidated and documented.
- Around line 146-149: The code assigns a lambda to the name `strip` which
violates Ruff E731; replace the lambda assignment with a proper function
definition (e.g., define def strip(s): return s.replace('<|"|>', "").strip())
and keep the subsequent usage unchanged where `transcription =
strip(inference.tool_result.get("transcription", ""))` and `text_response =
strip(inference.tool_result.get("response", ""))`; ensure the new `strip`
function is defined in the same scope so calls from the `inference.tool_result`
handling still work.
- Line 127: Replace the deprecated asyncio.get_event_loop() call used to assign
loop (the line creating the variable named loop) with asyncio.get_running_loop()
so the code uses the running event loop in async contexts; ensure this change is
made where loop is set (in worker.py) and only called from within a coroutine or
async function (or adjust the caller to obtain the loop in an appropriate async
context) to avoid RuntimeError.

In `@iii-aura/README.md`:
- Around line 7-19: The fenced ASCII diagram block in the README (the
triple-backtick block containing "Browser (iii-browser-sdk)          iii Engine 
Python Worker (iii-sdk)" and the diagram lines) lacks a language identifier;
update that opening fence to include a plain text language (e.g., "text" or
"plaintext") so the markdown linter accepts it and rendering stays
consistent—locate the triple-backtick block in README.md and change ``` to
```text (or ```plaintext).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b4445f09-3423-4303-ba5b-2cd7067d4e36

📥 Commits

Reviewing files that changed from the base of the PR and between b0f6adc and 7aed39e.

📒 Files selected for processing (13)
  • iii-aura/.gitignore
  • iii-aura/README.md
  • iii-aura/browser/index.html
  • iii-aura/browser/package.json
  • iii-aura/browser/src/aura.ts
  • iii-aura/browser/tsconfig.json
  • iii-aura/browser/vite.config.ts
  • iii-aura/iii-config.example.yaml
  • iii-aura/python-worker/pyproject.toml
  • iii-aura/python-worker/src/iii_aura/__init__.py
  • iii-aura/python-worker/src/iii_aura/inference.py
  • iii-aura/python-worker/src/iii_aura/tts.py
  • iii-aura/python-worker/src/iii_aura/worker.py

Real-time voice and vision AI assistant powered by Gemma 4 E2B and
Kokoro TTS, orchestrated entirely through iii primitives. Browser
connects via iii-browser-sdk, Python worker handles LLM inference
and streaming TTS over channels.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
iii-aura/python-worker/src/iii_aura/worker.py (1)

47-59: Fragile workaround accessing CPython internals.

The _SafeExecutor workaround for litert_lm poisoning concurrent.futures.thread._shutdown is well-documented but relies on CPython implementation details. This could break with Python version updates. Consider adding a comment noting which Python versions this has been tested with, and monitor for breakage when upgrading.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/worker.py` around lines 47 - 59, Add a
short, clear comment above the _SafeExecutor class (and near the _cft._shutdown
write in submit) documenting which CPython versions this workaround was tested
on and warning that it relies on CPython internals
(concurrent.futures.thread._shutdown) and may break on upgrades; also add a TODO
to monitor or remove this workaround when upgrading Python and reference the
symbols _SafeExecutor, submit, _cft._shutdown, and the module-level _executor so
reviewers can find the implementation easily.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@iii-aura/python-worker/src/iii_aura/worker.py`:
- Around line 158-161: The else branch assumes response["content"][0]["text"]
exists, which can raise KeyError/IndexError when inference.tool_result is falsy
or response has no content; update the logic in the worker processing where
transcription and text_response are set (variables: transcription,
text_response, response, inference.tool_result) to defensively validate response
is a dict with a non-empty "content" list and that the first item has a "text"
key before accessing it, and if validation fails set text_response to a safe
default (e.g., empty string or an error message) and log a warning/error so the
turn fails gracefully instead of raising an exception.
- Around line 282-287: The main() function currently registers handlers (e.g.,
via iii.on_functions_available) then returns, allowing the process to exit while
async handlers still expect a running event loop; add a blocking mechanism in
main() using a threading.Event (e.g., stop = threading.Event()), register signal
handlers for SIGTERM and SIGINT that call stop.set(), call stop.wait() to block
the main thread, and on unblock call iii.shutdown() to cleanly stop the worker
so async handlers can run without the process exiting prematurely.

---

Nitpick comments:
In `@iii-aura/python-worker/src/iii_aura/worker.py`:
- Around line 47-59: Add a short, clear comment above the _SafeExecutor class
(and near the _cft._shutdown write in submit) documenting which CPython versions
this workaround was tested on and warning that it relies on CPython internals
(concurrent.futures.thread._shutdown) and may break on upgrades; also add a TODO
to monitor or remove this workaround when upgrading Python and reference the
symbols _SafeExecutor, submit, _cft._shutdown, and the module-level _executor so
reviewers can find the implementation easily.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c66c2808-36eb-4360-9899-36a6cf217aeb

📥 Commits

Reviewing files that changed from the base of the PR and between 7aed39e and 7f3f8b1.

📒 Files selected for processing (13)
  • iii-aura/.gitignore
  • iii-aura/README.md
  • iii-aura/browser/index.html
  • iii-aura/browser/package.json
  • iii-aura/browser/src/aura.ts
  • iii-aura/browser/tsconfig.json
  • iii-aura/browser/vite.config.ts
  • iii-aura/iii-config.example.yaml
  • iii-aura/python-worker/pyproject.toml
  • iii-aura/python-worker/src/iii_aura/__init__.py
  • iii-aura/python-worker/src/iii_aura/inference.py
  • iii-aura/python-worker/src/iii_aura/tts.py
  • iii-aura/python-worker/src/iii_aura/worker.py
✅ Files skipped from review due to trivial changes (8)
  • iii-aura/python-worker/src/iii_aura/init.py
  • iii-aura/browser/vite.config.ts
  • iii-aura/browser/tsconfig.json
  • iii-aura/browser/package.json
  • iii-aura/iii-config.example.yaml
  • iii-aura/python-worker/pyproject.toml
  • iii-aura/README.md
  • iii-aura/.gitignore
🚧 Files skipped from review as they are similar to previous changes (2)
  • iii-aura/python-worker/src/iii_aura/inference.py
  • iii-aura/python-worker/src/iii_aura/tts.py

Comment on lines +158 to +161
else:
transcription = None
text_response = response["content"][0]["text"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential KeyError/IndexError if LLM response structure is unexpected.

If inference.tool_result is empty (falsy) and the LLM response doesn't conform to the expected structure (missing content key or empty list), this will raise an unhandled exception, causing the turn to fail without a graceful error message.

Suggested defensive fix
     if inference.tool_result:
         transcription = _clean_model_tokens(inference.tool_result.get("transcription", ""))
         text_response = _clean_model_tokens(inference.tool_result.get("response", ""))
     else:
         transcription = None
-        text_response = response["content"][0]["text"]
+        try:
+            text_response = response["content"][0]["text"]
+        except (KeyError, IndexError, TypeError):
+            logger.error("Unexpected LLM response structure", {"response": str(response)[:200]})
+            return {"error": "unexpected_response_format"}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
else:
transcription = None
text_response = response["content"][0]["text"]
else:
transcription = None
try:
text_response = response["content"][0]["text"]
except (KeyError, IndexError, TypeError):
logger.error("Unexpected LLM response structure", {"response": str(response)[:200]})
return {"error": "unexpected_response_format"}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/worker.py` around lines 158 - 161, The
else branch assumes response["content"][0]["text"] exists, which can raise
KeyError/IndexError when inference.tool_result is falsy or response has no
content; update the logic in the worker processing where transcription and
text_response are set (variables: transcription, text_response, response,
inference.tool_result) to defensively validate response is a dict with a
non-empty "content" list and that the first item has a "text" key before
accessing it, and if validation fails set text_response to a safe default (e.g.,
empty string or an error message) and log a warning/error so the turn fails
gracefully instead of raising an exception.

Comment on lines +282 to +287
iii.on_functions_available(
lambda fns: logger.info(f"Aura worker ready — {len(fns)} functions available")
)

logger.info("iii-aura worker started")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if register_worker is expected to block or if there's a wait/run method
rg -n "register_worker|\.wait\(|\.run\(" --type py iii-aura/python-worker/

Repository: iii-hq/workers

Length of output: 300


🏁 Script executed:

# Read the main() function to see how register_worker is used and what happens after
head -300 iii-aura/python-worker/src/iii_aura/worker.py | tail -100

Repository: iii-hq/workers

Length of output: 3549


🏁 Script executed:

# Also search for other worker implementations to understand the expected pattern
find . -name "worker.py" -o -name "*worker*.py" | head -10

Repository: iii-hq/workers

Length of output: 105


🏁 Script executed:

# Search for async/await patterns in worker.py
grep -n "async\|await\|asyncio" iii-aura/python-worker/src/iii_aura/worker.py

Repository: iii-hq/workers

Length of output: 1136


🏁 Script executed:

# Check if there are any imports of asyncio or event loop management
grep -n "import.*asyncio\|import.*loop\|import.*event" iii-aura/python-worker/src/iii_aura/worker.py

Repository: iii-hq/workers

Length of output: 75


🏁 Script executed:

# Look for any blocking calls or event loop starts in the entire repo
rg "asyncio\.|event_loop|\.run\(|\.start\(|while True" --type py -A 2 iii-aura/python-worker/ | head -30

Repository: iii-hq/workers

Length of output: 982


🏁 Script executed:

# Look for other worker implementations
find . -path ./node_modules -prune -o -name "*.py" -type f -exec grep -l "register_worker" {} \;

Repository: iii-hq/workers

Length of output: 138


🏁 Script executed:

# Check for README or documentation about the iii framework
find . -name "README*" -o -name "*.md" | xargs grep -l "register_worker\|event.loop" 2>/dev/null | head -5

Repository: iii-hq/workers

Length of output: 78


🏁 Script executed:

# Look at the full imports in worker.py to understand iii better
head -30 iii-aura/python-worker/src/iii_aura/worker.py

Repository: iii-hq/workers

Length of output: 849


🏁 Script executed:

# Read the todo-worker-python to see how it uses register_worker
cat ./todo-worker-python/src/main.py

Repository: iii-hq/workers

Length of output: 1714


🏁 Script executed:

# Also check the README to see if there's any documentation about the pattern
cat ./iii-aura/README.md | head -50

Repository: iii-hq/workers

Length of output: 2422


Add a blocking mechanism to keep the worker alive.

The main() function registers functions and handlers but returns without blocking. Since asyncio.get_running_loop() is called in async handlers, the event loop is running internally, but the main process will exit when main() completes. Add a blocking call (similar to the todo-worker pattern) and proper signal handling:

stop = threading.Event()
signal.signal(signal.SIGTERM, lambda *_: stop.set())
signal.signal(signal.SIGINT, lambda *_: stop.set())
stop.wait()
iii.shutdown()

Without this, the worker process terminates immediately after startup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@iii-aura/python-worker/src/iii_aura/worker.py` around lines 282 - 287, The
main() function currently registers handlers (e.g., via
iii.on_functions_available) then returns, allowing the process to exit while
async handlers still expect a running event loop; add a blocking mechanism in
main() using a threading.Event (e.g., stop = threading.Event()), register signal
handlers for SIGTERM and SIGINT that call stop.set(), call stop.wait() to block
the main thread, and on unblock call iii.shutdown() to cleanly stop the worker
so async handlers can run without the process exiting prematurely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant