-
Notifications
You must be signed in to change notification settings - Fork 0
[02/09] audio: capture lifecycle fix + ALSA stderr suppression #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
f779e4a
[01/09] config: centralize Settings + path-aware load
b92498b
Add config dependency and lib section to Cargo.toml
aa08f41
Update Cargo.lock for config dependency
2d12d08
[02/09] audio: capture lifecycle fix + ALSA stderr suppression
4a44dac
fix(audio): decouple wav_file_loader decl, cfg(unix) + safety notes (…
b0c61f5
Merge main into 02-audio-capture
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,120 @@ | ||
| # Application Configuration | ||
|
|
||
| This directory contains the configuration files for the ColdVox application. | ||
|
|
||
| ## `default.toml` | ||
|
|
||
| This is the primary configuration file for the application. It contains the default settings for all components, including text injection, VAD, and STT. | ||
|
|
||
| The application loads this file at startup. The values in this file can be overridden by environment variables or command-line arguments. | ||
|
|
||
| ## Security Best Practices | ||
|
|
||
| **Important Security Note:** Do not store secrets, API keys, passwords, or any sensitive information in `default.toml` or any committed configuration files. This file is intended for default, non-sensitive values only and should be version-controlled. | ||
|
|
||
| - Use environment variables for overriding sensitive values (e.g., `COLDVOX_STT__PREFERRED=your_secret_plugin`). Refer to [docs/user/runflags.md](docs/user/runflags.md) for all overridable variables. | ||
| - For local development or production overrides, create a `config/overrides.toml` file with your custom settings. Add `config/overrides.toml` to your `.gitignore` to prevent accidental commits of sensitive data. | ||
| - If implementing custom loading, you can extend the config builder in `crates/app/src/main.rs` to include `overrides.toml` after `default.toml` for layered overrides. | ||
| - Always validate and sanitize configuration values at runtime to prevent injection attacks or invalid settings. | ||
|
|
||
| Example `overrides.toml` template (create this file for local use): | ||
|
|
||
| ```toml | ||
| # Local overrides for default.toml - add to .gitignore! | ||
| # This file is not loaded by default; extend Settings::new() if needed. | ||
|
|
||
| # Example: Override injection settings | ||
| [Injection] | ||
| fail_fast = true # Maps to COLDVOX_INJECTION__FAIL_FAST=true | ||
| max_total_latency_ms = 500 # Maps to COLDVOX_INJECTION__MAX_TOTAL_LATENCY_MS=500 | ||
|
|
||
| # Example: STT preferences (avoid committing model paths with secrets) | ||
| [stt] | ||
| preferred = "local_whisper" # Maps to COLDVOX_STT__PREFERRED=local_whisper | ||
| max_mem_mb = 2048 # Maps to COLDVOX_STT__MAX_MEM_MB=2048 | ||
| ``` | ||
|
|
||
| ## Deployment Considerations | ||
|
|
||
| When deploying ColdVox, handle configurations carefully to ensure security, flexibility, and reliability across environments. | ||
|
|
||
| ### Including config/default.toml in Builds and Deployments | ||
| - **Repository**: Always commit `config/default.toml` as it holds safe, default values. Do not modify it for environment-specific needs. | ||
| - **Build Process**: The TOML is loaded at runtime, not embedded. In CI/CD (e.g., via `cargo build --release`), copy `config/default.toml` to the deployment artifact or container. | ||
| - Example in Dockerfile: | ||
| ``` | ||
| COPY config/default.toml /app/config/ | ||
| COPY target/release/coldvox-app /app/ | ||
| WORKDIR /app | ||
| CMD ["./coldvox-app"] | ||
| ``` | ||
| - For binary distributions: Include in a `config/` subdirectory next to the executable. | ||
| - **Runtime Loading**: The app loads `config/default.toml` relative to the working directory. XDG support not implemented; to add it, extend `Settings::new()` with XDG path lookup (see deployment docs for details). | ||
|
|
||
| ### Environment-Specific Configurations | ||
| - **Overrides via Environment Variables**: Preferred for secrets and dynamic settings. Use `COLDVOX__` prefix: | ||
| - Example for production: `export COLDVOX_STT__PREFERRED=cloud_whisper; export COLDVOX_INJECTION__FAIL_FAST=true`. | ||
| - Nested: `COLDVOX_VAD__SENSITIVITY=0.8` overrides `[vad].sensitivity`. | ||
| - Set in deployment tools: Systemd (`Environment=`), Docker (`-e`), Kubernetes (Secrets/ConfigMaps). | ||
| - **Separate TOML Files for Non-Secrets**: Use `overrides.toml` (or env-specific like `staging.toml`) for bulk overrides. Extend the loader in `crates/app/src/main.rs` to support `COLDVOX_CONFIG_OVERRIDE_PATH=/path/to/staging.toml`. | ||
| - Template extension for staging: | ||
| ```toml | ||
| # staging.toml - non-sensitive overrides | ||
| [stt] | ||
| preferred = "vosk" | ||
| language = "en" | ||
|
|
||
| [injection] | ||
| injection_mode = "keystroke" # Staging: Test keystroke reliability | ||
| ``` | ||
| - Current load order: CLI flags > Env vars > default.toml > hardcoded defaults. Note: `overrides.toml` is a template and NOT automatically loaded. To enable, add `.add_source(File::with_name("config/overrides.toml").required(false))` to `Settings::new()`. | ||
| - **Validation**: On deploy, validate configs (see [docs/deployment.md](docs/deployment.md) for steps, including parsing checks and tests). | ||
|
|
||
| ### Best Practices | ||
| - **Secrets Management**: Use tools like HashiCorp Vault, AWS Secrets Manager, or env files (`.env` with `dotenv` if extended). | ||
| - **Rollback**: Backup configs before deploy; fallback to env vars if TOML fails. | ||
| - **CI Integration**: Test config loading in workflows (e.g., set mock env vars in `.github/workflows/ci.yml`). | ||
| - For full deployment details, including validation and rollback, refer to [docs/deployment.md](docs/deployment.md). | ||
|
|
||
| ## `plugins.json` | ||
|
|
||
| This file contains the configuration for the STT (Speech-to-Text) plugin manager. It defines the preferred plugin, fallback plugins, and other settings related to plugin management. | ||
|
|
||
| While the main application configuration is in `default.toml`, this file is kept separate to potentially allow for dynamic updates or for management by external tools in the future. | ||
|
|
||
| ## For Test Authors | ||
|
|
||
| Tests that need to load configuration should use `Settings::from_path()` with `CARGO_MANIFEST_DIR`: | ||
|
|
||
| ```rust | ||
| #[cfg(test)] | ||
| use std::env; | ||
| use std::path::PathBuf; | ||
|
|
||
| fn get_test_config_path() -> PathBuf { | ||
| // Try workspace root first (for integration tests) | ||
| let workspace_config = PathBuf::from(env!("CARGO_MANIFEST_DIR")) | ||
| .parent() | ||
| .unwrap() | ||
| .parent() | ||
| .unwrap() | ||
| .join("config/default.toml"); | ||
|
|
||
| if workspace_config.exists() { | ||
| return workspace_config; | ||
| } | ||
|
|
||
| // Fallback to relative path from crate root | ||
| PathBuf::from(env!("CARGO_MANIFEST_DIR")) | ||
| .join("../../config/default.toml") | ||
| } | ||
|
|
||
| #[test] | ||
| fn my_test() { | ||
| let config_path = get_test_config_path(); | ||
| let settings = Settings::from_path(&config_path)?; | ||
| // ... test logic | ||
| } | ||
| ``` | ||
|
|
||
| This ensures tests work regardless of working directory context. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| # ColdVox default configuration file | ||
| # Root-level app settings | ||
| resampler_quality = "balanced" # "fast", "balanced", "quality" | ||
| activation_mode = "vad" # "vad", "hotkey" | ||
| enable_device_monitor = true | ||
| # device = "Device Name" # Optional: specific device (omit for default) | ||
|
|
||
| [injection] | ||
| # Core behavior | ||
| fail_fast = false # Exit immediately if all injection methods fail | ||
| allow_kdotool = false # Enable kdotool fallback (KDE/X11) | ||
| allow_enigo = false # Enable enigo fallback (input simulation) | ||
| inject_on_unknown_focus = true # Allow injection when focus is unknown | ||
| require_focus = false # Require editable focus for injection | ||
| pause_hotkey = "" # Hotkey to pause/resume injection (e.g., "Ctrl+Alt+P") | ||
| redact_logs = true # Redact text in logs for privacy | ||
|
|
||
| # Timing and latency | ||
| max_total_latency_ms = 800 # Max latency for a single injection call (ms) | ||
| per_method_timeout_ms = 250 # Timeout for each method attempt (ms) | ||
| paste_action_timeout_ms = 200 # Timeout for paste actions (ms) | ||
|
|
||
| # Cooldown/backoff | ||
| cooldown_initial_ms = 10000 # Initial cooldown after failure (ms) | ||
| cooldown_backoff_factor = 2.0 # Exponential backoff factor | ||
| cooldown_max_ms = 300000 # Max cooldown period (ms) | ||
|
|
||
| # Injection behavior | ||
| injection_mode = "auto" # "keystroke", "paste", or "auto" | ||
| keystroke_rate_cps = 20 # Keystroke rate (chars/sec) | ||
| max_burst_chars = 50 # Max chars per burst | ||
| paste_chunk_chars = 500 # Chunk size for paste ops | ||
| chunk_delay_ms = 30 # Delay between paste chunks (ms) | ||
|
|
||
| # Focus/window management | ||
| focus_cache_duration_ms = 200 # Cache duration for focus status (ms) | ||
| enable_window_detection = true # Enable window manager integration | ||
| clipboard_restore_delay_ms = 500 # Delay before restoring clipboard (ms) | ||
| discovery_timeout_ms = 1000 # Timeout for window discovery (ms) | ||
|
|
||
| # App allow/block lists | ||
| allowlist = [] # List of allowed app patterns (regex) | ||
| blocklist = [] # List of blocked app patterns (regex) | ||
|
|
||
| # Success rate tuning | ||
| min_success_rate = 0.3 # Minimum success rate before fallback | ||
| min_sample_size = 5 # Samples before trusting success rate | ||
|
|
||
| [stt] | ||
| # preferred = "vosk" # Preferred STT engine (omit for auto-select) | ||
| fallbacks = [] | ||
| require_local = false | ||
| # max_mem_mb = 1024 # Memory limit in MB (omit for no limit) | ||
| # language = "en-US" # Language code (omit for default) | ||
| failover_threshold = 5 | ||
| failover_cooldown_secs = 10 | ||
| model_ttl_secs = 300 | ||
| disable_gc = false | ||
| metrics_log_interval_secs = 30 | ||
| debug_dump_events = false | ||
| auto_extract = true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| # Local overrides for default.toml - add config/overrides.toml to .gitignore! | ||
| # This file is not loaded by default; extend Settings::new() in crates/app/src/main.rs to include it if desired. | ||
| # Use this for local development or production settings without committing sensitive data. | ||
| # All values here override those in default.toml and can also be overridden by env vars. | ||
|
|
||
| # Example: Override general settings | ||
| device = "your_preferred_mic" # Maps to COLDVOX_DEVICE=your_preferred_mic | ||
| resampler_quality = "quality" # Maps to COLDVOX_RESAMPLER_QUALITY=quality | ||
| enable_device_monitor = true # Maps to COLDVOX_ENABLE_DEVICE_MONITOR=true | ||
| activation_mode = "hotkey" # Maps to COLDVOX_ACTIVATION_MODE=hotkey | ||
|
|
||
| # Example: Override injection settings | ||
| [Injection] | ||
| fail_fast = true # Maps to COLDVOX_INJECTION__FAIL_FAST=true | ||
| allow_kdotool = true # Maps to COLDVOX_INJECTION__ALLOW_KDOTOOL=true | ||
| allow_enigo = false # Maps to COLDVOX_INJECTION__ALLOW_ENIGO=false | ||
| inject_on_unknown_focus = false # Maps to COLDVOX_INJECTION__INJECT_ON_UNKNOWN_FOCUS=false | ||
| require_focus = true # Maps to COLDVOX_INJECTION__REQUIRE_FOCUS=true | ||
| pause_hotkey = "Ctrl+Alt+P" # Maps to COLDVOX_INJECTION__PAUSE_HOTKEY=Ctrl+Alt+P | ||
| redact_logs = true # Maps to COLDVOX_INJECTION__REDACT_LOGS=true | ||
| max_total_latency_ms = 500 # Maps to COLDVOX_INJECTION__MAX_TOTAL_LATENCY_MS=500 | ||
| per_method_timeout_ms = 200 # Maps to COLDVOX_INJECTION__PER_METHOD_TIMEOUT_MS=200 | ||
| paste_action_timeout_ms = 150 # Maps to COLDVOX_INJECTION__PASTE_ACTION_TIMEOUT_MS=150 | ||
| cooldown_initial_ms = 5000 # Maps to COLDVOX_INJECTION__COOLDOWN_INITIAL_MS=5000 | ||
| cooldown_backoff_factor = 1.5 # Maps to COLDVOX_INJECTION__COOLDOWN_BACKOFF_FACTOR=1.5 | ||
| cooldown_max_ms = 60000 # Maps to COLDVOX_INJECTION__COOLDOWN_MAX_MS=60000 | ||
| injection_mode = "paste" # Maps to COLDVOX_INJECTION__INJECTION_MODE=paste | ||
| keystroke_rate_cps = 15 # Maps to COLDVOX_INJECTION__KEYSTROKE_RATE_CPS=15 | ||
| max_burst_chars = 30 # Maps to COLDVOX_INJECTION__MAX_BURST_CHARS=30 | ||
| paste_chunk_chars = 300 # Maps to COLDVOX_INJECTION__PASTE_CHUNK_CHARS=300 | ||
| chunk_delay_ms = 50 # Maps to COLDVOX_INJECTION__CHUNK_DELAY_MS=50 | ||
| focus_cache_duration_ms = 100 # Maps to COLDVOX_INJECTION__FOCUS_CACHE_DURATION_MS=100 | ||
| enable_window_detection = false # Maps to COLDVOX_INJECTION__ENABLE_WINDOW_DETECTION=false | ||
| clipboard_restore_delay_ms = 300 # Maps to COLDVOX_INJECTION__CLIPBOARD_RESTORE_DELAY_MS=300 | ||
| discovery_timeout_ms = 800 # Maps to COLDVOX_INJECTION__DISCOVERY_TIMEOUT_MS=800 | ||
| allowlist = ["firefox", "chrome"] # Maps to COLDVOX_INJECTION__ALLOWLIST=firefox,chrome | ||
| blocklist = ["password_manager"] # Maps to COLDVOX_INJECTION__BLOCKLIST=password_manager | ||
| min_success_rate = 0.5 # Maps to COLDVOX_INJECTION__MIN_SUCCESS_RATE=0.5 | ||
| min_sample_size = 10 # Maps to COLDVOX_INJECTION__MIN_SAMPLE_SIZE=10 | ||
|
|
||
| # Example: STT preferences (avoid committing sensitive model paths or API keys) | ||
| [stt] | ||
| preferred = "vosk" # Maps to COLDVOX_STT__PREFERRED=vosk | ||
| fallbacks = ["whisper", "mock"] # Maps to COLDVOX_STT__FALLBACKS=whisper,mock | ||
| require_local = true # Maps to COLDVOX_STT__REQUIRE_LOCAL=true | ||
| max_mem_mb = 1024 # Maps to COLDVOX_STT__MAX_MEM_MB=1024 | ||
| language = "en" # Maps to COLDVOX_STT__LANGUAGE=en | ||
| failover_threshold = 5 # Maps to COLDVOX_STT__FAILOVER_THRESHOLD=5 | ||
| failover_cooldown_secs = 60 # Maps to COLDVOX_STT__FAILOVER_COOLDOWN_SECS=60 | ||
| model_ttl_secs = 600 # Maps to COLDVOX_STT__MODEL_TTL_SECS=600 | ||
| disable_gc = false # Maps to COLDVOX_STT__DISABLE_GC=false | ||
| metrics_log_interval_secs = 30 # Maps to COLDVOX_STT__METRICS_LOG_INTERVAL_SECS=30 | ||
| debug_dump_events = true # Maps to COLDVOX_STT__DEBUG_DUMP_EVENTS=true | ||
| auto_extract = true # Maps to COLDVOX_STT__AUTO_EXTRACT=true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| { | ||
| "preferred_plugin": null, | ||
| "fallback_plugins": [], | ||
| "require_local": false, | ||
| "max_memory_mb": null, | ||
| "required_language": null, | ||
| "failover": { | ||
| "failover_threshold": 5, | ||
| "failover_cooldown_secs": 10 | ||
| }, | ||
| "gc_policy": { | ||
| "model_ttl_secs": 300, | ||
| "enabled": true | ||
| }, | ||
| "metrics": { | ||
| "log_interval_secs": 30, | ||
| "debug_dump_events": false | ||
| }, | ||
| "auto_extract_model": true | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,14 +1,3 @@ | ||
| pub mod vad_adapter; | ||
| pub mod vad_processor; | ||
|
|
||
| // Re-export modules from coldvox-audio crate | ||
| pub use coldvox_audio::{ | ||
| capture::CaptureStats, | ||
| chunker::{AudioChunker, ChunkerConfig, ResamplerQuality}, | ||
| frame_reader::FrameReader, | ||
| ring_buffer::{AudioProducer, AudioRingBuffer}, | ||
| }; | ||
|
|
||
| pub use coldvox_audio::AudioFrame; | ||
| pub use vad_adapter::*; | ||
| pub use vad_processor::*; | ||
| pub mod wav_file_loader; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -86,7 +86,7 @@ impl VadProcessor { | |||||
| timestamp_ms, | ||||||
| energy_db, | ||||||
| } => { | ||||||
| info!( | ||||||
| debug!( | ||||||
| "VAD: Speech started at {}ms (energy: {:.2} dB)", | ||||||
| timestamp_ms, energy_db | ||||||
| ); | ||||||
|
|
@@ -96,7 +96,7 @@ impl VadProcessor { | |||||
| duration_ms, | ||||||
| energy_db, | ||||||
| } => { | ||||||
| info!( | ||||||
| debug!( | ||||||
| "VAD: Speech ended at {}ms (duration: {}ms, energy: {:.2} dB)", | ||||||
| timestamp_ms, duration_ms, energy_db | ||||||
| ); | ||||||
|
|
@@ -126,14 +126,14 @@ impl VadProcessor { | |||||
|
|
||||||
| self.frames_processed += 1; | ||||||
|
|
||||||
| if self.frames_processed % 100 == 0 { | ||||||
| if self.frames_processed.is_multiple_of(100) { | ||||||
| tracing::debug!( | ||||||
| "VAD: Received {} frames, processing active", | ||||||
| self.frames_processed | ||||||
| ); | ||||||
| } | ||||||
|
|
||||||
| if self.frames_processed % 1000 == 0 { | ||||||
| if self.frames_processed.is_multiple_of(1000) { | ||||||
|
||||||
| if self.frames_processed.is_multiple_of(1000) { | |
| if self.frames_processed % 1000 == 0 { |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primitive integers do not have is_multiple_of in stable Rust; this will not compile. Use a modulo check instead: self.frames_processed % 100 == 0.