Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions docs/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -1859,6 +1859,42 @@ If Whisper transcribes "vox type" (or "Vox Type"), it will be replaced with "vox
"omar key" = "Omarchy"
```

### smart_auto_submit

**Type:** Boolean
**Default:** `false`
**Required:** No

When `true`, Voxtype watches for the word "submit" at the end of each transcription. If detected, it strips the word from the output and presses Enter - as if `auto_submit` had fired, but triggered by voice rather than being permanently on. Trailing punctuation on "submit" (e.g., "submit." from spoken punctuation) is handled correctly.

**Example:**

```toml
[text]
smart_auto_submit = true
```

Saying "send a reply to Alice submit" types "send a reply to Alice" and presses Enter.

**Per-recording override:**

```bash
# Enable for just this recording (even if config has it off)
voxtype record start --smart-auto-submit
voxtype record toggle --smart-auto-submit

# Disable for just this recording (even if config has it on)
voxtype record start --no-smart-auto-submit
```

**Environment variable:**

```bash
VOXTYPE_SMART_AUTO_SUBMIT=true voxtype
```

**Note:** `smart_auto_submit` is conditional - it only fires when you say "submit". The existing `auto_submit` option always presses Enter after every transcription. Use `smart_auto_submit` when you want the choice per dictation, and `auto_submit` when you always want Enter pressed.

---

## [vad]
Expand Down
78 changes: 78 additions & 0 deletions docs/SMOKE_TESTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,84 @@ sleep 2
voxtype record stop
```

## Smart Auto-Submit

Tests the `smart_auto_submit` feature: saying "submit" at the end of dictation
strips the word and presses Enter.

### Config-based

```bash
# 1. Enable in config.toml:
# [text]
# smart_auto_submit = true

# 2. Restart daemon
systemctl --user restart voxtype

# 3. Record and say "hello world submit" (or "hello world submit.")
voxtype record start
sleep 4
voxtype record stop

# 4. Expected: "hello world" is typed and Enter is pressed
#
# To verify via logs, the daemon must be running with debug logging (-v):
# journalctl --user -u voxtype --since "30 seconds ago" | grep "Smart auto-submit triggered"
# At default log level the trigger fires silently - verify by observing Enter being pressed.
```

### CLI override (per-recording)

```bash
# Force on for this recording (even if config has smart_auto_submit = false)
voxtype record start --smart-auto-submit
sleep 4
voxtype record stop
# Say "hello world submit" - should type "hello world" and press Enter

# Force off for this recording (even if config has smart_auto_submit = true)
voxtype record start --no-smart-auto-submit
sleep 4
voxtype record stop
# Say "hello world submit" - "submit" should remain in output, no Enter pressed
```

### Environment variable

```bash
# Stop the managed daemon first to avoid running two daemons simultaneously
systemctl --user stop voxtype

# Start a temporary daemon with the env var
VOXTYPE_SMART_AUTO_SUBMIT=true voxtype daemon &
DAEMON_PID=$!
sleep 2

voxtype record start && sleep 4 && voxtype record stop
# Say "hello world submit" - should type "hello world" and press Enter

# Clean up: stop the temp daemon and restart the managed one
kill $DAEMON_PID
systemctl --user start voxtype
```

### Negative cases

```bash
# "submitted" (partial word) should NOT trigger
voxtype record start --smart-auto-submit
sleep 4
voxtype record stop
# Say "I submitted the form" - full text including "submitted" should appear, no Enter

# "submit" in the middle should NOT trigger
voxtype record start --smart-auto-submit
sleep 4
voxtype record stop
# Say "please submit this form now" - full text should appear, no Enter
```

## File Output

Tests the file output mode for writing transcriptions to files instead of typing.
Expand Down
27 changes: 27 additions & 0 deletions docs/USER_MANUAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -1452,6 +1452,33 @@ auto_submit = true # Press Enter after transcription

Useful for chat applications or command lines where you want to submit immediately after dictating.

**Smart auto-submit (say "submit" to press Enter):**

```toml
[text]
smart_auto_submit = true
```

With this enabled, ending your dictation with the word "submit" strips that word from the output and presses Enter. Unlike `auto_submit` (which always presses Enter), this only fires when you choose to say it.

```
# You say: "reply to Alice and cc Bob submit"
# Voxtype types: "reply to Alice and cc Bob" [then presses Enter]
```

Per-recording override (useful with compositor keybindings):

```bash
voxtype record start --smart-auto-submit # force on for this recording
voxtype record start --no-smart-auto-submit # force off for this recording
```

Or via environment variable for the whole session:

```bash
VOXTYPE_SMART_AUTO_SUBMIT=true voxtype
```

**Shift+Enter for newlines:**

```toml
Expand Down
135 changes: 135 additions & 0 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,14 @@ pub struct Cli {
#[arg(long, conflicts_with = "shift_enter_newlines", help_heading = "Output")]
pub no_shift_enter_newlines: bool,

/// Enable smart auto-submit (say "submit" to press Enter)
#[arg(long, help_heading = "Output")]
pub smart_auto_submit: bool,

/// Disable smart auto-submit (overrides config)
#[arg(long, conflicts_with = "smart_auto_submit", help_heading = "Output")]
pub no_smart_auto_submit: bool,

/// Delay between typed characters in milliseconds (0 = fastest)
#[arg(long, value_name = "MS", help_heading = "Output")]
pub type_delay: Option<u32>,
Expand Down Expand Up @@ -425,6 +433,14 @@ pub enum RecordAction {
/// Disable Shift+Enter newlines for this transcription (overrides config)
#[arg(long, conflicts_with = "shift_enter_newlines")]
no_shift_enter_newlines: bool,

/// Enable smart auto-submit for this recording (say "submit" to press Enter)
#[arg(long, conflicts_with = "no_smart_auto_submit")]
smart_auto_submit: bool,

/// Disable smart auto-submit for this recording
#[arg(long, conflicts_with = "smart_auto_submit")]
no_smart_auto_submit: bool,
},
/// Stop recording and transcribe (send SIGUSR2 to daemon)
Stop {
Expand Down Expand Up @@ -483,6 +499,14 @@ pub enum RecordAction {
/// Disable Shift+Enter newlines for this transcription (overrides config)
#[arg(long, conflicts_with = "shift_enter_newlines")]
no_shift_enter_newlines: bool,

/// Enable smart auto-submit for this recording (say "submit" to press Enter)
#[arg(long, conflicts_with = "no_smart_auto_submit")]
smart_auto_submit: bool,

/// Disable smart auto-submit for this recording (overrides config)
#[arg(long, conflicts_with = "smart_auto_submit")]
no_smart_auto_submit: bool,
},
/// Cancel current recording or transcription (discard without output)
Cancel,
Expand Down Expand Up @@ -704,6 +728,32 @@ impl RecordAction {
None
}
}

/// Get the smart auto-submit override from --smart-auto-submit / --no-smart-auto-submit flags
/// Returns Some(true) to enable, Some(false) to disable, None if not specified
pub fn smart_auto_submit_override(&self) -> Option<bool> {
let (enable, disable) = match self {
RecordAction::Start {
smart_auto_submit,
no_smart_auto_submit,
..
} => (*smart_auto_submit, *no_smart_auto_submit),
RecordAction::Toggle {
smart_auto_submit,
no_smart_auto_submit,
..
} => (*smart_auto_submit, *no_smart_auto_submit),
RecordAction::Stop { .. } | RecordAction::Cancel => return None,
};

if enable {
Some(true)
} else if disable {
Some(false)
} else {
None
}
}
}

#[derive(Subcommand)]
Expand Down Expand Up @@ -1715,4 +1765,89 @@ mod tests {
_ => panic!("Expected Record command"),
}
}

// =========================================================================
// Smart auto-submit flag tests
// =========================================================================

#[test]
fn test_record_start_smart_auto_submit_enable() {
let cli = Cli::parse_from(["voxtype", "record", "start", "--smart-auto-submit"]);
match cli.command {
Some(Commands::Record { action }) => {
assert_eq!(action.smart_auto_submit_override(), Some(true));
}
_ => panic!("Expected Record command"),
}
}

#[test]
fn test_record_start_no_smart_auto_submit() {
let cli = Cli::parse_from(["voxtype", "record", "start", "--no-smart-auto-submit"]);
match cli.command {
Some(Commands::Record { action }) => {
assert_eq!(action.smart_auto_submit_override(), Some(false));
}
_ => panic!("Expected Record command"),
}
}

#[test]
fn test_record_start_smart_auto_submit_mutual_exclusion() {
let result = Cli::try_parse_from([
"voxtype",
"record",
"start",
"--smart-auto-submit",
"--no-smart-auto-submit",
]);
assert!(
result.is_err(),
"Should not allow both flags simultaneously"
);
}

#[test]
fn test_record_start_smart_auto_submit_no_flags_returns_none() {
let cli = Cli::parse_from(["voxtype", "record", "start"]);
match cli.command {
Some(Commands::Record { action }) => {
assert_eq!(action.smart_auto_submit_override(), None);
}
_ => panic!("Expected Record command"),
}
}

#[test]
fn test_record_toggle_smart_auto_submit_enable() {
let cli = Cli::parse_from(["voxtype", "record", "toggle", "--smart-auto-submit"]);
match cli.command {
Some(Commands::Record { action }) => {
assert_eq!(action.smart_auto_submit_override(), Some(true));
}
_ => panic!("Expected Record command"),
}
}

#[test]
fn test_record_toggle_no_smart_auto_submit() {
let cli = Cli::parse_from(["voxtype", "record", "toggle", "--no-smart-auto-submit"]);
match cli.command {
Some(Commands::Record { action }) => {
assert_eq!(action.smart_auto_submit_override(), Some(false));
}
_ => panic!("Expected Record command"),
}
}

#[test]
fn test_record_stop_has_no_smart_auto_submit_override() {
let cli = Cli::parse_from(["voxtype", "record", "stop"]);
match cli.command {
Some(Commands::Record { action }) => {
assert_eq!(action.smart_auto_submit_override(), None);
}
_ => panic!("Expected Record command"),
}
}
}
12 changes: 12 additions & 0 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,10 @@ on_transcription = true
#
# Custom word replacements (case-insensitive)
# replacements = { "vox type" = "voxtype" }
#
# Smart auto-submit: say "submit" at the end of dictation to press Enter.
# The word "submit" is stripped from the output text and Enter is pressed.
# smart_auto_submit = false

# [vad]
# Voice Activity Detection - filters silence-only recordings
Expand Down Expand Up @@ -1215,6 +1219,11 @@ pub struct TextConfig {
/// Example: { "vox type" = "voxtype" }
#[serde(default)]
pub replacements: HashMap<String, String>,

/// Smart auto-submit: say "submit" at the end of dictation to press Enter.
/// The word "submit" is stripped from the output and Enter is pressed.
#[serde(default)]
pub smart_auto_submit: bool,
}

/// Meeting transcription configuration
Expand Down Expand Up @@ -2083,6 +2092,9 @@ pub fn load_config(path: Option<&Path>) -> Result<Config, VoxtypeError> {
config.output.restore_clipboard_delay_ms = ms;
}
}
if let Ok(val) = std::env::var("VOXTYPE_SMART_AUTO_SUBMIT") {
config.text.smart_auto_submit = parse_bool_env(&val);
}

Ok(config)
}
Expand Down
Loading