Skip to content

Improve App ID guidance, hotkey visibility, and recording startup latency / 优化 App ID 指引、快捷键入口与录音首响延迟#1

Open
jingchang0623-crypto wants to merge 1 commit intomainfrom
codex/appid-hotkey-latency-improvements
Open

Improve App ID guidance, hotkey visibility, and recording startup latency / 优化 App ID 指引、快捷键入口与录音首响延迟#1
jingchang0623-crypto wants to merge 1 commit intomainfrom
codex/appid-hotkey-latency-improvements

Conversation

@jingchang0623-crypto
Copy link
Copy Markdown
Owner

中文说明

变更背景

本 PR 主要解决三个用户体验问题:

  1. 火山 ASR 凭证字段命名不直观(App Key 容易与 API Key 混淆)
  2. 快捷键设置入口不够明显(仅在「处理模式」页)
  3. 按下快捷键后到可说话的首响应偏慢

主要改动

1) 火山 ASR 字段文案优化

  • 将火山凭证字段显示名从 App Key 调整为 App ID
  • 在 ASR 设置中新增提示文案:App ID 需在旧版豆包语音控制台获取(非 API Key)

2) 通用页新增快捷键设置模块

  • 在「通用设置」新增独立「快捷键设置」模块
  • 支持直接为每个模式录制快捷键
  • 支持切换触发方式:按住录制 / 按下切换
  • 支持快捷键冲突检测并阻止保存
  • 保留跳转「处理模式」的高级设置入口

3) 录音首响延迟优化(核心)

  • 将启动链路由“先连 ASR,再开录音”改为“先开录音,再连 ASR”
  • 在 ASR 连接期间缓存前置音频切片
  • ASR 连接完成后补发缓存切片,减少首句丢失风险
  • 增加关键日志,便于观察连接耗时与缓存补发情况

测试与验证

  • swift build -c release 通过
  • 新增 RecognitionSession 缓存补发逻辑测试用例(未连先缓存、连接后补发、缓存上限控制)

影响范围

  • ASR 火山配置 UI 文案
  • 通用设置页结构与交互
  • 录音启动时序与 ASR 发送策略

English Summary

Context

This PR addresses three UX issues:

  1. Volcano ASR credential naming is confusing (App Key can be mistaken for API Key)
  2. Hotkey configuration is not visible enough (previously only in Modes)
  3. Noticeable delay from hotkey press to speaking readiness

What changed

1) Volcano ASR label clarity

  • Renamed Volcano credential label from App Key to App ID
  • Added an in-UI hint that App ID should be obtained from the legacy Doubao Speech console (not API Key)

2) Dedicated hotkey section in General settings

  • Added a standalone Hotkey Settings module in General
  • Supports per-mode hotkey recording
  • Supports trigger style switching: Hold to record / Toggle
  • Adds duplicate-hotkey conflict detection before save
  • Keeps an advanced navigation entry to the Modes page

3) Startup latency optimization (core)

  • Reordered startup flow from “connect ASR first” to “start recording first”
  • Buffered early audio chunks while ASR is connecting
  • Flushed buffered chunks after connection to avoid losing early speech
  • Added logs for connect latency and buffer flush visibility

Validation

  • swift build -c release passes
  • Added RecognitionSession unit test for buffering/flush behavior and buffer cap

Scope

  • Volcano ASR config UI text
  • General settings structure/UX
  • Recording startup sequencing and ASR audio dispatch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant