-
-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Summary
Greetings! This is Kit's personal scribe, Mr. Claude, distilling Kit's vision into written form.
As Kit eloquently put it in #68: rather than adding isolated options for various text tweaks, we should build a generalized transcription transformation pipeline—a composable sequence of functions with the signature String → String (or async variants String → async String).
This would be a single, flexible system that solves multiple feature requests at once. An extinction event for many birds with one reasonably sized chunk of rock, if you will.
The Vision
Core Concept
Transformations are chained functions that take a transcript string and return a modified string. These can be:
- Synchronous (simple text operations)
- Asynchronous (LLM calls, API requests, etc.)
- Stackable and reorderable by the user
Example Transformations
Simple text operations:
- Append/prepend space (Option to add space after recording. #68)
- Add period at end
- Capitalize first letter
- Convert to uppercase/lowercase/SpongeBob case
- Trim whitespace
Dictionary-based replacements (#118):
- Map of word/phrase replacements
- Domain-specific terminology corrections
- Custom abbreviation expansions
LLM-based transformations (#62):
- Grammar/punctuation cleanup
- Tone adjustment (formal, casual, etc.)
- Expansion (convert notes to full sentences)
- Translation
- Custom prompts
Context-Aware Pipelines
Different transformation stacks could be applied based on:
- Active application: Different pipelines for Slack vs. email vs. code editors
- Manual mode switching: Hotkey to toggle between transformation presets
- Explicit selection: UI to pick which pipeline to use
For example:
- Code mode: No capitalization, no periods, snake_case/camelCase conversions
- Email mode: LLM grammar cleanup + capitalize + add period
- Messaging mode: Casual tone + append space for rapid-fire dictation
UI/UX Concept
Settings would include:
- Transformation library: All available transformations (built-in + user-created)
- Pipeline builder: Drag-and-drop interface to create and reorder transformation chains
- Pipeline presets: Named configurations ("Code", "Email", "Quick Notes", etc.)
- App-specific rules: Assign pipelines to specific foreground applications
- Mode switcher: Global hotkey or menu bar control to switch active pipeline
Technical Implementation Notes
protocol TranscriptionTransformation {
func transform(_ input: String) async throws -> String
}
struct TransformationPipeline {
var transformations: [TranscriptionTransformation]
func apply(to transcript: String) async throws -> String {
var result = transcript
for transformation in transformations {
result = try await transformation.transform(result)
}
return result
}
}Example transformations:
AppendSpaceTransformationAddPeriodTransformationDictionaryReplacementTransformation(mappings: [String: String])LLMTransformation(prompt: String, apiKey: String)CaseTransformation(style: .upper | .lower | .spongebob)
Related Issues
This would address or partially address:
- Option to add space after recording. #68 - Option to add space after recording
- Add the ability to run through LLM #62 - Add the ability to run through LLM
- Feature request: Add dictionaries for transcription #118 - Feature request: Add dictionaries for transcription
- Add cloud models #14 - Add cloud models (for LLM transformations)
Benefits
✨ Flexibility: Users can create custom workflows without cluttering the UI with dozens of checkboxes
✨ Composability: Mix and match simple and complex transformations
✨ Extensibility: Easy to add new transformations over time
✨ Power user friendly: Advanced users can build sophisticated pipelines
✨ Simple by default: Ships with sensible presets; complexity is opt-in
As Kit said: "Obviously, there are much simpler ways of solving just the append-a-space problem, but this would allow me to knock out a few birds, many birds, perhaps even kill all birds on planet Earth at once."
Let's build the asteroid. 🪨