Skip to content

Feature: Generalized Transcription Transformation Pipeline #121

@kitlangton

Description

@kitlangton

Summary

Greetings! This is Kit's personal scribe, Mr. Claude, distilling Kit's vision into written form.

As Kit eloquently put it in #68: rather than adding isolated options for various text tweaks, we should build a generalized transcription transformation pipeline—a composable sequence of functions with the signature String → String (or async variants String → async String).

This would be a single, flexible system that solves multiple feature requests at once. An extinction event for many birds with one reasonably sized chunk of rock, if you will.

The Vision

Core Concept

Transformations are chained functions that take a transcript string and return a modified string. These can be:

  • Synchronous (simple text operations)
  • Asynchronous (LLM calls, API requests, etc.)
  • Stackable and reorderable by the user

Example Transformations

Simple text operations:

Dictionary-based replacements (#118):

  • Map of word/phrase replacements
  • Domain-specific terminology corrections
  • Custom abbreviation expansions

LLM-based transformations (#62):

  • Grammar/punctuation cleanup
  • Tone adjustment (formal, casual, etc.)
  • Expansion (convert notes to full sentences)
  • Translation
  • Custom prompts

Context-Aware Pipelines

Different transformation stacks could be applied based on:

  1. Active application: Different pipelines for Slack vs. email vs. code editors
  2. Manual mode switching: Hotkey to toggle between transformation presets
  3. Explicit selection: UI to pick which pipeline to use

For example:

  • Code mode: No capitalization, no periods, snake_case/camelCase conversions
  • Email mode: LLM grammar cleanup + capitalize + add period
  • Messaging mode: Casual tone + append space for rapid-fire dictation

UI/UX Concept

Settings would include:

  • Transformation library: All available transformations (built-in + user-created)
  • Pipeline builder: Drag-and-drop interface to create and reorder transformation chains
  • Pipeline presets: Named configurations ("Code", "Email", "Quick Notes", etc.)
  • App-specific rules: Assign pipelines to specific foreground applications
  • Mode switcher: Global hotkey or menu bar control to switch active pipeline

Technical Implementation Notes

protocol TranscriptionTransformation {
    func transform(_ input: String) async throws -> String
}

struct TransformationPipeline {
    var transformations: [TranscriptionTransformation]
    
    func apply(to transcript: String) async throws -> String {
        var result = transcript
        for transformation in transformations {
            result = try await transformation.transform(result)
        }
        return result
    }
}

Example transformations:

  • AppendSpaceTransformation
  • AddPeriodTransformation
  • DictionaryReplacementTransformation(mappings: [String: String])
  • LLMTransformation(prompt: String, apiKey: String)
  • CaseTransformation(style: .upper | .lower | .spongebob)

Related Issues

This would address or partially address:

Benefits

Flexibility: Users can create custom workflows without cluttering the UI with dozens of checkboxes
Composability: Mix and match simple and complex transformations
Extensibility: Easy to add new transformations over time
Power user friendly: Advanced users can build sophisticated pipelines
Simple by default: Ships with sensible presets; complexity is opt-in


As Kit said: "Obviously, there are much simpler ways of solving just the append-a-space problem, but this would allow me to knock out a few birds, many birds, perhaps even kill all birds on planet Earth at once."

Let's build the asteroid. 🪨

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions