feat(speechmatics): add max_speakers parameter for speaker diarization #3524

nsepehr · 2025-09-29T06:02:47Z

Summary

This PR adds support for the max_speakers parameter to the Speechmatics STT plugin, allowing developers to limit the number of unique speakers detected during diarization.

Problem

Currently, when using the Speechmatics STT plugin with diarization enabled, there's no way to specify the maximum number of speakers. The transcription_config parameter (which is deprecated) accepts a speaker_diarization_config with max_speakers, but this value is not preserved when the plugin processes the configuration.

Solution

Added max_speakers as a direct parameter to the STT __init__ method
Updated the STTOptions dataclass to include the max_speakers field
Modified _process_config to include max_speakers in the speaker_diarization_config when sending to the Speechmatics API
Added proper handling for extracting max_speakers from the deprecated transcription_config parameter for backward compatibility
Updated documentation to explain the new parameter

Use Case

This parameter is particularly useful for scenarios where the number of participants is known in advance, such as:

Two-person interviews or conversations
Small group discussions with a fixed number of participants
Customer service calls (agent and customer)
Educational settings with known speaker counts

Testing

Tested locally with a multi-speaker agent implementation
Verified that the parameter is correctly passed to the Speechmatics API configuration
Confirmed backward compatibility with the deprecated transcription_config parameter

Example Usage

stt = speechmatics.STT(
    language="en",
    enable_diarization=True,
    max_speakers=2,  # Limit to 2 speakers
    diarization_sensitivity=0.5,
    speaker_active_format="@[{speaker_id}]: {text}",
)

Breaking Changes

None - this is a backward-compatible addition.

CLAassistant · 2025-09-29T06:02:54Z

All committers have signed the CLA.

longcw · 2025-09-29T07:46:46Z

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py

+            if not is_given(max_speakers) and hasattr(config, "speaker_diarization_config"):
+                if (
+                    config.speaker_diarization_config
+                    and "max_speakers" in config.speaker_diarization_config
+                ):
+                    max_speakers = config.speaker_diarization_config["max_speakers"]


speaker_diarization_config is a dataclass? if it requires a specific version of speechmatics, we can specify it in pyproject.toml.

Suggested change

if not is_given(max_speakers) and hasattr(config, "speaker_diarization_config"):

if (

config.speaker_diarization_config

and "max_speakers" in config.speaker_diarization_config

):

max_speakers = config.speaker_diarization_config["max_speakers"]

if (

not is_given(max_speakers)

and (dz_cfg := config.speaker_diarization_config)

and dz_cfg.max_speakers is not None

):

max_speakers = dz_cfg.max_speakers

longcw · 2025-09-29T07:50:47Z

livekit-plugins/livekit-plugins-speechmatics/livekit/plugins/speechmatics/stt.py

            if self._stt_options.diarization_sensitivity is not None:
                dz_cfg["speaker_sensitivity"] = self._stt_options.diarization_sensitivity
+            if self._stt_options.max_speakers is not None:
+                dz_cfg["max_speakers"] = self._stt_options.max_speakers


we should also refactor _process_config to replace the dataclass value of transcription_config instead of assigning a dict to it.

- Added max_speakers parameter to STT __init__ method - Updated STTOptions dataclass to include max_speakers field - Modified _process_config to include max_speakers in speaker_diarization_config - Added handling for extracting max_speakers from deprecated transcription_config - Updated documentation to explain the new parameter - Fixed compatibility with livekit-agents 1.2.6 (removed diarization from STTCapabilities) - Updated minimum livekit-agents version to 1.2.6 This parameter allows limiting the number of unique speakers detected during diarization, which is useful for scenarios with a known number of participants (e.g., 2-person interviews, small group meetings with fixed participants).

nsepehr force-pushed the feat/speechmatics-max-speakers branch 2 times, most recently from a3a7974 to c395ae5 Compare September 29, 2025 06:18

longcw reviewed Sep 29, 2025

View reviewed changes

nsepehr force-pushed the feat/speechmatics-max-speakers branch from c395ae5 to 400f516 Compare September 30, 2025 05:25

nsepehr force-pushed the feat/speechmatics-max-speakers branch from 400f516 to 8400a67 Compare September 30, 2025 05:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(speechmatics): add max_speakers parameter for speaker diarization #3524

feat(speechmatics): add max_speakers parameter for speaker diarization #3524

nsepehr commented Sep 29, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Sep 29, 2025 •

edited

Loading

Uh oh!

longcw Sep 29, 2025

Uh oh!

longcw Sep 29, 2025

Uh oh!

Uh oh!

feat(speechmatics): add max_speakers parameter for speaker diarization #3524

Are you sure you want to change the base?

feat(speechmatics): add max_speakers parameter for speaker diarization #3524

Conversation

nsepehr commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Use Case

Testing

Example Usage

Breaking Changes

Uh oh!

CLAassistant commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

longcw Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

longcw Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nsepehr commented Sep 29, 2025 •

edited

Loading

CLAassistant commented Sep 29, 2025 •

edited

Loading