pyannote
diff --git a/‎CHANGELOG.md
+12-3 b/‎CHANGELOG.md
+12-3
diff --git a/‎README.md
+63-34 b/‎README.md
+63-34
diff --git a/‎pyproject.toml
+1 b/‎pyproject.toml
+1
diff --git a/‎src/pyannote/audio/pipelines/premium/__init__.py
+25 b/‎src/pyannote/audio/pipelines/premium/__init__.py
+25
@@ -4,17 +4,25 @@
 
 ### TL;DR
 
-#### Quality of life improvements
+#### Quality-of-Life improvements
 
 Models can now be stored alongside their pipelines in the same repository, streamlining gating mechanism:
 - accept `pyannote/speaker-diarization-x.x` pipeline user agreement
 - ~~accept `pyannote/segmentation-3.0` model user agreement~~
 - ~~accept `pyannote/wespeaker-voxceleb-resnet34-LM` model user agreement~~
 - load pipeline with `Pipeline.from_pretrained("pyannote/speaker-diarization-3.1", token=True)`
 
-#### Improve speech separation quality
+#### [pyannoteAI](https://www.pyannote.ai) premium speaker diarization
 
-Clipping and speaker/source alignment issues in speech separation pipeline have been fixed.
+Change one line of code to use [pyannoteAI](https://docs.pyannote.ai) and enjoy **more accurate speaker diarization**.
+
+```diff
+from pyannote.audio import Pipeline
+pipeline = Pipeline.from_pretrained(
+-    "pyannote/speaker-diarization-3.1", token="huggingface-access-token")
++    "pyannoteAI/speaker-diarization-precision, token="pyannoteAI-api-key")
+diarization = pipeline("/path/to/conversation.wav")
+```
 
 ### Breaking changes
 
@@ -31,6 +39,7 @@ Clipping and speaker/source alignment issues in speech separation pipeline have
 
 ### New features
 
+- feat(pyannoteAI): add wrapper around pyannoteAI SDK
 - improve(hub): add support for pipeline repos that also include underlying models
 - feat(clustering): add support for `k-means` clustering
 - feat(model): add `wav2vec_frozen` option to freeze/unfreeze `wav2vec` in `SSeRiouSS` architecture
 
@@ -1,36 +1,46 @@
 Using `pyannote.audio` open-source toolkit in production?
 Consider switching to [pyannoteAI](https://www.pyannote.ai) for better and faster options.
 
-# `pyannote.audio` speaker diarization toolkit
+# `pyannote` speaker diarization toolkit
 
 `pyannote.audio` is an open-source toolkit written in Python for speaker diarization. Based on [PyTorch](https://pytorch.org) machine learning framework, it comes with state-of-the-art [pretrained models and pipelines](https://hf.co/pyannote), that can be further finetuned to your own data for even better performance.
 
 <p align="center">
  <a href="https://www.youtube.com/watch?v=37R_R82lfwA"><img src="https://img.youtube.com/vi/37R_R82lfwA/0.jpg"></a>
 </p>
 
-## TL;DR
+
+## Highlights
+
+- :exploding_head: state-of-the-art performance (see [Benchmark](#benchmark))
+- :hugs: pretrained [pipelines](https://hf.co/models?other=pyannote-audio-pipeline) (and [models](https://hf.co/models?other=pyannote-audio-model)) on [:hugs: model hub](https://huggingface.co/pyannote)
+- :rocket: built-in support for [pyannoteAI](https://pyannote.ai) premium speaker diarization
+- :snake: Python-first API
+- :zap: multi-GPU training with [pytorch-lightning](https://pytorchlightning.ai/)
+
+## Open-source speaker diarization pipeline
 
 1. Install [`pyannote.audio`](https://github.com/pyannote/pyannote-audio) with `pip install pyannote.audio`
 2. Accept [`pyannote/segmentation-3.0`](https://hf.co/pyannote/segmentation-3.0) user conditions
 3. Accept [`pyannote/speaker-diarization-3.1`](https://hf.co/pyannote/speaker-diarization-3.1) user conditions
-4. Create access token at [`hf.co/settings/tokens`](https://hf.co/settings/tokens).
+4. Create Huggingface access token at [`hf.co/settings/tokens`](https://hf.co/settings/tokens).
 
 ```python
+import torch
 from pyannote.audio import Pipeline
 from pyannote.audio.pipelines.utils.hook import ProgressHook
 
+# Open-source pyannote speaker diarization pipeline
 pipeline = Pipeline.from_pretrained(
     "pyannote/speaker-diarization-3.1",
-    token="HUGGINGFACE_ACCESS_TOKEN_GOES_HERE")
+    token="HUGGINGFACE_ACCESS_TOKEN")
 
 # send pipeline to GPU (when available)
-import torch
 pipeline.to(torch.device("cuda"))
 
 # apply pretrained pipeline (with optional progress hook)
 with ProgressHook() as hook:
-    diarization = pipeline("audio.wav", hook=hook)
+    diarization = pipeline("audio.wav", hook=hook)  # runs locally
 
 # print the result
 for turn, _, speaker in diarization.itertracks(yield_label=True):
@@ -39,14 +49,56 @@ for turn, _, speaker in diarization.itertracks(yield_label=True):
 # start=1.8s stop=3.9s speaker_1
 # start=4.2s stop=5.7s speaker_0
 # ...
+
 ```
 
-## Highlights
+## Premium pyannoteAI speaker diarization pipeline
+
+1. Install [`pyannote.audio`](https://github.com/pyannote/pyannote-audio) with `pip install pyannote.audio`
+2. Create pyannoteAI API key at [`dashboard.pyannote.ai`](https://dashboard.pyannote.ai)
+
+```python
+from pyannote.audio import Pipeline
+
+# Premium pyannoteAI speaker diarization service
+pipeline = Pipeline.from_pretrained(
+    "pyannoteAI/speaker-diarization-precision", token="PYANNOTEAI_API_KEY")
+
+diarization = pipeline("audio.wav")  # runs on pyannoteAI servers
+
+# print the result
+for turn, _, speaker in diarization.itertracks(yield_label=True):
+    print(f"start={turn.start:.1f}s stop={turn.end:.1f}s {speaker}")
+# start=0.2s stop=1.6s SPEAKER_00
+# start=1.8s stop=4.0s SPEAKER_01 
+# start=4.2s stop=5.6s SPEAKER_00
+# ...
+```
+
+Visit [`docs.pyannote.ai`](https://docs.pyannote.ai) to learn about other pyannoteAI features (voiceprinting, confidence scores, ...)
+
+## Benchmark
+
+Out of the box, `pyannote.audio` speaker diarization [pipeline v3.1](https://hf.co/pyannote/speaker-diarization-3.1) is expected to be much better (and faster) than v2.x. [`pyannoteAI`](https://www.pyannote.ai) premium model goes one step further. Those numbers are diarization error rates (in %) - the lower the better.
+
+| Benchmark (2025-03)  | [v2.1](https://hf.co/pyannote/speaker-diarization-2.1) | [v3.1](https://hf.co/pyannote/speaker-diarization-3.1) | <a href="https://docs.pyannote.ai"><img src="https://avatars.githubusercontent.com/u/162698670" width="32" /></a>       | 
+| --------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ | ------------------------------------------------ |
+| [AISHELL-4](https://arxiv.org/abs/2104.03603)                                                                               | 14.1                                                   | 12.2                                                   | 12.1                                             |
+| [AliMeeting](https://www.openslr.org/119/) (channel 1)                                                                      | 27.4                                                   | 24.5                                                   | 19.8                                             |
+| [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (IHM)                                                                        | 18.9                                                   | 18.8                                                   | 15.8                                             |
+| [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (SDM)                                                                        | 27.1                                                   | 22.7                                                   | 18.3                                             |
+| [AVA-AVD](https://arxiv.org/abs/2111.14448)                                                                                 | 66.3                                                   | 49.7                                                   | 45.3                                             |
+| [CALLHOME](https://catalog.ldc.upenn.edu/LDC2001S97) ([part 2](https://github.com/BUTSpeechFIT/CALLHOME_sublists/issues/1)) | 31.6                                                   | 28.4                                                   | 20.1                                             |
+| [DIHARD 3](https://catalog.ldc.upenn.edu/LDC2022S14) ([full](https://arxiv.org/abs/2012.01477))                             | 26.9                                                   | 21.4                                                   | 17.2                                             |
+| [Earnings21](https://github.com/revdotcom/speech-datasets)                                                                  | 17.0                                                   | 9.4                                                    | 9.0                                              |
+| [Ego4D](https://arxiv.org/abs/2110.07058) (dev.)                                                                            | 61.5                                                   | 51.2                                                   | 45.8                                             |
+| [MSDWild](https://github.com/X-LANCE/MSDWILD)                                                                               | 32.8                                                   | 25.4                                                   | 19.7                                             |
+| [RAMC](https://www.openslr.org/123/)                                                                                        | 22.5                                                   | 22.2                                                   | 11.1                                             |
+| [REPERE](https://www.islrn.org/resources/360-758-359-485-0/) (phase2)                                                       | 8.2                                                    | 7.9                                                    |  7.6                                             |
+| [VoxConverse](https://github.com/joonson/voxconverse) (v0.3)                                                                | 11.2                                                   | 11.2                                                   |  9.9                                             |
+
+[Diarization error rate](http://pyannote.github.io/pyannote-metrics/reference.html#diarization) (in %)
 
-- :hugs: pretrained [pipelines](https://hf.co/models?other=pyannote-audio-pipeline) (and [models](https://hf.co/models?other=pyannote-audio-model)) on [:hugs: model hub](https://huggingface.co/pyannote)
-- :exploding_head: state-of-the-art performance (see [Benchmark](#benchmark))
-- :snake: Python-first API
-- :zap: multi-GPU training with [pytorch-lightning](https://pytorchlightning.ai/)
 
 ## Documentation
 
@@ -78,29 +130,6 @@ for turn, _, speaker in diarization.itertracks(yield_label=True):
   - 2024-04-05 > [Offline speaker diarization (speaker-diarization-3.1)](tutorials/community/offline_usage_speaker_diarization.ipynb) by [Simon Ottenhaus](https://github.com/simonottenhauskenbun)
   - 2024-09-24 > [Evaluating `pyannote` pretrained speech separation pipelines](tutorials/community/eval_separation_pipeline.ipynb) by  [Clément Pagés](https://github.com/)
 
-## Benchmark
-
-Out of the box, `pyannote.audio` speaker diarization [pipeline](https://hf.co/pyannote/speaker-diarization-3.1) v3.1 is expected to be much better (and faster) than v2.x.
-Those numbers are diarization error rates (in %):
-
-| Benchmark                                                                                                                   | [v2.1](https://hf.co/pyannote/speaker-diarization-2.1) | [v3.1](https://hf.co/pyannote/speaker-diarization-3.1) | [pyannoteAI](https://www.pyannote.ai) |
-| --------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | ------------------------------------------------------ | ------------------------------------------------ |
-| [AISHELL-4](https://arxiv.org/abs/2104.03603)                                                                               | 14.1                                                   | 12.2                                                   | 11.9                                             |
-| [AliMeeting](https://www.openslr.org/119/) (channel 1)                                                                      | 27.4                                                   | 24.4                                                   | 22.5                                             |
-| [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (IHM)                                                                        | 18.9                                                   | 18.8                                                   | 16.6                                             |
-| [AMI](https://groups.inf.ed.ac.uk/ami/corpus/) (SDM)                                                                        | 27.1                                                   | 22.4                                                   | 20.9                                             |
-| [AVA-AVD](https://arxiv.org/abs/2111.14448)                                                                                 | 66.3                                                   | 50.0                                                   | 39.8                                             |
-| [CALLHOME](https://catalog.ldc.upenn.edu/LDC2001S97) ([part 2](https://github.com/BUTSpeechFIT/CALLHOME_sublists/issues/1)) | 31.6                                                   | 28.4                                                   | 22.2                                             |
-| [DIHARD 3](https://catalog.ldc.upenn.edu/LDC2022S14) ([full](https://arxiv.org/abs/2012.01477))                             | 26.9                                                   | 21.7                                                   | 17.2                                             |
-| [Earnings21](https://github.com/revdotcom/speech-datasets)                                                                  | 17.0                                                   | 9.4                                                    | 9.0                                              |
-| [Ego4D](https://arxiv.org/abs/2110.07058) (dev.)                                                                            | 61.5                                                   | 51.2                                                   | 43.8                                             |
-| [MSDWild](https://github.com/X-LANCE/MSDWILD)                                                                               | 32.8                                                   | 25.3                                                   | 19.8                                             |
-| [RAMC](https://www.openslr.org/123/)                                                                                        | 22.5                                                   | 22.2                                                   | 18.4                                             |
-| [REPERE](https://www.islrn.org/resources/360-758-359-485-0/) (phase2)                                                       | 8.2                                                    | 7.8                                                    | 7.6                                              |
-| [VoxConverse](https://github.com/joonson/voxconverse) (v0.3)                                                                | 11.2                                                   | 11.3                                                   | 9.4                                              |
-
-[Diarization error rate](http://pyannote.github.io/pyannote-metrics/reference.html#diarization) (in %)
-
 ## Citations
 
 If you use `pyannote.audio` please use the following citations:
 
@@ -29,6 +29,7 @@ dependencies = [
     "torchmetrics>=1.6.1",
     "soundfile>=0.13.1",
     "matplotlib>=3.10.0",
+    "pyannoteai.sdk>=0.1.0",
 ]
 
 [project.scripts]
 
@@ -0,0 +1,25 @@
+# The MIT License (MIT)
+#
+# Copyright (c) 2025- pyannoteAI
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+from .speaker_diarization import PremiumSpeakerDiarization
+
+__all__ = ["PremiumSpeakerDiarization"]
Original file line number	Diff line number	Diff line change
`@@ -29,6 +29,7 @@ dependencies = [`
`29`	`29`	`"torchmetrics>=1.6.1",`
`30`	`30`	`"soundfile>=0.13.1",`
`31`	`31`	`"matplotlib>=3.10.0",`
	`32`	`+ "pyannoteai.sdk>=0.1.0",`
`32`	`33`	`]`
`33`	`34`
`34`	`35`	`[project.scripts]`