Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure OpenAI client transcribes or translates speech depending on the deployment used, not the library function #1910

Open
1 task done
s-zanella opened this issue Nov 29, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@s-zanella
Copy link

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

The behaviour of the AzureOpenAI client depends on the deployment used and not on the library function called.

For an OpenAI client client, client.audio.transcriptions transcribes speech and client.audio.translations translates it, as expected.

For an AzureOpenAI client azure_client, however, you get a transcription or a translation depending on the endpoint specified regardless of the library function used. The library function used only determines the class of the result object, not the text.

To Reproduce

  1. Observe OpenAI client behavior.
import os
from openai import OpenAI, AzureOpenAI
from azure.identity import get_bearer_token_provider, DefaultAzureCredential

# Example file: https://upload.wikimedia.org/wikipedia/commons/b/b1/Candide_01_voltaire.mp3
audio_file = "./Candide_01_voltaire.mp3"

openai_client = OpenAI()

with open(audio_file, 'rb') as f:
    result = openai_client.audio.transcriptions.create(
        file=f,            
        model="whisper-1",
    )

    print(f"openai_client.audio.transcriptions: {result.__class__.__name__, result.text[:60]}")

    result = openai_client.audio.translations.create(
        file=f,            
        model="whisper-1",
    )

    print(f"openai_client.audio.translations: {result.__class__.__name__, result.text[:60]}")
openai_client.audio.transcriptions: ('Transcription', "Chapitre premier de Candide ou l'optimisme, de Voltaire, enregistré pour LibriVo")
openai_client.audio.translations: ('Translation', 'Chapter 1 of Candide or Optimism, by Voltaire, recorded for LibriVox.org by Bern')
  1. Observe that AzureOpenAI client behavior does not depend on the library function used.
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")
azure_endpoint = os.environ['AZURE_ENDPOINT']
azure_deployment = "whisper/audio/transcriptions?api-version=2024-06-01"

azure_client = AzureOpenAI(
    azure_ad_token_provider=token_provider,
    azure_endpoint=os.environ['AZURE_ENDPOINT'],
    azure_deployment=azure_deployment,
    api_version="2024-06-01"
)

with open(audio_file, 'rb') as f:
    result = azure_client.audio.transcriptions.create(
        file=f,            
        model="whisper-1",
    )

    print(f"azure_client.audio.transcriptions: {result.__class__.__name__, result.text[:60]}")

    result = azure_client.audio.translations.create(
        file=f,            
        model="whisper-1",
    )

    print(f"azure_client.audio.translations: {result.__class__.__name__, result.text[:60]}")
azure_client.audio.transcriptions: ('Transcription', "Chapitre premier de Candide ou l'optimisme, de Voltaire, enr")
azure_client.audio.translations: ('Translation', "Chapitre premier de Candide ou l'optimisme, de Voltaire, enr")
azure_deployment = "whisper/audio/translations?api-version=2024-06-01"

azure_client = AzureOpenAI(
    azure_ad_token_provider=token_provider,
    azure_endpoint=os.environ['AZURE_ENDPOINT'],
    azure_deployment=azure_deployment,
    api_version="2024-06-01"
)

with open(audio_file, 'rb') as f:
    result = azure_client.audio.transcriptions.create(
        file=f,            
        model="whisper-1",
    )

    print(f"azure_client.audio.transcriptions: {result.__class__.__name__, result.text[:60]}")

    result = azure_client.audio.translations.create(
        file=f,            
        model="whisper-1",
    )

    print(f"azure_client.audio.translations: {result.__class__.__name__, result.text[:60]}")
azure_client.audio.transcriptions: ('Transcription', 'Chapter 1 of Candide or Optimism, by Voltaire, recorded for ')
azure_client.audio.translations: ('Translation', 'Chapter 1 of Candide or Optimism, by Voltaire, recorded for ')

Code snippets

No response

OS

Ubuntu

Python version

Python v3.12.7

Library version

openai v1.55.3

@s-zanella s-zanella added the bug Something isn't working label Nov 29, 2024
@RobertCraigie
Copy link
Collaborator

cc @kristapratico

@kristapratico
Copy link
Contributor

@s-zanella can you share the reason for including the full path + API version ("whisper/audio/translations?api-version=2024-06-01") in the azure_deployment parameter? It is expected that only the deployment name, i.e. whisper, is passed and the client will build the URL.

@s-zanella
Copy link
Author

I see. https://{endpoint}/openai/deployments/whisper/audio/translations?api-version=2024-06-01 is the endpoint URI that shows in the Azure OpenAI Service portal. Usually for other services, copy & pasting this and passing it as the azure_endpoint works. This is what I did initially; the code above is refactored to use azure_endpoint and azure_deployment parameters and in retrospective makes it more evident that I should have used azure_deployment = "whisper" .

I tried removing the API version and using whisper/audio from the original URI but neither worked. Using just whisper works as expected. Still, the behaviour is puzzling and I feel that there should be a guard against using an endpoint that does not match the library function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants