Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure multiple audio formats are supported #374

Open
rachwalk opened this issue Jan 17, 2025 · 0 comments
Open

ensure multiple audio formats are supported #374

rachwalk opened this issue Jan 17, 2025 · 0 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@rachwalk
Copy link
Contributor

This task contains two features:

  1. A preprocess_audio function, similar to preprocess_image, to handle conversion of various audio formats (e.g., .mp3, .wav, np.array with sampling rate) into a standard format accepted by multimodal vendors.
  2. Validation to ensure the model can process and understand the provided audio content (e.g., compatibility with gpt-4o-audio-preview).
@rachwalk rachwalk added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant