> ## Documentation Index
> Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Camel.models.openai audio models

<a id="camel.models.openai_audio_models" />

<a id="camel.models.openai_audio_models.OpenAIAudioModels" />

## OpenAIAudioModels

```python theme={"system"}
class OpenAIAudioModels(BaseAudioModel):
```

Provides access to OpenAI's Text-to-Speech (TTS) and Speech\_to\_Text
(STT) models.

<a id="camel.models.openai_audio_models.OpenAIAudioModels.__init__" />

### **init**

```python theme={"system"}
def __init__(
    self,
    api_key: Optional[str] = None,
    url: Optional[str] = None,
    timeout: Optional[float] = None
):
```

Initialize an instance of OpenAI.

<a id="camel.models.openai_audio_models.OpenAIAudioModels.text_to_speech" />

### text\_to\_speech

```python theme={"system"}
def text_to_speech(self, input: str, **kwargs: Any):
```

Convert text to speech using OpenAI's TTS model. This method
converts the given input text to speech using the specified model and
voice.

**Parameters:**

* **input** (str): The text to be converted to speech.
* **model\_type** (AudioModelType, optional): The TTS model to use. Defaults to `AudioModelType.TTS_1`.
* **voice** (VoiceType, optional): The voice to be used for generating speech. Defaults to `VoiceType.ALLOY`.
* **storage\_path** (str, optional): The local path to store the generated speech file if provided, defaults to `None`. \*\*kwargs (Any): Extra kwargs passed to the TTS API.

**Returns:**

Union\[List\[\_legacy\_response.HttpxBinaryResponseContent],
\_legacy\_response.HttpxBinaryResponseContent]: List of response
content object from OpenAI if input characters more than 4096,
single response content if input characters less than 4096.

<a id="camel.models.openai_audio_models.OpenAIAudioModels._split_audio" />

### \_split\_audio

```python theme={"system"}
def _split_audio(self, audio_file_path: str, chunk_size_mb: int = 24):
```

Split the audio file into smaller chunks. Since the Whisper API
only supports files that are less than 25 MB.

**Parameters:**

* **audio\_file\_path** (str): Path to the input audio file.
* **chunk\_size\_mb** (int, optional): Size of each chunk in megabytes. Defaults to `24`.

**Returns:**

list: List of paths to the split audio files.

<a id="camel.models.openai_audio_models.OpenAIAudioModels.speech_to_text" />

### speech\_to\_text

```python theme={"system"}
def speech_to_text(
    self,
    audio_file_path: str,
    translate_into_english: bool = False,
    **kwargs: Any
):
```

Convert speech audio to text.

**Parameters:**

* **audio\_file\_path** (str): The audio file path, supporting one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
* **translate\_into\_english** (bool, optional): Whether to translate the speech into English. Defaults to `False`. \*\*kwargs (Any): Extra keyword arguments passed to the Speech-to-Text (STT) API.

**Returns:**

str: The output text.

<a id="camel.models.openai_audio_models.OpenAIAudioModels.audio_question_answering" />

### audio\_question\_answering

```python theme={"system"}
def audio_question_answering(
    self,
    audio_file_path: str,
    question: str,
    model: str = 'gpt-4o-mini-audio-preview',
    **kwargs: Any
):
```

Answer a question directly using the audio content.

**Parameters:**

* **audio\_file\_path** (str): The path to the audio file.
* **question** (str): The question to ask about the audio content.
* **model** (str, optional): The model to use for audio question answering. (default: :obj:`"gpt-4o-mini-audio-preview"`) \*\*kwargs (Any): Extra keyword arguments passed to the chat completions API.

**Returns:**

str: The model's response to the question.
