download_file

def download_file(url: str, cache_dir: str):

Download a file from a URL to a local cache directory.

Parameters:

  • url (str): The URL of the file to download.
  • cache_dir (str): The directory to save the downloaded file.

Returns:

str: The path to the downloaded file.

Raises:

  • Exception: If the download fails.

AudioAnalysisToolkit

class AudioAnalysisToolkit(BaseToolkit):

init

def __init__(
    self,
    cache_dir: Optional[str] = None,
    transcribe_model: Optional[BaseAudioModel] = None,
    audio_reasoning_model: Optional[BaseModelBackend] = None,
    timeout: Optional[float] = None
):

A toolkit for audio processing and analysis. This class provides methods for processing, transcribing, and extracting information from audio data, including direct question answering about audio content.

Parameters:

  • cache_dir (Optional[str]): Directory path for caching downloaded audio files. If not provided, ‘tmp/’ will be used. (default: :obj:None)
  • transcribe_model (Optional[BaseAudioModel]): Model used for audio transcription. If not provided, OpenAIAudioModels will be used. (default: :obj:None)
  • audio_reasoning_model (Optional[BaseModelBackend]): Model used for audio reasoning and question answering. If not provided, uses the default model from ChatAgent. (default: :obj:None)
  • timeout (Optional[float]): The timeout value for API requests in seconds. If None, no timeout is applied. (default: :obj:None)

audio2text

def audio2text(self, audio_path: str):

Transcribe audio to text.

Parameters:

  • audio_path (str): The path to the audio file or URL.

Returns:

str: The transcribed text.

ask_question_about_audio

def ask_question_about_audio(self, audio_path: str, question: str):

Ask any question about the audio and get the answer using multimodal model.

Parameters:

  • audio_path (str): The path to the audio file.
  • question (str): The question to ask about the audio.

Returns:

str: The answer to the question.

get_tools

def get_tools(self):

Returns:

List[FunctionTool]: A list of FunctionTool objects representing the functions in the toolkit.