Skip to main content

VolcanoModel

class VolcanoModel(OpenAICompatibleModel):
Volcano Engine API in a unified OpenAICompatibleModel interface. Parameters:
  • model_type (Union[ModelType, str]): Model for which a backend is created.
  • model_config_dict (Optional[Dict[str, Any]], optional): A dictionary that will be fed into the API call. If :obj:None, :obj:{} will be used. (default: :obj:None)
  • api_key (Optional[str], optional): The API key for authenticating with the Volcano Engine service. (default: :obj:None)
  • url (Optional[str], optional): The url to the Volcano Engine service. (default: :obj:https://ark.cn-beijing.volces.com/api/v3)
  • token_counter (Optional[BaseTokenCounter], optional): Token counter to use for the model. If not provided, :obj:OpenAITokenCounter will be used. (default: :obj:None)
  • timeout (Optional[float], optional): The timeout value in seconds for API calls. If not provided, will fall back to the MODEL_TIMEOUT environment variable or default to 180 seconds. (default: :obj:None)
  • max_retries (int, optional): Maximum number of retries for API calls. (default: :obj:3) **kwargs (Any): Additional arguments to pass to the client initialization.

init

def __init__(
    self,
    model_type: Union[ModelType, str],
    model_config_dict: Optional[Dict[str, Any]] = None,
    api_key: Optional[str] = None,
    url: Optional[str] = None,
    token_counter: Optional[BaseTokenCounter] = None,
    timeout: Optional[float] = None,
    max_retries: int = 3,
    **kwargs: Any
):

_inject_reasoning_content

def _inject_reasoning_content(self, messages: List[OpenAIMessage]):
Inject the last reasoning_content into assistant messages. For Volcano Engine’s doubao-seed models with deep thinking enabled, the reasoning_content from the model response needs to be passed back in subsequent requests for proper context management. Parameters:
  • messages: The original messages list.
Returns: Messages with reasoning_content added to the last assistant message that has tool_calls.

_extract_reasoning_content

def _extract_reasoning_content(self, response: ChatCompletion):
Extract reasoning_content from the model response. Parameters:
  • response: The model response.
Returns: The reasoning_content if available, None otherwise.

run

def run(
    self,
    messages: List[OpenAIMessage],
    response_format: Optional[Type[BaseModel]] = None,
    tools: Optional[List[Dict[str, Any]]] = None
):
Runs inference of Volcano Engine chat completion. Overrides the base run method to inject reasoning_content from previous responses into subsequent requests, as required by Volcano Engine’s doubao-seed models with deep thinking enabled. Parameters:
  • messages: Message list with the chat history in OpenAI API format.
  • response_format: The format of the response.
  • tools: The schema of the tools to use for the request.
Returns: ChatCompletion in the non-stream mode, or Stream[ChatCompletionChunk] in the stream mode.