Camel.models.sglang model
SGLangModel
SGLang service interface.
Parameters:
- model_type (Union[ModelType, str]): Model for which a backend is created.
- model_config_dict (Optional[Dict[str, Any]], optional): A dictionary that will be fed into:obj:
openai.ChatCompletion.create()
. If :obj:None
, :obj:SGLangConfig().as_dict()
will be used. (default: :obj:None
) - api_key (Optional[str], optional): The API key for authenticating with the model service. SGLang doesn’t need API key, it would be ignored if set. (default: :obj:
None
) - url (Optional[str], optional): The url to the model service. If not provided, :obj:
"http://127.0.0.1:30000/v1"
will be used. (default: :obj:None
) - token_counter (Optional[BaseTokenCounter], optional): Token counter to use for the model. If not provided, :obj:
OpenAITokenCounter( ModelType.GPT_4O_MINI)
will be used. (default: :obj:None
) - timeout (Optional[float], optional): The timeout value in seconds for API calls. If not provided, will fall back to the MODEL_TIMEOUT environment variable or default to 180 seconds. (default: :obj:
None
) - Reference: https://sgl-project.github.io/backend/openai_api_completions.html
init
_start_server
_ensure_server_running
Ensures that the server is running. If not, starts the server.
_monitor_inactivity
Monitor whether the server process has been inactive for over 10 minutes.
token_counter
Returns:
BaseTokenCounter: The token counter following the model’s tokenization style.
check_model_config
_run
Runs inference of OpenAI chat completion.
Parameters:
- messages (List[OpenAIMessage]): Message list with the chat history in OpenAI API format.
Returns:
Union[ChatCompletion, Stream[ChatCompletionChunk]]:
ChatCompletion
in the non-stream mode, or
Stream[ChatCompletionChunk]
in the stream mode.
stream
Returns:
bool: Whether the model is in stream mode.
del
Properly clean up resources when the model is destroyed.
cleanup
Terminate the server process and clean up resources.
_terminate_process
_kill_process_tree
Kill the process and all its child processes.
_execute_shell_command
Execute a shell command and return the process handle
Parameters:
- command: Shell command as a string (can include \ line continuations)
Returns:
subprocess.Popen: Process handle
_wait_for_server
Wait for the server to be ready by polling the /v1/models endpoint.
Parameters:
- base_url (str): The base URL of the server
- timeout (Optional[float]): Maximum time to wait in seconds. (default: :obj:
30
)