⭐ Star us on GitHub, join our Discord, or follow us on XThis notebook provides a comprehensive guide to generating user queries and structured tool call data using CAMEL’s ChatAgent framework. By utilizing real tools and the Hermes JSON format for function calls, the tutorial demonstrates a structured approach to scalable and flexible data generation.In this notebook, you’ll explore:
CAMEL’s ChatAgent Framework: A multi-agent system for generating human-like queries and structured tool call data, leveraging its modular and adaptable design.
Hermes Function Calling Format: A standardized JSON-based format for encoding function calls, ensuring consistency and interoperability in structured data outputs.
OpenAI API Integration: Enabling advanced natural language understanding and generation for crafting user queries and processing tool responses.
Toolkits Integration: Leveraging tools such as MathToolkit, SearchToolkit, and others to provide diverse functionalities for realistic scenarios.
Automated Data Generation: End-to-end pipeline for generating tool-specific user queries, structuring their outputs, and saving them as JSON files.
import osfrom getpass import getpass# Prompt for the OpenAI API key securelyopenai_api_key = getpass('Enter your OpenAI API key: ')os.environ["OPENAI_API_KEY"] = openai_api_key
Alternatively, if running on Colab, you could save your API keys and tokens as Colab Secrets, and use them across notebooks.To do so, comment out the above manual API key prompt code block(s), and uncomment the following codeblock.⚠️ Don’t forget granting access to the API key you would be using to the current notebook.
Copy
# import os# from google.colab import userdata# os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
This function leverages specific tools to generate human-like queries. We’ll set up a ChatAgent to create relevant, tool-specific user queries.
Copy
def generate_user_query(selected_tool: Union[Callable, FunctionTool], n: int = 1) -> List[str]: r"""Generates user queries by leveraging specific tools, helping the ChatAgent craft human-like queries that take advantage of each tool's functionality. Args: selected_tool (Union[Callable, FunctionTool]): The tool to leverage for query generation. n (int, optional): The number of queries to generate. Defaults to 1. """ tool_call_sys_msg = ( "You are a smart query generator designed to utilize specific tools " "based on user needs. " "Formulate queries as a human user would." "Instructions:\n" "1. Envision a real-world scenario, but don't say it out loud." "2. Craft a realistically phrased actionable query fitting that scenario" " that could be satisfied with the provided tool(s)\n" "3. With the tool(s) in mind, phrase the query naturally and " "informatively.\n" "4. Only rely on information from tools they appear likely to " "provide\n" "5. Pose the query as if the user doesn't know what tools are " "available." ) # Convert to FunctionTool if necessary if not isinstance(selected_tool, FunctionTool): selected_tool = FunctionTool(selected_tool) # Create a model instance for generating queries query_model = ModelFactory.create( model_platform=ModelPlatformType.OPENAI, model_type=ModelType.GPT_4O_MINI, model_config_dict={"temperature": 1} ) # Initialize ChatAgent with the system message query_agent = ChatAgent( system_message=tool_call_sys_msg, model=query_model, ) queries = [] # Prepare tools info message for guiding query generation tools_info_message = ( "Generate a relevant query based on the following tool " "details:\n\n" + "\n".join(f"Tool Schema: {selected_tool.get_openai_tool_schema()}") ) # Generate queries for _ in range(n): response = query_agent.step(tools_info_message) queries.append(response.msgs[0].content) # Extract the generated query return queries
Step 3: Define Function to Generate Structured Tool Call Data
This function will structure tool call data based on user queries by leveraging each selected tool.
Copy
def generate_tool_call_data(user_messages: List[str], selected_tool: Union[Callable, FunctionTool]) -> \ list[Any]: r"""Generates structured tool call data for a list of user messages by using each specified tool in selected_tools. """ # Convert to FunctionTool if necessary if not isinstance(selected_tool, FunctionTool): selected_tool = FunctionTool(selected_tool) # Define system message guiding ChatAgent on function calls base_system_msg = "You are a function calling AI model. " hermes_tool_call_sys_msg = ( f"You are a function calling AI model. You are provided with " f"function signatures within <tools> </tools> XML tags. You may call " f"one or more functions to assist with the user query. If available " f"tools are not relevant in assisting with user query, just respond " f"in natural conversational language. Don't make assumptions about " f"what values to plug into functions. After calling & executing the " f"functions, you will be provided with function results within " f"<tool_response> </tool_response> XML tags." f"\n <tools> \n" f"{[selected_tool.get_openai_tool_schema()]}" f"\n </tools>\n" "For each function call return a JSON object, with the following " "pydantic model json schema:{'title': 'FunctionCall', 'type': " "'object', 'properties': {'name': {'title': 'Name', 'type': " "'string'}, 'arguments': {'title': 'Arguments', 'type': 'object'}}, " "'required': ['arguments', 'name']}\n" f"Each function call should be enclosed within <tool_call> " f"</tool_call> XML tags.\n" f"Example:\n" f"<tool_call>\n" "{'name': <function-name>, 'arguments': <args-dict>}\n" "</tool_call>" f"") sys_hermes = ShareGPTMessage(from_='system', value=hermes_tool_call_sys_msg) # Initialize model for tool call data generation tool_model = ModelFactory.create( model_platform=ModelPlatformType.OPENAI, model_type=ModelType.GPT_4O_MINI, ) all_tool_data = {} tool_output_data = [] for user_message in user_messages: # Set up ChatAgent with the system message, model, and the single tool tool_agent = ChatAgent( system_message=base_system_msg, model=tool_model, tools=[selected_tool], ) # Generate response using ChatAgent and structured output try: tool_agent.step(user_message) except Exception as e: print(f"Error: {e}") continue messages = [record.memory_record.message for record in tool_agent.memory.retrieve()] sharegpt_hermes_msgs = \ [sys_hermes] + [msg.to_sharegpt(HermesFunctionFormatter()) for msg in messages[1:]] # Only include conversations with function calls if any(type(message) == FunctionCallingMessage for message in messages): tool_output_data.append( json.loads( ShareGPTConversation(sharegpt_hermes_msgs) .model_dump_json(by_alias=True)) ) # Add generated data to the dictionary with tool name as the key all_tool_data[selected_tool.func.__name__] = tool_output_data return tool_output_data
# API keys can be setup in the notebook like this# google_api_key = getpass('Enter your API key: ')# os.environ["GOOGLE_API_KEY"] = google_api_key# weather_api_key = getpass('Enter your API key: ')# os.environ["OPENWEATHERMAP_API_KEY"] = weather_api_keyselected_tools = [ *MathToolkit().get_tools(), # Add more tools as needed, though they require API keys *[ # Search tools with no API keys required FunctionTool(SearchToolkit().search_duckduckgo), FunctionTool(SearchToolkit().search_wiki), ], # FunctionTool(SearchToolkit().search_google), # *ArxivToolkit().get_tools(), # *GoogleMapsToolkit().get_tools(), # *WeatherToolkit().get_tools(),]
In this tutorial, you learned how to generate user queries and structure tool call data by leveraging multiple tools with the camel library. This setup enables scalable and flexible data generation for various toolkits and tasks.
⭐ Star us on GitHub, join our Discord, or follow us on X