Embodied Agents
You can also check this cookbook in colab here
⭐ Star us on Github, join our Discord or follow our X
Philosophical Bits
We believe the essence of intelligence emerges from its dynamic interactions with the external environment, where the use of various tools becomes a pivotal factor in its development and manifestation.
The EmbodiedAgent()
in CAMEL is an advanced conversational agent that leverages code interpreters and tool agents (e.g., HuggingFaceToolAgent()
) to execute diverse tasks efficiently. This agent represents a blend of advanced programming and AI capabilities, and is able to interact and respond within a dynamic environment.
Quick Start
Let’s first play with a ChatAgent
instance by simply initialize it with a system message and interact with user messages.
🕹 Step 0: Preparations
Setting Up API Keys
You’ll need to set up your API keys for OpenAI.
Alternatively, if running on Colab, you could save your API keys and tokens as Colab Secrets, and use them across notebooks.
To do so, comment out the above manual API key prompt code block(s), and uncomment the following codeblock.
⚠️ Don’t forget granting access to the API key you would be using to the current notebook.
🕹 Step 1: Define the Role
We first need to set up the necessary information.
The meta_dict
and role_type
will be used to generate the system message.
🕹 Step 2: Initialize the Agent 🐫
Based on the system message, we are ready to initialize our embodied agent.
Be aware that the default argument values for tool_agents
and code_interpreter
are None
, and the underlying code interpreter is using the SubProcessInterpreter()
, which handles the execution of code in Python and Bash within a subprocess.
🕹 Step 3: Interact with the Agent with .step()
Use the base message wrapper to generate the user message.
And feed that into your agents:
Under the hood, the agent will perform multiple actions within its action space in the OS to fulfill the user request. It will compose code to implement the action – no worries, it will ask for your permission before execution.
Ideally you should get the output similar to this, if you allow the agent to perform actions: