Critic Agents and Tree Search
You can also check this cookbook in colab here
โญ Star us on Github, join our Discord or follow our X
Philosophical Bits
What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle.
โ Marvin Minsky, The Society of Mind, p. 308
In this section, we will take a spite of the task-oriented RolyPlaying()
class. We design this in an instruction-following manner. The essence is that to solve a complex task, you can enable two communicative agents collabratively working together step by step to reach solutions. The main concepts include:
- Task: a task can be as simple as an idea, initialized by an inception prompt.
- AI User: the agent who is expected to provide instructions.
- AI Assistant: the agent who is expected to respond with solutions that fulfills the instructions.
Prerequisite: We assume that you have read the section on intro to role-playing.
How do agents accomplish hard tasks? While reasoning can naturally emerge from next-token-prediction pretraining, it is still difficult for agents to solve complex tasks which require lots of intermediate steps. To tackle this issue, tree search is a simple and effective framework.
A typical tree search include node expansion and node selection. In the March 2023 paper, CAMEL introduces a heuristic tree search approach with critic in the loop, where the expansion and selection are presented below:
To put it simply, a critic agent is a helper agents in the role-playing session, which is capable of selecting proposals and provide informative verbal feedback to the role-playing agents.
Quick Start
๐น Step 0: Preparations
Setting Up API Keys
Youโll need to set up your API keys for OpenAI.
Alternatively, if running on Colab, you could save your API keys and tokens as Colab Secrets, and use them across notebooks.
To do so, comment out the above manual API key prompt code block(s), and uncomment the following codeblock.
โ ๏ธ Donโt forget granting access to the API key you would be using to the current notebook.
๐น Step 1: Configure the Specifications for Critic Agents
๐น Step 2: Get the Critic Agents
With the above arguments, we have:
Letโs take a look on the default system message:
You may overwrite the system message and configure the critic differently based on your own needs.
๐น Step 3: Using Critic Agents for Task Solving
Our RolePlaying()
class provide a simple way for you to add the critic in the loop. Below we provide a basic pipeline.
We then need to set the kwargs for the task and each agent:
Putting them together:
And the helper functions to run our society:
Now letโs set our code in motion:
In this setting, the AI User
and AI Assistant
will generate different options when responding (you can simply change the temperature
in model_config
to somewhat control the diversity). AI Critic
will respond with its option selection and reasoning; such additional context will be fed to the two other agents and help them form better subsequent responses.
Remarks
While we see some performance gains from critic-in-the-loop, it may not really solve the fundamental extrapolation problem (and self-consistency remains a strong baseline for many tasks). It is debatable if those agents can extrapolate by self-play within its current scale. A more practical question is how we may efficiently introduce informative feedbacks/rewards, when agents are connected with external environments and are endowed with tools and memories. They are expected to have a good world model and know how to make abstraction and analogy when necessary. Stay tuned for our next update.