RAG Cookbook#
You can also check this cookbook in colab here
⭐ Star us on Github, join our Discord or follow our X
Overview#
In this notebook, we show the useage of CAMEL Retrieve Module in both customized way and auto way. We will also show how to combine AutoRetriever
with ChatAgent
, and further combine AutoRetriever
with RolePlaying
by using Function Calling
.
4 main parts included:
Customized RAG
Auto RAG
Single Agent with Auto RAG
Role-playing with Auto RAG
Installation#
Ensure you have CAMEL AI installed in your Python environment:
[ ]:
!pip install "camel-ai[all]==0.2.16"
Load Data#
Let’s first load the CAMEL paper from https://arxiv.org/pdf/2303.17760.pdf. This will be our local example data.
[2]:
import os
import requests
os.makedirs('local_data', exist_ok=True)
url = "https://arxiv.org/pdf/2303.17760.pdf"
response = requests.get(url)
with open('local_data/camel_paper.pdf', 'wb') as file:
file.write(response.content)
1. Customized RAG#
In this section we will set our customized RAG pipeline, we will take VectorRetriever
as an example. Set embedding model, we will use OpenAIEmbedding
as the embedding model, so we need to set the OPENAI_API_KEY
in below.
[3]:
from getpass import getpass
# Prompt for the OpenAI API key securely
openai_api_key = getpass('Enter your API key: ')
os.environ["OPENAI_API_KEY"] = openai_api_key
Enter your API key: ··········
Import and set the embedding instance:
[4]:
from camel.embeddings import OpenAIEmbedding
from camel.types import EmbeddingModelType
embedding_instance = OpenAIEmbedding(model_type=EmbeddingModelType.TEXT_EMBEDDING_3_LARGE)
Import and set the vector storage instance:
[5]:
from camel.storages import QdrantStorage
storage_instance = QdrantStorage(
vector_dim=embedding_instance.get_output_dim(),
path="local_data",
collection_name="camel_paper",
)
Import and set the retriever instance:
[6]:
from camel.retrievers import VectorRetriever
vector_retriever = VectorRetriever(embedding_model=embedding_instance,
storage=storage_instance)
We use integrated Unstructured Module
to splite the content into small chunks, the content will be splited automacitlly with its chunk_by_title
function, the max character for each chunk is 500 characters, which is a suitable length for OpenAIEmbedding
. All the text in the chunks will be embed and stored to the vector storage instance, it will take some time, please wait..
[7]:
vector_retriever.process(
content="local_data/camel_paper.pdf",
)
Now we can retrieve information from the vector storage by giving a query. By default it will give you back the text content from top 1 chunk with highest Cosine similarity score, and the similarity score should be higher than 0.75 to ensure the retrieved content is relevant to the query. You can also change the top_k
value and similarity_threshold
value with your needs.
The returned dictionary list includes:
similarity score
content path
metadata
text
[8]:
retrieved_info = vector_retriever.query(
query="To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named role-playing .",
top_k=1
)
print(retrieved_info)
[{'similarity score': '0.8732195101415544', 'content path': 'local_data/camel_paper.pdf', 'metadata': {'filetype': 'application/pdf', 'languages': ['eng'], 'page_number': 10, 'orig_elements': 'eJx1VMtu5DYQ/JWGzuPJvDxj5xYssMBekgDZm7EwWmJrRJgiGT5mPFjsv6dasmIDyd7EYj+rSnz63oiTUXx5tqb5lZrTvt+Z7WHfbbp9//hwlPZ06h83B+n7rRzMqVlRM0phw4UR/73pQkjGei6Sp7PjW6jleRB7HgqQ0+MOKW/o1ZoyADxuFYzB+qJZT0/bzWl9XNH95mG9+7ai5Xw87Ofz/eZ+vfvveY4H0ORbLjLqBn/aV3F/Re6k+YGL3joptyh6xTE623Gxwf8STd9Mg/lz5fM0/FMj/txotQjk2dexlQR8u5mg9M7Sgfv+eGgP+81x89Cbdnt46Dre7A/70+5xf7+ZOhd5VQKaI30KvnM1oy198VQGmylylLSiq5C8RheSABaKoaCJZUehJ64l+DCGmgkkI3wanHgM/gxkHKuflrkIYVwwSewNxRRiyIDIh4u499QljPrEo1xDeiGPD0MpOLmLUMii7pr+qIlAVArcDSSeWyf5/9uVANw5bgPqy4d53Q13V05G8yL418qF80um6wA9KMnf1SZFR+vtiH2HOrIn+EHSRSkIfkVO2EyZgVopuKEcXNU7rArGsBcFTFuGkEI9Y9oLuzqxlNf0dZhBjWDP7pZtngjPQ7gihwthQysXbfETriFUN7Bz8IVGmSo6jM25ghNnXwTR/iIpz+EGxXieXUml3tkYkbialMECo/4oGok0Y98GVcLfNQl9j3qQL3fslPx3MfoABko1k1Bql4/atjLwxYY0u6DjyK11aIFB4aWxumLvZv3nX+Vft1ysQUwuquFZw7UNG5Mk57dGsNNCg2DgzzUBTCNcu5roxRT+LuOrg52cbROnG7SE57W0VQvoEzMro+NcOFnlejbSivQ1IXwvvEcbxVkvy9WiH9gPLs90qvWkK+ioIVm0DsTJNcaQCiyWhRNYm9j+yc/Syg1CzBJAkpJsW5chIQNNflINrM/6nmV1aJi472up8CCWcZzOQstDAmcW29tOf2K1s3MW/TqhMRhxb+p80O23L4sga32Plqfqd05zwFd9RX58+wcj7AKm'}, 'text': '6 Conclusion In this paper, we explore the potential of autonomous cooperation among communicative agents and propose a novel cooperative agent framework named role-playing . Our approach enables communicative agents to collaborate autonomously toward completing tasks while requiring minimal human intervention, leading to better solutions are per our thorough evaluations. Through our analysis, we show that achieving autonomous cooperation is challenging due to issues like conversation deviation,'}]
Let’s try an irrelevant query:
[9]:
retrieved_info_irrevelant = vector_retriever.query(
query="Compared with dumpling and rice, which should I take for dinner?",
top_k=1,
)
print(retrieved_info_irrevelant)
[{'text': 'No suitable information retrieved from local_data/camel_paper.pdf with similarity_threshold = 0.7.'}]
2. Auto RAG#
In this section we will run the AutoRetriever
with default settings. It uses OpenAIEmbedding
as default embedding model and Qdrant
as default vector storage.
What you need to do is:
Set content input paths, which can be local paths or remote urls
Give a query
The Auto RAG pipeline would create collections for given content input paths, the collection name will be set automaticlly based on the content input path name, if the collection exists, it will do the retrieve directly.
[10]:
from camel.retrievers import AutoRetriever
from camel.types import StorageType
auto_retriever = AutoRetriever(
vector_storage_local_path="local_data2/",
storage_type=StorageType.QDRANT,
embedding_model=embedding_instance)
retrieved_info = auto_retriever.run_vector_retriever(
query="If I'm interest in contributing to the CAMEL projec, what should I do?",
contents=[
"local_data/camel_paper.pdf", # example local path
"https://github.com/camel-ai/camel/wiki/Contributing-Guidlines", # example remote url
],
top_k=1,
return_detailed_info=True,
similarity_threshold=0.5
)
print(retrieved_info)
{'Original Query': "If I'm interest in contributing to the CAMEL projec, what should I do?", 'Retrieved Context': [{'similarity score': '0.7025470476009851', 'content path': 'https://github.com/camel-ai/camel/wiki/Contributing-Guidlines', 'metadata': {'filetype': 'text/html', 'languages': ['eng'], 'link_texts': ['slack'], 'link_urls': ['https://join.slack.com/t/camel-kwr1314/shared_invite/zt-1vy8u9lbo-ZQmhIAyWSEfSwLCl2r2eKA'], 'url': 'https://github.com/camel-ai/camel/wiki/Contributing-Guidlines', 'orig_elements': 'eJxNU9tu2zAM/RXOL3uJ4ybOtW9BUQzDugFDBxRbVxS0RMVqbMnQJU5W9N9H2WvXF0sQycNzDun754waasmERy2zS8hIiXKtcLtelZvljOblVlTVSi4289VsTptZNoGspYASA3L+c6Z0Q+HcUSoOdApFHdomZTVo9hH35Dlyn5HZZw/pVZvDow/ouKGRdBrDi+3FWzCBjDW+QXH4XxVdM77XIXT+siierDbTIWkqbFuEQmBLTX7o3aycLQpfoyPJbY46UPEn5LPjeRO3TWXzX9/b+vPufHd7rW77m6tm7ub0ZTe06rjozY01XSzXVC5Lsa7m5UpsK7WcqVJcLOe4XAuVdDKtlPpKaq9DHauB0EgH9Xgpen3QxZU1wekqBm32+aeoJUtjD14mo3sJ6UeN5gBnG0FZl04H2gRy5ANfQLxDgGAh1ARXu6/XN9A5+0QifIDfUW5KwV+12cIdfXQEdBLsgkwFNR5phPWx66wLU9h5QAO2I5N7DgjiRjpoDPqYroDgsGOuZ6CjbY6pMxo5FhCPUYLS1MgJ9AR9bRuqiSdMqaCnhr2gd7St8WAVYNPAgXfAT+GuJlYxaIUeTUgsWbKzMjIVQz0owhDZgQmQYXv4NcnWRjn0wUWRghPQLVvAjKUVMe00pmYTQH9IjLX3MSGglNDawRNsuyY96XSmCuDVDJRblTN+zhqAm7IWUYOWhJzKI1H6BFXc+0Eudp0jwVb985RB6ug1+nawiBQPMbDGYSaSv7QQAD9ZKS/amzssOG0zDFNJGz3MfmSplBY6cePMNhotBlnvEAWKtIivf+E3dG6Y3I+0UC8PfwFbuVpV'}, 'text': "Thank you for your interest in contributing to the CAMEL project! 🎉 We're excited to have your support. As an open-source initiative in a rapidly evolving and open-ended field, we wholeheartedly welcome contributions of all kinds. Whether you want to introduce new features, enhance the infrastructure, improve documentation, asking issues, add more examples, implement state-of-the-art research ideas, or fix bugs, we appreciate your enthusiasm and efforts. 🙌 You are welcome to join our slack for"}]}
3. Single Agent with Auto RAG#
In this section we will show how to combine the AutoRetriever
with one ChatAgent
.
Let’s set an agent function, in this function we can get the response by providing a query to this agent.
[11]:
from camel.agents import ChatAgent
from camel.messages import BaseMessage
from camel.types import RoleType
from camel.retrievers import AutoRetriever
from camel.types import StorageType
def single_agent(query: str) ->str :
# Set agent role
assistant_sys_msg = """You are a helpful assistant to answer question,
I will give you the Original Query and Retrieved Context,
answer the Original Query based on the Retrieved Context,
if you can't answer the question just say I don't know."""
# Add auto retriever
auto_retriever = AutoRetriever(
vector_storage_local_path="local_data2/",
storage_type=StorageType.QDRANT,
embedding_model=embedding_instance)
retrieved_info = auto_retriever.run_vector_retriever(
query=query,
contents=[
"local_data/camel_paper.pdf", # example local path
"https://github.com/camel-ai/camel/wiki/Contributing-Guidlines", # example remote url
],
top_k=1,
return_detailed_info=False,
similarity_threshold=0.5
)
# Pass the retrieved infomation to agent
user_msg = str(retrieved_info)
agent = ChatAgent(assistant_sys_msg)
# Get response
assistant_response = agent.step(user_msg)
return assistant_response.msg.content
print(single_agent("If I'm interest in contributing to the CAMEL projec, what should I do?"))
If you're interested in contributing to the CAMEL project, you can introduce new features, enhance the infrastructure, improve documentation, address issues, add more examples, implement state-of-the-art research ideas, or fix bugs. Additionally, you are welcome to join their Slack for further engagement and collaboration.
4. Role-playing with Auto RAG#
In this section we will show how to combine the RETRIEVAL_FUNCS
with RolePlaying
by applying Function Calling
.
[12]:
from typing import List
from colorama import Fore
from camel.agents.chat_agent import FunctionCallingRecord
from camel.configs import ChatGPTConfig
from camel.toolkits import (
MathToolkit,
RetrievalToolkit,
)
from camel.societies import RolePlaying
from camel.types import ModelType, ModelPlatformType
from camel.utils import print_text_animated
from camel.models import ModelFactory
def role_playing_with_rag(
task_prompt,
model_platform=ModelPlatformType.OPENAI,
model_type=ModelType.GPT_4O,
chat_turn_limit=5,
) -> None:
task_prompt = task_prompt
tools_list = [
*MathToolkit().get_tools(),
*RetrievalToolkit().get_tools(),
]
role_play_session = RolePlaying(
assistant_role_name="Searcher",
user_role_name="Professor",
assistant_agent_kwargs=dict(
model=ModelFactory.create(
model_platform=model_platform,
model_type=model_type,
),
tools=tools_list,
),
user_agent_kwargs=dict(
model=ModelFactory.create(
model_platform=model_platform,
model_type=model_type,
),
),
task_prompt=task_prompt,
with_task_specify=False,
)
print(
Fore.GREEN
+ f"AI Assistant sys message:\n{role_play_session.assistant_sys_msg}\n"
)
print(
Fore.BLUE + f"AI User sys message:\n{role_play_session.user_sys_msg}\n"
)
print(Fore.YELLOW + f"Original task prompt:\n{task_prompt}\n")
print(
Fore.CYAN
+ "Specified task prompt:"
+ f"\n{role_play_session.specified_task_prompt}\n"
)
print(Fore.RED + f"Final task prompt:\n{role_play_session.task_prompt}\n")
n = 0
input_msg = role_play_session.init_chat()
while n < chat_turn_limit:
n += 1
assistant_response, user_response = role_play_session.step(input_msg)
if assistant_response.terminated:
print(
Fore.GREEN
+ (
"AI Assistant terminated. Reason: "
f"{assistant_response.info['termination_reasons']}."
)
)
break
if user_response.terminated:
print(
Fore.GREEN
+ (
"AI User terminated. "
f"Reason: {user_response.info['termination_reasons']}."
)
)
break
# Print output from the user
print_text_animated(
Fore.BLUE + f"AI User:\n\n{user_response.msg.content}\n"
)
# Print output from the assistant, including any function
# execution information
print_text_animated(Fore.GREEN + "AI Assistant:")
tool_calls: List[FunctionCallingRecord] = [
FunctionCallingRecord(**call.as_dict())
for call in assistant_response.info['tool_calls']
]
for func_record in tool_calls:
print_text_animated(f"{func_record}")
print_text_animated(f"{assistant_response.msg.content}\n")
if "CAMEL_TASK_DONE" in user_response.msg.content:
break
input_msg = assistant_response.msg
Run the role-playing with defined retriever function:
[14]:
role_playing_with_rag(task_prompt = """If I'm interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don't generate the answer by yourself, adjust the similarity threshold to lower value is necessary""")
AI Assistant sys message:
BaseMessage(role_name='Searcher', role_type=<RoleType.ASSISTANT: 'assistant'>, meta_dict={'task': "If I'm interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don't generate the answer by yourself, adjust the similarity threshold to lower value is necessary", 'assistant_role': 'Searcher', 'user_role': 'Professor'}, content="===== RULES OF ASSISTANT =====\nNever forget you are a Searcher and I am a Professor. Never flip roles! Never instruct me!\nWe share a common interest in collaborating to successfully complete a task.\nYou must help me to complete the task.\nHere is the task: If I'm interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don't generate the answer by yourself, adjust the similarity threshold to lower value is necessary. Never forget our task!\nI must instruct you based on your expertise and my needs to complete the task.\n\nI must give you one instruction at a time.\nYou must write a specific solution that appropriately solves the requested instruction and explain your solutions.\nYou must decline my instruction honestly if you cannot perform the instruction due to physical, moral, legal reasons or your capability and explain the reasons.\nUnless I say the task is completed, you should always start with:\n\nSolution: <YOUR_SOLUTION>\n\n<YOUR_SOLUTION> should be very specific, include detailed explanations and provide preferable detailed implementations and examples and lists for task-solving.\nAlways end <YOUR_SOLUTION> with: Next request.", video_bytes=None, image_list=None, image_detail='auto', video_detail='low')
AI User sys message:
BaseMessage(role_name='Professor', role_type=<RoleType.USER: 'user'>, meta_dict={'task': "If I'm interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don't generate the answer by yourself, adjust the similarity threshold to lower value is necessary", 'assistant_role': 'Searcher', 'user_role': 'Professor'}, content='===== RULES OF USER =====\nNever forget you are a Professor and I am a Searcher. Never flip roles! You will always instruct me.\nWe share a common interest in collaborating to successfully complete a task.\nI must help you to complete the task.\nHere is the task: If I\'m interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don\'t generate the answer by yourself, adjust the similarity threshold to lower value is necessary. Never forget our task!\nYou must instruct me based on my expertise and your needs to solve the task ONLY in the following two ways:\n\n1. Instruct with a necessary input:\nInstruction: <YOUR_INSTRUCTION>\nInput: <YOUR_INPUT>\n\n2. Instruct without any input:\nInstruction: <YOUR_INSTRUCTION>\nInput: None\n\nThe "Instruction" describes a task or question. The paired "Input" provides further context or information for the requested "Instruction".\n\nYou must give me one instruction at a time.\nI must write a response that appropriately solves the requested instruction.\nI must decline your instruction honestly if I cannot perform the instruction due to physical, moral, legal reasons or my capability and explain the reasons.\nYou should instruct me not ask me questions.\nNow you must start to instruct me using the two ways described above.\nDo not add anything else other than your instruction and the optional corresponding input!\nKeep giving me instructions and necessary inputs until you think the task is completed.\nWhen the task is completed, you must only reply with a single word <CAMEL_TASK_DONE>.\nNever say <CAMEL_TASK_DONE> unless my responses have solved your task.', video_bytes=None, image_list=None, image_detail='auto', video_detail='low')
Original task prompt:
If I'm interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don't generate the answer by yourself, adjust the similarity threshold to lower value is necessary
Specified task prompt:
None
Final task prompt:
If I'm interest in contributing to the CAMEL projec and I encounter some challenges during the setup process, what should I do? You should refer to the content in url https://github.com/camel-ai/camel/wiki/Contributing-Guidlines to answer my question, don't generate the answer by yourself, adjust the similarity threshold to lower value is necessary
AI User:
Instruction: Visit the URL provided to access the CAMEL project's Contributing Guidelines.
Input: https://github.com/camel-ai/camel/wiki/Contributing-Guidlines
AI Assistant:
Function Execution: information_retrieval
Args: {'query': 'contributing challenges', 'contents': 'https://github.com/camel-ai/camel/wiki/Contributing-Guidlines', 'similarity_threshold': 0.5}
Result: {'Original Query': 'contributing challenges', 'Retrieved Context': ['No suitable information retrieved from https://github.com/camel-ai/camel/wiki/Contributing-Guidlines with similarity_threshold = 0.5.']}
Solution: It seems that I was unable to retrieve specific information regarding challenges during the setup process for contributing to the CAMEL project from the provided URL. The content might not directly address this query, or the similarity threshold might need further adjustment.
If you are facing challenges during the setup process, I recommend checking the following general steps:
1. **Review the Documentation**: Ensure that you have thoroughly read the contributing guidelines and any related setup instructions provided in the repository.
2. **Check for Common Issues**: Look for a FAQ section or common issues page that might address your specific problem.
3. **Community Support**: Engage with the community by checking forums, discussion boards, or reaching out via the project's communication channels like Slack, Discord, or mailing lists.
4. **Open an Issue**: If you cannot find a solution, consider opening an issue on the project's GitHub page, providing detailed information about the problem you're encountering.
5. **Seek Help from Maintainers**: Contact the maintainers or contributors directly if possible, as they might provide insights or solutions.
If you need further assistance or have another request, please let me know.
Next request.
AI User:
Instruction: Adjust the similarity threshold to a lower value if necessary to find relevant information on the setup process challenges from the provided URL.
Input: None
AI Assistant:
Function Execution: information_retrieval
Args: {'query': 'setup process challenges', 'contents': 'https://github.com/camel-ai/camel/wiki/Contributing-Guidlines', 'similarity_threshold': 0.3}
Result: {'Original Query': 'setup process challenges', 'Retrieved Context': ["Getting Help 🆘\n\nOur aim is to make the developer setup as straightforward as possible. If you encounter any challenges during the setup process, don't hesitate to reach out to a maintainer. We're here to assist you and ensure that the experience is smooth not just for you but also for future contributors. 😊"]}
Solution: I have successfully retrieved relevant information regarding challenges during the setup process for contributing to the CAMEL project. Here's what you can do if you encounter any challenges:
- **Getting Help**: The CAMEL project aims to make the developer setup as straightforward as possible. If you encounter any challenges during the setup process, you are encouraged to reach out to a maintainer. They are available to assist you and ensure that the experience is smooth not only for you but also for future contributors.
This information suggests that the maintainers are open to providing support, so don't hesitate to contact them if you need help.
If you have any more questions or need further assistance, feel free to ask.
Next request.
AI User:
Instruction: Follow the guidance provided by the CAMEL project's contributing guidelines to contact a maintainer if you encounter challenges during the setup process.
Input: None
AI Assistant:
Solution: To follow the guidance provided by the CAMEL project's contributing guidelines and contact a maintainer if you encounter challenges during the setup process, you can take the following steps:
1. **Identify Maintainers**: Visit the CAMEL project's GitHub repository and look for the "Contributors" or "Maintainers" section. This is often found in the README file or in the repository's "People" tab.
2. **Contact Information**: Check if the maintainers have provided any contact information such as email addresses or social media handles. Sometimes, maintainers may also be reachable through the project's communication channels like Slack, Discord, or mailing lists.
3. **Open an Issue**: If direct contact information is not available, consider opening an issue on the GitHub repository. Clearly describe the challenge you are facing during the setup process. Use a descriptive title and provide as much detail as possible, including error messages, steps you have taken, and any relevant logs.
4. **Use Project Communication Channels**: If the project has a dedicated communication channel (e.g., Slack, Discord), join it and post your query there. This can be a quick way to get help from maintainers or other contributors.
5. **Be Respectful and Patient**: When reaching out, be respectful and patient. Remember that maintainers are often volunteers and may take some time to respond.
By following these steps, you can effectively reach out to a maintainer for assistance with setup challenges.
If you need further guidance or have another request, please let me know.
Next request.
AI User:
CAMEL_TASK_DONE
AI Assistant:
Great! If you have any more questions or need further assistance in the future, feel free to reach out. Have a wonderful day!