Documentation Index
Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt
Use this file to discover all available pages before exploring further.
PersonaHub
The PersonaHub adapted from “Scaling Synthetic Data Creation with 1,
000,000,000 Personas”.
PersonaHub proposes a novel persona-driven data synthesis methodology
that leverages various perspectives within a large language model (LLM) to
create diverse synthetic data. By showcasing PersonaHub’s use cases in
synthesizing high-quality mathematical and logical reasoning problems,
instructions (i.e., user prompts), knowledge-rich texts, game NPCs and
tools (functions) at scale, the authors demonstrate persona-driven data
synthesis is versatile, scalable, flexible, and easy to use, potentially
driving a paradigm shift in synthetic data creation and applications in
practice, which may have a profound impact on LLM research and development.
Please refer to the paper for more details: https://arxiv.org/pdf/2406.20094.
Parameters:
- model (BaseModelBackend, optional): The model to use for persona generation and manipulation. (default: :obj:
None)
init
def __init__(self, model: Optional[BaseModelBackend] = None):
setitem
def __setitem__(self, persona: Persona):
Add a persona to the group.
Parameters:
- persona (Persona): The persona to add.
delitem
def __delitem__(self, persona_id: uuid.UUID):
Remove a persona from the group by ID.
Parameters:
- persona_id (uuid.UUID): The ID of the persona to remove.
getitem
def __getitem__(self, persona_id: uuid.UUID):
Get a persona by ID.
Parameters:
- persona_id (uuid.UUID): The ID of the persona to retrieve.
text_to_persona
def text_to_persona(
self,
text: str,
action: Literal['read', 'write', 'like', 'dislike'] = 'read'
):
Infers a specific persona who is likely to [read|write|like|dislike
|…] the given text.
Parameters:
- text (str): The input text for which to infer a persona.
- action (str): The action associated with the persona (default is “read”).
Returns:
Persona: The inferred persona.
persona_to_persona
def persona_to_persona(self, persona: Persona):
Derives additional personas based on interpersonal relationships
from this persona.
Parameters:
- persona (Persona): The persona from which to derive related personas.
Returns:
Dict[uuid.UUID, Persona]: A dictionary of related personas.
deduplicate
def deduplicate(
self,
embedding_model: Optional[BaseEmbedding] = None,
similarity_threshold: float = 0.85
):
Remove similar personas from the group.
Parameters:
- embedding_model (BaseEmbedding): The embedding model for similarity compairsion. (default is
None).
- similarity_threshold (float): The similarity threshold for deduplication (default is
0.85).
_get_embedding
def _get_embedding(embedding_model: BaseEmbedding, description: Optional[str]):
Cache embeddings to reduce recomputation.
_cosine_similarity
def _cosine_similarity(vec1: np.ndarray, vec2: np.ndarray):
Copmute the cosine similarity of two vectors.
Parameters:
- vec1 (np.ndarray): Vector 1
- vec2 (np.ndarray): Vector 2
_is_similar
def _is_similar(
self,
persona1: Persona,
persona2: Persona,
similarity_threshold: float,
embedding_model: BaseEmbedding
):
Check if two personas are similar by consine similarity
of the embeddings of their descriptions.
Parameters:
- persona1 (Persona1): A persona.
- persona2 (Persona2): The other persona.
- similarity_threshold (float): The threshold on consine similarity to determine whether the two personas are similar.
- embedding_model (BaseEmbedding): The embedding model for similarity compairsion.
len
iter
get_all_personas
def get_all_personas(self):
Return a list of all personas.