Camel.personas.persona hub
PersonaHub
The PersonaHub adapted from “Scaling Synthetic Data Creation with 1, 000,000,000 Personas”.
PersonaHub proposes a novel persona-driven data synthesis methodology that leverages various perspectives within a large language model (LLM) to create diverse synthetic data. By showcasing PersonaHub’s use cases in synthesizing high-quality mathematical and logical reasoning problems, instructions (i.e., user prompts), knowledge-rich texts, game NPCs and tools (functions) at scale, the authors demonstrate persona-driven data synthesis is versatile, scalable, flexible, and easy to use, potentially driving a paradigm shift in synthetic data creation and applications in practice, which may have a profound impact on LLM research and development. Please refer to the paper for more details: https://arxiv.org/pdf/2406.20094.
Parameters:
- model (BaseModelBackend, optional): The model to use for persona generation and manipulation. (default: :obj:
None
)
init
setitem
Add a persona to the group.
Parameters:
- persona (Persona): The persona to add.
delitem
Remove a persona from the group by ID.
Parameters:
- persona_id (uuid.UUID): The ID of the persona to remove.
getitem
Get a persona by ID.
Parameters:
- persona_id (uuid.UUID): The ID of the persona to retrieve.
text_to_persona
Infers a specific persona who is likely to [read|write|like|dislike |…] the given text.
Parameters:
- text (str): The input text for which to infer a persona.
- action (str): The action associated with the persona (default is “read”).
Returns:
Persona: The inferred persona.
persona_to_persona
Derives additional personas based on interpersonal relationships from this persona.
Parameters:
- persona (Persona): The persona from which to derive related personas.
Returns:
Dict[uuid.UUID, Persona]: A dictionary of related personas.
deduplicate
Remove similar personas from the group.
Parameters:
- embedding_model (BaseEmbedding): The embedding model for similarity compairsion. (default is
None
). - similarity_threshold (float): The similarity threshold for deduplication (default is
0.85
).
_get_embedding
Cache embeddings to reduce recomputation.
_cosine_similarity
Copmute the cosine similarity of two vectors.
Parameters:
- vec1 (np.ndarray): Vector 1
- vec2 (np.ndarray): Vector 2
_is_similar
Check if two personas are similar by consine similarity of the embeddings of their descriptions.
Parameters:
- persona1 (Persona1): A persona.
- persona2 (Persona2): The other persona.
- similarity_threshold (float): The threshold on consine similarity to determine whether the two personas are similar.
- embedding_model (BaseEmbedding): The embedding model for similarity compairsion.
len
iter
get_all_personas
Return a list of all personas.