Camel.datasets.self instruct generator

SelfInstructGenerator

class SelfInstructGenerator(BaseGenerator):

A generator for creating synthetic datapoints using self-instruct. It utilizes both a human-provided dataset (seed_dataset) and generated machine instructions (machine_instructions) to produce new, synthetic datapoints that include a question, a computed rationale (code), and a final answer (from a verifier).

init

def __init__(
    self,
    seed_dataset: StaticDataset,
    verifier: BaseVerifier,
    instruction_agent: Optional[ChatAgent] = None,
    rationale_agent: Optional[ChatAgent] = None,
    seed: int = 42,
    **kwargs
):

Initialize the self-instruct generator. Parameters:

seed_dataset (StaticDataset): Dataset containing seed instructions.
verifier (BaseVerifier): Verifier instance to validate generated solutions.
instruction_agent (Optional[ChatAgent]): Agent for generating instructions. If not provided, a default agent will be created.
rationale_agent (Optional[ChatAgent]): Agent for generating rationales. If not provided, a default agent will be created.
seed (int): Random seed for reproducibility. (default: :obj:42) **kwargs: Additional keyword arguments passed to the BaseGenerator. (default: 42)

default_instruction_agent

def default_instruction_agent(self):

Returns: ChatAgent: An agent with the default instruction prompt.

default_rationale_agent

def default_rationale_agent(self):

Returns: ChatAgent: An agent with the rationale prompt

format_support_block

def format_support_block(dp: DataPoint):

Format a DataPoint into a few-shot example block. Parameters:

dp (DataPoint): A data point.

Returns: str: A formatted string containing the question and its corresponding code block in Markdown-style Python format.

generate_new_instruction

def generate_new_instruction(
    self,
    agent: ChatAgent,
    support_human_dps: list[DataPoint],
    support_machine_dps: list[DataPoint]
):

Generate a new instruction using self-instruct prompting. Parameters:

agent (ChatAgent): The agent to use for generating the instruction.
support_human_dps (list[DataPoint]): List of human examples to sample.
support_machine_dps (list[DataPoint]): List of machine examples to sample.

Returns: str: The newly generated question.

generate_rationale

def generate_rationale(
    self,
    question: str,
    agent: Optional[ChatAgent] = None,
    support_human_dps: Optional[list[DataPoint]] = None
):

Generate rationale code (solution) for the given question. Parameters:

question (str): The question to be solved.
agent (Optional[ChatAgent]): The agent to use for generating the rationale. If None is provided, the default rationale agent will be used. (default: :obj:None)
support_human_dps (Optional[list[DataPoint]]): List of human examples to sample. (default: :obj:None)

Returns: str: The generated code solution as a string.

QuestionSchema

class QuestionSchema(BaseModel):

Schema for the generated question. Parameters:

question (str): The question generated by the model.

RationaleSchema

class RationaleSchema(BaseModel):

Schema for the generated rationale code. Parameters:

code (str): The generated code without any formatting.

Overview

Agents

Configs

Data Generation

Datasets

Embeddings

Models

Interpreters

Memory

Messages

Prompts

Responses

Retrievers

Societies

Storage

Tasks

Terminators

Toolkits

Types

Verifiers

Bots

Utilities

Environments

Extractors

Personas

Benchmarks

Data Collectors

Datahubs

Loaders

Runtimes

Schemas

​SelfInstructGenerator

​init

​default_instruction_agent

​default_rationale_agent

​format_support_block

​generate_new_instruction

​generate_rationale

​QuestionSchema

​RationaleSchema