Camel.datasets.self instruct generator
SelfInstructGenerator
A generator for creating synthetic datapoints using self-instruct.
It utilizes both a human-provided dataset (seed_dataset) and generated machine instructions (machine_instructions) to produce new, synthetic datapoints that include a question, a computed rationale (code), and a final answer (from a verifier).
init
Initialize the self-instruct generator.
Parameters:
- seed_dataset (StaticDataset): Dataset containing seed instructions.
- verifier (BaseVerifier): Verifier instance to validate generated solutions.
- instruction_agent (Optional[ChatAgent]): Agent for generating instructions. If not provided, a default agent will be created.
- rationale_agent (Optional[ChatAgent]): Agent for generating rationales. If not provided, a default agent will be created.
- seed (int): Random seed for reproducibility. (default: :obj:
42
) **kwargs: Additional keyword arguments passed to the BaseGenerator. (default: 42)
default_instruction_agent
Returns:
ChatAgent: An agent with the default instruction prompt.
default_rationale_agent
Returns:
ChatAgent: An agent with the rationale prompt
format_support_block
Format a DataPoint into a few-shot example block.
Parameters:
- dp (DataPoint): A data point.
Returns:
str: A formatted string containing the question and its corresponding code block in Markdown-style Python format.
generate_new_instruction
Generate a new instruction using self-instruct prompting.
Parameters:
- agent (ChatAgent): The agent to use for generating the instruction.
- support_human_dps (list[DataPoint]): List of human examples to sample.
- support_machine_dps (list[DataPoint]): List of machine examples to sample.
Returns:
str: The newly generated question.
generate_rationale
Generate rationale code (solution) for the given question.
Parameters:
- question (str): The question to be solved.
- agent (Optional[ChatAgent]): The agent to use for generating the rationale. If None is provided, the default rationale agent will be used. (default: :obj:
None
) - support_human_dps (Optional[list[DataPoint]]): List of human examples to sample. (default: :obj:
None
)
Returns:
str: The generated code solution as a string.
QuestionSchema
Schema for the generated question.
Attributes: question (str): The question generated by the model.
RationaleSchema
Schema for the generated rationale code.
Attributes: code (str): The generated code without any formatting.