> ## Documentation Index > Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt > Use this file to discover all available pages before exploring further. # Camel.datagen.self improving cot ## SelfImprovingCoTPipeline ```python theme={"system"} class SelfImprovingCoTPipeline: ``` Pipeline for generating self-taught reasoning traces using the self-improving methodology. This implements the STaR paper's approach of: 1. Initial reasoning trace generation 2. Self-evaluation 3. Feedback-based improvement 4. Iterative refinement ### **init** ```python theme={"system"} def __init__( self, reason_agent: ChatAgent, problems: List[Dict], max_iterations: int = 3, score_threshold: Union[float, Dict[str, float]] = 0.7, rejection_sampling_n: Optional[int] = None, evaluate_agent: Optional[ChatAgent] = None, reward_model: Optional[BaseRewardModel] = None, output_path: Optional[str] = None, few_shot_examples: Optional[str] = None, batch_size: Optional[int] = None, max_workers: Optional[int] = None, solution_pattern: str = '\\\\boxed{(.*?)}', trace_pattern: Optional[str] = None ): ``` Initialize the self-improving cot pipeline. **Parameters:** * **reason\_agent** (ChatAgent): The chat agent used for generating and improving reasoning traces. * **problems** (List\[Dict]): List of problem dictionaries to process. * **max\_iterations** (int, optional): Maximum number of improvement iterations. If set to `0`, the pipeline will generate an initial trace without any improvement iterations. (default: :obj:`3`) * **score\_threshold** (Union\[float, Dict\[str, float]], optional): Quality threshold. Can be either a single float value applied to average score, or a dictionary mapping score dimensions to their thresholds. For example: `{"correctness": 0.8, "coherence": 0.7}`. If using reward model and threshold for a dimension is not specified, will use the default value 0.7. (default: :obj:`0.7`) * **rejection\_sampling\_n** (int, optional): Specifies the number of samples to be drawn using the rejection sampling method, where samples are accepted or rejected based on a predefined condition to achieve a desired distribution. (default: :obj: `None`) * **evaluate\_agent** (Optional\[ChatAgent]): The chat agent used for evaluating reasoning traces. (default: :obj:`None`) * **reward\_model** (BaseRewardModel, optional): Model used to evaluate reasoning traces. If `None`, uses Agent self-evaluation. (default: :obj:`None`) * **output\_path** (str, optional): Output path for saving traces. If `None`, results will only be returned without saving to file. (default: :obj:`None`) * **few\_shot\_examples** (str, optional): Examples to use for few-shot generation. (default: :obj:`None`) * **batch\_size** (int, optional): Batch size for parallel processing. (default: :obj:`None`) * **max\_workers** (int, optional): Maximum number of worker threads. (default: :obj:`None`) * **solution\_pattern** (str, optional): Regular expression pattern with one capture group to extract answers from solution text. (default: :obj:`r'\\boxed{(.*?)}'`) * **trace\_pattern** (str, optional): Regular expression pattern with one capture group to extract answers from trace text. If `None`, uses the same pattern as solution\_pattern. (default: :obj:`None`) ### safe\_write\_json ```python theme={"system"} def safe_write_json(self, file_path, data): ``` ### clean\_json ```python theme={"system"} def clean_json(self, data): ``` ### \_check\_score\_threshold ```python theme={"system"} def _check_score_threshold(self, scores: Dict[str, float]): ``` Check if scores meet the threshold requirements. **Parameters:** * **scores** (Dict\[str, float]): Dictionary of scores for different dimensions. **Returns:** bool: True if scores meet threshold requirements, False otherwise. ### \_generate\_feedback ```python theme={"system"} def _generate_feedback(self, scores: Dict[str, float]): ``` Generate feedback based on which dimensions need improvement. **Parameters:** * **scores** (Dict\[str, float]): Dictionary of scores for different dimensions. **Returns:** str: Feedback message indicating which dimensions need improvement. ### generate\_reasoning\_trace ```python theme={"system"} def generate_reasoning_trace(self, problem: str): ``` Generate initial reasoning trace for a given problem. **Parameters:** * **problem** (str): The problem text to generate reasoning for. **Returns:** str: Generated reasoning trace. ### evaluate\_trace ```python theme={"system"} def evaluate_trace( self, problem: str, trace: str, solution: Optional[str] = None ): ``` Evaluate the quality of a reasoning trace. **Parameters:** * **problem** (str): The original problem text to evaluate against. * **trace** (str): The reasoning trace to evaluate. * **solution** (Optional\[str]): The solution to the problem, if provided. (default: :obj:`None`) **Returns:** Dict\[str, Any]: Evaluation results containing: * scores: Dict of evaluation dimensions and their scores * feedback: Detailed feedback for improvement For Agent self-evaluation, the scores will include: * correctness: Score for logical correctness * clarity: Score for clarity of explanation * completeness: Score for completeness of reasoning For reward model evaluation, the scores will depend on the model's evaluation dimensions. ### generate\_reasoning\_trace\_rejection ```python theme={"system"} def generate_reasoning_trace_rejection(self, problem: str): ``` Generate multiple candidate reasoning traces for a problem and select the best one based on evaluation. **Parameters:** * **problem** (str): The problem text for generating a reasoning trace. **Returns:** str: The best candidate trace that meets quality criteria, or the first candidate if none qualify. ### improve\_trace ```python theme={"system"} def improve_trace( self, problem: str, trace: str, feedback: str, solution: Optional[str] = None ): ``` Generate improved reasoning trace based on feedback. **Parameters:** * **problem** (str): The original problem text. * **trace** (str): The current reasoning trace. * **feedback** (str): Feedback for improving the trace. * **solution** (Optional\[str]): The solution to the problem, if provided. (default: :obj:`None`) **Returns:** str: Improved reasoning trace. ### validate\_problem\_format ```python theme={"system"} def validate_problem_format(self, problem: Dict): ``` Validate that a problem dictionary has the required format. **Parameters:** * **problem** (Dict): Problem dictionary to validate. ### \_check\_boxed\_answers ```python theme={"system"} def _check_boxed_answers(self, solution: str, trace: str): ``` Check if the answer in the trace matches the solution using the configured patterns. **Parameters:** * **solution** (str): The problem solution string. * **trace** (str): The reasoning trace string. **Returns:** bool: True if answers match, False otherwise ### process\_problem ```python theme={"system"} def process_problem(self, problem: Dict, rationalization: bool = False): ``` Process a single problem through the self-improving cot pipeline. **Parameters:** * **problem** (Dict): Problem dictionary containing the problem text. * **rationalization** (bool, optional): Whether to use rationalization. (default: :obj:`False`) **Returns:** ProblemResult: Results with final trace and history. ### generate ```python theme={"system"} def generate(self, rationalization: bool = False): ``` Execute the self-improving cot pipeline on all problems. Process problems and return results. If output\_path is specified, also save results to file. **Parameters:** * **rationalization** (bool, optional): Whether to use rationalization. (default: :obj:`False`) **Returns:** List\[Dict\[str, Any]]: List of processed results