RagasFields
annotate_dataset
- dataset (Dataset): The input dataset to annotate.
- context_call (Optional[Callable[[Dict[str, Any]], List[str]]]): Function to generate context for each example.
- answer_call (Optional[Callable[[Dict[str, Any]], str]]): Function to generate answer for each example.
rmse
- input_trues (Sequence[float]): Ground truth values.
- input_preds (Sequence[float]): Predicted values.
auroc
- trues (Sequence[bool]): Ground truth binary values.
- preds (Sequence[float]): Predicted probability values.
ragas_calculate_metrics
- dataset (Dataset): The dataset containing predictions and ground truth.
- pred_context_relevance_field (Optional[str]): Field name for predicted context relevance.
- pred_faithfulness_field (Optional[str]): Field name for predicted faithfulness.
- metrics_to_evaluate (Optional[List[str]]): List of metrics to evaluate.
- ground_truth_context_relevance_field (str): Field name for ground truth relevance.
- ground_truth_faithfulness_field (str): Field name for ground truth adherence.
ragas_evaluate_dataset
- dataset (Dataset): Input dataset to evaluate.
- contexts_field_name (Optional[str]): Field name containing contexts.
- answer_field_name (Optional[str]): Field name containing answers.
- metrics_to_evaluate (Optional[List[str]]): List of metrics to evaluate.
RAGBenchBenchmark
- processes (int, optional): Number of processes for parallel processing.
- subset (str, optional): Dataset subset to use (e.g., “hotpotqa”).
- split (str, optional): Dataset split to use (e.g., “test”).
init
download
load
- force_download (bool, optional): Whether to force download the data.
run
- agent (ChatAgent): Chat agent for generating answers.
- auto_retriever (AutoRetriever): Retriever for finding relevant contexts.