> ## Documentation Index > Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt > Use this file to discover all available pages before exploring further. # Camel.datasets.base generator ## BaseGenerator ```python theme={"system"} class BaseGenerator(ABC, IterableDataset): ``` Abstract base class for data generators. This class defines the interface for generating synthetic datapoints. Concrete implementations should provide specific generation strategies. ### **init** ```python theme={"system"} def __init__( self, seed: int = 42, buffer: int = 20, cache: Union[str, Path, None] = None, data_path: Union[str, Path, None] = None, **kwargs ): ``` Initialize the base generator. **Parameters:** * **seed** (int): Random seed for reproducibility. (default: :obj:`42`) (default: 42) * **buffer** (int): Amount of DataPoints to be generated when the iterator runs out of DataPoints in data. (default: :obj:`20`) * **cache** (Union\[str, Path, None]): Optional path to save generated datapoints during iteration. If None is provided, datapoints will be discarded every 100 generations. * **data\_path** (Union\[str, Path, None]): Optional path to a JSONL file to initialize the dataset from. \*\*kwargs: Additional generator parameters. ### **aiter** ```python theme={"system"} def __aiter__(self): ``` Async iterator that yields datapoints dynamically. If a `data_path` was provided during initialization, those datapoints are yielded first. When self.\_data is empty, 20 new datapoints are generated. Every 100 yields, the batch is appended to the JSONL file or discarded if `cache` is None. Yields: DataPoint: A single datapoint. ### **iter** ```python theme={"system"} def __iter__(self): ``` Synchronous iterator for PyTorch IterableDataset compatibility. If a `data_path` was provided during initialization, those datapoints are yielded first. When self.\_data is empty, 20 new datapoints are generated. Every 100 yields, the batch is appended to the JSONL file or discarded if `cache` is None. Yields: DataPoint: A single datapoint. ### sample ```python theme={"system"} def sample(self): ``` **Returns:** DataPoint: The next DataPoint. **Note:** This method is intended for synchronous contexts. Use 'async\_sample' in asynchronous contexts to avoid blocking or runtime errors. ### save\_to\_jsonl ```python theme={"system"} def save_to_jsonl(self, file_path: Union[str, Path]): ``` Saves the generated datapoints to a JSONL (JSON Lines) file. Each datapoint is stored as a separate JSON object on a new line. **Parameters:** * **file\_path** (Union\[str, Path]): Path to save the JSONL file. **Note:** * Uses `self._data`, which contains the generated datapoints. * Appends to the file if it already exists. * Ensures compatibility with large datasets by using JSONL format. ### flush ```python theme={"system"} def flush(self, file_path: Union[str, Path]): ``` Flush the current data to a JSONL file and clear the data. **Parameters:** * **file\_path** (Union\[str, Path]): Path to save the JSONL file. **Note:** * Uses `save_to_jsonl` to save `self._data`. ### \_init\_from\_jsonl ```python theme={"system"} def _init_from_jsonl(self, file_path: Path): ``` Load and parse a dataset from a JSONL file. **Parameters:** * **file\_path** (Path): Path to the JSONL file. **Returns:** List\[Dict\[str, Any]]: A list of datapoint dictionaries.