Camel.retrievers.auto retriever
AutoRetriever
Facilitates the automatic retrieval of information using a query-based approach with pre-defined elements.
Attributes:
url_and_api_key (Optional[Tuple[str, str]]): URL and API key for
accessing the vector storage remotely.
vector_storage_local_path (Optional[str]): Local path for vector
storage, if applicable.
storage_type (Optional[StorageType]): The type of vector storage to
use. Defaults to StorageType.QDRANT
.
embedding_model (Optional[BaseEmbedding]): Model used for embedding
queries and documents. Defaults to OpenAIEmbedding()
.
init
_initialize_vector_storage
Sets up and returns a vector storage instance with specified parameters.
Parameters:
- collection_name (Optional[str]): Name of the collection in the vector storage.
Returns:
BaseVectorStorage: Configured vector storage instance.
_collection_name_generator
Generates a valid collection name from a given file path or URL.
Parameters:
- content (Union[str, Element]): Local file path, remote URL, string content or Element object.
Returns:
str: A sanitized, valid collection name suitable for use.
run_vector_retriever
Executes the automatic vector retriever process using vector storage.
Parameters:
- query (str): Query string for information retriever.
- contents (Union[str, List[str], Element, List[Element]]): Local file paths, remote URLs, string contents or Element objects.
- top_k (int, optional): The number of top results to return during retrieve. Must be a positive integer. Defaults to
DEFAULT_TOP_K_RESULTS
. - similarity_threshold (float, optional): The similarity threshold for filtering results. Defaults to
DEFAULT_SIMILARITY_THRESHOLD
. - return_detailed_info (bool, optional): Whether to return detailed information including similarity score, content path and metadata. Defaults to
False
. - max_characters (int): Max number of characters in each chunk. Defaults to
500
.
Returns:
dict[str, Sequence[Collection[str]]]: By default, returns
only the text information. If return_detailed_info
is
True
, return detailed information including similarity
score, content path and metadata.