Camel.storages.vectordb storages.faiss
FaissStorage
An implementation of the BaseVectorStorage
using FAISS,
Facebook AI’s Similarity Search library for efficient vector search.
The detailed information about FAISS is available at:
FAISS <https://github.com/facebookresearch/faiss>
_
Parameters:
- vector_dim (int): The dimension of storing vectors.
- index_type (str, optional): Type of FAISS index to create. Options include ‘Flat’, ‘IVF’, ‘HNSW’, etc. (default: :obj:
'Flat'
) - collection_name (Optional[str], optional): Name for the collection. If not provided, set it to the current time with iso format. (default: :obj:
None
) - storage_path (Optional[str], optional): Path to directory where the index will be stored. If None, index will only exist in memory. (default: :obj:
None
) - distance (VectorDistance, optional): The distance metric for vector comparison (default: :obj:
VectorDistance.COSINE
) - nlist (int, optional): Number of cluster centroids for IVF indexes. Only used if index_type includes ‘IVF’. (default: :obj:
100
) - m (int, optional): HNSW parameter. Number of connections per node. Only used if index_type includes ‘HNSW’. (default: :obj:
16
) **kwargs (Any): Additional keyword arguments. - Notes: - FAISS offers various index types optimized for different use cases: - ‘Flat’: Exact search, but slowest for large datasets - ‘IVF’: Inverted file index, good balance of speed and recall - ‘HNSW’: Hierarchical Navigable Small World, fast with high recall - ‘PQ’: Product Quantization for memory-efficient storage - The choice of index should be based on your specific requirements for search speed, memory usage, and accuracy.
init
Initialize the FAISS vector storage.
Parameters:
- vector_dim: Dimension of vectors to be stored
- index_type: FAISS index type (‘Flat’, ‘IVF’, ‘HNSW’, etc.)
- collection_name: Name of the collection (defaults to timestamp) (default: timestamp)
- storage_path: Directory to save the index (None for in-memory only)
- distance: Vector distance metric
- nlist: Number of clusters for IVF indexes
- m: HNSW parameter for connections per node **kwargs: Additional parameters
_generate_collection_name
Generates a collection name if user doesn’t provide
_get_index_path
Returns the path to the index file
_get_metadata_path
Returns the path to the metadata file
_create_index
Returns:
A FAISS index object configured according to the parameters.
_save_to_disk
Save the index and metadata to disk if storage_path is provided.
_load_from_disk
Loads the index and metadata from disk if they exist.
add
Adds a list of vectors to the index.
Parameters:
- records (List[VectorRecord]): List of vector records to be added. **kwargs (Any): Additional keyword arguments.
update_payload
Updates the payload of the vectors identified by their IDs.
Parameters:
- ids (List[str]): List of unique identifiers for the vectors to be updated.
- payload (Dict[str, Any]): Payload to be updated for all specified IDs. **kwargs (Any): Additional keyword arguments.
delete_collection
Deletes the entire collection (index and metadata).
delete
Deletes vectors from the index based on either IDs or payload filters.
Parameters:
- ids (Optional[List[str]], optional): List of unique identifiers for the vectors to be deleted.
- payload_filter (Optional[Dict[str, Any]], optional): A filter for the payload to delete points matching specific conditions. **kwargs (Any): Additional keyword arguments.
status
Returns:
VectorDBStatus: Current status of the vector database.
query
Searches for similar vectors in the storage based on the provided query.
Parameters:
- query (VectorDBQuery): The query object containing the search vector and the number of top similar vectors to retrieve.
- filter_conditions (Optional[Dict[str, Any]], optional): A dictionary specifying conditions to filter the query results. **kwargs (Any): Additional keyword arguments.
Returns:
List[VectorDBQueryResult]: A list of query results ordered by similarity.
clear
Remove all vectors from the storage.
load
Load the index from disk if storage_path is provided.
client
Provides access to the underlying FAISS client.
_matches_filter
Checks if a vector’s payload matches the filter conditions.
Parameters:
- vector_id (str): ID of the vector to check.
- filter_conditions (Dict[str, Any]): Conditions to match against.
Returns:
bool: True if the payload matches all conditions, False otherwise.
_normalize_vector
Normalizes a vector to unit length for cosine similarity.
Parameters:
- vector (ndarray): Vector to normalize, either 1D or 2D array.
Returns:
ndarray: Normalized vector with the same shape as input.