> ## Documentation Index
> Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Camel.retrievers.vector retriever

<a id="camel.retrievers.vector_retriever" />

<a id="camel.retrievers.vector_retriever.VectorRetriever" />

## VectorRetriever

```python theme={"system"}
class VectorRetriever(BaseRetriever):
```

An implementation of the `BaseRetriever` by using vector storage and
embedding model.

This class facilitates the retriever of relevant information using a
query-based approach, backed by vector embeddings.

**Parameters:**

* **embedding\_model** (BaseEmbedding): Embedding model used to generate vector embeddings.
* **storage** (BaseVectorStorage): Vector storage to query.
* **unstructured\_modules** (UnstructuredIO): A module for parsing files and URLs and chunking content based on specified parameters.

<a id="camel.retrievers.vector_retriever.VectorRetriever.__init__" />

### **init**

```python theme={"system"}
def __init__(
    self,
    embedding_model: Optional[BaseEmbedding] = None,
    storage: Optional[BaseVectorStorage] = None
):
```

Initializes the retriever class with an optional embedding model.

**Parameters:**

* **embedding\_model** (Optional\[BaseEmbedding]): The embedding model instance. Defaults to `OpenAIEmbedding` if not provided.
* **storage** (BaseVectorStorage): Vector storage to query.

<a id="camel.retrievers.vector_retriever.VectorRetriever.process" />

### process

```python theme={"system"}
def process(
    self,
    content: Union[str, 'Element', IO[bytes]],
    chunk_type: str = 'chunk_by_title',
    max_characters: int = 500,
    embed_batch: int = 50,
    should_chunk: bool = True,
    extra_info: Optional[dict] = None,
    metadata_filename: Optional[str] = None,
    chunker: Optional[BaseChunker] = None,
    **kwargs: Any
):
```

Processes content from local file path, remote URL, string
content, Element object, or a binary file object, divides it into
chunks by using `Unstructured IO`, and stores their embeddings in the
specified vector storage.

**Parameters:**

* **content** (Union\[str, Element, IO\[bytes]]): Local file path, remote URL, string content, Element object, or a binary file object.
* **chunk\_type** (str): Type of chunking going to apply. Defaults to "chunk\_by\_title".
* **max\_characters** (int): Max number of characters in each chunk. Defaults to `500`.
* **embed\_batch** (int): Size of batch for embeddings. Defaults to `50`. (default: 50)
* **should\_chunk** (bool): If True, divide the content into chunks, otherwise skip chunking. Defaults to True.
* **extra\_info** (Optional\[dict]): Extra information to be added to the payload. Defaults to None.
* **metadata\_filename** (Optional\[str]): The metadata filename to be used for storing metadata. Defaults to None. \*\*kwargs (Any): Additional keyword arguments for content parsing.

<a id="camel.retrievers.vector_retriever.VectorRetriever.query" />

### query

```python theme={"system"}
def query(
    self,
    query: str,
    top_k: int = Constants.DEFAULT_TOP_K_RESULTS,
    similarity_threshold: float = Constants.DEFAULT_SIMILARITY_THRESHOLD
):
```

Executes a query in vector storage and compiles the retrieved
results into a dictionary.

**Parameters:**

* **query** (str): Query string for information retriever.
* **similarity\_threshold** (float, optional): The similarity threshold for filtering results. Defaults to `DEFAULT_SIMILARITY_THRESHOLD`.
* **top\_k** (int, optional): The number of top results to return during retriever. Must be a positive integer. Defaults to `DEFAULT_TOP_K_RESULTS`.

**Returns:**

List\[Dict\[str, Any]]: Concatenated list of the query results.
