> ## Documentation Index
> Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Camel.retrievers.bm25 retriever

<a id="camel.retrievers.bm25_retriever" />

<a id="camel.retrievers.bm25_retriever.BM25Retriever" />

## BM25Retriever

```python theme={"system"}
class BM25Retriever(BaseRetriever):
```

An implementation of the `BaseRetriever` using the `BM25` model.

This class facilitates the retriever of relevant information using a
query-based approach, it ranks documents based on the occurrence and
frequency of the query terms.

**Parameters:**

* **bm25** (BM25Okapi): An instance of the BM25Okapi class used for calculating document scores.
* **content\_input\_path** (str): The path to the content that has been processed and stored.
* **unstructured\_modules** (UnstructuredIO): A module for parsing files and URLs and chunking content based on specified parameters.
* **References**:
* **https**: //github.com/dorianbrown/rank\_bm25

<a id="camel.retrievers.bm25_retriever.BM25Retriever.__init__" />

### **init**

```python theme={"system"}
def __init__(self):
```

Initializes the BM25Retriever.

<a id="camel.retrievers.bm25_retriever.BM25Retriever.process" />

### process

```python theme={"system"}
def process(
    self,
    content_input_path: str,
    chunk_type: str = 'chunk_by_title',
    **kwargs: Any
):
```

Processes content from a file or URL, divides it into chunks by
using `Unstructured IO`,then stored internally. This method must be
called before executing queries with the retriever.

**Parameters:**

* **content\_input\_path** (str): File path or URL of the content to be processed.
* **chunk\_type** (str): Type of chunking going to apply. Defaults to "chunk\_by\_title". \*\*kwargs (Any): Additional keyword arguments for content parsing.

<a id="camel.retrievers.bm25_retriever.BM25Retriever.query" />

### query

```python theme={"system"}
def query(self, query: str, top_k: int = DEFAULT_TOP_K_RESULTS):
```

Executes a query and compiles the results.

**Parameters:**

* **query** (str): Query string for information retriever.
* **top\_k** (int, optional): The number of top results to return during retriever. Must be a positive integer. Defaults to `DEFAULT_TOP_K_RESULTS`.

**Returns:**

List\[Dict\[str]]: Concatenated list of the query results.
