Camel.loaders.firecrawl reader

Firecrawl

class Firecrawl:

Firecrawl allows you to turn entire websites into LLM-ready markdown. Parameters:

api_key (Optional[str]): API key for authenticating with the Firecrawl API.
api_url (Optional[str]): Base URL for the Firecrawl API.
References:
https: //docs.firecrawl.dev/introduction

init

def __init__(
    self,
    api_key: Optional[str] = None,
    api_url: Optional[str] = None
):

crawl

def crawl(
    self,
    url: str,
    params: Optional[Dict[str, Any]] = None,
    **kwargs: Any
):

Crawl a URL and all accessible subpages. Customize the crawl by setting different parameters, and receive the full response or a job ID based on the specified options. Parameters:

url (str): The URL to crawl.
params (Optional[Dict[str, Any]]): Additional parameters for the crawl request. Defaults to None. **kwargs (Any): Additional keyword arguments, such as poll_interval, idempotency_key.

Returns: Any: The crawl job ID or the crawl results if waiting until completion.

check_crawl_job

def check_crawl_job(self, job_id: str):

Check the status of a crawl job. Parameters:

job_id (str): The ID of the crawl job.

Returns: Dict: The response including status of the crawl job.

scrape

def scrape(self, url: str, params: Optional[Dict[str, str]] = None):

To scrape a single URL. This function supports advanced scraping by setting different parameters and returns the full scraped data as a dictionary. Reference: https://docs.firecrawl.dev/advanced-scraping-guide Parameters:

url (str): The URL to read.
params (Optional[Dict[str, str]]): Additional parameters for the scrape request.

Returns: Dict[str, str]: The scraped data.

structured_scrape

def structured_scrape(self, url: str, response_format: BaseModel):

Use LLM to extract structured data from given URL. Parameters:

url (str): The URL to read.
response_format (BaseModel): A pydantic model that includes value types and field descriptions used to generate a structured response by LLM. This schema helps in defining the expected output format.

Returns: Dict: The content of the URL.

map_site

def map_site(self, url: str, params: Optional[Dict[str, Any]] = None):

Map a website to retrieve all accessible URLs. Parameters:

url (str): The URL of the site to map.
params (Optional[Dict[str, Any]]): Additional parameters for the map request. Defaults to None.

Returns: list: A list containing the URLs found on the site.

Overview

Agents

Configs

Data Generation

Datasets

Embeddings

Models

Interpreters

Memory

Messages

Prompts

Responses

Retrievers

Societies

Storage

Tasks

Terminators

Toolkits

Types

Verifiers

Bots

Utilities

Environments

Extractors

Personas

Benchmarks

Data Collectors

Datahubs

Loaders

Runtimes

Schemas

Firecrawl

init

crawl

check_crawl_job

scrape

structured_scrape

map_site

Overview

Agents

Configs

Data Generation

Datasets

Embeddings

Models

Interpreters

Memory

Messages

Prompts

Responses

Retrievers

Societies

Storage

Tasks

Terminators

Toolkits

Types

Verifiers

Bots

Utilities

Environments

Extractors

Personas

Benchmarks

Data Collectors

Datahubs

Loaders

Runtimes

Schemas

​Firecrawl

​init

​crawl

​check_crawl_job

​scrape

​structured_scrape

​map_site

Firecrawl

init

crawl

check_crawl_job

scrape

structured_scrape

map_site