Camel.loaders.firecrawl reader
Firecrawl
Firecrawl allows you to turn entire websites into LLM-ready markdown.
Parameters:
- api_key (Optional[str]): API key for authenticating with the Firecrawl API.
- api_url (Optional[str]): Base URL for the Firecrawl API.
- References:
- https: //docs.firecrawl.dev/introduction
init
crawl
Crawl a URL and all accessible subpages. Customize the crawl by setting different parameters, and receive the full response or a job ID based on the specified options.
Parameters:
- url (str): The URL to crawl.
- params (Optional[Dict[str, Any]]): Additional parameters for the crawl request. Defaults to
None
. **kwargs (Any): Additional keyword arguments, such aspoll_interval
,idempotency_key
.
Returns:
Any: The crawl job ID or the crawl results if waiting until completion.
check_crawl_job
Check the status of a crawl job.
Parameters:
- job_id (str): The ID of the crawl job.
Returns:
Dict: The response including status of the crawl job.
scrape
To scrape a single URL. This function supports advanced scraping by setting different parameters and returns the full scraped data as a dictionary.
Reference: https://docs.firecrawl.dev/advanced-scraping-guide
Parameters:
- url (str): The URL to read.
- params (Optional[Dict[str, Any]]): Additional parameters for the scrape request.
Returns:
Dict: The scraped data.
structured_scrape
Use LLM to extract structured data from given URL.
Parameters:
- url (str): The URL to read.
- response_format (BaseModel): A pydantic model that includes value types and field descriptions used to generate a structured response by LLM. This schema helps in defining the expected output format.
Returns:
Dict: The content of the URL.
map_site
Map a website to retrieve all accessible URLs.
Parameters:
- url (str): The URL of the site to map.
- params (Optional[Dict[str, Any]]): Additional parameters for the map request. Defaults to
None
.
Returns:
list: A list containing the URLs found on the site.