Documentation Index
Fetch the complete documentation index at: https://docs.camel-ai.org/llms.txt
Use this file to discover all available pages before exploring further.
class MarkItDownToolkit(BaseToolkit):
A class representing a toolkit for MarkItDown.
.. deprecated::
MarkItDownToolkit is deprecated. Use FileToolkit instead, which now
includes the same functionality through its read_file method that
supports both single files and multiple files.
Example migration:
Old way
from camel.toolkits import MarkItDownToolkit
toolkit = MarkItDownToolkit()
content = toolkit.read_files([‘file1.pdf’, ‘file2.docx’])
New way
from camel.toolkits import FileToolkit
toolkit = FileToolkit()
content = toolkit.read_file([‘file1.pdf’, ‘file2.docx’])
init
def __init__(self, timeout: Optional[float] = None):
read_files
def read_files(self, file_paths: List[str]):
Scrapes content from a list of files and converts it to Markdown.
This function takes a list of local file paths, attempts to convert
each file into Markdown format, and returns the converted content.
The conversion is performed in parallel for efficiency.
Supported file formats include:
- PDF (.pdf)
- Microsoft Office: Word (.doc, .docx), Excel (.xls, .xlsx),
PowerPoint (.ppt, .pptx)
- EPUB (.epub)
- HTML (.html, .htm)
- Images (.jpg, .jpeg, .png) for OCR
- Audio (.mp3, .wav) for transcription
- Text-based formats (.csv, .json, .xml, .txt)
- ZIP archives (.zip)
Parameters:
- file_paths (List[str]): A list of local file paths to be converted.
Returns:
Dict[str, str]: A dictionary where keys are the input file paths
and values are the corresponding content in Markdown format.
If conversion of a file fails, the value will contain an
error message.
Returns:
List[FunctionTool]: A list of FunctionTool objects
representing the functions in the toolkit.