MarkItDownToolkit
Old way
from camel.toolkits import MarkItDownToolkit toolkit = MarkItDownToolkit() content = toolkit.read_files([‘file1.pdf’, ‘file2.docx’])New way
from camel.toolkits import FileToolkit toolkit = FileToolkit() content = toolkit.read_file([‘file1.pdf’, ‘file2.docx’])init
read_files
- PDF (.pdf)
- Microsoft Office: Word (.doc, .docx), Excel (.xls, .xlsx), PowerPoint (.ppt, .pptx)
- EPUB (.epub)
- HTML (.html, .htm)
- Images (.jpg, .jpeg, .png) for OCR
- Audio (.mp3, .wav) for transcription
- Text-based formats (.csv, .json, .xml, .txt)
- ZIP archives (.zip)
- file_paths (List[str]): A list of local file paths to be converted.