Skip to main content

extract

Method
This is a method of the OCRAgent class in the ocr_agent module.
Extract text from a document or image.

Signature

def extract(source: str, include_image_base64: Optional[bool], pages: Optional[List[int]], image_limit: Optional[int], model: Optional[str]) -> Any

Parameters

source
str
required
URL or path to document/image
include_image_base64
Optional
Include base64 images in response
pages
Optional
Specific pages to extract (for PDFs)
image_limit
Optional
Maximum images per page
model
Optional
Override model for this call **kwargs: Additional parameters

Returns

Returns
Any
OCRResponse with pages, markdown content, and metadata

Usage

agent = OCRAgent(llm="mistral/mistral-ocr-latest")
    result = agent.extract("https://arxiv.org/pdf/2201.04234")
    for page in result.pages:
        print(f"Page {page.index}: {page.markdown}")