Automation
OCR (Optical Character Recognition)
OCR is the technology that converts images of printed or handwritten text into machine-readable characters. In AI document pipelines, OCR is the first stage of intelligent document processing — extracting raw text from scanned invoices, contracts, and forms before NLP and LLM layers classify, extract, and route the content.
Related terms
- Intelligent Document Processing — Intelligent document processing (IDP) is the automated extraction of structured data from unstructured or semi-structured documents — invoices, contracts, clinical notes — using OCR, NLP, and large language models, then routing the data to downstream systems.
- Workflow Automation — Workflow automation is the replacement of repetitive multi-step business processes with software that executes them deterministically, with optional human checkpoints at decision points.