Automation

OCR (Optical Character Recognition)

OCR is the technology that converts images of printed or handwritten text into machine-readable characters. In AI document pipelines, OCR is the first stage of intelligent document processing — extracting raw text from scanned invoices, contracts, and forms before NLP and LLM layers classify, extract, and route the content.

Related terms

  • Intelligent Document Processing Intelligent document processing (IDP) is the automated extraction of structured data from unstructured or semi-structured documents — invoices, contracts, clinical notes — using OCR, NLP, and large language models, then routing the data to downstream systems.
  • Workflow Automation Workflow automation is the replacement of repetitive multi-step business processes with software that executes them deterministically, with optional human checkpoints at decision points.

← Back to glossary