Data
Inference
Inference is the process of running a trained AI model to produce outputs from inputs. Inference cost dominates the operating budget of most LLM deployments and scales with input length, output length, and model size.
Related terms
- LLM (Large Language Model) — A large language model (LLM) is a neural network trained on broad text corpora that can generate, summarize, translate, classify, and reason about natural language.
- AI ROI — AI ROI is the measurable business outcome (revenue, cost reduction, cycle time, error rate, customer satisfaction) attributable to an AI deployment, divided by the total cost of that deployment over a defined period — typically 12–36 months.