Data
Embedding (AI)
An embedding is a dense numerical vector that represents the semantic content of a piece of text, image, or audio in a continuous vector space. Items with similar meaning have vectors that are close together, enabling similarity search, clustering, and classification without hand-crafted features.
Embeddings are produced by encoder models (OpenAI text-embedding-3, Cohere Embed, Google text-embedding-gecko) and stored in vector databases for retrieval. The quality of embeddings directly determines the quality of RAG retrieval: a poor embedding model returns irrelevant documents, which the LLM then reasons over incorrectly. Embedding model selection is one of the highest-leverage decisions in any RAG system design.
Related terms
- Embedding — An embedding is a fixed-length vector of numbers that represents the semantic meaning of a piece of text (or image, audio).
- Vector Database — A vector database is a database optimized for storing embeddings and answering similarity queries ("give me the 10 most similar items to this one").
- RAG (Retrieval-Augmented Generation) — RAG is a pattern in which an AI model retrieves relevant documents from a knowledge base at query time and uses them as additional context to generate its response.
- Semantic Search — Semantic search is a retrieval approach that finds documents by meaning rather than keyword overlap.