A vector embedding is a list of numbers that represents a piece of text in a way that captures its meaning, so that passages with similar meanings end up close together and unrelated ones end up far apart.
What it actually is
A model reads a sentence and outputs an array of numbers — often hundreds or thousands of them — called a vector. You can think of each vector as coordinates for a point in a very high-dimensional space. The key property is that distance in that space reflects meaning, not wording: "cancel my subscription" and "how do I end my plan" land near each other even though they share almost no words, while "cancel my subscription" and "renew for another year" sit far apart despite looking similar on the surface. That is what lets software compare ideas rather than just match keywords.
Why it matters
Keyword search fails the moment the reader and the document use different words for the same thing. Embeddings fix this. By turning both the question and every passage into vectors, a system can rank passages by closeness in meaning — the basis of semantic search.
How it underpins retrieval
Embeddings are the engine behind retrieval-augmented generation:
- Each chunk of a document is embedded once and stored.
- Your question is embedded at query time.
- The system finds the passages whose vectors are nearest to the question's vector and feeds those to the model as context.
This is how an assistant answers from a 90-page PDF without reading all 90 pages every time.
When you ask Sidenote about a document, embeddings help locate the exact passages that bear on your question, so the answer is built from real text in front of you — and every claim is backed by a citation that scrolls to its source.