Glossary

Cosine similarity

Cosine similarity scores how closely two embeddings point in the same direction — the standard way semantic search ranks which passages are most relevant to a query.

Cosine similarity is a measure of how closely two vectors point in the same direction. In the context of text search, those vectors are embeddings — numeric representations of meaning — and a high cosine similarity score means two pieces of text are semantically close, regardless of whether they share any words.

Why it matters

When a semantic search system embeds your query and every passage in a document, it needs a way to decide which passages are most relevant. Cosine similarity provides that: it computes the angle between the query vector and each passage vector. Passages that sit close to the query in meaning — expressing the same idea, from different angles — score near 1.0; passages on unrelated topics score near 0.

The intuition is geometric. A passage about "terminating a contract" and a query for "cancelling an agreement" encode similar meanings, so their vectors point roughly the same way through the high-dimensional space embeddings live in. Cosine similarity captures that alignment directly, which is why it's the standard scoring function for vector-based retrieval.

For most retrieval systems, cosine similarity produces a good but imperfect first-pass ranking. That's where reranking comes in: a second-pass model re-scores the top candidates by reading query and passage together, correcting the cases where cosine similarity ranked a topically-adjacent-but-wrong passage ahead of the one that actually answers the question.

Together, cosine similarity and reranking form the relevance backbone that lets a system surface the right passage — and cite the sentence behind the answer.

All terms
Ready when you are

Stop digging. Start asking.

Add Sidenote to Chrome, open any page in your wiki, and ask it the question you’ve been Slacking the team about.

7-day Pro trial · No card required · Free tier forever