Semantic search finds text by meaning rather than by matching exact words. Instead of looking for the literal keywords you typed, it understands what you are asking and retrieves passages that express the same idea — even when they share no words with your query.
Why it matters
Traditional keyword search is brittle. Ask a document about its "cancellation policy" and a keyword index will miss a paragraph headed "ending your subscription," because the words don't match. Semantic search closes that gap: it matches concepts, so a search for "how do I leave" surfaces the right passage regardless of the exact phrasing the author used. For long documents, wikis, and research papers — where the same idea is written a dozen different ways — this is the difference between finding the answer and scrolling forever.
How it works
Semantic search relies on vector embeddings: a model converts each chunk of text into a list of numbers that captures its meaning, and converts your query the same way. Passages whose vectors sit closest to the query vector are the most relevant, and those are returned first.
The typical pipeline is:
- Index — split the document into passages and embed each one.
- Query — embed the question and find the nearest passages by vector similarity.
- Rank — return the best-matching passages, often re-scored for precision.
Where it fits in Sidenote
Semantic search is the retrieval step that makes grounded answers possible. When you ask Sidenote a question, it semantically searches the document you are reading to pull the passages that actually address it, then feeds only those to the model — the same retrieve-then-answer pattern as retrieval-augmented generation. Because the answer is built from real passages, every claim can carry a citation that scrolls straight to the source sentence. Good search is what lets Sidenote say "here is the exact line," rather than guessing.