Semantic Search
Definition
Semantic search uses embedding models to represent both queries and documents as dense numerical vectors in a high-dimensional semantic space. Documents with similar meanings are close together in this space, regardless of the specific words used. When a user asks a question, the query is embedded and compared to all document embeddings using cosine similarity. Documents with the highest similarity scores are retrieved. This approach handles paraphrasing, synonyms, and conceptual relationships that keyword search misses — the core challenge of real-world information retrieval.
Why It Matters
Users rarely search using the exact terminology in documentation. They describe problems in their own words, use informal language, ask questions rather than using keyword phrases, and may not know the technical name for what they are experiencing. Semantic search bridges this gap: it finds the right article based on what the user means, not just what they typed. For AI chatbots, semantic search is essential — it enables the retrieval step of RAG to find relevant content regardless of how the user phrased the question.
How It Works
At index time, each document chunk is passed through an embedding model (e.g., OpenAI text-embedding-3-small, Cohere embed, or a self-hosted model) which produces a fixed-dimension vector representation. These vectors are stored in a vector database (Pinecone, Weaviate, pgvector, Qdrant). At query time, the query is embedded using the same model. A nearest-neighbor search (often approximate, using HNSW or similar indexes) retrieves the K document chunks with the highest cosine similarity to the query vector.
Semantic vs. Keyword Search
User Query
"how do I cancel my subscription?"
Keyword Search
Matches exact words only
Cancel subscription page
Subscription pricing plans
Cancel event registration
Misses "unsubscribe" articles
Semantic Search
Understands meaning & intent
How to unsubscribe from a plan
Account closure and data deletion
Downgrade or pause your subscription
Finds cancellation, unsubscribe, closure
Real-World Example
A user asks the chatbot 'I keep getting kicked out of the app'. No article uses the phrase 'kicked out'. But the embedding of this query is semantically close to embeddings of articles about 'session expiry', 'automatic logout', and 'authentication timeout'. Semantic search retrieves these articles and the AI gives a relevant, accurate answer about session management — something keyword search alone could not achieve.
Common Mistakes
- ✕Using semantic search for exact lookup queries (error codes, product names) where keyword search is more reliable — semantic search should complement, not replace, keyword search.
- ✕Not monitoring embedding model quality — embedding models vary significantly in their understanding of domain-specific vocabulary.
- ✕Neglecting the choice of vector database — different databases have different performance characteristics for different scale and query patterns.
Related Terms
Full-Text Search
Full-text search is the capability to find documents by searching across the complete content of all articles — not just titles or metadata. It uses algorithms like BM25 to rank results by term frequency and relevance, enabling users to find articles using any keywords that appear anywhere in the content.
Knowledge Base Search
Knowledge base search is the capability that enables users to find relevant articles, and enables AI systems to retrieve relevant content to answer questions. Effective search combines full-text keyword matching with semantic understanding — finding relevant content even when users use different words than those in the articles.
Text Chunking
Text chunking is the process of splitting long documents into smaller, focused segments before indexing them in a knowledge base. Chunk size and overlap strategy directly affect retrieval quality — chunks that are too large lose precision, while chunks that are too small lose context. Finding the right balance is a key knowledge base engineering decision.
Knowledge Base
A knowledge base is a centralized repository of structured information — articles, FAQs, guides, and documentation — that an AI chatbot or support system uses to answer user questions accurately. It is the foundation of any AI-powered self-service experience, directly determining how accurate and comprehensive the bot's answers are.
Ready to build your AI chatbot?
Put these concepts into practice with 99helpers — no code required.
Start free trial →