Full-Text Search
Definition
Full-text search indexes the entire text of every document and enables queries against this index. When a user searches for 'export CSV', full-text search finds all articles containing those words, ranked by how frequently and prominently they appear. The most widely used algorithm is BM25 (Best Match 25), which considers term frequency (how often the term appears in the document), inverse document frequency (how rare the term is across all documents), and document length normalization. Full-text search excels at finding exact matches — product names, error codes, feature names — that semantic search might not handle as precisely.
Why It Matters
Full-text search is the foundation of any knowledge base discovery system. It handles the large category of queries where users know the exact terminology — searching for a specific error code, a product feature name, or an exact phrase from the documentation. Without full-text search, these precise queries would either fail or require semantic search to make lucky connections. Full-text search and semantic search are complementary — each covers the cases the other misses.
How It Works
Full-text search is implemented using a search engine like Elasticsearch or its open-source alternatives. At ingestion, documents are tokenized, stemmed (reducing words to root forms), and indexed in an inverted index mapping terms to document IDs and positions. At query time, the query terms are processed the same way and matched against the index. BM25 scoring ranks results by relevance. The search engine supports boolean operators (AND, OR, NOT), phrase queries, and field-specific queries (search only in titles).
Full-Text Search Mechanics
User Query
"reset password"
Tokenize
Stop-word removal
(no stops here)
Stemming
Index Lookup
Match stemmed tokens
BM25 Scoring
Term freq + inverse doc freq
Ranked Results
Real-World Example
A developer searches the knowledge base for 'ERR_SSL_CERT_INVALID' — a precise error code. Full-text search finds the one article that contains this exact string and returns it immediately at the top of results. Semantic search would struggle with this because an error code has no semantic meaning beyond its exact string — there is nothing to reason about conceptually. Full-text search handles these precise lookups reliably.
Common Mistakes
- ✕Relying only on full-text search without semantic search — users who describe problems in their own words without knowing the exact terminology get poor results.
- ✕Not configuring synonym handling — users might search 'invoice' when articles use 'billing statement', and without synonym mapping these queries miss relevant content.
- ✕Not applying stemming and normalization — a search for 'integrating' should match articles about 'integration', not require the exact word form.
Related Terms
Semantic Search
Semantic search finds knowledge base articles based on the meaning of a query — not just the words used. By converting both queries and documents into vector embeddings, it identifies conceptually similar content even when users use different terminology than the articles, enabling more natural and accurate information retrieval.
Knowledge Base Search
Knowledge base search is the capability that enables users to find relevant articles, and enables AI systems to retrieve relevant content to answer questions. Effective search combines full-text keyword matching with semantic understanding — finding relevant content even when users use different words than those in the articles.
Knowledge Base
A knowledge base is a centralized repository of structured information — articles, FAQs, guides, and documentation — that an AI chatbot or support system uses to answer user questions accurately. It is the foundation of any AI-powered self-service experience, directly determining how accurate and comprehensive the bot's answers are.
Faceted Search
Faceted search is a search interface that lets users filter results by multiple metadata attributes simultaneously — such as category, product, date range, or content type. It helps users narrow large result sets to the most relevant subset without requiring precise keyword queries.
Ready to build your AI chatbot?
Put these concepts into practice with 99helpers — no code required.
Start free trial →