Retrieval-Augmented Generation (RAG)

Query Expansion

Definition

Query expansion is a pre-retrieval technique that enriches the original query with additional terms or alternative formulations to improve retrieval coverage. In traditional keyword search, query expansion adds synonyms, related terms, and common abbreviations to cast a wider net. In neural RAG systems, query expansion can take forms such as: generating hypothetical answers (HyDE), rephrasing the query in multiple ways (multi-query retrieval), extracting explicit sub-questions from a complex query, or adding domain-specific terminology the user may not know. Query expansion increases recall (finding more relevant documents) at the cost of slightly lower precision (the expanded results may include more irrelevant documents).

Why It Matters

Query expansion addresses the vocabulary mismatch problem: users describe their questions in their own language, which may differ from the language used in the knowledge base. A customer asking 'why was I charged twice?' may not know to search for 'duplicate invoice' or 'double billing'. Query expansion can automatically add these synonymous terms, ensuring the retrieval covers documents using the correct technical vocabulary. For complex multi-part questions, query expansion that decomposes the question into sub-queries and retrieves context for each sub-query enables the LLM to answer questions requiring information from multiple documents.

How It Works

Query expansion is implemented at the retrieval stage, before embedding and search. Term-based expansion: look up the query terms in a synonym dictionary or word association graph and add related terms. LLM-based expansion: prompt an LLM to generate 3-5 alternative phrasings of the user's query, retrieve separately for each phrasing, and merge the results (removing duplicates). HyDE: prompt the LLM to write a hypothetical document that would answer the query, then embed and search with that hypothetical document rather than the query itself (since document embeddings are closer to other document embeddings than query embeddings).

Query Expansion — Broadening Retrieval Coverage

Original query

"reset password"

Synonym Expansion

passcodecredentialslogin keyPIN

Related Terms

recoverrestorechangeupdate

LLM Variations

forgot passwordcan't log inaccount locked

Expanded query terms

reset passwordpasscodecredentialslogin keyrecoverrestorechange passwordforgot passwordcan't log inaccount locked

Retrieve for each variant

Union of all result sets

De-duplicate results

Remove overlapping docs

Better recall

Fewer missed relevant docs

Real-World Example

A 99helpers customer finds that customer queries about account permissions often use imprecise language: 'who can see my stuff', 'sharing settings', 'privacy controls'. Their knowledge base uses the technical term 'access management'. After implementing LLM-based query expansion that rewrites each query into 3 alternative phrasings, retrieval for permission-related queries improves significantly because the expanded queries include both the customer's informal language and the technical terms from the knowledge base. Recall@5 for permission queries improves from 56% to 83%.

Common Mistakes

✕Over-expanding queries with too many additional terms — excessive expansion retrieves irrelevant documents that dilute the LLM context with noise
✕Expanding queries without measuring the precision impact — while expansion improves recall, it may reduce precision; measure both before and after
✕Using generic synonym expansion without domain knowledge — domain-specific query expansion based on your actual customer language and knowledge base vocabulary outperforms generic expansion

Related Terms

Query Rewriting

Query rewriting is a technique that transforms a user's original query into an improved version — clearer, more complete, or better suited for retrieval — using an LLM to improve recall and relevance before searching the knowledge base.

Hypothetical Document Embedding

Hypothetical Document Embedding (HyDE) is a RAG technique that improves retrieval by having an LLM generate a hypothetical document that would answer the user's query, then using that document's embedding rather than the query embedding for similarity search.

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

Query Expansion

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Query Rewriting

Hypothetical Document Embedding

Multi-Query Retrieval

Dense Retrieval

Retrieval-Augmented Generation

Ready to build your AI chatbot?