Retrieval-Augmented Generation (RAG)

Parent Document Retrieval

Definition

Parent document retrieval is a two-level indexing and retrieval strategy that addresses the tension between chunking granularity for retrieval and context richness for generation. Small chunks (100-200 tokens) are ideal for retrieval — they are semantically focused, matching queries precisely. But small chunks often lack sufficient context for the LLM to generate a complete answer. Parent document retrieval solves this by: indexing small child chunks for retrieval, but when a child chunk is retrieved, returning the larger parent section (500-2,000 tokens) to the LLM as context. The child chunk identifies the right location in the knowledge base; the parent provides the full context.

Why It Matters

Parent document retrieval is an elegant solution to a fundamental RAG tradeoff. Without it, engineers must choose between small chunks (good retrieval precision, poor context) and large chunks (poor retrieval precision, good context). Parent document retrieval eliminates this tradeoff by decoupling retrieval granularity from context provision granularity. This is especially valuable for dense technical documentation where a single sentence ('The API rate limit is 100 requests per minute') is the precise retrieval target, but the surrounding paragraph provides essential context for a complete answer.

How It Works

Parent document retrieval is implemented with a two-level document store: small child chunks are embedded and stored in the vector database with a reference to their parent document ID. Larger parent documents (or sections) are stored in a separate document store (in-memory dict, Redis, or database). At retrieval time: 1) embed the query, 2) search the vector database with child chunk embeddings to find the most similar small chunks, 3) look up the parent document ID for each retrieved child chunk, 4) retrieve the full parent document from the document store, 5) pass the parent document text (not the small chunk) to the LLM as context. LangChain's ParentDocumentRetriever provides this functionality.

Parent Document Retrieval — Two-Stage Strategy

Small Chunk Index

Vector DB

Chunk A1 → vec

Chunk A2 → vec

Chunk B1 → vec

Chunk B2 → vec

Full Document Store

Doc DB / Redis

Doc A — full section (800 tok)

Doc B — full section (750 tok)

Retrieval Flow

Embed query

user question → query vector

Search small chunk index

top-k child chunks by similarity

Look up parent IDs

chunk A2.parentId = Doc A

Fetch full parent documents

return 800-token section, not 100-token chunk

Flat Chunking

Lower quality

Retrieval precisionHigh

Context richnessLow

ReturnsSmall chunk (100 tok)

Parent Retrieval

Best of both

Retrieval precisionHigh

Context richnessHigh

ReturnsFull section (800 tok)

Real-World Example

A 99helpers customer tests three chunking strategies: large 1,000-token chunks (poor retrieval precision but rich context), small 100-token chunks (good retrieval precision but insufficient context), and parent document retrieval (100-token child chunks for retrieval, 800-token parent sections for context). On their evaluation set, large chunks: precision@5 0.61, answer completeness 0.78. Small chunks: precision@5 0.88, answer completeness 0.54. Parent document retrieval: precision@5 0.87, answer completeness 0.82. Parent document retrieval achieves near-small-chunk retrieval precision with near-large-chunk answer completeness.

Common Mistakes

✕Making parent documents too large — very large parent documents exceed context window budgets and dilute relevant information with irrelevant content from other sections
✕Not maintaining the parent-child relationship accurately during indexing — if child chunks are not correctly mapped to their parent, the retrieved parent may contain wrong information
✕Applying parent document retrieval uniformly — for short FAQ-style articles, the added complexity is unnecessary; apply to long-form documentation where the context tradeoff is most acute

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

Parent Document Retrieval

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Document Chunking

Chunk Overlap

Context Window

Retrieval-Augmented Generation

Retrieval Precision

Ready to build your AI chatbot?