Prompt Engineering

Few-Shot Prompting

Definition

Few-shot prompting is a prompt engineering technique that includes 2-10 worked examples of the target task directly in the prompt, establishing a pattern the model then applies to new inputs. Each example shows an input-output pair that illustrates the desired format, style, reasoning approach, or output structure. The model's in-context learning ability allows it to generalize from these demonstrations without updating its weights. Few-shot prompting is highly effective for tasks with specific output formats, classification with nuanced labels, and domains where phrasing matters. Example quality and diversity significantly affect performance.

Why It Matters

Few-shot prompting bridges the gap between a model's general capabilities and specific task requirements, often achieving near-fine-tuned performance without the cost and delay of training. For applications like ticket classification, data extraction, or structured output generation, including 3-5 well-chosen examples in the prompt can increase accuracy by 15-30% compared to zero-shot instructions alone. It's also faster to iterate: adding or changing examples takes minutes while fine-tuning takes hours. For edge cases that the model handles poorly, adding a targeted example directly addresses the failure mode.

How It Works

A few-shot prompt includes examples in a consistent format before the actual query. For a sentiment classification task: 'Review: The battery died after 2 hours. Sentiment: Negative. Review: Absolutely love the build quality. Sentiment: Positive. Review: It works fine for basic tasks. Sentiment: Neutral. Review: [new review]'. The model learns the input-output pattern and applies it consistently. Example selection strategies include: representative coverage of all labels, inclusion of boundary cases the model gets wrong, and diverse phrasing to prevent overfitting to surface patterns.

Few-Shot Prompting — Prompt Structure

Prompt (sent to model)

Task Instruction

Classify the sentiment of the customer review as Positive, Negative, or Neutral.

Examples (2-shot)

Input 1

"The delivery arrived two days late."

Output 1

Negative

Input 2

"Setup was simple and it works perfectly."

Output 2

Positive

Actual Query

"Best purchase I've made all year — highly recommend!"

→ ???

Model Output

Positive

Pattern generalized from 2 in-prompt examples — no fine-tuning required

Example selection tips

  • Cover all label classes with roughly equal representation
  • Include boundary / ambiguous cases the model gets wrong
  • Keep phrasing diverse to avoid surface-level overfitting

Real-World Example

A legal document classifier needed to categorize contracts into 8 clause types with high precision on technical legal language. Zero-shot prompting achieved 71% accuracy. Adding 3 examples per clause type (24 examples total) in a few-shot prompt raised accuracy to 89%—without any fine-tuning. The team selected examples that covered ambiguous boundary cases (e.g., limitation-of-liability clauses that overlap with indemnification), directly targeting the most common errors. The 24-example prompt costs $0.02 per document to run but saved 3 hours of manual review per contract.

Common Mistakes

  • Using unrepresentative examples—examples that don't reflect real production input distributions teach the model the wrong patterns
  • Including too many examples without checking context length—large few-shot prompts may exceed context limits or increase latency unacceptably
  • Treating few-shot prompting as a substitute for fine-tuning at scale—at very high volumes, fine-tuning on the few-shot examples is more cost-efficient

Related Terms

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →
What is Few-Shot Prompting? Few-Shot Prompting Definition & Guide | 99helpers | 99helpers.com