Large Language Models (LLMs)

Few-Shot Learning

Definition

Few-shot learning (also called few-shot prompting or in-context learning) is a prompting technique where you include 2-10 examples of the desired task directly in the prompt before the actual query. The examples serve as demonstrations: by showing the model '[input] → [output]' pairs, you communicate the task format, output structure, and stylistic expectations without any weight updates. The LLM generalizes from these examples to produce a consistent output for the new input. Few-shot is the middle ground between zero-shot (no examples) and fine-tuning (thousands of examples with weight updates). It works because pre-training exposed the model to implicit few-shot patterns across millions of NLP tasks described in text.

Why It Matters

Few-shot prompting is one of the most powerful prompt engineering techniques for improving LLM output consistency and format adherence. When you need an LLM to return outputs in a specific structure—JSON with particular fields, a specific rating scale, a constrained classification—few-shot examples communicate the expected format far more reliably than verbal instructions alone. For 99helpers teams building structured extraction or classification features, few-shot prompting can dramatically improve output consistency compared to zero-shot, often matching fine-tuned model quality on narrow tasks without the overhead of model training.

How It Works

A few-shot prompt structure: [System: Task description] [User: Example input 1] [Assistant: Example output 1] [User: Example input 2] [Assistant: Example output 2] ... [User: Actual input] [Assistant:]. The examples should be representative of the real input distribution, cover edge cases, and demonstrate the exact format required. Ordering matters—later examples have higher influence than earlier ones (recency bias). Use 3-8 examples for most tasks; more examples provide diminishing returns and consume context tokens. For classification tasks, balance examples across classes. Select examples that cover the variety in expected inputs rather than duplicating similar examples.

Few-Shot Learning vs Zero-Shot

Zero-Shot

Prompt
Classify sentiment: "The delivery was late but the product was great."
Output
Mixed sentiment — negative delivery, positive product.

No examples provided — relies on model's training

2-Shot

Prompt (with examples)
Q: I love this product!
A: Positive
Q: Worst experience ever.
A: Negative
Q: "The delivery was late but the product was great."
A: ???
Output
Mixed

Examples define the expected format and task

0-shot
62%
1-shot
74%
2-shot
82%
5-shot
91%

Typical accuracy improvement with more examples (sentiment task)

Real-World Example

A 99helpers team wants the LLM to categorize support tickets into one of five categories (billing, technical, general, feature-request, urgent). Zero-shot classification with just the category list misclassifies ~15% of tickets. Adding five few-shot examples—one per category showing a representative ticket and its correct label—drops misclassification to 4%. The examples communicate that 'My card was declined' → billing (not technical), even though both involve product problems. The model generalizes from these examples to new tickets it has never seen.

Common Mistakes

  • Including examples that don't represent the actual input distribution—cherry-picked easy examples don't help with edge cases.
  • Using too many few-shot examples and running out of context window budget before the actual query.
  • Assuming few-shot always outperforms zero-shot—for simple, well-defined tasks like translation, zero-shot frontier models often match few-shot quality.

Related Terms

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →
What is Few-Shot Learning? Few-Shot Learning Definition & Guide | 99helpers | 99helpers.com