Structured Output
Definition
Structured output (also called JSON mode, guided generation, or constrained decoding) ensures LLM responses conform to a predefined schema—a JSON object with required fields and types, a CSV row, an XML document, or another specified format. Without structured output, LLM responses are free-form text that requires fragile parsing logic (regex, string splitting) prone to breaking when the model changes its phrasing. With structured output, the model is constrained (either by instruction, grammar-based decoding, or API enforcement) to produce valid, schema-compliant output that can be parsed with standard JSON parsers. OpenAI's API supports a response_format: {type: 'json_object'} mode and a response_format: {type: 'json_schema', json_schema: {...}} mode for strict schema enforcement.
Why It Matters
Structured output is essential for integrating LLMs with downstream systems. A sentiment analysis feature that returns free text like 'The customer seems quite upset about the billing issue' requires custom parsing; a structured output returning {sentiment: 'negative', confidence: 0.92, category: 'billing', urgency: 'high'} can be directly consumed by a routing system, database, or dashboard without parsing. For 99helpers customers building data extraction, classification, or enrichment features, structured output dramatically increases reliability and reduces the engineering effort needed to handle LLM responses programmatically.
How It Works
Structured output implementation options: (1) OpenAI JSON schema mode—define a JSON schema in the API call; the model is constrained to produce valid schema-conforming JSON (guaranteed valid, may still have semantic errors); (2) Instructor library—wraps the LLM call with Pydantic models for automatic validation and retry on parsing failure; (3) Outlines/LMQL—grammar-constrained generation that guarantees output conforms to a formal grammar, even for locally-hosted models; (4) function calling—define a function schema and the model produces arguments matching the schema. For Anthropic Claude, structured output is achieved via detailed prompt instructions plus response parsing; native JSON schema enforcement is available in recent API versions.
Structured Output — Freeform vs JSON Schema Constrained
User Prompt
Extract the product name, price, and availability from this text: "The UltraTab Pro is currently in stock at $349."
Freeform Output
The product name is UltraTab Pro. Its price is $349 and it is currently available in stock.
JSON Schema Output
{ "product": "UltraTab Pro", "price_usd": 349, "in_stock": true }
JSON Schema constraint
{ "type": "object", "properties": { "product": { "type": "string" }, "price_usd": { "type": "number" }, "in_stock": { "type": "boolean" } }, "required": ["product","price_usd","in_stock"] }
Real-World Example
A 99helpers team builds a ticket classification system. Initial implementation: LLM returns free text like 'This is a billing question about subscription changes.' Parsing this is fragile—different phrasings break the regex. After switching to structured output with schema {category: string, subcategory: string, urgency: 'low'|'medium'|'high', sentiment: 'positive'|'neutral'|'negative'}, the API returns {category: 'billing', subcategory: 'subscription', urgency: 'medium', sentiment: 'neutral'} every time. Downstream routing logic uses these fields directly, reducing classification errors by 85% and eliminating all parsing failures.
Common Mistakes
- ✕Using JSON mode without defining a specific schema—JSON mode ensures valid JSON syntax but not specific fields; schema enforcement ensures the exact structure needed.
- ✕Expecting structured output to guarantee semantic correctness—the output structure is enforced, but field values can still be semantically wrong (e.g., sentiment classified incorrectly).
- ✕Not implementing retry logic for structured output failures—even with schema enforcement, some providers may occasionally return non-conforming output; retry with increased temperature or rephrased instructions.
Related Terms
Function Calling
Function calling enables LLMs to request the execution of predefined functions with structured arguments, allowing AI systems to interact with external APIs, databases, and tools rather than just generating text.
Tool Use
Tool use is the broader capability of LLMs to interact with external systems—executing code, browsing the web, querying databases, reading files—by calling tools during generation to retrieve information or take actions.
LLM API
An LLM API is a cloud service interface that provides programmatic access to large language models, allowing developers to send prompts and receive completions without managing model infrastructure.
JSON Mode
JSON mode is an LLM API feature that guarantees the model's output is valid JSON, ensuring reliable programmatic parsing without worrying about prose text surrounding the JSON object.
Few-Shot Learning
Few-shot learning provides an LLM with a small number of input-output examples within the prompt, demonstrating the desired task format and behavior without updating model weights.
Ready to build your AI chatbot?
Put these concepts into practice with 99helpers — no code required.
Start free trial →