Prompt Engineering

LLM Security

Definition

LLM security is the discipline of identifying, assessing, and mitigating security risks specific to large language model applications. The OWASP Top 10 for LLMs identifies the most critical threat categories: prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, excessive agency, overreliance, and model theft. Unlike traditional software security, LLM security must contend with the inherent ambiguity of natural language—there is no equivalent to SQL parameterization that definitively prevents language-based attacks.

Why It Matters

LLM security is a critical but frequently neglected discipline as organizations rush to deploy AI features. Unlike API security or input validation—which have well-established patterns—LLM security requires new mental models: the input and instruction channels are the same text stream; model behavior is probabilistic, not deterministic; and 'fixing' one attack vector often doesn't prevent novel variants. Security teams that treat LLM components as black boxes with standard input validation will miss the LLM-specific attack surface. Dedicated LLM security review is required for any customer-facing or sensitive-data-handling AI deployment.

How It Works

LLM security practice covers: (1) threat modeling—identifying what an attacker could cause the model to do and what data they could extract; (2) red teaming—systematically testing adversarial inputs before deployment; (3) defense-in-depth—layering input classifiers, output filters, and agentic permission systems rather than relying on any single control; (4) monitoring—logging all LLM interactions and alerting on anomalous patterns; (5) access control—applying principle of least privilege to agentic tool access; (6) output handling—treating LLM outputs as untrusted user content in downstream systems (sanitize before rendering HTML, validate before executing code).

LLM Security — Threat Matrix

criticalhighmedium

Prompt InjectionInput Attack

Malicious instructions override system prompt or tool calls

critical

Data LeakageInformation Disclosure

System prompt or training data extracted via crafted queries

critical

JailbreakingSafety Bypass

Roleplay or framing manipulates model past content policies

high

Model InversionPrivacy Attack

Fine-tuning data inferred from model outputs at scale

high

Indirect InjectionSupply Chain

Malicious instructions embedded in documents / web pages AI reads

critical

Denial of ServiceAvailability

Adversarial inputs maximize token usage to exhaust quotas

medium

Defense Layers

•Input classifier

•Prompt hardening

•Output filtering

•Least-privilege tool access

•Audit logging

•Rate limiting

Real-World Example

A financial services firm conducted an LLM security review before deploying an AI research assistant with access to internal financial data. The review uncovered: (1) a prompt injection vulnerability in the document reader that could cause the model to exfiltrate document contents; (2) insufficient access controls that allowed the model to retrieve any document rather than only those relevant to the current query; (3) a hallucination risk where the model would confidently fabricate financial figures not in its retrieved context. Remediation—adding content isolation, retrieval filtering, and grounding instructions—took 3 weeks but prevented what would have been a significant security incident.

Common Mistakes

✕Treating LLM security as identical to traditional web application security—the attack vectors and defenses are fundamentally different
✕Relying on model safety training as a security boundary—safety training reduces risk but is not a security guarantee; business logic must not depend on it
✕Ignoring security of agentic systems—an LLM that can take actions (send email, query databases, call APIs) requires especially rigorous security review of what actions it can take and under what conditions

Ready to build your AI chatbot?

Put these concepts into practice with 99helpers — no code required.

Start free trial →

LLM Security

Definition

Why It Matters

How It Works

Real-World Example

Common Mistakes

Related Terms

Prompt Injection

Guardrails

Adversarial Prompting

Prompt Leaking

Prompt Engineering

Ready to build your AI chatbot?