OWASP Top 10 for LLM Apps (2025): A Developer’s Guide to Mitigations, Code Patterns, and Secure AI Pipelines By CyberDudeBivash • September 21, 2025 (IST)
Executive Summary
-
Treat all prompts and retrieved context as untrusted. Assume direct and indirect prompt injection—from web pages, PDFs, or “helpful” tool outputs. Bind LLMs behind capability caps, allowlisted tools, and schema-validated outputs before anything executes. OWASP places Prompt Injection and (Improper) Output Handling at the top of 2025 risks for good reason. OWASP GenAI+1
-
Privacy & data exposure are now first-class risks. LLMs can leak PII, secrets, or system prompts—sometimes via model behavior you didn’t intend. Build redaction at ingest, context filters for RAG, and tenant isolation by default; don’t rely on model “politeness.” OWASP GenAI
-
Ship a secure pipeline, not just a prompt. Lock model and tool versions, publish an SBOM for your LLM stack, pin dependencies, and policy-gate releases. Align your program with OWASP LLM Top-10 and NIST AI RMF + Generative AI Profile for governance and audits. OWASP+1
Table of Contents
-
The LLM Security Model (why traditional appsec alone isn’t enough)
-
The 2025 OWASP LLM Top-10 — plain English, abuse cases, and fixes
-
Reference Architecture (policy → router → retrieval firewall → model → output validator → tool sandbox)
-
Code Patterns (output schemas, tool allowlists, retrieval filters, token budgets)
-
CI/CD & Supply-Chain Controls (SBOM, signatures, policy gates)
-
Red-Team & Test Harness (prompt-injection suites, tool-abuse tests)
-
SRE SLOs for LLM Systems (cost, latency, safety)
-
30/60/90 Day Rollout Plan
-
Checklists, RFP Questions, and Runbooks
-
FAQ
Note: This edition is condensed to keep it readable here while still being production-useful. If you want this turned into a printable long-form, I can split each risk into a chapter with deeper code and test suites.
1) The LLM Security Model (in 5 minutes)
A modern LLM app is not just a model call. It’s a pipeline:
User → Prompt Router → Retrieval/Memory → Model → Output Validator → Tools/Plugins → Data Stores & External APIs → User
New attack realities
-
Inputs are programs. Natural language can smuggle instructions. So treat prompts and retrieved text like code: parse, constrain, and never auto-execute. OWASP formalizes this as LLM01 Prompt Injection. OWASP GenAI
-
Outputs can be payloads. URLs, commands, and JSON the model emits must be schema-validated and run in sandboxes with budgets. OWASP calls this Improper/Insecure Output Handling. OWASP GenAI
-
Supply chain is bigger. Models, embeddings, RAG corpora, tools, model routers, and guardrails are all dependencies. You need versioning + SBOM like you do for containers. OWASP elevates LLM Supply Chain and Poisoning in 2025. OWASP GenAI
-
Governance matters. Map your controls to NIST AI RMF and the Generative AI Profile to pass audits without slowing teams. NIST Publications
2) The 2025 OWASP LLM Top-10 — Abuse Cases & How to Fix
Below are concise, developer-first treatments of each OWASP 2025 risk. For exact wording and evolving details, refer to the official OWASP pages. OWASP GenAI
LLM01 — Prompt Injection (direct & indirect)
What it is. Inputs—typed by users or embedded in third-party content—steer your model to ignore rules, exfiltrate data, or trigger tools. That includes PDFs, web pages, emails, and “internal notes.” Attacks are often indirect via RAG or browsing. OWASP GenAI
Fixes you can ship this sprint
-
Separate channels: keep instructions, user input, and retrieval context as distinct structured fields.
-
Tool allowlists with argument schemas: the model can suggest, but your policy engine decides.
-
Context firewalls: strip secrets/keys/URLs; label untrusted text; apply domain & URL allowlists for any follow-up fetch.
-
HITL for high-risk actions; token/time budgets per request.
-
Red-team prompts in CI.
Why it stays hard: Even major labs say there’s no silver bullet; behavior can be influenced by non-obvious patterns (e.g., hidden instructions). Defense is layered. OWASP GenAI+1
LLM02 — Sensitive Information Disclosure
What it is. LLMs leak PII, secrets, system prompts, or proprietary methods via outputs or side-effects. Risk rises with memory, logs, RAG corpora, and “helpful” tools. OWASP GenAI
Fixes
-
Redact at ingest (mask emails, credit cards, keys); store hash + vault ref, not plaintext.
-
Retrieval filters: tenant-scoped, label-based selects; avoid global embeddings for multi-tenant systems.
-
Prompt hardening: the model may ignore; treat as advisory, not a control.
-
Data processing terms: opt-out of training/retention where applicable.
LLM03 — Supply Chain
What it is. Your LLM app depends on models, embeddings, datasets, tool plugins, vector DBs, and guardrail libraries. Drifts and tampered artifacts cause safety and integrity regressions. (OWASP 2025 elevates this risk.) OWASP GenAI
Fixes
-
SBOM for AI: include model name, build hash/commit, tokenizer version, safety config, guardrail models.
-
Pin & sign everything (models, weights, prompts, policies).
-
Repro builds: fast rollbacks; store provenance manifests with Merkle roots.
-
Integration tests with canary prompts and regression suites.
LLM04 — Data & Model Poisoning
What it is. Malicious or low-quality content lands in pretraining, fine-tuning, or RAG data; outputs shift (bias/backdoors/exfil). OWASP GenAI
Fixes
-
Signed corpora with provenance; hash lists for inclusion.
-
Poison detectors and outlier filters on new data; human sampling.
-
Gated retrains: require evals to pass before shipping.
-
Isolation: keep customer and public data separate unless explicitly approved.
LLM05 — Improper (Insecure) Output Handling
What it is. Your app trusts LLM outputs too much—rendering HTML, following URLs, executing code, or calling tools without validation or sandbox. OWASP calls out this failure mode directly. OWASP GenAI
Fixes
-
Require JSON-only responses with JSON Schema + strict parser.
-
No implicit execution: outputs feed a policy gate; risky ops need human approval or a hardened sandbox.
-
Content-type guards and URL/domain allowlists.
LLM06 — Excessive Agency
What it is. Agents with too many tools/permissions (email, tickets, code) chain actions to do unintended things. OWASP GenAI
Fixes
-
Capability caps (max tools/actions per session).
-
Spend/time budgets; kill switches.
-
Step-up approvals when writing data, spending money, or escalating privileges.
LLM07 — System Prompt Leakage
What it is. Hidden system instructions (and policy details) get exposed via clever queries or logs. Attackers then tailor injections. OWASP GenAI
Fixes
-
Avoid reflective Q&A about system roles.
-
Split policy from prompt in infra; redact in logs and analytics.
LLM08 — Vector & Embedding Weaknesses
What it is. Poisoned or low-quality embeddings, cross-tenant vector leakage, or semantic collisions degrade retrieval and enable indirect injection. OWASP GenAI
Fixes
-
Per-tenant indexes (or strong row-level security).
-
Pre-filter context; strip tool-like tokens (“ignore previous”, “call_tool”, “system: …”).
-
Embed features not secrets; no raw PII in vectors.
LLM09 — Misinformation
What it is. Unfounded statements used as facts—dangerous when outputs drive business processes or code. OWASP GenAI
Fixes
-
Citations required for high-risk answers; verify before action.
-
Uncertainty surfacing: ask for confidence bands; block actions under thresholds.
LLM10 — Unbounded Consumption
What it is. Token/cost amplification, tool loops, or adversarial long prompts causing DoS and runaway bills. OWASP GenAI
Fixes
-
Token & time ceilings per stage; guard recursion depth.
-
Cache hot prompts; small-model fallbacks; rate-limit per principal.
3) A Secure Reference Architecture (drop-in blueprint)
Policy & Identity Plane
-
Tenant, role, and budget limits; model & tool allowlists; approval matrix.
Prompt Router
-
Splits system, user, context. Applies normalizers (strip control-like words, collapse whitespace), assigns budgets.
Retrieval Firewall
-
Per-tenant filtering; PII redaction; domain/URL allowlists for follow-ups; poison/quality scans.
Model Gateway
-
Model and safety config pinned; JSON-only; forced max tokens; retry/backoff with audit.
Output Validator
-
JSON Schema parsing; typed conversion; policy gate to decide if any tool call is allowed.
Tool Sandbox
-
Idempotent, scoped functions; mTLS, short-lived tokens, read-only where possible; egress allowlists.
Observability & Forensics
-
Structured logs (no secrets/system prompts), prompt+context hashes, replayable traces, cost & safety metrics.
4) Code Patterns
4.1 Force JSON + Validate with JSON Schema (Python)
Why it matters: You never trust LLM strings. Parse → validate → type-check first; then consider taking action. OWASP lists Improper/Insecure Output Handling as a core failure. OWASP GenAI
4.2 Tool-Call Policy Gate (TypeScript)
4.3 Retrieval Firewall (Python)
4.4 Budgets & DoS Controls (server middleware)
Map to OWASP: LLM10 Unbounded Consumption / LLM04 DoS via complexity; apply ceilings and fallbacks. OWASP GenAI
4.5 SBOM for Your LLM Stack (YAML example)
Why: OWASP 2025 emphasizes Supply Chain; auditors will ask you to prove what ran in prod and where it came from. OWASP GenAI
5) CI/CD & Supply-Chain Controls
-
Policy as code: store prompts, policies, and tool allowlists in version control; sign releases.
-
Canary prompts: run a stable suite (harmful, jailbreaking, injection) on every build; block on regression.
-
Model pinning: upgrade behind a feature flag; re-run evals before rollout.
-
GenAI SBOM: attach to artifacts; publish to security portal.
-
Governance mapping: tag controls to OWASP LLM Top-10 and NIST AI RMF (GenAI Profile) categories for audit. NIST Publications
6) Red-Team Playbook (ship with your repo)
-
Prompt-Injection Suite: direct, indirect (web/PDF), obfuscated (emoji/Base64), multilingual, adversarial suffix. Use MITRE ATLAS patterns for coverage. OWASP GenAI
-
Output-Abuse Suite: URLs to non-allowlisted domains; HTML/JS injection; code snippets that try to execute.
-
Tool-Abuse Suite: attempts to chain tools, escalate, or spend; verify HITL triggers.
-
Leakage Suite: attempts to reveal system prompts, API keys, or another tenant’s data.
-
Consumption Suite: long recursive prompts; measure ceilings and backoffs.
7) SRE SLOs for LLM Systems
-
Safety SLO: < 0.1% blocked events that bypass policy in staging; 0 in prod (monitor).
-
Cost SLO: p95 cost per request below threshold; alert on spikes (possible injection).
-
Latency SLO: p95 response under X ms with caching & fallbacks.
-
Drift SLO: model/router/guardrail drift = 0 without signed release.
8) 30/60/90 Day Rollout
Day 0–30 (Stabilize):
-
Force JSON-only + schema validation; institute tool allowlist and HITL for high-risk.
-
Add retrieval firewall with PII redaction + domain allowlist.
-
Pin model version; attach AI-SBOM; add canary prompts to CI.
Day 31–60 (Harden):
-
Split per-tenant indexes; add budgets (tokens/time/tools).
-
Build policy gate UI for approvals; wire cost & safety metrics.
Day 61–90 (Operate):
-
Run red-team exercises; pass auditors with NIST AI RMF mapping & OWASP LLM Top-10 controls. NIST Publications
9) Checklists & RFP Questions
Secure-by-Default Checklist (ship now)
-
JSON-only outputs + JSON Schema parsing
-
Tool allowlist + typed args + HITL for “write/spend”
-
Retrieval firewall (PII redaction, domain allowlists)
-
Model pinning + AI-SBOM + signed releases
-
Budgets (tokens/time/tools) + kill switch
-
Red-team suites in CI; block on regression
-
Structured logs without secrets/system prompts
Vendor/RFP Questions (copy/paste)
-
Do you provide pass/fail eval suites for injection, leakage, and output handling?
-
Can you pin versions of models, tokenizers, and safety configs; do you produce an SBOM?
-
How do you enforce domain/tool allowlists and budgets?
-
What’s your tenant isolation model for embeddings and memories?
-
How do you map to OWASP LLM Top-10 (2025) and NIST AI RMF (GenAI)?
10) FAQ
Q: Can I stop prompt injection completely?
Realistically, no—you can mitigate and contain it with structured inputs, output validation, capability caps, and HITL. OWASP calls out the inherent difficulty; Wired’s reporting underscores why “indirect” injections are stubborn. OWASP GenAI+1
Q: What about privacy laws?
Build data minimization, redaction, opt-outs, and transparent retention. Use the NIST AI RMF + GenAI Profile as a governance scaffold alongside your regional regs. NIST Publications
Q: Which three fixes give the biggest ROI?
-
JSON-only + schema; 2) Tool allowlists + HITL; 3) Retrieval firewall with PII redaction + allowlisted fetch.
Sources (official and current)
-
OWASP Top 10 for LLM Applications (2025) — project hub & risk pages. OWASP GenAI
-
OWASP LLM01 — Prompt Injection (2025) — direct/indirect definitions & mitigations. OWASP GenAI
-
OWASP LLM05 — Improper/Insecure Output Handling (2025) — validation before action. OWASP GenAI
-
OWASP LLM02 — Sensitive Information Disclosure (2025) — data exposure risks & practices. OWASP GenAI
-
OWASP Prompt Injection Prevention Cheat Sheet — concrete do/don’t for devs. OWASP Cheat Sheet Series
-
NIST AI RMF + Generative AI Profile (2024/2025) — governance scaffolding to pair with OWASP controls. NIST Publications
Comments
Post a Comment