OWASP Top 10 for LLM Apps (2025): A Developer’s Guide to Mitigations, Code Patterns, and Secure AI Pipelines By CyberDudeBivash • September 21, 2025 (IST)

Executive Summary

Treat all prompts and retrieved context as untrusted. Assume direct and indirect prompt injection—from web pages, PDFs, or “helpful” tool outputs. Bind LLMs behind capability caps, allowlisted tools, and schema-validated outputs before anything executes. OWASP places Prompt Injection and (Improper) Output Handling at the top of 2025 risks for good reason. OWASP GenAI+1
Privacy & data exposure are now first-class risks. LLMs can leak PII, secrets, or system prompts—sometimes via model behavior you didn’t intend. Build redaction at ingest, context filters for RAG, and tenant isolation by default; don’t rely on model “politeness.” OWASP GenAI
Ship a secure pipeline, not just a prompt. Lock model and tool versions, publish an SBOM for your LLM stack, pin dependencies, and policy-gate releases. Align your program with OWASP LLM Top-10 and NIST AI RMF + Generative AI Profile for governance and audits. OWASP+1

The LLM Security Model (why traditional appsec alone isn’t enough)
The 2025 OWASP LLM Top-10 — plain English, abuse cases, and fixes
Reference Architecture (policy → router → retrieval firewall → model → output validator → tool sandbox)
Code Patterns (output schemas, tool allowlists, retrieval filters, token budgets)
CI/CD & Supply-Chain Controls (SBOM, signatures, policy gates)
Red-Team & Test Harness (prompt-injection suites, tool-abuse tests)
SRE SLOs for LLM Systems (cost, latency, safety)
30/60/90 Day Rollout Plan
Checklists, RFP Questions, and Runbooks
FAQ

Note: This edition is condensed to keep it readable here while still being production-useful. If you want this turned into a printable long-form, I can split each risk into a chapter with deeper code and test suites.

1) The LLM Security Model (in 5 minutes)

A modern LLM app is not just a model call. It’s a pipeline:

User → Prompt Router → Retrieval/Memory → Model → Output Validator → Tools/Plugins → Data Stores & External APIs → User

New attack realities

Inputs are programs. Natural language can smuggle instructions. So treat prompts and retrieved text like code: parse, constrain, and never auto-execute. OWASP formalizes this as LLM01 Prompt Injection. OWASP GenAI
Outputs can be payloads. URLs, commands, and JSON the model emits must be schema-validated and run in sandboxes with budgets. OWASP calls this Improper/Insecure Output Handling. OWASP GenAI
Supply chain is bigger. Models, embeddings, RAG corpora, tools, model routers, and guardrails are all dependencies. You need versioning + SBOM like you do for containers. OWASP elevates LLM Supply Chain and Poisoning in 2025. OWASP GenAI
Governance matters. Map your controls to NIST AI RMF and the Generative AI Profile to pass audits without slowing teams. NIST Publications

2) The 2025 OWASP LLM Top-10 — Abuse Cases & How to Fix

Below are concise, developer-first treatments of each OWASP 2025 risk. For exact wording and evolving details, refer to the official OWASP pages. OWASP GenAI

LLM01 — Prompt Injection (direct & indirect)

What it is. Inputs—typed by users or embedded in third-party content—steer your model to ignore rules, exfiltrate data, or trigger tools. That includes PDFs, web pages, emails, and “internal notes.” Attacks are often indirect via RAG or browsing. OWASP GenAI

Fixes you can ship this sprint

Separate channels: keep instructions, user input, and retrieval context as distinct structured fields.
Tool allowlists with argument schemas: the model can suggest, but your policy engine decides.
Context firewalls: strip secrets/keys/URLs; label untrusted text; apply domain & URL allowlists for any follow-up fetch.
HITL for high-risk actions; token/time budgets per request.
Red-team prompts in CI.

Why it stays hard: Even major labs say there’s no silver bullet; behavior can be influenced by non-obvious patterns (e.g., hidden instructions). Defense is layered. OWASP GenAI+1

LLM02 — Sensitive Information Disclosure

What it is. LLMs leak PII, secrets, system prompts, or proprietary methods via outputs or side-effects. Risk rises with memory, logs, RAG corpora, and “helpful” tools. OWASP GenAI

Fixes

Redact at ingest (mask emails, credit cards, keys); store hash + vault ref, not plaintext.
Retrieval filters: tenant-scoped, label-based selects; avoid global embeddings for multi-tenant systems.
Prompt hardening: the model may ignore; treat as advisory, not a control.
Data processing terms: opt-out of training/retention where applicable.

LLM03 — Supply Chain

What it is. Your LLM app depends on models, embeddings, datasets, tool plugins, vector DBs, and guardrail libraries. Drifts and tampered artifacts cause safety and integrity regressions. (OWASP 2025 elevates this risk.) OWASP GenAI

Fixes

SBOM for AI: include model name, build hash/commit, tokenizer version, safety config, guardrail models.
Pin & sign everything (models, weights, prompts, policies).
Repro builds: fast rollbacks; store provenance manifests with Merkle roots.
Integration tests with canary prompts and regression suites.

LLM04 — Data & Model Poisoning

What it is. Malicious or low-quality content lands in pretraining, fine-tuning, or RAG data; outputs shift (bias/backdoors/exfil). OWASP GenAI

Fixes

Signed corpora with provenance; hash lists for inclusion.
Poison detectors and outlier filters on new data; human sampling.
Gated retrains: require evals to pass before shipping.
Isolation: keep customer and public data separate unless explicitly approved.

LLM05 — Improper (Insecure) Output Handling

What it is. Your app trusts LLM outputs too much—rendering HTML, following URLs, executing code, or calling tools without validation or sandbox. OWASP calls out this failure mode directly. OWASP GenAI

Fixes

Require JSON-only responses with JSON Schema + strict parser.
No implicit execution: outputs feed a policy gate; risky ops need human approval or a hardened sandbox.
Content-type guards and URL/domain allowlists.

LLM06 — Excessive Agency

What it is. Agents with too many tools/permissions (email, tickets, code) chain actions to do unintended things. OWASP GenAI

Fixes

Capability caps (max tools/actions per session).
Spend/time budgets; kill switches.
Step-up approvals when writing data, spending money, or escalating privileges.

LLM07 — System Prompt Leakage

What it is. Hidden system instructions (and policy details) get exposed via clever queries or logs. Attackers then tailor injections. OWASP GenAI

Fixes

Avoid reflective Q&A about system roles.
Split policy from prompt in infra; redact in logs and analytics.

LLM08 — Vector & Embedding Weaknesses

What it is. Poisoned or low-quality embeddings, cross-tenant vector leakage, or semantic collisions degrade retrieval and enable indirect injection. OWASP GenAI

Fixes

Per-tenant indexes (or strong row-level security).
Pre-filter context; strip tool-like tokens (“ignore previous”, “call_tool”, “system: …”).
Embed features not secrets; no raw PII in vectors.

LLM09 — Misinformation

What it is. Unfounded statements used as facts—dangerous when outputs drive business processes or code. OWASP GenAI

Fixes

Citations required for high-risk answers; verify before action.
Uncertainty surfacing: ask for confidence bands; block actions under thresholds.

LLM10 — Unbounded Consumption

What it is. Token/cost amplification, tool loops, or adversarial long prompts causing DoS and runaway bills. OWASP GenAI

Fixes

Token & time ceilings per stage; guard recursion depth.
Cache hot prompts; small-model fallbacks; rate-limit per principal.

3) A Secure Reference Architecture (drop-in blueprint)

Policy & Identity Plane

Tenant, role, and budget limits; model & tool allowlists; approval matrix.

Prompt Router

Splits system, user, context. Applies normalizers (strip control-like words, collapse whitespace), assigns budgets.

Retrieval Firewall

Per-tenant filtering; PII redaction; domain/URL allowlists for follow-ups; poison/quality scans.

Model Gateway

Model and safety config pinned; JSON-only; forced max tokens; retry/backoff with audit.

Output Validator

JSON Schema parsing; typed conversion; policy gate to decide if any tool call is allowed.

Tool Sandbox

Idempotent, scoped functions; mTLS, short-lived tokens, read-only where possible; egress allowlists.

Observability & Forensics

Structured logs (no secrets/system prompts), prompt+context hashes, replayable traces, cost & safety metrics.

4) Code Patterns

4.1 Force JSON + Validate with JSON Schema (Python)


import json
from jsonschema import validate, Draft202012Validator
from jsonschema.exceptions import ValidationError

# 1) Require JSON output in your prompt
JSON_INSTRUCTION = (
  "You MUST return ONLY valid minified JSON matching this schema."
)

# 2) Schema for a safe tool call (allowlist name + typed args)
TOOL_SCHEMA = {
  "type": "object",
  "required": ["tool", "args"],
  "properties": {
    "tool": {"type": "string", "enum": ["create_ticket", "lookup_order", "send_reply"]},
    "args": {
      "type": "object",
      "additionalProperties": False,
      "properties": {
        "ticket_title": {"type": "string", "maxLength": 200},
        "order_id": {"type": "string", "pattern": "^[A-Z0-9-]{4,}$"},
        "message": {"type": "string", "maxLength": 2000}
      }
    }
  },
  "additionalProperties": False
}

def parse_llm_json(s: str) -> dict:
  try:
    data = json.loads(s)
    Draft202012Validator(TOOL_SCHEMA).validate(data)
    return data
  except (json.JSONDecodeError, ValidationError) as e:
    raise ValueError(f"Invalid model output: {e}")

Why it matters: You never trust LLM strings. Parse → validate → type-check first; then consider taking action. OWASP lists Improper/Insecure Output Handling as a core failure. OWASP GenAI

4.2 Tool-Call Policy Gate (TypeScript)


type Tool = "create_ticket" | "lookup_order" | "send_reply";

const POLICY: Record<Tool, {risk: "low"|"medium"|"high"; approve: (args:any)=>boolean}> = {
  create_ticket: { risk: "medium", approve: (a) => !!a.ticket_title && a.ticket_title.length <= 200 },
  lookup_order:  { risk: "low",    approve: (a) => /^[A-Z0-9-]{4,}$/.test(a.order_id) },
  send_reply:    { risk: "high",   approve: (a) => typeof a.message === "string" && a.message.length <= 2000 }
};

export function policyGate(output: {tool: Tool, args: any}) {
  const rule = POLICY[output.tool];
  if (!rule) throw new Error("Tool not allowed");
  if (!rule.approve(output.args)) throw new Error("Args rejected");

  // Optional: require human approval for "high" risk
  if (rule.risk === "high") throw new Error("HITL required");
  return true;
}

4.3 Retrieval Firewall (Python)


import re

ALLOW_DOMAINS = {"docs.acme.com", "kb.acme.com"}
BLOCK_PATTERNS = [
  r"(?i)ignore previous", r"(?i)system:", r"(?i)call_tool", r"(?i)execute",
]

def strip_toollike(text: str) -> str:
  t = text
  for pat in BLOCK_PATTERNS:
    t = re.sub(pat, "[blocked-token]", t)
  return t

def allow_url(url: str) -> bool:
  from urllib.parse import urlparse
  host = urlparse(url).hostname or ""
  return any(host.endswith(d) for d in ALLOW_DOMAINS)

def redact_pii(text: str) -> str:
  text = re.sub(r"\b\d{3}-\d{2}-\d{4}\b", "[ssn]", text)
  text = re.sub(r"\b(?:\d[ -]*?){13,16}\b", "[card]", text)
  return text

4.4 Budgets & DoS Controls (server middleware)


export function budgetGuard({maxTokens=1500, maxMs=12000, maxTools=2}) {
  return async (req, res, next) => {
    req.budget = { tokens: maxTokens, deadline: Date.now() + maxMs, toolsLeft: maxTools };
    res.on("finish", () => {/* emit cost + safety metrics */});
    next();
  };
}

Map to OWASP: LLM10 Unbounded Consumption / LLM04 DoS via complexity; apply ceilings and fallbacks. OWASP GenAI

4.5 SBOM for Your LLM Stack (YAML example)


ai-sbom:
  app: "helpdesk-assistant"
  version: "2025.09.21"
  models:
    - provider: "acme-hosted"
      name: "llm-x-70b"
      build_sha: "5c8e45..."
      tokenizer: "tkr-2025.06"
      safety_config: "safe-v3.2"
  guardrails:
    - name: "pii-redactor"
      version: "1.8.1"
      policy_sha: "9f1a2b..."
  embeddings:
    - model: "embed-2.0"
      dim: 3072
      index: "vecdb-tenant-silo"
  data:
    - corpus: "kb-2025q3"
      manifest_merkle_root: "0xabc123..."
      signed_by: "security@acme.com"

Why: OWASP 2025 emphasizes Supply Chain; auditors will ask you to prove what ran in prod and where it came from. OWASP GenAI

5) CI/CD & Supply-Chain Controls

Policy as code: store prompts, policies, and tool allowlists in version control; sign releases.
Canary prompts: run a stable suite (harmful, jailbreaking, injection) on every build; block on regression.
Model pinning: upgrade behind a feature flag; re-run evals before rollout.
GenAI SBOM: attach to artifacts; publish to security portal.
Governance mapping: tag controls to OWASP LLM Top-10 and NIST AI RMF (GenAI Profile) categories for audit. NIST Publications

6) Red-Team Playbook (ship with your repo)

Prompt-Injection Suite: direct, indirect (web/PDF), obfuscated (emoji/Base64), multilingual, adversarial suffix. Use MITRE ATLAS patterns for coverage. OWASP GenAI
Output-Abuse Suite: URLs to non-allowlisted domains; HTML/JS injection; code snippets that try to execute.
Tool-Abuse Suite: attempts to chain tools, escalate, or spend; verify HITL triggers.
Leakage Suite: attempts to reveal system prompts, API keys, or another tenant’s data.
Consumption Suite: long recursive prompts; measure ceilings and backoffs.

7) SRE SLOs for LLM Systems

Safety SLO: < 0.1% blocked events that bypass policy in staging; 0 in prod (monitor).
Cost SLO: p95 cost per request below threshold; alert on spikes (possible injection).
Latency SLO: p95 response under X ms with caching & fallbacks.
Drift SLO: model/router/guardrail drift = 0 without signed release.

8) 30/60/90 Day Rollout

Day 0–30 (Stabilize):

Force JSON-only + schema validation; institute tool allowlist and HITL for high-risk.
Add retrieval firewall with PII redaction + domain allowlist.
Pin model version; attach AI-SBOM; add canary prompts to CI.

Day 31–60 (Harden):

Split per-tenant indexes; add budgets (tokens/time/tools).
Build policy gate UI for approvals; wire cost & safety metrics.

Day 61–90 (Operate):

Run red-team exercises; pass auditors with NIST AI RMF mapping & OWASP LLM Top-10 controls. NIST Publications

9) Checklists & RFP Questions

Secure-by-Default Checklist (ship now)

JSON-only outputs + JSON Schema parsing
Tool allowlist + typed args + HITL for “write/spend”
Retrieval firewall (PII redaction, domain allowlists)
Model pinning + AI-SBOM + signed releases
Budgets (tokens/time/tools) + kill switch
Red-team suites in CI; block on regression
Structured logs without secrets/system prompts

Vendor/RFP Questions (copy/paste)

Do you provide pass/fail eval suites for injection, leakage, and output handling?
Can you pin versions of models, tokenizers, and safety configs; do you produce an SBOM?
How do you enforce domain/tool allowlists and budgets?
What’s your tenant isolation model for embeddings and memories?
How do you map to OWASP LLM Top-10 (2025) and NIST AI RMF (GenAI)?

10) FAQ

Q: Can I stop prompt injection completely?
Realistically, no—you can mitigate and contain it with structured inputs, output validation, capability caps, and HITL. OWASP calls out the inherent difficulty; Wired’s reporting underscores why “indirect” injections are stubborn. OWASP GenAI+1

Q: What about privacy laws?
Build data minimization, redaction, opt-outs, and transparent retention. Use the NIST AI RMF + GenAI Profile as a governance scaffold alongside your regional regs. NIST Publications

Q: Which three fixes give the biggest ROI?

JSON-only + schema; 2) Tool allowlists + HITL; 3) Retrieval firewall with PII redaction + allowlisted fetch.

Sources (official and current)

OWASP Top 10 for LLM Applications (2025) — project hub & risk pages. OWASP GenAI
OWASP LLM01 — Prompt Injection (2025) — direct/indirect definitions & mitigations. OWASP GenAI
OWASP LLM05 — Improper/Insecure Output Handling (2025) — validation before action. OWASP GenAI
OWASP LLM02 — Sensitive Information Disclosure (2025) — data exposure risks & practices. OWASP GenAI
OWASP Prompt Injection Prevention Cheat Sheet — concrete do/don’t for devs. OWASP Cheat Sheet Series
NIST AI RMF + Generative AI Profile (2024/2025) — governance scaffolding to pair with OWASP controls. NIST Publications

#CyberDudeBivash #OWASP #LLM #GenAI #AppSec #PromptInjection #OutputHandling #RAG #AgentSecurity #DataPrivacy #AITrust #NISTAIRMF #SecureAI #2025

Search This Blog

Cyberdudebivash