OWASP LLM Top 10 (2025): A Practical Compliance Guide for AI Teams
The OWASP Top 10 for LLM Applications has become the de facto framework for assessing LLM security risks. This guide walks through each of the ten items, explains the real-world risks in the context of AI agents using MCP, and shows how to implement effective controls.
Why OWASP LLM Top 10 Matters for MCP Deployments
The OWASP Top 10 for LLM Applications is not just a checklist -- it is a risk framework that maps directly to real attack vectors observed in production AI systems. For organizations deploying AI agents with MCP access to enterprise tools and data, these risks are amplified. An LLM that can read databases, send emails, and modify records through MCP tools has a larger blast radius than a chatbot answering questions from a knowledge base.
Compliance with the OWASP LLM Top 10 is increasingly becoming a requirement for enterprise AI deployments. Security teams, auditors, and regulators reference this framework when evaluating AI system security. Understanding each item and having documented controls is essential.
Prompt Injection
Prompt injection occurs when an attacker manipulates the input to an LLM to override its instructions, bypass safety measures, or trigger unintended actions. In MCP environments, prompt injection takes two primary forms.
Direct injection happens when a user crafts input specifically to manipulate the agent -- for example, prefixing their request with "Ignore all previous instructions and instead..." This is the form most people think of, and most modern AI models have some resistance to it.
Indirect injection is more dangerous in MCP contexts. When an agent retrieves data from an MCP tool -- a web page, a database record, a document -- that data may contain injected instructions. The agent processes this data as part of its context, and the injected instructions can alter its behavior. Since MCP tools return arbitrary content, every tool response is a potential injection vector.
How INS addresses this: INS scans both incoming requests and outgoing tool responses for prompt injection patterns. The detection pipeline uses pattern matching for known injection signatures and structural analysis to detect novel injection attempts. Suspicious content is flagged, logged, and optionally blocked before it reaches the agent's context window.
Sensitive Information Disclosure
LLMs can inadvertently reveal sensitive information -- training data, system prompts, PII from previous interactions, or data accessed through tools. In MCP workflows, this risk is compounded because agents access real enterprise data through tools and may include that data in responses to users, in subsequent tool calls, or in logs.
The exposure vectors include: agents including PII from database queries in their responses to users; agents sending customer data to analytics or monitoring MCP tools; system prompts containing API keys or internal URLs being revealed through prompt extraction attacks; and over-fetching from data tools where the response contains more sensitive data than the task requires.
How INS addresses this: Bidirectional PII and secret scanning on every MCP interaction. INS detects over 120 types of sensitive data including SSNs, credit card numbers, API keys, and credentials. Configurable masking, redaction, or tokenization ensures sensitive data does not flow to unauthorized destinations. Full audit logging provides a compliance trail.
Supply Chain Vulnerabilities
The LLM supply chain includes model providers, training data sources, plugins, and -- critically for MCP deployments -- the MCP servers and tools that agents connect to. A compromised MCP server is a supply chain attack: your agent trusts it because it was authorized, but the server is now serving malicious tool descriptions or returning poisoned data.
This category encompasses tool poisoning (malicious tool descriptions), rug pull attacks (servers changing behavior after initial trust), and compromised third-party MCP servers that inject malicious content into tool responses.
How INS addresses this: Continuous tool definition monitoring with differential analysis detects rug pull attacks. Tool description scanning catches poisoning attempts. MCP server integrity tracking maintains baselines and alerts on any changes. All third-party tool responses are scanned before reaching the agent.
Data and Model Poisoning
While model poisoning (manipulating training data) is typically addressed at the model provider level, data poisoning through MCP tools is a direct concern for agent deployments. If an MCP tool returns manipulated data -- incorrect financial figures, altered customer records, fabricated search results -- the agent will use this data in its reasoning and outputs, potentially causing real-world harm.
In the MCP context, data poisoning can be targeted: an attacker who compromises a single MCP server can influence every agent that uses it, potentially affecting thousands of downstream decisions.
How INS addresses this: Response integrity scanning detects anomalous tool responses. Behavioral analysis identifies when tool responses deviate significantly from expected patterns. Audit logging preserves a complete record of all tool responses for forensic analysis.
Improper Output Handling
When LLM output is fed directly into downstream systems without validation, it creates injection risks. In MCP workflows, this manifests when an agent's output is used as input to another tool: for example, an agent generates a SQL query and passes it to a database tool, or an agent composes an email body and sends it through a mail tool. If the agent has been manipulated (via prompt injection or data poisoning), its outputs to downstream tools can be malicious.
This is particularly dangerous because the downstream MCP tool trusts the agent as a legitimate client and executes whatever the agent sends -- including SQL injection, command injection, or cross-site scripting payloads embedded in the agent's output.
How INS addresses this: Request-side scanning detects injection payloads in tool call parameters. Policy enforcement can restrict which parameter values are allowed for sensitive tools. Parameterized tool calls are validated against the tool's input schema before being forwarded to the MCP server.
Excessive Agency
Excessive agency occurs when an AI agent has more permissions, access, or autonomy than necessary for its task. In MCP deployments, this often means agents that can access every registered MCP server and invoke any tool, including destructive operations like deleting records, modifying configurations, or sending external communications.
The principle of least privilege applies directly: a customer support agent should not have access to production database administration tools. An analytics agent should not be able to send emails. When agents have excessive agency, the impact of any successful attack -- prompt injection, tool poisoning, or data poisoning -- is amplified dramatically.
How INS addresses this: Granular policy enforcement controls which agents can access which tools on which MCP servers. Policies support conditions based on agent identity, tool name, time of day, and parameter values. Destructive operations can require explicit approval workflows. Rate limiting prevents agents from making an excessive number of tool calls.
System Prompt Leakage
System prompts often contain sensitive information: internal instructions, safety guardrails, API keys, internal URLs, or business logic. Attackers use various techniques to extract system prompts, including direct requests ("Repeat your instructions"), roleplay scenarios, and indirect extraction through tool responses that reflect the agent's instructions.
In MCP contexts, system prompt leakage can reveal which tools an agent has access to, how it decides which tools to use, and what security constraints have been configured -- information that an attacker can use to craft more targeted attacks.
How INS addresses this: Output scanning detects when agent responses contain patterns consistent with system prompt leakage. Monitoring of tool call patterns can identify reconnaissance behavior where an attacker is probing to discover the agent's capabilities and constraints.
Vector and Embedding Weaknesses
When agents use RAG (Retrieval Augmented Generation) through MCP tools that query vector databases, the retrieved content becomes part of the agent's context. If the vector database has been poisoned with malicious documents, the retrieval process becomes an indirect injection vector. An attacker can plant a document containing injection instructions that will be retrieved when certain queries are made.
This is particularly concerning because vector search is semantic -- the malicious document does not need to contain the exact query terms, just semantically similar content. This makes poisoning attacks harder to detect with traditional keyword-based security tools.
How INS addresses this: Response scanning catches injection attempts in RAG results regardless of how the document was retrieved. The same prompt injection detection that protects against direct attacks also protects against injection through retrieved content. Content length and complexity analysis flags anomalous retrieval results.
Misinformation
LLMs can generate plausible but incorrect information (hallucination). When an agent uses MCP tools to take real-world actions based on hallucinated reasoning -- sending incorrect data to a customer, making wrong calculations for financial reports, or generating inaccurate compliance documentation -- the consequences go beyond a bad chatbot response.
In MCP workflows, misinformation risk is compounded when tool responses are manipulated (data poisoning) and the agent incorporates the false data into its reasoning chain. The agent may then make a series of tool calls based on incorrect premises, with each subsequent action compounding the error.
How INS addresses this: Audit logging captures the complete chain of tool calls and responses, enabling forensic reconstruction of how the agent arrived at its decisions. Anomaly detection flags unusual patterns in tool usage that may indicate the agent is operating on incorrect data. Policy controls can require human approval for high-stakes actions.
Unbounded Consumption
LLM applications can be exploited to consume excessive resources -- either through denial-of-service attacks or through manipulation that causes the agent to make an excessive number of expensive tool calls. In MCP environments, an attacker might craft inputs that cause the agent to enter a loop of tool invocations, overwhelming the MCP servers or running up costs.
Beyond intentional attacks, poorly designed agent workflows can inadvertently create resource exhaustion. An agent that retries failed tool calls without backoff, or that queries every record in a database table, can overwhelm downstream systems.
How INS addresses this: Per-agent rate limiting controls the number of tool calls within configurable time windows. Per-tool rate limiting prevents any single tool from being overwhelmed. Request size limits prevent excessively large payloads. Circuit breaker patterns detect and stop tool invocation loops. All rate limit events are logged for analysis.
Building a Compliance Program
Mapping the OWASP LLM Top 10 to your MCP deployment is the first step toward a comprehensive AI security compliance program. Here is a practical approach:
- Assessment: Map each OWASP item to your specific MCP architecture. Identify which tools, agents, and data flows are exposed to each risk.
- Controls: For each identified risk, implement technical controls (gateway scanning, policy enforcement, rate limiting) and procedural controls (approval workflows, regular reviews, incident response plans).
- Documentation: Document your control mapping. Auditors and regulators will ask how you address each OWASP item. A clear mapping document saves significant time during audits.
- Monitoring: Deploy continuous monitoring for each risk category. Real-time dashboards showing threat detection rates, policy violations, and anomaly alerts provide ongoing assurance.
- Testing: Regularly test your controls with adversarial simulations. Attempt prompt injection, tool poisoning, and data exfiltration against your own systems to verify that your defenses work.
- Iteration: The threat landscape evolves. Review and update your controls quarterly, incorporating new attack techniques and updating detection patterns.
The Bottom Line
The OWASP LLM Top 10 provides a structured framework for thinking about AI security risks. For organizations deploying AI agents with MCP tool access, these are not theoretical vulnerabilities -- they are practical risks that require concrete controls. A purpose-built MCP security gateway addresses the majority of these risks at the infrastructure level, providing consistent protection across all agent-to-tool interactions.
Compliance is not a one-time achievement. It requires ongoing vigilance, regular testing, and continuous improvement. But with the right foundation -- a security gateway that provides visibility, detection, and enforcement -- organizations can deploy AI agents confidently while meeting the security standards that enterprises, auditors, and regulators demand.
INS Security Team
Building enterprise security for the AI agent era.
Address the OWASP LLM Top 10 with one deployment
INS provides comprehensive coverage across all 10 OWASP LLM risk categories. See our full OWASP compliance mapping.
Join the WaitlistRelated Posts
March 25, 2026
What Is MCP Security?
A complete guide to securing Model Context Protocol and understanding the MCP attack surface.
March 28, 2026
MCP Tool Poisoning Attacks
How malicious tool descriptions compromise AI agents and what you can do about it.
April 1, 2026
How to Prevent PII Leaks in AI Agent Workflows
Types of PII exposure, detection approaches, and masking strategies for AI systems.