Microsoft: Goal Hijacking and Zero-Click RCE via Poisoned MCP Tool Descriptions

Published July 5, 2026

Vulnerability Analysis🏢 Microsoft #Microsoft #MCP #LLMSecurity #RCE #AgenticAI

Microsoft's AI Red Team and Lakera AI have identified a critical vulnerability in agentic AI systems utilizing the Model Context Protocol (MCP). Adversaries can poison the natural language descriptions of MCP tools to deceive AI agents into "Goal Hijacking," redirecting the agent from its intended objective to attacker-defined tasks. This vulnerability enables zero-click exploit chains where agents autonomously execute malicious actions, including remote code execution (RCE) in agentic IDEs and unauthorized data exfiltration, without requiring user interaction beyond the agent's initial deployment. This mechanism effectively bypasses traditional human-in-the-loop safeguards by exploiting the agent's inherent trust in tool metadata.

Threat Model & MCP Overview
- Model Context Protocol (MCP) enables AI agents to dynamically discover and interact with external tools and data sources.
- Agentic systems rely on natural language tool descriptions to determine tool selection and parameter execution.
- The attack surface exists in the trust relationship between the LLM and the tool's descriptive metadata.
Attack Mechanics: Goal Hijacking
- Poisoned Tool Descriptions: Attackers inject deceptive instructions into the "description" field of an MCP tool definition.
- Goal Redirection: The LLM interprets the poisoned description as a high-priority directive, overriding the user's original prompt.
- Zero-Click Execution: Malicious goals are triggered automatically upon tool selection, removing the need for further social engineering or user interaction.
Technical Impact & Exploitation
- RCE in IDEs: Validation of Remote Code Execution capabilities within agentic integrated development environments.
- EchoLeak Exploit: Demonstrated capability to exploit Microsoft Copilot to facilitate unauthorized data movement.
- Lateral Movement: Agents can be manipulated to pivot from external inputs to internal network resources autonomously.
Systemic Risk & Threat Landscape
- Failure Modes: Identification of seven new systemic failure modes specifically affecting agentic AI architectures.
- Actor Activity: Observation of 31 commercially operating threat groups actively targeting agentic AI interfaces.
- Oversight Erosion: Traditional human-in-the-loop (HITL) controls are bypassed when the agent perceives a malicious action as a legitimate tool function.
Mitigation & Defensive Strategy
- Tool Validation: Implementation of rigorous schema validation and sanitization for all MCP tool descriptions.
- Least Privilege Access: Restricting agent tool access based on strict context and minimal necessary permissions.
- Behavioral Monitoring: Implementing detection for anomalies in tool selection patterns and unexpected output goals.

techtarget.com — The agentic AI 'lethal trifecta': What CISOs should know
techjacksolutions.com — AI Infrastructure and Software Supply Chain Under Coordinated Pressure: Credential Theft, Agent Hijacking, and MFA Bypass Converge Across Enterprise Technology Stack
techjacksolutions.com — Agentic AI Attack Surface Confirmed: Red Team Data Validates 7 New Failure Modes as Zero-Click Exploit Chains Emerge
Microsoft Security Blog — Updating the taxonomy of failure modes in agentic AI systems: What a year of red teaming taught us
Arxiv
Lakera
kiteworks.com — Microsoft Named 7 Ways Your AI Agents Can Be Hacked. One Is Already Happening in Production.
Cdn-dynmedia-1
Letsdatascience
Crowdstrike
Thehackernews
Reddit
Cambridgeanalytica
Openreview
Mdpi
Okta
Reddit
Hiddenlayer
Cdw
Workos
Getastra
Obsidiansecurity
Covertswarm
Zenity
Grokipedia
Dokumen
Dark Reading — Attackers Hijack Exposed AI Endpoints to Power Offensive Ops

FlagThis

Microsoft: Goal Hijacking and Zero-Click RCE via Poisoned MCP Tool Descriptions

Related posts

Microsoft: Goal Hijacking and Zero-Click RCE via Poisoned MCP Tool Descriptions

Related posts

SHARE INTELLIGENCE WIRE