Attackers are leveraging Indirect Prompt Injection (IPI) to hijack AI agents from OpenAI, Anthropic, and Google by weaponizing the Retrieval-Augmented Generation (RAG) process. Through SEO poisoning, malicious sites are prioritized in agent grounding searches, delivering hidden payloads via CSS (display:none, opacity:0) and zero-width characters. These invisible instructions override system prompts to execute unauthorized tool-use functions, enabling cryptojacking via WebAssembly and the exfiltration of sensitive session data to attacker-controlled endpoints. This vulnerability shifts the primary attack vector from direct user input to external, untrusted data sources utilized for agentic autonomy.
-
Threat Model & Attack Vector
- Transition from Direct Prompt Injection (user-to-model) to Indirect Prompt Injection (data-to-model) using untrusted web sources.
- Use of SEO poisoning to manipulate search rankings, ensuring AI agents prioritize malicious pages during the "grounding" phase.
- Exploitation of the semantic discrepancy between human visual perception and LLM tokenization of HTML/CSS.
-
Technical Execution & Stealth
- Deployment of stealth payloads using
display:none, white-on-white text, and zero-font sizes to hide instructions from human users. - Utilization of natural language "Instructional Overrides" (e.g., "Ignore previous instructions") to hijack the LLM's operational logic.
- Injection of malicious context directly into the LLM's active window during the retrieval phase of the RAG workflow.
- Deployment of stealth payloads using
-
Operational Impact & Payload Delivery
- Execution of unauthorized tool-use capabilities to deploy cryptojacking scripts, specifically utilizing WebAssembly-based miners.
- Silent exfiltration of sensitive user session data and PII via API calls triggered by the compromised agent.
- Bypass of safety guardrails to redirect users to malicious domains or perform unauthorized system actions.
-
Defensive Strategies & Mitigation
- Implementation of aggressive input sanitization to strip hidden HTML elements and invisible CSS before data reaches the LLM context.
- Adoption of a Dual-LLM architecture, utilizing a restricted "inspector" model to screen retrieved content for prompt injections.
- Enforcement of strict contextual segregation to maintain clear boundaries between system prompts, user inputs, and external data.
-
Ecosystem Risk Assessment
- Demonstrated vulnerability across the three leading AI ecosystems (OpenAI, Anthropic, Google) supporting web-browsing.
- Increased risk profile for "agentic" AI systems possessing autonomous write, execute, or API-calling permissions.
- Systemic erosion of trust in enterprise RAG deployments due to the inherent unreliability of untrusted web-grounding.
Related posts
- Cybersecurity News — Hackers Abuse SEO Poisoning and Hidden HTML to Trick AI Agents Into Following Malicious Instructions
- Forcepoint
- Securityboulevard
- Digitaljournal
- Scworld
- Hackread
- Cybernews
- Blog
- Artificialintelligence-news
- Arxiv
- Cyberdefensemagazine
- Infosecurity-magazine