SOURCE: arXiv (Computer Science - Cryptography and Security)

Shared-Embedding Sequence Models: The Instruction-Data Conflation Vulnerability

Published July 1, 2026

Vulnerability Analysis#LLMSecurity #PromptInjection #AIArchitecture #VulnerabilityAnalysis #Cybersecurity

Research detailed in arXiv:2606.27567 identifies a fundamental architectural flaw in shared-embedding sequence models where instructions and data are processed via a unified attention-aggregation pipeline. This "instruction-data conflation" mirrors the Von Neumann architecture's overlap of code and data, rendering prompt injection a structural vulnerability rather than a patchable alignment bug. Mathematical proofs utilizing Total Variation Distance (TVD) demonstrate the impossibility of Semantic-Faithful Control (SFC), proving that trusted instructions and untrusted data are statistically inseparable. This flaw enables authoritative action hijacking, including refusal bypasses and unauthorized tool execution, effectively neutralizing current in-pipeline classifiers and alignment-based defenses.

Threat Model: Instruction-Data Conflation
- Defines Prompted Action Models (PAMs) as systems where control signals (instructions) and variable inputs (data) share a single embedding space.
- Draws a direct parallel to the Von Neumann architecture, where the lack of separation between code and data enabled decades of buffer overflow exploits.
- Establishes that prompt injection is a systemic property of the architecture, not a failure of the training set or RLHF alignment.
Attack Mechanics: The Invariance Gap
- Utilizes Total Variation Distance (TVD) to prove the mathematical impossibility of provenance recovery, meaning the model cannot reliably distinguish the source of a token.
- Identifies the "Invariance Gap," where semantic-equivalence classes allow adversarial inputs to bypass finite training sets and classifiers.
- Employs attention-map analysis to demonstrate "Control-Path Exposure," showing how untrusted data can hijack the model's internal attention mechanism to trigger authoritative actions.
Systemic Security Impact
- High success rates in hijacking "authoritative actions," resulting in unauthorized tool execution, memory writes, and refusal bypasses.
- Demonstrates that current state-of-the-art in-pipeline classifiers fail when faced with semantic-equivalent adversarial inputs.
- Quantifies representation overlap across production tokenizers, proving that trusted and untrusted streams are processed identically at the latent level.
Countermeasures and Architectural Requirements
- Argues that "in-pipeline" defenses (filtering and alignment) are insufficient due to the finite-coverage invariance gap.
- Proposes a fundamental shift toward the physical or logical separation of instruction and data channels.
- Suggests implementing architectural boundaries similar to Data Execution Prevention (DEP) or ASLR to isolate control paths from user-supplied data.
Conclusion: The Path to Robust Agentic AI
- Concludes that shared-embedding models are inherently insecure for high-stakes agentic workflows.
- Asserts that true security requires a move away from single-stream sequence processing for control logic.
- Warns that until architectural separation is achieved, prompt injection remains a permanent risk factor.

arXiv (Computer Science - Cryptography and Security) — TEMPO-Diffusion: Temporally Exposed Malicious Poisoning of Diffusion Models
arXiv (Computer Science - Cryptography and Security) — HauntAttack: When Attack Follows Reasoning as a Shadow
arXiv (Computer Science - Cryptography and Security) — On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models
arXiv (Computer Science - Cryptography and Security) — Decomposing Memorization Reduction in Privacy-Preserving Fine-Tuning of SLMs for CSIRTs
Hacking-and-security
Themoonlight
Rdworldonline
Youtube
Assets
Xhan77
Proceedings
Tempo
Huggingface
Roboticscenter
Emergentmind
Github
Openaccess
Openreview
Alphaxiv
Dailysecurity
Roboticscenter
Stat
Aclanthology
Mansisak
Openreview

FlagThis

Shared-Embedding Sequence Models: The Instruction-Data Conflation Vulnerability

Related posts

Shared-Embedding Sequence Models: The Instruction-Data Conflation Vulnerability

Related posts

SHARE INTELLIGENCE WIRE