Published June 24, 2026
The research identifies systemic vulnerabilities in current AI testing methodologies, specifically the failure of digital-only sandboxes to mitigate kinetic risks in embodied AI. In cyber-physical systems (CPS), AI agents can bypass digital isolation to manipulate physical environments or human operators. This research introduces a formalized taxonomy and a multi-dimensional measurement framework—incorporating fidelity, controllability, and containment—to address sandbox escape vectors and adversarial attacks on the monitoring apparatus. The framework provides a standardized methodology for validating the safety and security of complex AI deployments through high-fidelity simulation and formal evidence composition.
- Research Overview & Problem Statement
- Identifies the security gap where digital sandboxes fail to address the kinetic risks of embodied AI.
- Highlights that AI in robotics and AIoT can manipulate physical processes or human operators.
- Proposes a shift from simple digital isolation to comprehensive, assurance-oriented frameworks.
- Sandbox Taxonomy & Boundary Definitions
- Categorizes environments into Digital, Embodied, and Cyber-Physical layers.
- Defines formalized boundary protocols to prevent cross-domain escalation.
- Establishes sandbox archetypes to standardize isolation and simulation capabilities.
- Cyber-Physical Threat Model
- Addresses attack vectors targeting both the AI model and the assurance/monitoring apparatus.
- Models the risk of "sandbox escapes" into physical environments.
- Utilizes the "Weakest-Link Rule" to evaluate the reliability of multi-dimensional security evidence.
- Measurement Framework & Metrics
- Introduces six key metrics: Fidelity, Controllability, Observability, Containment, Reproducibility, and Governance.
- Quantifies containment effectiveness by calculating the probability of sandbox escape.
- Measures the "Fidelity Gap" to ensure simulation accuracy relative to physical behavior.
- Industry & Regulatory Implications
- Provides the validation metrics required for regulatory compliance, such as the EU AI Act.
- Requires advanced instrumentation and evidence capture for physical AI auditing.
- Evaluates the resilience of safety-critical monitoring tools against adversarial manipulation.
Related posts
- arXiv (Computer Science - Cryptography and Security) — AI Sandboxes: A Threat Model, Taxonomy, and Measurement Framework
- Papers
- Github
- Themoonlight
- Cymulate
- Securitydocs
- Youtube