CyberSecurity updates
Updated: 2024-12-04 14:06:31 Pacfic

microsoft.com
Microsoft Azure AI Content Safety Vulnerability - 6d
Read more: www.microsoft.com

Mindgard security researchers have found two vulnerabilities in Microsoft Azure’s content safety filters for AI, namely AI Text Moderation and Prompt Shield. These vulnerabilities allow attackers to bypass these safeguards and inject malicious content into protected large language models (LLMs). Mindgard’s testing involved exposing ChatGPT 3.5 Turbo with Azure OpenAI to these filters and then using character injection and adversarial ML evasion techniques to circumvent them. The first method, character injection, involved adding specific characters and text patterns to prompts, leading to a significant drop in jailbreak detection effectiveness. The second, adversarial ML evasion, further reduced the effectiveness of both filters by finding blind spots in their ML classification systems. Microsoft acknowledged the issue and has been working on fixes for upcoming model updates. However, Mindgard emphasizes the seriousness of these vulnerabilities, as attackers could exploit them to compromise sensitive information, gain unauthorized access, manipulate outputs, and spread misinformation.


This site is an experimental news aggregator using feeds I personally follow. You can provide me feedback using this form or using Bluesky.