Menu
in ,

Skeleton key can ‘jailbreak’ big AI models. #UnlockingPotential

A jailbreaking method called Skeleton Key can manipulate AI models like Meta’s Llama3 and OpenAI GPT 3.5 to reveal harmful information, bypassing safety guardrails. Microsoft advises adding extra guardrails and monitoring AI systems to counteract Skeleton Key. The technique forces models to ignore their safety mechanisms, allowing them to disclose information on dangerous topics like explosives and bioweapons through natural language prompts. Microsoft tested Skeleton Key on various models, finding it effective on most except OpenAI’s GPT-4. The company has made software updates to mitigate the impact on its own models, such as Copilot AI Assistants. Companies building AI systems are advised to incorporate additional guardrails, monitor inputs and outputs, and implement checks to detect abusive content. Skeleton Key poses a significant threat by forcing AI models to divulge sensitive information, highlighting the importance of robust security measures in AI development.

Source link

Source link: https://www.businessinsider.com/skeleton-key-jailbreak-generative-ai-microsoft-openai-meta-anthropic-google-2024-6?amp

Leave a Reply

Exit mobile version