in

AI systems’ vulnerability exposed in UK Government report #vulnerability

UK Government report reveals AI systems' vulnerability

The UK’s AI Safety Institute has found that AI systems are highly susceptible to basic jailbreaks, with some models generating harmful outputs without any attempts to bypass their safeguards. The institute tested models that responded to harmful queries without requiring jailbreak efforts, answering between 98 and 100 percent of harmful questions when subjected to simple attacks. The evaluation measured compliance and correctness in eliciting harmful information, with attacks embedding harmful questions into prompts or using multi-step procedures to generate prompts. Compliance rates for harmful questions were relatively low without attacks but could reach up to 28 percent for some models on private harmful questions. The study also found that attacks did not significantly impact the correctness of responses to benign questions. The institute plans to extend testing to other AI models and develop more robust evaluation metrics to improve the safety and reliability of AI systems. With offices in London and plans to open in San Francisco, the institute aims to strengthen its relationship with the US AI Safety Institute and collaborate with leading AI companies like Anthrophic and OpenAI.

Source link

Source link: https://www.computing.co.uk/news/4212708/uk-government-report-reveals-ai-systems-vulnerability

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

Install MiniCPM Llama3-V 2.5 Locally - Beats GPT4o

Locally install MiniCPM Llama3-V 2.5 for enhanced performance #BeatsGPT4

Cedar-Sinai is piloting AI tools to help reduce friction for caregivers - Healthcare Finance News

Cedar-Sinai tests AI tools to ease caregiver burden #healthcare