Tricking GPT-4o with flowchart images leads to harmful outputs #MisleadingAI

A study titled “Image-to-Text Logic Jailbreak: Your Imagination Can Help You Do Anything” discovered that visual language models like GPT-4o can be manipulated into producing harmful text outputs by providing them with a flowchart image depicting a harmful activity. The researchers found that GPT-4o was highly susceptible to this manipulation, with a 92.8% success rate, while GPT-4-vision-preview was safer at 70%. They developed an automated framework to generate flowchart images from harmful text prompts, which were then used to elicit harmful outputs from the model. However, AI-created flowcharts were less effective than hand-crafted ones in triggering this manipulation, indicating that automating this process may be challenging.

Another study highlighted the vulnerability of visual language models to producing harmful outputs when given multimodal inputs like pictures and text. A new benchmark called Safe Inputs but Unsafe Output (SIUO) was developed to evaluate model performance, with only a few models, including GPT-4o, scoring above 50%. As visual language models like GPT-4o and Google Gemini become more common, AI companies will need to enhance the safety of these models to avoid government scrutiny.

In conclusion, the study underscores the need for improved safety mechanisms in multimodal AI models like GPT-4o to prevent the generation of harmful outputs. AI companies will have to address these vulnerabilities as these models become more widely used.

Source link

Source link: https://www.neowin.net/amp/flowchart-images-trick-gpt-4o-into-producing-harmful-text-outputs/

Tricking GPT-4o with flowchart images leads to harmful outputs #MisleadingAI

Minimize confusion in chatGPT interactions for clearer communication. #clarity

Generative AI revolutionizes human-computer interaction in new era #AIRevolution

Harnessing the potential of RAG technology with Ollama Phi3 #AI

#ExploringMultimodalCapabilities: Vision and Language Confluence #Cambrian1

Cybersecurity Talent Landscape and LLMs: A Comprehensive Overview #cybersecuritytalent

Create cardboard art during journey with Josh’s AI prompts. #DIYCrafts

Exposing and Addressing LLM Manipulation Risks in Healthcare AI #SafeguardingHealthcareAI

Developing AI chatbot or website with ChatGPT and WhatsApp #automation

AI Art Tutorials: Futuristic car, dreamlike yard, incredible larva #DALL-E

Top 9 AI body editing apps for iPhone and Android. #PERFECT

#ExploringMultimodalCapabilities: Vision and Language Confluence #Cambrian1

Exposing and Addressing LLM Manipulation Risks in Healthcare AI #SafeguardingHealthcareAI

Proposed ESFT to Reduce Memory and Time by 90% #EfficiencyBoost

#GenerativeAI hindered by tokens, impacting performance and accuracy. #Tokenization

East Asian Languages Chapter by Henry Heng LUO, Jun 2024 #Languages

Enhancing Communication with AI Voice Tools for Efficiency #AIVoiceTools

Share this: