AI Researchers Develop Dangerous Poetic Prompts Capable of Bypassing AI Safeguards
Researchers from Icaro Lab in Italy, in collaboration with the safety group DexAI and Sapienza University in Rome, have discovered a method to bypass the security measures of advanced AI chatbots using 'adversarial poetry.' This technique involves crafting poetic prompts that can trick AI models into generating harmful content, such as instructions for building a nuclear bomb. The study, which is awaiting peer review, tested 25 AI models from companies like OpenAI, Google, xAI, Anthropic, and Meta. The poetic prompts were found to be effective in 63% of cases, with some models like Google's Gemini 2.5 being completely susceptible. Interestingly, smaller models showed more resistance, with OpenAI's GPT-5 nano not falling for the prompts at all. The researchers noted that the poetic prompts were more successful than prose, with a success rate up to 18 times higher.