ChatGPT And Gemini Can Give Harmful Answers If You Trick Them Via Poetry, Here Is How

Times Now

ChatGPT and Gemini have guardrails when it comes to giving harmful suggestions, but there have been instances when these models were manipulated to do the wrongful. Now, new research related to the same

has found that these models have a deeper systematics weakness that lets the attackers tear down their safety mechanisms and gather harmful information from the models. According to the researchers based in Italy's Icaro Lab, if a user covers harmful requests in poetry can act as a 'universal single turn jailbreak' and lead the AI models to respond with information asked in harmful prompts.

Details About the Research

The researchers said that they tested 20 manually curated harmful requests in poems and got an attack success rate of 62 percent over 25 frontier closed and open weight models. The models that have been tested in the research are Moonshot AI, Gemini, Anthropic, DeepSeek, Meta, QWen, xAI, and Mistral AI. In the results, it was found that even when AI was used to automatically rewrite harmful prompts into bad poetry, it was still able to yield a 43 percent success rate.Also Read: ChatGPT Update: OpenAI May Soon Show Ads To Free Users, All You Need To KnowThe Study suggests that the harmful questions framed poetically can wiel results from the chatbots. And in this case, the attackers can get up to 18 times more success as compared to using straight up harmful prompts. The research also found out that the smaller models were still pretty resilient to the prompts as compared to larger models. For example, GTP 5 Nano avoided responding to harmful prompts, but Gemini 2.5 Pro responded to all of them.The reason behind harmful prompts working as poetry could be that the AI models are trained to understand patterns and decide if it is a harmful question or not. But in poetry, the language gets stacked with rhythms, unique syntax, and more, which is a main reason AI models get confused.

ChatGPT And Gemini Can Give Harmful Answers If You Trick Them Via Poetry, Here Is How

Details About the Research

More stories you might like

West Ham vs Man United Live Streaming: Where to Watch Premier League Match on TV and Online? | MUN vs WHU Telecast Details

T20 World Cup: Farhan guides Pakistan past United States for back-to-back wins

BTS Paved The Way For CORTIS' Keonho? Idol Reveals Secret Connection With Dynamite Song

Aston Villa Seek Top-Five Finish Game By Game, Emery Says Season Has Been More Up Than Down

Draft DAP-2026 aims to fast-track defence acquisitions, boost Atmanirbhar Bharat

Mumbai police probe threat to actor Ranveer Singh; Bishnoi gang angle suspected

IPL 2026: Delhi Capitals strengthen backroom staff, appoint Ian Bell as assistant coach

Kagiso Rabada T20I Stats and Record at Narendra Modi Stadium

Chelsea vs Leeds Live Streaming: Where to Watch Premier League Match on TV and Online? | CHE vs LEE Telecast Details

T20 World Cup 2026: England vs West Indies– Toss Prediction, Playing XI, Weather Forecast, and More

AI Generated