AI chatbots give wrong medical advice

A BMJ Open study finds 50% of AI health responses are inaccurate
Chatbots like ChatGPT and Gemini gave flawed medical advice
Users face risks as AI provides confident but unverified guidance

Summarized by AI ⓘ

Mastering AI

SEE ALL

Feedpost Specials

Claude for Word: Revolutionizing Document Workflow for Professionals

NewsBytes

OpenAI updates Agents SDK for building safer, smarter AI agents

Feedpost Specials

Transform Pet Photos into Animated Videos with AI Magic!

What is the story about?

Technology companies have been steadily expanding into healthcare, pitching AI as a tool that can simplify access to medical information.

From OpenAI’s ChatGPT Health to Google’s MedGamma 1.5 and Anthropic’s Claude for Healthcare, a

growing number of platforms now promise users quicker, more personalised health guidance. While many users report positive experiences, new research suggests the reality may be far more concerning.

Study finds widespread inaccuracies in AI health responses

A recent study published in BMJ Open has raised fresh concerns about the reliability of AI-generated medical advice. The research, first reported by Bloomberg, examined five widely used AI chatbots, ChatGPT, Gemini, Meta AI, Grok and DeepSeek, to assess how accurately they respond to health-related queries.

Researchers tested the systems using 10 questions spanning five medical categories. The findings were stark: around 50 per cent of all responses contained problematic or incorrect information. More alarmingly, nearly one in five answers were deemed highly problematic, posing potential risks if followed without professional guidance.

The study also revealed that these AI models tend to perform better when dealing with straightforward, closed-ended questions on well-established topics such as cancer or vaccines. However, their reliability drops significantly when faced with open-ended or complex issues, including nutrition and emerging fields like stem cell therapy.

Another key issue highlighted in the report is the lack of transparency and supporting evidence. In many cases, the chatbots failed to provide complete or verifiable medical references, raising doubts about the credibility of their outputs.

Confidence without clinical judgement raises red flags

Perhaps the most troubling aspect identified by researchers is the tone of certainty adopted by these systems. Despite lacking medical training, licensing or clinical judgement, the chatbots frequently presented their answers with authority and confidence.

This mismatch between tone and accuracy could mislead users into trusting flawed guidance. Across all responses analysed in the study, there were only two instances where a chatbot refused to answer a question, both involving Meta AI, suggesting that most systems prioritise responsiveness over caution.

The researchers warned that deploying such tools at scale without adequate safeguards could amplify the spread of medical misinformation. “These systems can generate authoritative-sounding but potentially flawed responses,” the authors noted, adding that the findings underscore the need to rethink how AI is used in public-facing health communication.

The timing of the study is significant. AI firms are increasingly positioning their platforms as healthcare companions, capable of offering insights based on user-provided data.

OpenAI’s ChatGPT Health, for instance, allows individuals to share personal health information to receive more tailored responses, a move that raises both opportunities and risks.

As AI continues to integrate into everyday decision-making, the study serves as a reminder that convenience does not always guarantee accuracy. In the case of healthcare, the stakes are particularly high, making reliability and oversight critical.

AI chatbots give wrong medical advice

Related Stories

Study finds widespread inaccuracies in AI health responses

Confidence without clinical judgement raises red flags

More stories you might like

Your AI doctor could be wrong half the time

Bixonimania Hoax: How AI Believed a Fake Eye Condition, Warning Users About Health Queries

AI Health Advice: Confidence High, Accuracy Low in New Study

India's AI Frontier: Delhi-NCR Leads ChatGPT Adoption, Top Cities Drive 50% Usage

Combating Online Extremism: AI Chatbots Meet Human Support Networks

Anthropic Challenges OpenAI: Enterprise AI Market Sees Shifting Momentum

OpenAI Broadens Cloud Horizons: Partnering with Oracle & Google Beyond Microsoft

OpenAI cites Anthropic's compute constraints behind Claude Mythos delay

OpenAI unveils GPT-5.4-Cyber in restricted rollout, a direct challenge to Claude Mythos

Mercor hacked via LiteLLM flaw, OpenAI Anthropic data at risk

AI Generated