You’ve probably done it, typed a symptom into an AI chatbot instead of Googling it, hoping for a quick, clear answer. The response feels reassuring, well-worded, even authoritative. But a new study suggests that confidence may be exactly what makes it risky. According to recent research published in BMJ Open and highlighted in analysis from The Conversation, nearly half of all AI-generated health answers are inaccurate or incomplete, even though they sound convincing. So, 50% of the answers you are getting can be wrong.The study evaluated five widely used AI chatbots, including ChatGPT, Gemini, Grok, Meta AI, and DeepSeek. Researchers tested them using 250 health-related questions across topics like cancer, vaccines, nutrition, stem cells, and athletic
performance, areas where misinformation is already common. The results were difficult to ignore:
- 49–50% of responses were flagged as problematic
- Around 30% were somewhat misleading
- Nearly 20% were considered highly problematic or potentially harmful
What’s more concerning isn’t just the errors. it’s how they’re delivered.
The Confidence Problem
One of the study’s most striking findings is that AI doesn’t hesitate. Even when answers were wrong or incomplete, they were often presented with certainty and little to no warning. Out of 250 responses, chatbots refused to answer only twice. This creates what researchers call a 'false sense of reliability.' The language is polished, structured, and easy to follow, making it harder for users to question the accuracy. As highlighted in The Conversation, AI can sound authoritative on health topics while quietly getting key details wrong.
But that being said not all topics were equally problematic. The chatbots performed relatively better on areas like vaccines and cancer but struggled significantly with nutrition, stem cell treatments, and athletic performance advice. These are exactly the categories where people often look for quick fixes, alternative treatments, or lifestyle guidance, making the risk of misinformation even more real.
Why This Happens
The issue isn’t that AI lacks information, it’s how it generates answers. AI models don’t 'know' facts the way humans do. They predict responses based on patterns in training data, which can include a mix of scientific research, online forums, and general web content. That means a response might blend accurate medical knowledge with outdated, incomplete, or even misleading information, without clearly separating the two. Another problem is weak or fabricated citations. The study found that references provided by chatbots were often incomplete or unreliable, with an average completeness score of just 40%.For casual queries, like understanding a term or getting general health tips, AI can still be useful. But when it comes to decision-making, the risks become clearer. Separate research from the University of Oxford found that people using AI for health advice were no better at identifying medical conditions or deciding when to seek care than those relying on traditional methods. In some cases, users were actually misled by a mix of correct and incorrect suggestions, making it difficult to identify the right course of action.
So, Should You Stop Using AI for Health?
Not necessarily, but how you use it matters. Think of AI as a starting point, not a final answer. It can help explain medical jargon, summarise information, or prepare you for a doctor’s visit. But it shouldn’t replace professional advice, especially when symptoms, treatments, or diagnoses are involved. Because if this study shows anything, it’s this, AI doesn’t just get things wrong, it gets them wrong convincingly. And when it comes to your health, that distinction matters more than ever.