The growing use of artificial intelligence in academic writing is raising fresh concerns over the reliability of scientific research, particularly as AI-generated hallucinations begin slipping into published papers unnoticed.
An associate professor who had used AI tools to grammatically refine his thesis recently heard back from a publisher questioning several references included in the manuscript. Reportedly, the AI system used by the researcher had silently inserted a fabricated source into the paper.
“I felt deeply embarrassed,” said Rafael Topaz, the associate professor in question, who leads a team at Columbia University developing AI applications in healthcare.
“I’m an AI researcher. I know about hallucinations,” he said. “If this is happening to me, an AI expert, what happens to other people?”
The close call prompted Topaz to investigate how frequently experts were being misled by AI-generated misinformation. A study published in The Lancet by Topaz and his colleagues audited nearly 2.5 million biomedical papers and 97 million citations indexed on PubMed Central, a major repository used by clinicians and researchers worldwide.
The researchers identified more than 4,000 fabricated references embedded across nearly 3,000 papers. While not all the false references were AI-generated, Topaz noted that the rise in fake citations accelerated sharply in 2024.
The report found that over the past three years, the rate of fabricated references in biomedical literature has increased more than twelvefold. In 2023, one in every 2,828 papers contained at least one fake reference. By last year, that figure had reportedly surged to one in every 458 papers.
AI hallucinations occur when language models prioritise predicting word patterns over factual accuracy. While such errors are often harmless in casual use, the risks become significantly greater when hallucinations begin infiltrating academic research, potentially undermining scientific credibility and trust.
Experts getting the process wrong
Fake AI-generated research papers are already emerging as a growing challenge in academia, becoming increasingly difficult to identify and threatening to overwhelm the peer-review system. However, hallucinated references embedded within legitimate human-written studies could prove even more widespread — and potentially harder to detect.
Experts say the real issue lies in unverified AI-generated output making its way into final academic work. The solution, they argue, is to build stronger verification systems directly into research workflows. Since AI hallucinations are often unintentional and may go unnoticed by researchers themselves, robust verification mechanisms could help prevent such inaccuracies from reaching publication.
Verification practices currently vary widely across academic journals. While some publishers use software tools to examine citations and identify AI-generated material, oversight remains inconsistent. There is also no straightforward way to trace and review the broader evidence chain to uncover the original fabricated studies or references.
According to Topaz’s analysis, publishers have struggled to identify fabricated citations effectively, with 98.4% of papers containing false references still unretracted at the time of the audit.














