Artificial
intelligence is growing rapidly. Several tech companies are relying on AI tools now. However, a new study points out that it might not be reliable enough for disease-related work. Anthropic tested various advanced AI models on virus data retrieval tasks and found them making mistakes. The company examined how AI agents perform when collecting coral sequence data from scientific databases and how errors could eventually affect the overall medical research results.
Here’s What Happened
The researchers evaluated scientific AI agents, including Claude,
GPT-based systems, Biomni Open Source and Edison Analysis. These AI models were used to retrieve virus sequence data from NCBI Virus. For those unaware, it is a popular database that scientists depend on for tracking outbreaks and studying infectious diseases. This study highlighted that the AI models often struggled to retrieve complete and accurate datasets. Depending on the system used, average accuracy ranged from 16.9 per cent to 91.3 per cent. Moreover, the report asserts that scientific data retrieval needs near-perfect accuracy, as even a single error can influence some of the biggest conclusions.
A major finding turned out to be inconsistent answers. Reportedly,
AI systems produced different results when they were given the same query various times. In one of the examples, an AI model retrieved 106 matching sequences during an attempt, 15 during a second attempt and only five during a third attempt in data related to Ebola virus data when it was given the same instructions each time. These inconsistencies can create problems for scientific studies that rely on reproducible results. If researchers cannot obtain the same dataset repeatedly, it gets difficult to verify findings. This study shows AI can make critical mistakes that can impact larger scientific discoveries. Without reliable access to accurate information, even the most advanced AI systems risk producing convincing answers that may not withstand scientific scrutiny.