AI Models Surpass Benchmarks in Autonomous Cybersecurity Tasks, Indicating Rapid Advancement
Two advanced artificial intelligence models, Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5, have significantly outperformed previous benchmarks in autonomous cybersecurity tasks. According to the United Kingdom’s AI Security Institute (AISI) and Palo Alto Networks, these models have exceeded the expected pace of AI systems in completing cybersecurity tasks autonomously. The AISI noted that the models have surpassed the doubling trend of task completion time, which had been observed since late 2024. The models demonstrated their capabilities in structured simulations of multi-stage attacks, with Claude Mythos Preview completing complex tasks previously unsolved by any model. Palo Alto Networks corroborated these findings, highlighting the models' ability to identify vulnerabilities and transform them into critical exploit paths in near-real-time. The advancements suggest a rapid growth in AI's capability to autonomously handle cybersecurity challenges.