An AI agent named ARTEMIS has successfully hacked Stanford University’s computer network and performed better than professional human hackers who earn six-figure salaries. The experiment shows how fast artificial intelligence is evolving and how it may change the way cybersecurity testing is done in the future.According to reports, during the test, ARTEMIS was allowed to operate on Stanford’s private and public computer science networks for 16 hours. In that time, the AI scanned nearly 8,000 devices, including servers and different smart systems used across the university. By the end of the trial, ARTEMIS had uncovered nine valid security vulnerabilities with an accuracy rate of 82 per cent. The researchers noted that this performance was better than nine out
of ten human penetration testers.Also Read: Alert! Millions Of Apple And Google Devices Are At Hacking Risk, Emergency Updates Rolled Out
The study was carried out by researchers Justin Lin, Eliot Jones and Donovan Jasper. According to them, ARTEMIS is built differently from typical AI tools. Many AI systems struggle with long-duration tasks or complex system analysis, but ARTEMIS is designed to run autonomously for hours while scanning and studying network behaviour.One of the biggest takeaways from the study is the huge cost advantage of using an AI security agent. ARTEMIS costs just $18 (around Rs 1,630) per hour to operate. In comparison, a professional penetration tester in the US earns more than $125,000 per year (over Rs 1.13 crore). This difference could push organisations to rethink how they spend on cybersecurity audits and testing in the coming years.ARTEMIS also showed unique threat-detection capabilities.The AI can launch smaller sub-agents whenever it finds suspicious behaviour, allowing it to investigate several issues at once -- something human testers cannot easily do. In one case, it found a vulnerability in an outdated server that human testers had previously missed because the system was hard to access.However, the AI is not perfect. ARTEMIS struggled with tasks that required clicking through graphical interfaces, leading to a few missed flaws. It also flagged some harmless activity as possible threats, which are known as false positives. Even then, the researchers say the AI performs extremely well in environments that mostly rely on text-based data.