What is the story about?
From being a tool that assists in various tasks, artificial intelligence (AI) has significantly evolved into systems capable of acting independently, making decisions and even executing commands without human approval, according to a recent study.
The new research, titled ‘Agents of Chaos,’ sawAI agents getting into a sealed lab environment and provided access to email accounts, Discord, as well as the power to run code on their own machines. It was later found that they leaked secrets, erased files from systems and got stuck in repetitive nine-day loops.
The experience was carried out by a total of 20 researchers over a period of two weeks. It was led by Northeastern University in Boston in collaboration with Harvard University, Stanford University, Massachusetts Institute of Technology, University of British Columbia and others.
Its focus was to stress-test the AI systems and see what happens when they are allowed to act. The overall result was not a single dramatic collapse. The team pushed the systems into difficult and unusual situations to witness their behaviour. This means that these AI chatbots were allowed to act on their own and even respond to prompts.
The AI agents were tested in a controlled lab with the help of OpenClaw, an open-source framework that connects language models to tools such as email, shell commands, and file systems.
As a result, the team documented 11 examples of problematic behaviour. The study showed how AI systems can make unexpected mistakes when allowed to act on their own rather than just answer questions.
A major issue witnessed during the research was that a few agents followed instructions from people who were not their owners or authorised users. These agents even went on to share confidential data, including internal prompts and sensitive information.
In one case, an AI agent was asked by a person to perform tasks like running shell commands, showing file lists or transferring data, with the system following most of these instructions.
"Observed behaviour includes execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover," read the study.
Some of these agents took dangerous commands that went on to damage the system. Among these included deleting files, changing system settings and running harmful scripts.
In another case, a non-owner asked one of the AI agents to keep a fictional password confidential. Later on, the person pressed it to delete the email containing the secret, while the agent lacked a proper deletion tool. It was seen that it did not escalate the issue, but rather disabled its own local email setup.
The researchers then found that the original email that contained the password was still on the server and had not been deleted.
There were instances where AI agents got more control over systems than intended. A few of them even reported task completion when the underlying system state contradicted such reports.
The new research, titled ‘Agents of Chaos,’ sawAI agents getting into a sealed lab environment and provided access to email accounts, Discord, as well as the power to run code on their own machines. It was later found that they leaked secrets, erased files from systems and got stuck in repetitive nine-day loops.
The experience was carried out by a total of 20 researchers over a period of two weeks. It was led by Northeastern University in Boston in collaboration with Harvard University, Stanford University, Massachusetts Institute of Technology, University of British Columbia and others.
Its focus was to stress-test the AI systems and see what happens when they are allowed to act. The overall result was not a single dramatic collapse. The team pushed the systems into difficult and unusual situations to witness their behaviour. This means that these AI chatbots were allowed to act on their own and even respond to prompts.
????
BREAKING: Stanford and Harvard just published the most unsettling AI paper of the year.
It’s called “Agents of Chaos,” and it proves that when autonomous AI agents are placed in open, competitive environments, they don't just optimize for performance. They naturally drift… pic.twitter.com/xllU3xp3ld
— Simplifying AI (@simplifyinAI) March 6, 2026
The AI agents were tested in a controlled lab with the help of OpenClaw, an open-source framework that connects language models to tools such as email, shell commands, and file systems.
As a result, the team documented 11 examples of problematic behaviour. The study showed how AI systems can make unexpected mistakes when allowed to act on their own rather than just answer questions.
A major issue witnessed during the research was that a few agents followed instructions from people who were not their owners or authorised users. These agents even went on to share confidential data, including internal prompts and sensitive information.
In one case, an AI agent was asked by a person to perform tasks like running shell commands, showing file lists or transferring data, with the system following most of these instructions.
"Observed behaviour includes execution of destructive system-level actions, denial-of-service conditions, uncontrolled resource consumption, identity spoofing vulnerabilities, cross-agent propagation of unsafe practices, and partial system takeover," read the study.
Some of these agents took dangerous commands that went on to damage the system. Among these included deleting files, changing system settings and running harmful scripts.
In another case, a non-owner asked one of the AI agents to keep a fictional password confidential. Later on, the person pressed it to delete the email containing the secret, while the agent lacked a proper deletion tool. It was seen that it did not escalate the issue, but rather disabled its own local email setup.
The researchers then found that the original email that contained the password was still on the server and had not been deleted.
There were instances where AI agents got more control over systems than intended. A few of them even reported task completion when the underlying system state contradicted such reports.














