What is the story about?
Artificial intelligence is rapidly transforming how information can be analysed online, but a new study suggests the technology may also be quietly eroding one of the internet’s long-standing protections: anonymity.
Researchers have found that modern large language models (LLMs), the technology powering tools such as ChatGPT, can often identify the real-world identity of anonymous social media users by analysing small fragments of personal information they share online.
The findings highlight growing fears among cybersecurity experts that AI could make sophisticated privacy attacks far easier and cheaper to carry out.
The research, conducted by AI specialists Simon Lermen and Daniel Paleka, explored how language models could gather publicly available information and match it across different platforms.
In their experiment, the researchers provided anonymous online profiles to an AI system and allowed it to collect every piece of information it could find. Even seemingly harmless details, such as personal anecdotes or location hints, proved surprisingly revealing.
The team described a hypothetical example where an anonymous user mentioned struggling at school and regularly walking their dog, Biscuit, through “Dolores Park”. The AI then searched across the web and social media platforms for similar details.
Using those clues, the system was able to link the anonymous account to a known online identity with a high level of confidence.
According to the researchers, tools like these dramatically reduce the cost and complexity of privacy attacks.
They said LLMs now make it feasible for attackers to carry out large-scale identity tracing, forcing a “fundamental reassessment of what can be considered private online”.
The risks extend beyond individual privacy. The authors warned that governments could potentially use AI to track anonymous activists or dissidents, while cybercriminals could deploy the technology to create highly personalised scams.
The research highlights a growing field sometimes described as AI surveillance. These systems use language models to gather and combine large amounts of online information about a person, a process that would normally require extensive manual investigation.
According to Lermen, information already available online can easily be exploited for targeted attacks.
Publicly accessible data can be “misused straightforwardly” for scams such as spear-phishing, where hackers impersonate trusted contacts in order to trick victims into clicking malicious links.
The danger, researchers say, is that AI tools dramatically lower the level of expertise needed to carry out such attacks. With access to a publicly available language model and an internet connection, even relatively inexperienced attackers could potentially run advanced data-analysis techniques.
However, experts also caution that the technology is not flawless.
Peter Bentley warned that AI systems often make mistakes when linking identities.
“People are going to be accused of things they haven’t done,” he said.
Another concern is that AI systems may use a wider range of public data than people expect. According to Marc Juárez from the University of Edinburgh, datasets such as hospital records, admissions data and public statistics may not be sufficiently anonymised in the age of AI.
“It is quite alarming. I think this paper is showing that we should reconsider our practices,” said Juarez.
Still, the technology does have limits. If there is not enough information online, or if too many people match the same details, identifying a specific individual becomes far more difficult.
“They can only link across platforms where someone consistently shares the same bits of information in both places,” said Marti Hearst.
Researchers say the findings should encourage both organisations and individuals to rethink how they approach online anonymity. Measures such as limiting automated data scraping and restricting bulk access to user information could help reduce the risk.
At the same time, users may also need to become more cautious about the small details they share online, as even seemingly harmless posts could one day reveal far more than intended.
Researchers have found that modern large language models (LLMs), the technology powering tools such as ChatGPT, can often identify the real-world identity of anonymous social media users by analysing small fragments of personal information they share online.
The findings highlight growing fears among cybersecurity experts that AI could make sophisticated privacy attacks far easier and cheaper to carry out.
How AI can unmask anonymous users
The research, conducted by AI specialists Simon Lermen and Daniel Paleka, explored how language models could gather publicly available information and match it across different platforms.
In their experiment, the researchers provided anonymous online profiles to an AI system and allowed it to collect every piece of information it could find. Even seemingly harmless details, such as personal anecdotes or location hints, proved surprisingly revealing.
The team described a hypothetical example where an anonymous user mentioned struggling at school and regularly walking their dog, Biscuit, through “Dolores Park”. The AI then searched across the web and social media platforms for similar details.
Using those clues, the system was able to link the anonymous account to a known online identity with a high level of confidence.
According to the researchers, tools like these dramatically reduce the cost and complexity of privacy attacks.
They said LLMs now make it feasible for attackers to carry out large-scale identity tracing, forcing a “fundamental reassessment of what can be considered private online”.
The risks extend beyond individual privacy. The authors warned that governments could potentially use AI to track anonymous activists or dissidents, while cybercriminals could deploy the technology to create highly personalised scams.
Rising fears over AI surveillance
The research highlights a growing field sometimes described as AI surveillance. These systems use language models to gather and combine large amounts of online information about a person, a process that would normally require extensive manual investigation.
According to Lermen, information already available online can easily be exploited for targeted attacks.
Publicly accessible data can be “misused straightforwardly” for scams such as spear-phishing, where hackers impersonate trusted contacts in order to trick victims into clicking malicious links.
The danger, researchers say, is that AI tools dramatically lower the level of expertise needed to carry out such attacks. With access to a publicly available language model and an internet connection, even relatively inexperienced attackers could potentially run advanced data-analysis techniques.
However, experts also caution that the technology is not flawless.
Peter Bentley warned that AI systems often make mistakes when linking identities.
“People are going to be accused of things they haven’t done,” he said.
Another concern is that AI systems may use a wider range of public data than people expect. According to Marc Juárez from the University of Edinburgh, datasets such as hospital records, admissions data and public statistics may not be sufficiently anonymised in the age of AI.
“It is quite alarming. I think this paper is showing that we should reconsider our practices,” said Juarez.
Still, the technology does have limits. If there is not enough information online, or if too many people match the same details, identifying a specific individual becomes far more difficult.
“They can only link across platforms where someone consistently shares the same bits of information in both places,” said Marti Hearst.
Researchers say the findings should encourage both organisations and individuals to rethink how they approach online anonymity. Measures such as limiting automated data scraping and restricting bulk access to user information could help reduce the risk.
At the same time, users may also need to become more cautious about the small details they share online, as even seemingly harmless posts could one day reveal far more than intended.














