AI Agent Breaches McKinsey's System

AI agent breached McKinsey’s Lilli platform in two hours, exposing vulnerabilities.
The breach accessed 46.5 million chat messages & 728,000 files used by 40,000+ staff.
McKinsey patched endpoints & revoked API access after CodeWall reported the AI security flaw.

Summarized by AI ⓘ

Mastering AI

SEE ALL

NewsBytes

Google, Accel pick 5 startups for Atoms AI Cohort 2026

Feedpost Specials

AI's Odd Vision: Why Bald Cats Look Like Elephants to Machines

Feedpost Specials

Unlock Google Maps' Ask Maps: Your AI-Powered Planning Companion

What is the story about?

Discover how an AI agent autonomously breached McKinsey's Lilli platform in a mere two hours, revealing new cybersecurity threats where AI itself becomes the attacker. Learn about the discovered vulnerabilities and their broader implications.

The AI Onslaught

In a startling demonstration of evolving cyber threats, a security startup named CodeWall has reported that its autonomous AI agent successfully penetrated

McKinsey's internal AI system, Lilli, within an astonishingly short timeframe of two hours. This controlled test, conducted as a responsible security exercise, highlights a nascent and potentially alarming trend: the capability of artificial intelligence tools to autonomously identify and exploit vulnerabilities in other AI systems. As businesses increasingly integrate AI into their daily operations for tasks ranging from data analysis to document searching, this incident serves as a critical wake-up call about the dual-edged nature of this powerful technology. The implications are far-reaching, suggesting a future where AI-driven attacks could become a significant and rapid cybersecurity challenge.

Unpacking Lilli's Vulnerabilities

The target of this AI intrusion was Lilli, McKinsey's proprietary generative AI platform, which was launched in July 2023 to enhance employee access to company knowledge and internal research. This platform is extensively utilized, with over 70% of McKinsey's workforce, exceeding 40,000 employees, interacting with it monthly, processing more than 500,000 prompts. CodeWall's research agent identified McKinsey as a target by scanning publicly available information, including the company's disclosure policies. Without any insider knowledge or credentials, the AI agent began by mapping the platform's attack surface and analyzing its documentation. It uncovered API documentation revealing over 200 system endpoints, critically identifying 22 that lacked authentication. One such vulnerable endpoint handled user search queries, allowing the AI to manipulate the request structure and inject malicious instructions into the database query. Through iterative testing and analysis of server error responses, the agent was able to access live production data.

Massive Data Exposure

The breach at Lilli reportedly exposed a substantial volume of sensitive internal information. CodeWall's findings indicated that the compromised database contained approximately 46.5 million chat messages generated by employees using the AI tool. These conversations allegedly encompassed strategic discussions, financial planning details, merger and acquisition information, client work, and internal research findings, all stored in plain text and accessible via the vulnerability. Furthermore, the system held about 728,000 files, including documents in PDF, spreadsheet, presentation, and Word formats. The AI agent also gained visibility into over 57,000 user accounts linked to the platform, as well as thousands of internal workspaces and AI assistants employed by McKinsey staff. The full autonomy of the attack, from reconnaissance to reporting, underscores the potential for AI to conduct sophisticated cyber operations without direct human intervention.

Beyond Data Access

The discovered vulnerability extended beyond mere data retrieval; it presented the potential for malicious actors to alter the core operational instructions of the AI system itself. Crucially, Lilli's system prompts—the set of rules governing the chatbot's responses and safeguards—were stored within the same database. The vulnerability's read and write capabilities meant an attacker could potentially modify these prompts, influencing the AI's output and possibly embedding misinformation or circumventing built-in security measures without immediate detection. This aspect raises significant concerns about the integrity and trustworthiness of AI systems when their fundamental guiding principles can be manipulated, potentially leading to compromised decision-making processes within organizations.

Response and Remediation

Upon being notified of the vulnerability by CodeWall at the beginning of March, McKinsey reportedly took swift action to address the security lapse. The consulting firm implemented several measures, including patching the affected API endpoints, revoking public access to the API documentation, and temporarily taking portions of the development environment offline. A spokesperson for McKinsey stated that an investigation, supported by a third-party forensics firm, found no evidence of client data or confidential client information being accessed by the researcher or any other unauthorized party. The company reiterated its commitment to cybersecurity, emphasizing the robustness of its systems and its top priority of protecting entrusted client data and information.