An AI Agent Broke Into McKinsey's Internal Chatbot. Here's What That Means for Every Enterprise Deploying AI Workers.

Earlier this month, security startup CodeWall pointed an autonomous AI agent at Lilli, McKinsey’s internal generative AI platform used by more than 40,000 employees.

Earlier this month, security startup CodeWall pointed an autonomous AI agent at Lilli, McKinsey's internal generative AI platform used by more than 40,000 employees. Within two hours, the agent had gained full read and write access to the production database, accessing tens of millions of chatbot conversations and hundreds of thousands of files tied to corporate consulting work.

According to research published by CodeWall and reported by The Register and Inc., the agent accessed 46.5 million chat messages covering strategy, mergers and acquisitions, and client engagements, all in plaintext, along with 728,000 files containing confidential client data, 57,000 user accounts, and 95 system prompts controlling the AI's behavior.

What made the breach particularly significant was the write access. Because Lilli's internal system prompts were stored in the same database, an attacker could have altered them without deploying new code or triggering standard security alerts. As CodeWall noted in their research: no deployment needed, no code change, just a single UPDATE statement wrapped in a single HTTP call.

The vulnerability itself was not sophisticated. SQL injection is one of the oldest bug classes in security. Lilli had been running in production for over two years and McKinsey's own internal scanners had failed to find any issues. An autonomous AI agent found it because it does not follow checklists. It maps, probes, chains, and escalates continuously and at machine speed.
For enterprises deploying AI agents today, this incident makes the stakes concrete. AI systems wired into weakly governed APIs create blast radii that can expand very quickly. The attack surface is no longer just the model. It is the full connected infrastructure around it.

This is precisely the problem Insygna was built to address. When AI agents operate without verified identity, auditable permissions, and independent oversight, enterprises have no reliable way to know what an agent is authorized to do, who is accountable when something goes wrong, or whether their own AI infrastructure is being used against them. The McKinsey incident is not an edge case. It is a preview of what enterprise AI governance without infrastructure looks like at scale.

Create Your Free Account