When Your AI Becomes the Attacker

The McKinsey Incident and What It Reveals About Enterprise AI Risk

📋
Incident — 28 February 2026: Security startup CodeWall pointed an autonomous offensive AI agent at the open internet and let it choose its own target. It chose McKinsey & Company's internal AI platform, Lilli. Two hours later, the agent had full read-write access to the production database. McKinsey patched all vulnerabilities within hours of responsible disclosure on 1 March and confirmed no evidence that client data was accessed by unauthorised parties. This analysis draws on public reporting only.
The Incident

An AI Agent Picked Its Own Target. Then It Got In.

Lilli launched in 2023 as McKinsey's institutional brain — a retrieval-augmented generative AI platform serving 43,000+ consultants, processing over 500,000 prompts every month, and sitting on top of decades of proprietary research, client engagement data, and internal strategy work. By early 2026, 70% of the firm's global workforce used it daily.

On 28 February 2026, CodeWall's autonomous offensive agent began its reconnaissance. It found the API documentation publicly accessible — over 200 endpoints, fully documented, available to anyone who looked. Of those, 22 required no authentication at all. The agent probed one that accepted search queries, noticed the JSON field names were being concatenated directly into SQL — not parameterised like the values — and in 15 iterative requests, reverse-engineered the query structure through error responses.

That is SQL injection. Documented in 1998. Still working in a production AI platform in 2026.

2 hrs
From first probe to full read-write database access — no credentials, no insider knowledge
46.5M
Internal chat messages exposed — covering strategy, M&A, and live client engagements
22
API endpoints with zero authentication in a production system used by 40,000+ people
95
Writable system prompts — the AI's behavioural instructions, stored in the same vulnerable database
T + 0 min
Reconnaissance
Agent scans internet, selects McKinsey's Lilli. API docs publicly accessible. 200+ endpoints mapped.
T + ~30 min
Injection found
SQL injection in JSON field names — not values. Standard scanners (OWASP ZAP) did not flag it.
T + ~90 min
Escalation via IDOR
Chains SQL injection with Insecure Direct Object Reference. Increments user IDs. 57,000 employee records traversed.
T + 2 hrs
Full database access
Read-write access to 46.5M messages, 728K files, 57K accounts — and all 95 writable system prompts.
1 Mar
Disclosure & patch
CodeWall discloses to McKinsey. CISO acknowledges within hours. All endpoints patched. API docs restricted.

← scroll to see full timeline →

What Was Exposed

The Scale Reflects How Deeply AI Was Integrated

Lilli was not a chatbot sitting in a corner of the organisation. It was McKinsey's institutional memory — connected to decades of proprietary research, live client work, and the behavioural configuration that governed how 43,000 consultants interacted with it every day. The database it sat on top of was not a peripheral system. It was the intellectual core of the firm's AI deployment.

46.5 million
Internal chat messages
Covering active client engagements, corporate strategy, M&A — the firm's working intelligence, fully accessible
728K
Internal files
With S3 paths intact — attackers knew exactly where originals lived in cloud storage
57K
Employee accounts
Full user profiles for McKinsey's global consulting workforce
3.68M
RAG document chunks
Proprietary methodologies and client frameworks accumulated over decades
95
Writable system prompts ⚠
The AI's behavioural instructions — stored in the same database, modifiable with a single UPDATE statement

That last figure deserves its own paragraph. An attacker with write access to those 95 system prompts could have rewritten the instructions governing how Lilli responds to every query from every consultant — silently, persistently, and indistinguishably from a legitimate configuration update. Strategic advice poisoned at the source. That attack did not happen. This time.

How It Happened

The Vulnerability Was Not New. The Context Was.

McKinsey's developers did the standard thing. They parameterised user input values in their SQL queries — the textbook defence against injection attacks. What they missed was that JSON field names were also being concatenated into SQL, an unusual vector that standard automated scanners, including OWASP ZAP, do not typically test for.

lilli-api — SQL injection vector (simplified)
/* Standard parameterisation — correctly secured ✓ */
SELECT * FROM documents WHERE value = $1

/* The blind spot — field NAMES concatenated directly ✗ */
SELECT * FROM documents WHERE {fieldName} = $1

/* CodeWall's agent: 15 iterations, error-based inference
   OWASP ZAP result: no flag — it tests values, not key names */

/* Once in: IDOR via sequential ID increment */
GET /api/users/1001/history  → employee 1001 search history
GET /api/users/1002/history  → employee 1002 search history
                            → repeat × 57,000

/* The most dangerous query available */
UPDATE system_prompts SET content = '[attacker-controlled]'
  WHERE model_type = 'all'  — all 95 prompts, rewritten

"McKinsey's team did the standard thing. They followed the textbook. The real failure was not that a developer missed an edge case — it was that the architecture had no independent layers of defence between the internet and the production database."

Traefik Security Analysis — March 2026

Where the attack surface exposure actually sits:

Unauthenticated endpoints
CRITICAL
22 of 200+
Public API documentation
CRITICAL
Full exposure
Writable system prompts
CRITICAL
95 prompts
AI config + user data co-located
HIGH
Same DB
No production API monitoring
HIGH
Zero alerts
IDOR on user records
HIGH
57K accounts
Scanner blind spot (field names)
MEDIUM
Not detected
The Real Lesson

The AI Was Not the Problem. The Infrastructure It Sat On Was.

Most commentary on the Lilli breach focused on the SQL injection. That is the wrong lens. SQL injection was the mechanism. The root cause was architectural: a production AI platform with no independent layers of defence between the public internet and its production database — no gateway authentication, no separation between AI configuration and user data, no behavioural monitoring in production that would flag 15 sequential requests to the same endpoint with modified key names.

The model-level safety controls that most organisations invest in — guardrails, output filtering, jailbreak resistance — are entirely bypassed when an attacker goes around the model and directly into the infrastructure it depends on. The AI is not the whole attack surface. The action layer is.

GATE 01 🔐
Authenticate every endpoint — no exceptions
API gateways with OAuth enforcement close 22 unauthenticated endpoints in one pass. This gate alone would have stopped the Lilli breach entirely.
Breach stopped here
GATE 02 🗄️
Separate AI config from user data
System prompts, RAG settings, and model parameters must not share a database with user records. One injection should not give write access to behavioural configuration.
Prompt poisoning stopped
GATE 03 📡
Monitor API behaviour in production
15 sequential requests to the same endpoint with modified field names. Sequential ID traversal across 57,000 accounts. Neither pattern resembles legitimate user behaviour. Detection gives a second line of defence when scanners miss something.
IDOR stopped here

"The next generation of AI incidents will come from agents sitting on top of weak action layers: exposed APIs, unauthenticated services, forgotten integrations, and misconfigured MCP servers. The model is not the whole attack surface. The API layer is."

Salt Security — March 2026

Gartner estimates 40% of enterprise applications will integrate AI agents by end of 2026 — up from less than 5% in 2025. The MCP ecosystem has grown to over 10,000 published servers. Every AI deployment is an action layer. Every action layer is an attack surface. The question is not whether your organisation has exposure. It is whether you can see it — before an autonomous agent does.

CodeWall's agent needed two hours. That is the window between "everything looks fine" and "full read-write access to your production database." If an autonomous agent targeted your AI platform today — what would it find?

What Secompass Does

We Secure the Action Layer — Before Someone Else Finds It.

The Lilli breach is not a unique event — it is a preview. Most organisations deploying AI in 2026 are making the same connection decisions McKinsey made, with the same gaps in their action layer. Secompass works with organisations across Australia and New Zealand to find those gaps and close them before an incident forces the issue.

Secompass AI Security Controls OWASP ASI-01 · LLM06
1
AI Agent Inventory & Discovery
A live map of every agent, MCP server, and API connection in your environment — including deployments by individual teams without central IT review. You cannot govern what you cannot see.
2
API Authentication Audit
Surface every unauthenticated or under-authenticated endpoint your AI agents can reach, before a CodeWall-style autonomous scan does it for you. Every endpoint requires authentication — no exceptions.
3
System Prompt Integrity Controls
Treat system prompts like source code: versioned, access-controlled, separated from user data, and monitored for unauthorised modification. Any write operation triggers an alert.
4
Tool Invocation & API Behaviour Monitoring
Log and baseline every tool call and API request. Alert on anomalous patterns — sequential ID traversal, repeated requests with modified parameters, out-of-scope access — in production, in real time.
5
AI Governance Framework
Build governance infrastructure aligned to OWASP Top 10 for LLMs 2025, OWASP Agentic AI Top 10 2026, CIS Controls v8.1, ISO 27001, and SOC 2 — making your AI deployment auditable and defensible.

Work with Secompass

Find Out What an Autonomous Agent Would See in Your Environment — Before It Does.

We help organisations across Australia and New Zealand audit their AI infrastructure, identify action-layer exposure, and build governance frameworks that hold up to scrutiny.

  • Do you have a complete inventory of every AI agent and API connection in your environment?
  • Are your AI-connected endpoints authenticated and monitored in production?
  • Are your system prompts version-controlled and separated from your user data?
Book a Free Consultation →
Sources: CodeWall technical disclosure, March 2026 · The Register, 9 March 2026 · 1Kosmos McKinsey breach analysis, 24 March 2026 · Salt Security blog, 13 March 2026 · Hathr.AI breach breakdown · Traefik Security analysis, 20 March 2026 · PointGuard AI incident report · Swept AI enterprise lessons, 12 March 2026 · Outpost24 research · Treblle API security analysis, 18 March 2026 · Gartner Cybersecurity Trends 2025 · OWASP Top 10 for Agentic Applications 2026

This post is for general informational and educational purposes only. It does not constitute legal, technical, or professional cybersecurity advice. Secompass recommends engaging a qualified adviser before making decisions based on this content.

📂 Browse our blog for more insights on cybersecurity, AI governance, and data protection.

Next
Next

How to make Agile and Security Work together