When Your AI Becomes the Attacker

14 May

The McKinsey Incident and What It Reveals About Enterprise AI Risk

📋

Incident — 28 February 2026: Security startup CodeWall pointed an autonomous offensive AI agent at the open internet and let it choose its own target. It chose McKinsey & Company's internal AI platform, Lilli. Two hours later, the agent had full read-write access to the production database. McKinsey patched all vulnerabilities within hours of responsible disclosure on 1 March and confirmed no evidence that client data was accessed by unauthorised parties. This analysis draws on public reporting only.

The Incident

An AI Agent Picked Its Own Target. Then It Got In.

Lilli launched in 2023 as McKinsey's institutional brain — a retrieval-augmented generative AI platform serving 43,000+ consultants, processing over 500,000 prompts every month, and sitting on top of decades of proprietary research, client engagement data, and internal strategy work. By early 2026, 70% of the firm's global workforce used it daily.

On 28 February 2026, CodeWall's autonomous offensive agent began its reconnaissance. It found the API documentation publicly accessible — over 200 endpoints, fully documented, available to anyone who looked. Of those, 22 required no authentication at all. The agent probed one that accepted search queries, noticed the JSON field names were being concatenated directly into SQL — not parameterised like the values — and in 15 iterative requests, reverse-engineered the query structure through error responses.

That is SQL injection. Documented in 1998. Still working in a production AI platform in 2026.

2 hrs

From first probe to full read-write database access — no credentials, no insider knowledge

46.5M

Internal chat messages exposed — covering strategy, M&A, and live client engagements

API endpoints with zero authentication in a production system used by 40,000+ people

Writable system prompts — the AI's behavioural instructions, stored in the same vulnerable database

T + 0 min

Reconnaissance

Agent scans internet, selects McKinsey's Lilli. API docs publicly accessible. 200+ endpoints mapped.

T + ~30 min

Injection found

SQL injection in JSON field names — not values. Standard scanners (OWASP ZAP) did not flag it.

T + ~90 min

Escalation via IDOR

Chains SQL injection with Insecure Direct Object Reference. Increments user IDs. 57,000 employee records traversed.

T + 2 hrs

Full database access

Read-write access to 46.5M messages, 728K files, 57K accounts — and all 95 writable system prompts.

1 Mar

Disclosure & patch

CodeWall discloses to McKinsey. CISO acknowledges within hours. All endpoints patched. API docs restricted.

← scroll to see full timeline →

What Was Exposed

The Scale Reflects How Deeply AI Was Integrated

Lilli was not a chatbot sitting in a corner of the organisation. It was McKinsey's institutional memory — connected to decades of proprietary research, live client work, and the behavioural configuration that governed how 43,000 consultants interacted with it every day. The database it sat on top of was not a peripheral system. It was the intellectual core of the firm's AI deployment.

46.5 million

Internal chat messages

Covering active client engagements, corporate strategy, M&A — the firm's working intelligence, fully accessible

728K

Internal files

With S3 paths intact — attackers knew exactly where originals lived in cloud storage

57K

Employee accounts

Full user profiles for McKinsey's global consulting workforce

3.68M

RAG document chunks

Proprietary methodologies and client frameworks accumulated over decades

Writable system prompts ⚠

The AI's behavioural instructions — stored in the same database, modifiable with a single UPDATE statement

That last figure deserves its own paragraph. An attacker with write access to those 95 system prompts could have rewritten the instructions governing how Lilli responds to every query from every consultant — silently, persistently, and indistinguishably from a legitimate configuration update. Strategic advice poisoned at the source. That attack did not happen. This time.

How It Happened

The Vulnerability Was Not New. The Context Was.

McKinsey's developers did the standard thing. They parameterised user input values in their SQL queries — the textbook defence against injection attacks. What they missed was that JSON field names were also being concatenated into SQL, an unusual vector that standard automated scanners, including OWASP ZAP, do not typically test for.

lilli-api — SQL injection vector (simplified)

/* Standard parameterisation — correctly secured ✓ */
SELECT * FROM documents WHERE value = $1

/* The blind spot — field NAMES concatenated directly ✗ */
SELECT * FROM documents WHERE {fieldName} = $1

/* CodeWall's agent: 15 iterations, error-based inference
   OWASP ZAP result: no flag — it tests values, not key names */

/* Once in: IDOR via sequential ID increment */
GET /api/users/1001/history  → employee 1001 search history
GET /api/users/1002/history  → employee 1002 search history
                            → repeat × 57,000

/* The most dangerous query available */
UPDATE system_prompts SET content = '[attacker-controlled]'
  WHERE model_type = 'all'  — all 95 prompts, rewritten

"McKinsey's team did the standard thing. They followed the textbook. The real failure was not that a developer missed an edge case — it was that the architecture had no independent layers of defence between the internet and the production database."

Traefik Security Analysis — March 2026

Where the attack surface exposure actually sits:

Unauthenticated endpoints

CRITICAL

22 of 200+

Public API documentation

CRITICAL

Full exposure

Writable system prompts

CRITICAL

95 prompts

AI config + user data co-located

HIGH

Same DB

No production API monitoring

HIGH

Zero alerts

IDOR on user records

HIGH

57K accounts

Scanner blind spot (field names)

MEDIUM

Not detected

The Real Lesson

The AI Was Not the Problem. The Infrastructure It Sat On Was.

Most commentary on the Lilli breach focused on the SQL injection. That is the wrong lens. SQL injection was the mechanism. The root cause was architectural: a production AI platform with no independent layers of defence between the public internet and its production database — no gateway authentication, no separation between AI configuration and user data, no behavioural monitoring in production that would flag 15 sequential requests to the same endpoint with modified key names.

The model-level safety controls that most organisations invest in — guardrails, output filtering, jailbreak resistance — are entirely bypassed when an attacker goes around the model and directly into the infrastructure it depends on. The AI is not the whole attack surface. The action layer is.

GATE 01 🔐

Authenticate every endpoint — no exceptions

API gateways with OAuth enforcement close 22 unauthenticated endpoints in one pass. This gate alone would have stopped the Lilli breach entirely.

Breach stopped here

GATE 02 🗄️

Separate AI config from user data

System prompts, RAG settings, and model parameters must not share a database with user records. One injection should not give write access to behavioural configuration.

Prompt poisoning stopped

GATE 03 📡

Monitor API behaviour in production

15 sequential requests to the same endpoint with modified field names. Sequential ID traversal across 57,000 accounts. Neither pattern resembles legitimate user behaviour. Detection gives a second line of defence when scanners miss something.

IDOR stopped here

"The next generation of AI incidents will come from agents sitting on top of weak action layers: exposed APIs, unauthenticated services, forgotten integrations, and misconfigured MCP servers. The model is not the whole attack surface. The API layer is."

Salt Security — March 2026

Gartner estimates 40% of enterprise applications will integrate AI agents by end of 2026 — up from less than 5% in 2025. The MCP ecosystem has grown to over 10,000 published servers. Every AI deployment is an action layer. Every action layer is an attack surface. The question is not whether your organisation has exposure. It is whether you can see it — before an autonomous agent does.

CodeWall's agent needed two hours. That is the window between "everything looks fine" and "full read-write access to your production database." If an autonomous agent targeted your AI platform today — what would it find?

What Secompass Does

We Secure the Action Layer — Before Someone Else Finds It.

The Lilli breach is not a unique event — it is a preview. Most organisations deploying AI in 2026 are making the same connection decisions McKinsey made, with the same gaps in their action layer. Secompass works with organisations across Australia and New Zealand to find those gaps and close them before an incident forces the issue.

Secompass AI Security Controls OWASP ASI-01 · LLM06

AI Agent Inventory & Discovery

A live map of every agent, MCP server, and API connection in your environment — including deployments by individual teams without central IT review. You cannot govern what you cannot see.

API Authentication Audit

Surface every unauthenticated or under-authenticated endpoint your AI agents can reach, before a CodeWall-style autonomous scan does it for you. Every endpoint requires authentication — no exceptions.

System Prompt Integrity Controls

Treat system prompts like source code: versioned, access-controlled, separated from user data, and monitored for unauthorised modification. Any write operation triggers an alert.

Tool Invocation & API Behaviour Monitoring

Log and baseline every tool call and API request. Alert on anomalous patterns — sequential ID traversal, repeated requests with modified parameters, out-of-scope access — in production, in real time.

AI Governance Framework

Build governance infrastructure aligned to OWASP Top 10 for LLMs 2025, OWASP Agentic AI Top 10 2026, CIS Controls v8.1, ISO 27001, and SOC 2 — making your AI deployment auditable and defensible.

Work with Secompass

Find Out What an Autonomous Agent Would See in Your Environment — Before It Does.

We help organisations across Australia and New Zealand audit their AI infrastructure, identify action-layer exposure, and build governance frameworks that hold up to scrutiny.

Do you have a complete inventory of every AI agent and API connection in your environment?
Are your AI-connected endpoints authenticated and monitored in production?
Are your system prompts version-controlled and separated from your user data?

Book a Free Consultation →

Sources: CodeWall technical disclosure, March 2026 · The Register, 9 March 2026 · 1Kosmos McKinsey breach analysis, 24 March 2026 · Salt Security blog, 13 March 2026 · Hathr.AI breach breakdown · Traefik Security analysis, 20 March 2026 · PointGuard AI incident report · Swept AI enterprise lessons, 12 March 2026 · Outpost24 research · Treblle API security analysis, 18 March 2026 · Gartner Cybersecurity Trends 2025 · OWASP Top 10 for Agentic Applications 2026

This post is for general informational and educational purposes only. It does not constitute legal, technical, or professional cybersecurity advice. Secompass recommends engaging a qualified adviser before making decisions based on this content.

📂 Browse our blog for more insights on cybersecurity, AI governance, and data protection.

Jatinder Oberoi

When Your AI Becomes the Attacker

An AI Agent Picked Its Own Target. Then It Got In.

The Scale Reflects How Deeply AI Was Integrated

The Vulnerability Was Not New. The Context Was.

The AI Was Not the Problem. The Infrastructure It Sat On Was.

We Secure the Action Layer — Before Someone Else Finds It.

Find Out What an Autonomous Agent Would See in Your Environment — Before It Does.

Locations

Contact

When Your AI Becomes the Attacker

An AI Agent Picked Its Own Target. Then It Got In.

The Scale Reflects How Deeply AI Was Integrated

The Vulnerability Was Not New. The Context Was.

The AI Was Not the Problem. The Infrastructure It Sat On Was.

We Secure the Action Layer — Before Someone Else Finds It.

Find Out What an Autonomous Agent Would See in Your Environment — Before It Does.

How to make Agile and Security Work together

Locations

Contact