ARMORIQ

How the Intent Assurance Plane (IAP) Disrupts the AI Espionage Pattern Anthropic Describes

Anthropic's AI espionage research reveals attackers using AI agents as quiet operators. Learn how ArmorIQ's Intent Assurance Plane enforces 'why' agents act through cryptographic plan verification and composite identity binding.

How the Intent Assurance Plane (IAP) Disrupts the AI Espionage Pattern Anthropic Describes// Cover

Anthropic's recent article on disrupting AI espionage captures a shift many security teams are just beginning to see. Attackers are no longer only sending phishing emails or trying to brute-force VPNs. They are starting to use AI agents as quiet operators inside organizations. Agents that can read internal docs, chain together tools, and make decisions with very little human supervision.

These agents behave much more like human spies than like traditional malware. They gather context, learn how systems fit together, identify soft spots, and then exploit workflows rather than vulnerabilities. And they do all of this using the same interfaces and tools that legitimate users and automations use.

At Armoriq, we built the Intent Assurance Plane (IAP) to address exactly this class of threat. The campaigns Anthropic describes do not succeed because we lack identity or access control. They succeed because the infrastructure has no way to verify whether an AI agent's actions still match the intent of the task it was supposedly performing. IAP adds this missing layer of governance.

The Core Issue: No One Enforces "Why" the Agent Is Acting

Today's enterprises are pretty good at answering three questions:

  • Who is acting? (identity, auth)
  • What can they access? (permissions, roles)
  • Where are they acting from? (network, device, location)

But Anthropic's scenarios show that for AI agents, these are not enough. Espionage-style attacks exploit the fact that almost no system verifies:

Why is this agent taking this action at this point in time?

If a model starts a task to draft an email campaign, and halfway through begins quietly collecting internal documents, or probing ticketing systems, or requesting broader access, nothing in a traditional stack sees that as a violation. As long as the agent is properly authenticated and the permissions check out, the action goes through. That is the gap IAP is designed to close.

IAP's basic idea is simple: you cannot secure AI agents by governing identity alone. You must secure the intent behind their actions.

How IAP Changes the Control Plane

IAP introduces a new enforcement layer that sits alongside identity and access control. Instead of trying to inspect prompts or guess at "AI safety," it gives every task an explicit, signed definition of what is in scope and then requires the agent to prove that each action belongs to that scope.

When a user (or an upstream system) requests a task, IAP converts that request into a structured plan of steps. That plan is canonicalised into a graph and hashed into a Merkle root. The result is a cryptographic fingerprint of the intended behavior for that task. IAP signs this fingerprint and issues an intent token that defines exactly what the agent is allowed to do.

From that point on, any action the agent tries to take calling an internal tool, reading data, making a change, or invoking another agent must present evidence that the action is part of the signed plan. If it cannot, the Policy Enforcement Point (PEP) simply refuses to execute it.

This is the key difference in mindset:

  • Without IAP, we ask: "Does this agent have permission to do this?"
  • With IAP, we also ask: "Was this action part of what the agent was actually supposed to do?"

Only if both are true does the system move forward.

Step-by-Step: How IAP Disrupts Anthropic's AI Espionage Pattern

Anthropic's examples show agents that start with benign tasks and then drift into reconnaissance, data harvesting, or operational manipulation. Here is how that pattern breaks down under IAP.

Step 1: Every task starts with a signed plan

Under IAP, even a simple request such as "summarize these incidents," "draft a response to this ticket," or "suggest a remediation playbook" is converted into a plan graph. That graph encodes what the agent is expected to do: which systems it may read, which tools it may call, what kind of outputs it may generate.

IAP canonicalises this graph into a deterministic structure (the CSRG) and computes a Merkle root (h_P). The system then issues a signed intent token that binds h_P to the user, the agent, and the environment.

At this stage, the AI agent has freedom to reason inside the plan, but the allowed action surface has been locked down and signed.

Step 2: Every action must prove it belongs to the plan

When the agent wants to take an action say, reading a log source, calling an internal API, or invoking another automation, it must present two things:

  • The signed intent token τ_P, and
  • A Merkle proof showing that the action's node is present in the committed plan graph.

If an AI espionage agent tries to "go exploring" and call tools or APIs that were not part of the original plan, those calls fail at the PEP. There is no matching node in the CSRG, no valid proof, and therefore no execution.

This breaks the common pattern Anthropic describes: start with a harmless workflow, then quietly expand into discovery and exfiltration. Under IAP, that expansion cannot happen silently. It must either be explicitly added to the plan (requiring a Trust Update) or be rejected.

Step 3: Delegation and capability growth become explicit

A recurring theme in Anthropic's report is that AI agents gradually acquire more capability: they learn where secrets live, discover powerful internal automations, and get humans to grant additional permissions.

With IAP, growing an agent's capabilities is not something that happens inside loose prompt flows. It happens through Trust Updates. For example, when the system genuinely needs to hand a sub-part of the task to another agent, IAP:

  • carves out a specific subtree of the plan,
  • derives a scoped identity for that subtree, and
  • issues a new token that is valid only for that portion of the graph.

Any further expansion beyond that subtree requires an additional signed update. AI-driven espionage loses its stealth because quiet capability accumulation now leaves cryptographic fingerprints at every step.

Step 4: Composite identity stops context-based escalation

Anthropic's scenarios rely heavily on context manipulation: use data from one part of the environment to justify actions in another. IAP prevents these cross-context jumps by binding identity and plan together.

Every intent token carries a composite identity that includes:

  • who requested the task,
  • which agent is acting,
  • what environment or data domain is in play, and
  • which plan graph governs behavior.

If the agent tries to act under a context it was not granted, say, moving from a non-sensitive dataset into a highly regulated one the composite identity no longer matches, and the action is denied. In this way, IAP enforces that AI agents cannot silently move from low-risk activity into high-impact domains.

Step 5: Everything is written to an immutable audit log

Espionage thrives on being unobservable. IAP removes that advantage. Every intent, commit, trust update, delegation, and executed action produces an entry in an append-only, Merkle-anchored audit log.

If something goes wrong, you can reconstruct:

  • what task was requested,
  • what plan was committed,
  • which agents acted under that plan,
  • what actions were taken, and
  • whether any trust updates extended the scope.

This not only supports forensics; it also makes it straightforward to tune policies, detect emergent patterns, and harden workflows where agents are too frequently requesting expanded scopes.

Why IAP Is a Natural Response to AI Espionage

The threats Anthropic highlights are not science fiction. They are what happens when capable AI agents are deployed into environments that only understand identity and static permissions. In that world, any authenticated agent with broad access can be pushed into doing work that looks "legitimate" to the infrastructure but is strategically harmful.

IAP addresses this by inserting a missing layer: intent-level governance. Instead of just asking, "Is this entity allowed to do X?", we now ask, "Was doing X part of what this entity was supposed to be doing for this task?" The answer is enforced cryptographically through signed plans, composite identities, and proof-of-inclusion checks at the moment of execution.

For defenders grappling with the realities of AI espionage, this is the shift that matters. We cannot simply wrap more filters around prompts or bolt on more logging around tools. We have to make the agent's intent itself a first-class security object and enforce it as rigorously as we enforce identity.

That is what the Intent Assurance Plane is built to do.

Onboarding open

Ready to control what your AI agents actually do?

Join the teams shipping safer, compliant AI agent deployments. White-glove onboarding for the first 50 design partners.

Read Docs →
Live Intent Assurance