ARMORIQ

ArmorClaude Was Supposed to Be a Plugin

The moment the trust boundary changed

May 26, 20269 min read
ArmorClaude Was Supposed to Be a Plugin// Cover

The moment a coding agent gets shell access, your security model quietly changes.

Most people don’t feel that immediately because AI tooling still feels lightweight. You install a plugin, connect a few tools, point it at a repository, and suddenly the system can search files, edit code, run Bash commands, query MCP servers, and coordinate workflows on your behalf. It still feels like a productivity feature, something closer to autocomplete than infrastructure.

But the moment the agent starts interacting with real systems, the plugin stops being UI glue. It becomes part of the execution environment. And once that happens, every hidden assumption inside the plugin becomes part of your trust boundary too.

That realization is where ArmorClaude actually began. Not as a product idea but as a tool we wanted ourselves.

We were already spending most of our time inside Claude Code, using agents against real repositories, real infrastructure, and increasingly complicated workflows. And the more we used it, the more obvious something became: we were being asked to trust systems that continuously changed their own understanding of the task while they were already operating inside our environment.

That felt fundamentally different from traditional software. The issue was not simply that agents could make mistakes. It was that the reasoning process itself was mutable while execution was already underway. We wanted something we ourselves would feel comfortable running against production systems, repositories, deployment tooling, and operational workflows.

So we started building it.

At first, the idea sounded fairly clean. Before Claude executed a tool, the plugin would ask the model to declare a structured plan. That plan would be transformed into a Canonical Structured Reasoning Graph, hashed into a Merkle root, and signed into an intent token. Every tool call afterward would need to prove that it still belonged to the reasoning structure the model originally committed to.

On paper, it looked elegant. Then we connected it to a real multi-step Claude workflow and the whole thing turned into a distributed systems problem almost immediately.

Claude doesn’t really execute plans

The first thing we learned was that Claude does not really execute plans. It grows them.

Anyone who has used Claude Code for more than a few minutes has seen this happen. You ask it to inspect a failing deployment script. It starts by reading a single file. Then it decides it should inspect related configs. Then it opens another directory. Then it reaches for Bash because the logs suggest something operational is happening. Halfway through, the system has quietly evolved from “inspect this issue” into “perform live operational diagnosis across the repository and runtime.”

Nothing about this feels malicious from inside the session. That is what makes it so difficult to reason about. Every step feels justified in the moment. Claude is simply trying to complete the task successfully. But underneath that process, something subtle is happening: the system is continuously redefining what it believes the task actually is.

And traditional controls have almost no visibility into that evolution. They can tell you which tool was called. They cannot tell you whether the reasoning that led to that tool call still belongs to the original task. That distinction sounds philosophical until you watch it happen in a real workspace.

A developer asks Claude to “clean up some old deployment logic.” The model starts tracing references, notices what looks like unused state, and eventually proposes removing infrastructure definitions that are still quietly tied to a production workflow. Nothing in the runtime flags this as obviously wrong. The system is authenticated. The commands are valid. The repository permissions are legitimate.

The dangerous part is not the command. It is how the system arrived there.

When Trust Update stopped being theoretical

This was also the first environment where one of the primitives from the IAP work stopped feeling theoretical. The Trust Update primitive had always existed architecturally. The papers described the ability to evolve intent safely through re-anchoring, delegation, and revocation while preserving continuity of trust. Conceptually, it solved a clean problem: how does a system maintain lineage as reasoning changes over time?

ArmorClaude was the first place where we needed that mechanism continuously. Without it, every evolving Claude plan looked like a brand-new agent. Every time the reasoning changed, the SDK minted a fresh intent token. The audit trail became fragmented because the system had no persistent notion of how one evolving plan related to the next.

That was exactly backwards. The entire point of intent assurance was continuity.

If Claude changes its reasoning halfway through a repository operation, you need to understand what changed, why it changed, and whether the evolved plan still belongs to the original task. Otherwise, every long-running coding workflow becomes a chain of disconnected trust decisions stitched together only by hope. Trust Update stopped being a cryptographic primitive on paper. It became operationally necessary.

Once we wired it through the stack, the system began behaving differently. Plans could evolve while preserving lineage back to the original reasoning root. Delegation became scoped rather than global. Revocation became recursive. The practical effect was surprisingly tangible.

Alice delegates a debugging workflow to Bob. Bob continues operating under delegated intent. Alice revokes the parent reasoning chain because the investigation drifted into systems it was never supposed to touch. Bob’s next tool call fails almost immediately because the delegated lineage underneath it collapses.

That was the first moment the system stopped feeling like a plugin and started feeling like infrastructure.

The latency problem was actually a trust problem

Then the latency problems arrived. At first, every Claude tool call paid the full trust cost. Process startup. HTTP roundtrip. Intent verification. Cryptographic validation. Audit persistence. A long Claude Code session could quietly accumulate several seconds of overhead. And this is where things became psychologically interesting.

The moment developers experience trust as friction, they begin routing around it. If the control layer feels slow enough, people stop treating it as infrastructure and start treating it as something to disable.

That forced us to confront a much harder problem than plugin performance. We were optimizing the operational cost of continuously verifying reasoning. Every time Claude evolved a plan, delegated authority, or re-anchored intent, the runtime had to answer a difficult question:

“can this system still be trusted to continue?”

That changed how we thought about latency entirely.

This was no longer a UX issue. It was the runtime cost of maintaining trust continuity inside a system that continuously reconstructs its own behavior.

Why ArmorClaude became a daemon

The first optimizations were less about speed and more about changing where trust lived.

Read-only operations like file inspection, search, and repository traversal were pushed onto fast paths because forcing every harmless inspection through the full trust pipeline made the system feel brittle. Token refresh moved out of the middle of workflows and into turn boundaries because Claude users experience interruptions differently than traditional systems do. Denials became self-repairing instead of binary because the model itself was often capable of correcting its own intent declaration if the system explained the failure clearly enough.

But eventually we realized the architecture itself needed updating. We were still treating the plugin like a stateless extension even though the trust problem itself was stateful.

So ArmorClaude stopped behaving like a plugin and started behaving more like a daemonized runtime layer. A persistent process held signing context, revocation state, audit lineage, and runtime memory in-process, while the Claude hooks themselves became thin clients streaming events into a continuously running trust fabric.

The effect was immediate. The system stopped feeling like an external verifier bolted onto Claude and started feeling like part of the runtime itself. And the interesting thing is that this is now how we use it ourselves every day.

ArmorClaude stopped being a prototype long ago. It became part of our own development environment because once you get used to seeing reasoning lineage, bounded delegation, and plan continuity directly inside the workflow, it becomes difficult to go back to blindly trusting execution alone.

The moment we realized “the latency was trust”

The most surprising discovery came later. For a while we kept assuming the half-second delays during re-anchor operations came from Postgres or Prisma. They didn’t. Almost the entire cost came from network roundtrips to Google Cloud KMS for every signature. The actual Ed25519 signing operation took less than a millisecond.

The latency was trust.

That realization forced another architectural pivot. Instead of treating KMS as the signer for every operation, we borrowed an idea from systems like Sigstore, Istio, and modern certificate infrastructures. KMS became the root authority issuing short-lived intermediate signing identities, while actual trust updates happened locally in memory and chained back cryptographically to the root.

Suddenly the system stopped moving at the speed of remote trust infrastructure and started moving at the speed of the runtime itself. And that was the moment the deeper shape of the problem became obvious. ArmorClaude was never really becoming “security for Claude.” It was slowly turning into a service mesh for reasoning.


Why this is bigger than Claude Code

Once you see agent systems through that lens, the ecosystem starts looking very different. The important question is no longer whether a tool call is authorized. It is whether the tool call still belongs to the reasoning structure that originally justified it.

Traditional systems never had to solve this because software behavior was mostly static. Claude workflows are different. The system continuously reconstructs the task while it is already operating inside repositories, shells, deployment systems, and MCP-connected infrastructure.

That changes what trust actually means. Trust stops being a property of identity alone. It becomes a property of continuity.

Can the system prove that the thing it is doing now still belongs to the thing it originally committed to doing?

That is the question ArmorClaude gradually became built around. And the deeper we pushed into real Claude workflows, the more obvious it became that this is not just a Claude problem. It is a property of every system where behavior is continuously constructed at runtime. Which is why the interesting thing about ArmorClaude was never really the plugin itself. It was realizing that reasoning has started behaving like infrastructure.

And infrastructure eventually needs control planes.

Onboarding open

Ready to control what your AI agents actually do?

Join the teams shipping safer, compliant AI agent deployments. White-glove onboarding for the first 50 design partners.

Read Docs →
Live Intent Assurance
ArmorClaude Was Supposed to Be a Plugin | ArmorIQ Blog