Zero Trust for AI Agents - Principles That Actually Work
Traditional IAM is broken for AI agents. Learn how SPIFFE, least agency, and the Plan-then-Execute pattern enable Zero Trust in autonomous agentic systems.
Disclaimer
This article is intended for informational purposes and reflects the state of published research and industry practice as of early 2026. It is not professional security advice. Your specific environment, threat model, and regulatory obligations will shape how these principles apply to your situation.
A note on sourcing
Every claim in this article that rests on a specific framework, study, or institutional position is traceable. The Fact-Check Appendix below maps each of those claims to its primary source, including an honest flag where a source is a preprint rather than peer-reviewed work, or where a study was sponsored by a party with a commercial interest in the findings.
The Top Sources section is not a bibliography. It is a curated reading list: the documents I consider the highest-value starting points if you want to go deeper than this article takes you.
This will be the standard format for all articles going forward. My goal with both sections is the same: to give you the tools to verify, challenge, and extend what you read here rather than simply take it on faith. You deserve to know where the ideas come from and to be able to follow them wherever they lead.
TL;DR
Imagine a Friday afternoon where a simple AI request triggers a chain reaction across four autonomous subagents, each crossing a trust boundary your identity stack never knew existed. This is the “identity ghost” in the machine. Traditional OAuth and IAM protocols were built for deterministic humans and static services, not the ephemeral, probabilistic swarms of the agentic era. We are currently facing a structural mismatch that could lead to catastrophic cascading failures if left unaddressed. In this deep dive, I explore the primitives that survive the translation to AI agents and the ones we must rebuild from scratch. From the cryptographic rigor of SPIFFE/SPIRE for workload authentication to the “Plan-then-Execute” architectural pattern for least-privilege authorization, the path to Zero Trust for AI is becoming clear. We must shift from soft prompt instructions to hard architectural constraints. If you are building agentic systems today, the identity layer is not a security tax, it is the foundation of your entire autonomous workforce. Never trust, always verify, especially when the principal isn’t human.
The Itch: Why This Matters Right Now
Picture a Friday afternoon in your org’s production environment.
A user asks an AI assistant to “review and summarize this quarter’s invoices.” Reasonable request. The assistant spawns a planner agent to break the task into steps. The planner delegates to a retrieval subagent to pull documents. The retrieval agent calls a summarization tool. The tool writes output back to a shared memory store.
Four hops. Four trust boundaries. Four opportunities for something to go wrong.
Now ask yourself: which of those hops did your identity platform actually authenticate? My guess is: the first one, maybe. The OAuth token the user handed to the assistant at login. Everything that happened after that, the planner, the subagent, the tool call, operated on inherited trust. Standing credentials. Roles assigned at provisioning, not at task time.
That is the problem. And it is not theoretical.
Every identity and access management (IAM) system built over the last two decades was designed around a specific mental model: persistent actors with accountable identities performing deterministic operations. Your human employees. Your static service accounts. Your microservices with predictable API call patterns. The tooling was shaped by those assumptions so deeply that most people never noticed the assumptions existed at all.
AI agents violate every single one of them. They are ephemeral (an agent may live for seconds). They are non-deterministic (their next tool call is probabilistic, not compiled). They spawn child agents dynamically. They accumulate credentials across a session the way a rolling snowball picks up debris. And the principal hierarchy they create, human to orchestrator to subagent to tool, generates a chain of trust that no existing protocol was designed to reason about end-to-end.
This is not a configuration problem you can fix with a tighter role assignment. It is a structural mismatch between the architecture of your identity stack and the nature of the entities now running inside it.
The question this piece is going to answer: which identity primitives survive the translation to agentic systems, and which ones need to be rebuilt from scratch? OAuth delegation, SPIFFE workload identity, capability tokens, decentralized identifiers (DIDs), some of these hold up. Some don’t. And the answer depends on where you are in the principal hierarchy: human to orchestrator, orchestrator to subagent, subagent to tool. Every hop is a different problem.
The Deep Dive: The Struggle for a Solution
Let me take you on the investigation I wish existed when this problem first landed on my desk.
The starting point is NIST SP 800-207, Zero Trust Architecture, published in August 2020. It is the canonical federal framework for ZT, and it does something quietly remarkable: it defines the entity seeking access not as a “user” but as a “subject,” explicitly covering “applications and other non-human entities that request information from resources.” The architects left the door open for machines.
But leaving a door open is not the same as building a hallway through it.
The seven Zero Trust tenets in 800-207 hold up reasonably well for agents on four counts:
Tenet 2 - encrypted data-plane communication (”All communication is secured regardless of network location”)
Tenet 3 - per-session access granting (”Access to individual enterprise resources is granted on a per-session basis”)
Tenet 4 - dynamic policy enforcement (”Access to resources is determined by dynamic policy”)
Tenet 5 - continuous posture monitoring (”The enterprise monitors and measures the integrity and security posture of all owned and associated assets”)
Those four, you can map to an agent-aware architecture without rebuilding them from the ground up. The other three:
Tenet 1 - resource classification (”All data sources and computing services are considered resources”)
Tenet 6 - dynamic authentication and authorization enforcement (”All resource authentication and authorization are dynamic and strictly enforced before access is allowed”)
Tenet 7 - behavioral telemetry collection (”The enterprise collects as much information as possible about assets, network infrastructure and communications and uses it to improve its security posture”)
All require material extension before they apply cleanly to a probabilistic, ephemeral actor (yes, I’m talking about our friends, the LLMs). And here is where the villain enters.
OAuth: The Stubborn Gatekeeper
OAuth 2.0 was designed for a world where you could enumerate, at build time, every operation a service would ever perform. You write a scope. You attach it to a token. The service presents the token. Access granted or denied.
That model assumes the service is deterministic.
A 2025 research paper from a team spanning the Cloud Security Alliance, MIT, and AWS identified seven specific failure modes when OAuth, OpenID Connect, and SAML are applied to multi-agent systems. The list is worth sitting with.
#1 - The permissions are too coarse and too static.
An OAuth scope granted to an orchestrator persists across every sub-task, even when a specific step only needs a fraction of that scope.
#2 - These protocols authenticate a single entity.
They have no native concept of a delegation chain: the token says “this service can do X” but says nothing about whether a subagent spawned by that service is authorized to do X on its behalf.
#3 - They are context-blind at runtime.
The token was evaluated once, at issuance. The context in which it is being used, hours later, by a different agent, for a purpose the original authorization never contemplated, is invisible to the policy decision point (PDP).
#4 - Scale.
An autonomous agent in a complex workflow may need non-human identities for dozens of APIs, databases, and downstream services simultaneously. Each one is a secret. Each secret is a potential exposure point. The researchers named this “secret sprawl,” and it compounds exponentially as agent orchestration grows.
#5 - Peer-to-peer trust.
OAuth assumes hierarchical trust relationships. Agents from different systems, different trust domains, different organizational owners, need to negotiate trust laterally. OAuth does not have a protocol for that.
#6 - Revocation.
If an agent is compromised mid-session, forcing immediate, complete revocation of its access rights across every system it has touched is, in the researchers’ words, “a major challenge” under existing protocols.
#7 - The legacy protocol was built for humans acquiring tokens for software.
The relationship between an orchestrator agent and a subagent it spawned two seconds ago has no clean analog in that mental model.
OAuth is not broken. It is just being asked to do a job it was never designed for.
SPIFFE: The New Hire with the Right Instincts
Here is where I start feeling something like optimism.
SPIFFE, the Secure Production Identity Framework for Everyone, is a Cloud Native Computing Foundation (CNCF) graduated open-source standard that issues cryptographic identity documents to software workloads. Its core primitive is the SVID, the SPIFFE Verifiable Identity Document, a short-lived X.509 certificate tied to a URI that looks like: spiffe://trust-domain/workload-identifier.
The crucial word there is short-lived.
SPIFFE’s SPIRE runtime authenticates each workload through a two-phase attestation process. First it verifies the node (the host machine, via platform-level proofs like cloud provider instance identity documents). Then it verifies the specific workload running on that node, using kernel metadata, container runtime state, or scheduler attestation. If a workload restarts, it must re-attest before receiving new credentials. There is no standing access. There is no persistent token sitting in memory waiting to be stolen.
Applied to AI agents, this is a promising fit. Each agent instance gets a SPIFFE ID at spawn time. The ID is short-lived. Agent-to-agent communication runs over mutual TLS authenticated by SVID exchange. If an agent is compromised or terminated, its credentials expire within minutes without requiring active revocation.
NIST SP 800-207A, the 2023 cloud-native extension of the original ZT framework, formally incorporates SPIFFE into the ZTA reference architecture by designating the service mesh (with SPIRE as the identity provider) as the appropriate implementation layer for workload identity in distributed systems. The five identity-based segmentation requirements it specifies, encrypted connections, short-lived credentials, runtime authorization, phishing-resistant end-user authentication, and per-request authorization, map cleanly to what SPIFFE provides at the workload layer.
SPIFFE solves the authentication problem. The agent can prove what it is. That is necessary. But it is not sufficient.
The Real Villain: Authorization at Every Hop
Authentication answers “who are you?” Authorization answers “what are you allowed to do, right now, in this context, at this specific step?”
And authorization is where the wheels come off.
The human-to-orchestrator-to-subagent-to-tool chain creates what researchers call a multi-hop delegation problem. At each hop, the principal changes. The context changes. The required permission scope changes. An orchestrator authorized to read invoices should not automatically delegate read-plus-write access to a subagent. A subagent authorized to summarize documents should not inherit the orchestrator’s database credentials. A tool authorized to format output should not carry the subagent’s network access.
But in most current implementations, that is exactly what happens. Credentials flow downstream unconstrained. Sub-agents inherit the full scope of their parent. The blast radius of a single compromised link in the chain is the entire chain.
The OWASP Top 10 for Agentic Applications, released December 9, 2025 and developed through collaboration with more than 100 industry experts, researchers, and practitioners, formalizes this problem across ten risk categories. The two most structurally interesting for identity practitioners are Excessive Agency (agents operating with broader permissions than their task requires) and Cascading Failures (where a compromised or hallucinating planner propagates destructive instructions to every agent in its orchestration tree simultaneously).
OWASP’s proposed answer is a principle called “Least Agency”: grant each agent the minimum autonomy, tool access, and credential scope required for its specific, bounded task. The principle is correct. The implementation mechanism is where the hard engineering work lives.
One concrete architectural response comes from SAP’s Office of the Chief Security Officer, which published a three-part technical series in late 2025. Their approach, the Plan-then-Execute pattern, separates the planning function from the execution function entirely. A Planner agent produces a step-by-step task list. Each step is passed individually to an Executor agent that carries it out with credentials scoped only to that step. The Executor’s output cannot retroactively influence the plan. For high-risk operations, a Verifier agent and a human-in-the-loop checkpoint sit between planning and execution, creating a four-way mutual oversight structure.
The key insight: hard controls (architectural constraints that make certain actions physically impossible) outperform soft controls (prompt-level instructions that ask the agent to restrain itself). A misconfigured agent cannot comply with a prompt instruction. It cannot violate a permission boundary it was never granted.
The Resolution: Your New Superpower
Here is the mental model I use now, and I think it is the right one for practitioners building in this space today.
Think of agent identity as a three-layer problem, each layer at a different point on the maturity arc.
Layer 1: Authentication (solved today): Proving the agent is what it claims to be. SPIFFE/SPIRE is your answer here, in production, now. Short-lived X.509 SVIDs, workload attestation at spawn time, mutual TLS (mTLS) for every agent-to-agent channel. No standing secrets. No inherited sessions. Microsoft’s production implementation of this pattern, integrating agent identities directly into enterprise identity directory tooling at Build 2025, demonstrates that the infrastructure is deployable at enterprise scale.
Layer 2: Authorization (in progress): Ensuring each agent can only do what its specific task, at this specific step, in this specific context, requires. This layer does not have a single protocol answer yet. What it has is an architectural principle: scope credentials to the step, not the session. Separate planning from execution. Use capability tokens rather than persistent role assignments. Build the constraint into the architecture so no amount of prompt injection can circumvent it.
Layer 3: Cross-Organizational Trust (prototype now, watch closely): Beyond SPIFFE and capability-scoped tokens, a 2025 research team proposed extending agent identity to decentralized identifiers (DIDs) and verifiable credentials (VCs), borrowing from the self-sovereign identity movement. Each agent carries a DID-anchored identity document attesting its capabilities, provenance, behavioral scope, and security posture. Zero-knowledge proofs (ZKPs) allow agents to prove policy compliance to a counterparty without exposing all their underlying attributes. This matters most when two agents from different organizations and different trust domains need to negotiate mutual trust without a shared identity provider. That scenario is coming. The tooling is immature and the operational complexity of managing DID-anchored identities at scale is unsolved, but the architecture is directionally correct. Prototype carefully. The standards are moving.
For product teams building agent systems right now: the cost of retrofitting hard authorization constraints after deployment is significantly higher than building them in at design time. A design-time decision looks like separating the planner and executor at the service boundary level (two distinct services with scoped credentials) rather than at the prompt level, where a single system prompt tries to instruct a monolithic agent to restrain itself. A retrofit decision looks like bolting a policy enforcement layer onto an already-deployed orchestrator that was never designed to surface its internal delegation chain to an external policy decision point (PDP). The former is an architecture choice you make in a sprint planning session. The latter is a multi-quarter remediation project. For technical executives approving roadmaps, this is an architectural line item in the initial build, not a security tax added after launch. Choose accordingly.
The CISA Zero Trust Maturity Model Version 2.0 gives you the pillar structure: Identity, Devices, Networks, Applications and Workloads, and Data. Agents map across all five. The maturity progression toward “dynamic, attribute-based data-level access controls” and “just-enough and just-in-time access principles” is exactly the destination. The gap is that the ZTMM was not written with non-deterministic, ephemeral actors in mind. Its language assumes thresholds definable in advance. Your job, right now, is to bridge that assumption with architectural patterns that make least-privilege operational for entities whose required permissions emerge at runtime.
The NIST AI Risk Management Framework (AI RMF 1.0) adds the governance layer: a Govern-Map-Measure-Manage cycle that tells you who is accountable for agent behavior, how to characterize agent risk surfaces, what telemetry to collect, and under what conditions to deactivate a misbehaving agent. Pair it with NIST AI 600-1’s specific guidance on LLM risk categories, and you have a governance posture that can survive an audit.
The AI RMF gives you the accountability structure, but there is a specific gap it does not close. No current framework defines a governance model for the full delegation chain. CISA ZTMM, NIST AI RMF, and the OWASP Agentic Top 10 all describe pieces of the problem: identity pillars, risk functions, threat categories; but none of them tell you who is accountable when a subagent three hops downstream from the human initiator takes a destructive action. That accountability gap is the next frontier. Practitioners should be pushing framework authors to address it. Product teams should be documenting their own delegation chain governance in architecture decision records today, even without a standard to comply with, because when the standard arrives, the organizations that have already mapped their chains will be the ones who can comply without starting from scratch.
The question that opened this piece (which primitives survive the translation to agentic systems and which need to be rebuilt from scratch) has a provisional answer. SPIFFE’s short-lived workload credentials survive. The per-session access granting principle survives, applied at the step level, not the session level. The behavioral telemetry requirement survives, though the signals are new. OAuth’s static scopes and single-entity authentication model do not survive intact. RBAC at the session level does not survive. Any control that relies on an agent complying with a soft constraint rather than being bounded by a hard architectural one does not survive.
The trust boundary at every hop in the principal hierarchy is real. The engineering response to it is becoming clearer. Security practitioners have the primitives to start building. Technical executives have the risk framing to justify the investment. Product teams building agent systems have the architectural patterns to bake constraints in at design time rather than bolting them on later.
A Microsoft-commissioned IDC study projects 1.3 billion AI agents in production by 2028. The window for getting the identity layer right before that scale arrives is narrow. Start with authentication. Build the authorization layer. Document your delegation chain. Treat every hop as an untrusted boundary until proven otherwise.
That is the only version of “never trust, always verify” that holds up when the principal is not a human.
Fact-Check Appendix
Statement: NIST SP 800-207 explicitly covers non-human entities, defining “subject” as inclusive of “applications and other non-human entities that request information from resources.” Source: NIST SP 800-207, Section 1 | https://nvlpubs.nist.gov/nistpubs/specialpublications/NIST.SP.800-207.pdf
Statement: A 2025 research paper identified seven specific failure modes when OAuth, OIDC, and SAML are applied to multi-agent systems, including coarse-grained permissions, single-entity authentication focus, context blindness, secret sprawl, peer-to-peer trust gaps, revocation complexity, and delegation chain absence. Source: Huang, Narajala et al., arXiv:2505.19301 (preprint, not peer-reviewed) | https://arxiv.org/abs/2505.19301
Statement: The OWASP Top 10 for Agentic Applications was released December 9, 2025, developed through collaboration with more than 100 industry experts, researchers, and practitioners. Source: OWASP GenAI Security Project, December 9, 2025 | https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
Statement: NIST SP 800-207A designates the service mesh as the appropriate implementation layer for workload identity and specifies five identity-based segmentation requirements (ID-SEG-REC-1 through ID-SEG-REC-5). Source: NIST SP 800-207A, September 2023 | https://csrc.nist.gov/pubs/sp/800/207/a/final
Statement: CISA Zero Trust Maturity Model Version 2.0 defines five pillars (Identity, Devices, Networks, Applications and Workloads, Data) and three cross-cutting capabilities, progressing through Traditional, Initial, Advanced, and Optimal maturity stages. Source: CISA ZTMM v2.0, April 2023 | https://www.cisa.gov/sites/default/files/2023-04/CISA_Zero_Trust_Maturity_Model_Version_2_508c.pdf
Statement: The NIST AI Risk Management Framework 1.0 was published January 26, 2023, structured around four functions: Govern, Map, Measure, and Manage. Source: NIST AI 100-1, January 2023 | https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf
Statement: NIST AI 600-1 identifies 12 LLM-specific risk categories and lists “autonomous agents” as a threat vector in action MS-2.7-001. Source: NIST AI 600-1, July 2024 | https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf
Statement: SAP’s Office of the CSO proposed the Plan-then-Execute pattern, separating planning from execution with per-step least-privilege scoping, and distinguishing hard architectural constraints from soft prompt-level controls. Source: Del Rosario and Thoden van Velzen, SAP Community Blog, September 25, 2025 | https://community.sap.com/t5/security-and-compliance-blog-posts/limiting-agent-autonomy-least-privilege-and-tool-access-for-agentic-ai/ba-p/14224584
Statement: Microsoft announced Entra Agent ID at Build 2025, automatically assigning identities to agents created in Copilot Studio and Azure AI Foundry, integrating with enterprise Conditional Access policies. Source: Microsoft Security Blog, May 19, 2025 | https://www.microsoft.com/en-us/security/blog/2025/05/19/microsoft-extends-zero-trust-to-secure-the-agentic-workforce/
Top 5 Prestigious Sources
NIST SP 800-207 (National Institute of Standards and Technology, U.S. Dept. of Commerce, 2020) | https://nvlpubs.nist.gov/nistpubs/specialpublications/NIST.SP.800-207.pdf
NIST AI 100-1: AI Risk Management Framework 1.0 (NIST, January 2023) | https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf
CISA Zero Trust Maturity Model v2.0 (Cybersecurity and Infrastructure Security Agency, April 2023) | https://www.cisa.gov/sites/default/files/2023-04/CISA_Zero_Trust_Maturity_Model_Version_2_508c.pdf
Huang, Narajala et al., “A Novel Zero-Trust Identity Framework for Agentic AI” (Cloud Security Alliance, MIT, AWS, arXiv:2505.19301, May 2025) | https://arxiv.org/abs/2505.19301
OWASP Top 10 for Agentic Applications 2026 (OWASP GenAI Security Project, December 2025) | https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
Del Rosario and Thoden van Velzen, “Limiting Agent Autonomy: Least Privilege and Tool Access for Agentic AI” (SAP Office of the CSO, SAP Community Blog, September 2025) | https://community.sap.com/t5/security-and-compliance-blog-posts/limiting-agent-autonomy-least-privilege-and-tool-access-for-agentic-ai/ba-p/14224584
Peace. Stay curious! End of transmission.




