
AI systems no longer work alone. Today, a single user request can trigger a chain of agents: one agent researches, another writes, another deploys. Each hand-off includes instructions to the next.
That hand-off is where security breaks down.
When Agent A tells Agent B to do something, Agent B has no built-in way to verify whether that instruction is legitimate. It cannot confirm who originally authorized the request, or whether Agent A has been compromised along the way.
This is a known problem in traditional security called the "confused deputy." A program with elevated privileges gets tricked into misusing them on behalf of an attacker. In multi-agent AI, the problem is worse. Agents reason autonomously, act on ambiguous instructions, and operate across long chains with no central authority checking their work.
Security researchers have identified four specific attack categories that target this gap:
There is also a lesser-known risk called Agent Card Poisoning. Agents often advertise their capabilities through metadata called an Agent Card. If that metadata is unsigned and unverified, an attacker can spoof it, advertising fake capabilities to intercept legitimate task delegations.
Two protocols dominate multi-agent communication today, and neither fully addresses the identity problem.
Model Context Protocol (MCP), developed by Anthropic, defines how agents discover and call tools. It handles the connection between an agent and a resource. It does not verify who the agent is or who authorized it to act.
Agent2Agent (A2A) Protocol, developed by Google, standardizes how agents communicate and collaborate. Early versions support Agent Card signing but do not enforce it. There is no central registry to verify agent identity, and the protocol does not enforce short-lived tokens or strict delegation chains. That leaves two open doors: token theft and stream hijacking.
The IETF is drafting a new standard called the Agent Identity Protocol (AIP), designed specifically as the identity layer that sits beneath protocols like MCP and A2A.
AIP introduces five mechanisms that together close the trust gap:
Decentralized Identifiers. Every agent gets a globally unique ID derived from its public key. Identity is cryptographic, not claimed.
Signed Delegation Chains. When Agent A delegates to Agent B, it issues a signed Principal Token. Agent B passes a new token to Agent C, and so on. Any agent in the chain can verify the full path back to the original human or organizational principal.
Scoped Permissions. Agents receive only the permissions they need. An agent with email.read access cannot grant a sub-agent email.send access. Permissions cannot escalate through delegation.
Depth Limits. AIP enforces a maximum delegation depth. This directly prevents chain obfuscation attacks where long delegation chains hide the original authorization context.
Short-Lived Credentials. Agents authenticate with tokens that expire quickly. If a token is intercepted, the window of exploitation is narrow.
AIP is still in draft. Most production multi-agent systems are running without it.
Until adoption is widespread, organizations building these systems need to treat trust boundaries as a first-class engineering problem. That means validating inter-agent communication at each hop, issuing scoped credentials rather than shared keys, and building circuit breakers that stop cascading failures when one agent in a chain is compromised.
The last agent in a chain should never trust the first agent's word alone. Right now, most of them do.