$catMANUAL||~50 min

A2A Protocol in Practice: When AI Agents Start Talking to Each Other, Things Get Interesting

advertisement

A2A Protocol in Practice: When AI Agents Start Talking to Each Other, Things Get Interesting

Last month I wrote about What the Heck is MCP? A Full-Stack Dev's Practical Take — the protocol that lets AI agents call external tools. Someone in the comments asked a natural follow-up: "Can agents call other agents?" Like, if I have a coding agent and a testing agent, can they just figure things out between themselves without me playing telephone operator?

Honestly, I didn't know enough to answer at the time. So I went digging, and discovered that Google dropped the A2A (Agent-to-Agent) protocol back in April 2025 specifically for this use case. I've been tinkering with it for a couple weeks now, hit plenty of walls, and figured it was time to write up what I learned.

A2A vs MCP: The Question Everyone Asks

Let's get this out of the way first because it's the number one thing people get confused about.

MCP (Model Context Protocol) handles how an agent calls tools. Your agent wants to query a database, call an API, read a file — that's all MCP territory. The agent is the boss, the tool is the servant. Clean relationship.

A2A handles how agents talk to other agents. Your agent isn't calling a dumb tool — it's collaborating with another agent that has its own intelligence, its own reasoning, its own capabilities. Neither side knows how the other one works internally. They just communicate through a standard interface.

Think of it this way: MCP is you ordering at a restaurant (you tell the waiter what you want). A2A is two chefs in the kitchen coordinating on a complex meal (each has their own specialty, they negotiate and collaborate).

They're not competing — they're complementary. A single agent can use MCP to call tools AND use A2A to collaborate with other agents at the same time.

The Core Concepts

A2A 1.0 shipped in 2025, built on JSON-RPC 2.0 over HTTP(S). The design is surprisingly clean. Let me walk through the key pieces.

Agent Card

This is probably the most important concept. Every A2A Server publishes an Agent Card — basically a self-description. It tells you:

  • Who you are (name, description)
  • What you can do (skills list)
  • Where to reach you (service URL)
  • How to authenticate

It lives at /.well-known/agent.json. When another agent wants to collaborate with you, it reads your Agent Card first to see if you're a good fit.

json
1
{
2
  "name": "CodeReviewAgent",
3
  "description": "Reviews code for quality, security, and performance. Supports Python, JS, Go.",
4
  "url": "https://code-review-agent.example.com",
5
  "version": "1.0.0",
6
  "skills": [
7
    {
8
      "id": "review-python",
9
      "name": "Python Code Review",
10
      "description": "Reviews Python code quality, security, and performance"
11
    }
12
  ],
13
  "capabilities": {
14
    "streaming": true,
15
    "pushNotifications": false
16
  }
17
}

Task

The fundamental work unit in A2A. You send a message to another agent, it creates a Task to track the request. Tasks have a lifecycle:

  • submitted → just arrived
  • working → being processed
  • input-required → needs more info from you
  • completed → done
  • failed → blew up
  • canceled → you gave up waiting

This is clever because agent collaboration can be long-running — nothing like a simple REST API call. Tasks let you track progress asynchronously.

Message and Part

The communication units. A Message has a role (user or agent) and contains one or more Parts.

Parts come in three flavors:

  • TextPart: plain text
  • FilePart: a file (URL reference or inline base64)
  • DataPart: structured JSON data

This flexibility means A2A isn't just about passing text — you can exchange files, structured data, even binary blobs.

Artifact

The output an agent produces after completing a task. If you asked it to write code, the artifact is the code file. If you asked it to analyze something, the artifact is the report.

Building an A2A Server: Let's Actually Code This

Enough theory. Let me show you how to build a real A2A Server using the Python SDK.

Setup

bash
1
python3 -m venv a2a-env
2
source a2a-env/bin/activate
3
pip install a2a-sdk

First gotcha I hit: the SDK requires Python 3.10+. My server defaulted to 3.9 and I got cryptic syntax errors. Had to switch to 3.11 with pyenv. Classic.

A Simple Translation Agent Server

python
1
from a2a.server.apps import A2AStarletteApplication
2
from a2a.server.request_handlers import DefaultRequestHandler
3
from a2a.server.tasks import InMemoryTaskStore
4
from a2a.types import (
5
    AgentCard, AgentSkill, AgentCapabilities,
6
    TaskState, Message, TextPart, Artifact,
7
)
8
import uvicorn
9
 
10
class TranslationAgent:
11
    """A simple translation agent"""
12
 
13
    async def process_message(self, message: Message) -> dict:
14
        text = ""
15
        for part in message.parts:
16
            if part.root.kind == "text":
17
                text = part.root.text
18
 
19
        # Simple logic (in production you'd call an LLM)
20
        if "hello" in text.lower():
21
            response = "你好!This is from the Translation Agent."
22
        elif "你好" in text:
23
            response = "Hello! Nice to meet you."
24
        else:
25
            response = f"Got your message: {text} (translation engine warming up...)"
26
 
27
        return {
28
            "status": TaskState.completed,
29
            "artifacts": [
30
                Artifact(parts=[TextPart(text=response)], name="translation")
31
            ],
32
        }
33
 
34
agent_card = AgentCard(
35
    name="TranslationAgent",
36
    description="Simple translation agent for Chinese-English",
37
    url="http://localhost:8000",
38
    version="1.0.0",
39
    skills=[
40
        AgentSkill(
41
            id="translate",
42
            name="Translation",
43
            description="Translates between Chinese and English",
44
        )
45
    ],
46
    capabilities=AgentCapabilities(streaming=False),
47
)
48
 
49
task_store = InMemoryTaskStore()
50
handler = DefaultRequestHandler(agent_card=agent_card, task_store=task_store)
51
app = A2AStarletteApplication(agent_card=agent_card, request_handler=handler)
52
 
53
if __name__ == "__main__":
54
    uvicorn.run(app.build(), host="0.0.0.0", port=8000)

Fire it up:

bash
1
python server.py

Test the Agent Card endpoint:

bash
1
curl http://localhost:8000/.well-known/agent.json

If you get the JSON back, you're good.

Writing a Client

python
1
import asyncio
2
from a2a.client import A2AClient
3
from a2a.types import (
4
    Message, TextPart, SendMessageRequest, MessageSendParams,
5
)
6
 
7
async def main():
8
    client = A2AClient(url="http://localhost:8000")
9
 
10
    # Check out the agent's card
11
    card = await client.get_agent_card()
12
    print(f"Connected to: {card.name}")
13
    print(f"Skills: {[s.name for s in card.skills]}")
14
 
15
    # Send a message
16
    message = Message(
17
        role="user",
18
        parts=[TextPart(text="Hello, how are you?")],
19
    )
20
 
21
    request = SendMessageRequest(params=MessageSendParams(message=message))
22
    response = await client.send_message(request)
23
 
24
    # Extract the result
25
    if hasattr(response.root.result, 'artifacts'):
26
        for artifact in response.root.result.artifacts:
27
            for part in artifact.parts:
28
                if part.root.kind == "text":
29
                    print(f"Translation: {part.root.text}")
30
 
31
if __name__ == "__main__":
32
    asyncio.run(main())

The client reads the Agent Card, sends a message, and gets back the translation. Simple enough.

When I first ran this, I got a ConnectionRefusedError. Took me 20 minutes to figure out that port 8000 was already taken by something else. Switched to 8001 and it worked. These little environment issues are the bane of trying new tools.

Streaming: Real-Time Progress Updates

The synchronous example works fine, but what if your agent takes 30 seconds to process? You can't just sit there with a spinning cursor. A2A supports SSE (Server-Sent Events) streaming for real-time progress.

On the server side, you yield events as you go:

python
1
import asyncio
2
from a2a.types import TaskState, TaskStatusUpdateEvent, TaskArtifactUpdateEvent, Artifact, TextPart, Message
3
 
4
class StreamingAgent:
5
    async def process_message_stream(self, message, task_id):
6
        text = ""
7
        for part in message.parts:
8
            if part.root.kind == "text":
9
                text = part.root.text
10
 
11
        steps = [
12
            f"Analyzing input: {text[:20]}...",
13
            "Calling translation engine...",
14
            "Reviewing output...",
15
            "Done!",
16
        ]
17
 
18
        for step in steps:
19
            yield TaskStatusUpdateEvent(
20
                task_id=task_id,
21
                status=TaskState.working,
22
                message=Message(role="agent", parts=[TextPart(text=step)]),
23
            )
24
            await asyncio.sleep(0.5)  # simulate work
25
 
26
        yield TaskArtifactUpdateEvent(
27
            task_id=task_id,
28
            artifact=Artifact(
29
                parts=[TextPart(text=f"Translated: {text} -> [result]")],
30
                name="translation",
31
            ),
32
        )
33
 
34
        yield TaskStatusUpdateEvent(task_id=task_id, status=TaskState.completed)

On the client side, you iterate over the stream:

python
1
async def stream_example():
2
    client = A2AClient(url="http://localhost:8000")
3
    message = Message(role="user", parts=[TextPart(text="Long text to translate...")])
4
    request = SendMessageRequest(params=MessageSendParams(message=message))
5
 
6
    async for event in client.send_message_stream(request):
7
        if hasattr(event.root, 'status'):
8
            print(f"Status: {event.root.status}")
9
        if hasattr(event.root, 'artifact'):
10
            for part in event.root.artifact.parts:
11
                if part.root.kind == "text":
12
                    print(f"Result: {part.root.text}")

The upside: much better UX. The downside: significantly more complex to implement, and debugging SSE streams is a pain. The logs are messy and breakpoints don't work as cleanly as with synchronous code.

Multi-Agent Orchestration: Where A2A Really Shines

Single-agent examples can be done with a plain REST API. A2A's real value emerges when you have multiple agents collaborating.

Picture this: you have a Requirements Analyst Agent, a Code Generator Agent, and a Test Runner Agent. A user describes a feature, and the three agents work together to deliver working, tested code.

python
1
import asyncio
2
from a2a.client import A2AClient
3
from a2a.types import Message, TextPart, SendMessageRequest, MessageSendParams
4
 
5
class Orchestrator:
6
    def __init__(self):
7
        self.agents = {
8
            "analyst": "http://localhost:8001",
9
            "coder": "http://localhost:8002",
10
            "tester": "http://localhost:8003",
11
        }
12
 
13
    async def run_workflow(self, requirement: str):
14
        # Step 1: Analyze requirements
15
        print("📋 Step 1: Analyzing requirements...")
16
        analyst = A2AClient(url=self.agents["analyst"])
17
        analysis = await analyst.send_message(
18
            SendMessageRequest(
19
                params=MessageSendParams(
20
                    message=Message(
21
                        role="user",
22
                        parts=[TextPart(text=f"Analyze this requirement: {requirement}")],
23
                    )
24
                )
25
            )
26
        )
27
        analysis_text = self._extract_text(analysis)
28
        print(f"Analysis: {analysis_text[:100]}...")
29
 
30
        # Step 2: Generate code
31
        print("\n💻 Step 2: Generating code...")
32
        coder = A2AClient(url=self.agents["coder"])
33
        code = await coder.send_message(
34
            SendMessageRequest(
35
                params=MessageSendParams(
36
                    message=Message(
37
                        role="user",
38
                        parts=[TextPart(text=f"Generate code based on:\n{analysis_text}")],
39
                    )
40
                )
41
            )
42
        )
43
        code_text = self._extract_text(code)
44
 
45
        # Step 3: Run tests
46
        print("\n🧪 Step 3: Running tests...")
47
        tester = A2AClient(url=self.agents["tester"])
48
        test_result = await tester.send_message(
49
            SendMessageRequest(
50
                params=MessageSendParams(
51
                    message=Message(
52
                        role="user",
53
                        parts=[TextPart(text=f"Test this code:\n{code_text}")],
54
                    )
55
                )
56
            )
57
        )
58
        test_text = self._extract_text(test_result)
59
 
60
        return {"analysis": analysis_text, "code": code_text, "test": test_text}
61
 
62
    def _extract_text(self, response):
63
        result = response.root.result
64
        if hasattr(result, 'artifacts') and result.artifacts:
65
            for artifact in result.artifacts:
66
                for part in artifact.parts:
67
                    if part.root.kind == "text":
68
                        return part.root.text
69
        return "No text in response"

The orchestrator chains three agents together, feeding each one's output to the next. In a real project you'd add error handling, retries, maybe parallel execution for independent steps.

When to Use What

Since there's already an MCP article on this site, here's a practical decision framework:

Use MCP when:

  • Your agent needs to query a database
  • Your agent calls external APIs
  • Your agent reads/writes files
  • Your agent uses search engines
  • The "tool" is stateless — one call, one result

Use A2A when:

  • Multiple agents collaborate on a complex task
  • Tasks run for a long time (minutes or hours)
  • Agents need to exchange context and negotiate
  • Different teams/companies built the agents
  • You need to preserve agent "opacity" (no internal state exposed)

Real-world example:

You're building an automated code review system:

  • MCP connects to GitHub API (read PRs, post comments)
  • MCP connects to SonarQube (code quality checks)
  • A2A connects to an independent "Security Review Agent" (has its own knowledge base and reasoning)
  • A2A connects to an independent "Performance Analysis Agent" (has its own benchmarks)

MCP handles tool calls. A2A handles agent collaboration. Clean separation.

Pitfalls I Hit (So You Don't Have To)

Pitfall 1: Agent Card URLs Must Be Absolute

I initially wrote the URL as a relative path /api/agent. The client choked on it. A2A spec requires full absolute URLs with protocol and domain.

json
1
// ❌ Wrong
2
{ "url": "/api/agent" }
3
 
4
// ✅ Right
5
{ "url": "https://my-agent.example.com" }

Pitfall 2: Task States Can't Skip Steps

Task state transitions are strict. You can't jump from submitted directly to completed — you must go through working first. I tried to be clever and shortcut this once. The client parser broke.

Valid transitions:

  • submittedworkingcompleted
  • submittedworkingfailed
  • submittedworkinginput-requiredworkingcompleted
  • Any non-terminal state → canceled

Pitfall 3: JSON-RPC ID Must Match

A2A uses JSON-RPC 2.0. The id field in request and response must match. The SDK handles this for you, but if you're testing with raw HTTP/curl, watch out.

Pitfall 4: SSE Timeout Through Proxies

Streaming responses use Server-Sent Events. If your agent takes longer than 30 seconds, intermediate proxies (nginx, load balancers) might kill the connection.

nginx
1
# nginx config for long-lived SSE connections
2
location /a2a/ {
3
    proxy_pass http://backend;
4
    proxy_read_timeout 300s;
5
    proxy_send_timeout 300s;
6
    proxy_set_header Connection '';
7
    proxy_http_version 1.1;
8
    chunked_transfer_encoding off;
9
}

Pitfall 5: Don't Skip Auth

A2A supports standard HTTP auth (Bearer Token, OAuth2). In my test environment I ran without auth for a week — everything worked. When I moved to production, nginx's security module blocked every request because there was no auth header. Added Bearer Token and it was fine. Don't be lazy like me.

Integrating With Existing Frameworks

A2A isn't trying to replace your existing agent framework. It gives them a common communication layer.

Google ADK: Native A2A support. Wrapping an ADK agent as an A2A Server is nearly zero-cost.

LangGraph: Works through adapters. Wrap your LangGraph agent as an A2A Server and it becomes callable by any A2A Client.

CrewAI: Natural fit. Each Crew can expose itself as an A2A Agent, and multiple Crews collaborate through A2A.

Roll your own: If your agent is custom-built (say, calling OpenAI API directly), you implement the A2A protocol manually:

  1. HTTP Server exposing /.well-known/agent.json and JSON-RPC endpoints
  2. Handle message/send requests
  3. Return A2A-compliant Task or Message responses

Security: The Big Concern

Agents talking to each other raises serious security questions. A2A addresses several:

Opacity

The most important security principle. Agents collaborate without exposing their internals. The other agent doesn't know what model you're using, what tools you have, or what you've memorized. Unlike traditional APIs where you need to know the interface spec, A2A only uses the public information in the Agent Card.

Auth

A2A supports standard HTTP auth mechanisms. The Agent Card declares what authentication is required. The client provides credentials with each request.

Input Validation

Never trust another agent's output. Even "trusted" agents can produce malicious content (prompt injection attacks). Always validate and sanitize A2A responses before processing them.

Performance Tips

Connection Reuse

If you're calling the same agent frequently, keep HTTP connections alive. The Python SDK uses httpx under the hood which supports this by default.

Parallel Calls

If multiple agents need to be called and they're independent, use asyncio.gather:

python
1
results = await asyncio.gather(
2
    agent_a.send_message(request_a),
3
    agent_b.send_message(request_b),
4
    agent_c.send_message(request_c),
5
)

I tested this with three agents. Parallel execution took about as long as the slowest individual agent — roughly 3x faster than serial execution.

Cache Agent Cards

Agent Cards don't change often. Cache them client-side instead of fetching every time. The spec recommends supporting HTTP caching headers (ETag, Last-Modified) on the Agent Card endpoint.

Current State of the Ecosystem

Let's be real: A2A is still early. It shipped in April 2025, so it's been about a year. The ecosystem is growing but not mature.

What's good:

  • Google is behind it with real engineering investment
  • DeepLearning.AI released a dedicated course
  • Multiple major agent frameworks are integrating
  • Spec is at v1.0 and relatively stable

What's not there yet:

  • Production case studies are sparse
  • SDK docs and examples need more love
  • Community size lags behind MCP
  • Debugging and observability tools are limited

My take: A2A and MCP will become the two foundational protocols of the AI agent ecosystem. MCP for tool calls, A2A for agent collaboration. Like HTTP and WebSocket — different protocols for different needs, both essential.

What's Next

I'm planning to integrate A2A into Hermes Agent (Hermes Agent vs OpenClaw Architecture Security and Use Cases Compared):

  1. Wrap Hermes as an A2A Server so its skills are callable by other agents
  2. Build an orchestrator that coordinates multiple A2A agents on complex tasks
  3. Explore A2A's push notification mechanism for long-running workflows

Will write a follow-up when that's done. Questions? Drop them in the comments.

Resources:

  • A2A Protocol Docs: https://a2a-protocol.org/
  • A2A GitHub: https://github.com/a2aproject/A2A
  • A2A Python SDK: https://github.com/a2aproject/a2a-python
  • DeepLearning.AI A2A Course: https://www.deeplearning.ai/

advertisement

A2A Protocol in Practice: When AI Agents Start Talking to Each Other, Things Get Interesting — AI Hub