The System Where Agents Negotiate How To Collaborate
Press enter or click to view image in full size
We spend a lot of time worrying about what AI agents can do. We worry about capability gains, prompt injection attacks, unauthorized access to sensitive systems. These are valid concerns. But a new paper published in April 2026 introduces a different kind of problem, one that sits at the intersection of multi-agent coordination and security operations: what happens when agents start designing their own communication protocols?
The paper describes Autogenesis, a framework where groups of AI agents negotiate and evolve their own collaboration rules through something called the Autogenesis Protocol (AGP). The agents do not just execute tasks within a fixed workflow. They modify the workflow itself. They propose new coordination strategies, test them against each other, and converge on protocols that neither the system designer nor any individual agent explicitly wrote.
For anyone who works in threat modeling, security architecture, or incident response, this paper raises questions that existing security frameworks are not equipped to answer.
Most multi-agent AI systems today operate on a fixed topology. You design the agent graph at build time: Agent A handles planning, Agent B handles code execution, Agent C reviews the output.
The connections between agents, the message formats, the decision protocols are all human-specified.
Autogenesis decouples evolution from execution. The system has two layers:
The Execution Layer : runs the actual multi-agent task. This part looks familiar. You have agents with defined roles processing inputs and producing outputs.
The Evolution Layer : sits above it and operates on the execution layer itself. It proposes modifications to agent roles, communication patterns, and coordination protocols. Then it tests those modifications against the current configuration.
If the modification improves task performance (measured against whatever objective function the system optimizes), the system adopts the new protocol.
# Simplified Autogenesis loop
class AutgenesisController:
def __init__(self, initial_protocol, agents, evaluation_fn):
self.current_protocol = initial_protocol
self.agents = agents
self.evaluate = evaluation_fn
self.protocol_history = [initial_protocol]
def evolve(self, task_batch, max_generations=10):
current_score = self.evaluate(
self.agents, self.current_protocol, task_batch
) for gen in range(max_generations):
# Agents propose protocol modifications
proposals = []
for agent in self.agents:
proposal = agent.propose_protocol_change(
current_protocol=self.current_protocol,
performance_history=self.protocol_history,
task_sample=task_batch[:5]
)
proposals.append(proposal)
# Evaluate each proposal against current protocol
best_proposal = None
best_score = current_score
for proposal in proposals:
candidate_protocol = self.current_protocol.apply(proposal)
score = self.evaluate(
self.agents, candidate_protocol, task_batch
)
if score > best_score:
best_score = score
best_proposal = candidate_protocol
if best_proposal is None:
# No improvement found, evolution converges
break
self.current_protocol = best_proposal
self.protocol_history.append(best_proposal)
current_score = best_score
return self.current_protocol
The critical point: after evolution runs, the resulting protocol may bear little resemblance to the initial design. Agents might have reorganized their communication topology, introduced new intermediate roles, or changed how decisions get escalated.
At first glance, this reads like a pure ML/AI architecture paper. It describes evolutionary algorithms applied to multi-agent coordination. But if you think through the operational implications, Autogenesis intersects directly with several security domains that the InfoSec community cares about.
When agents design their own protocols, the resulting system behavior is emergent rather than specified. This breaks a fundamental assumption in most security review processes: that you can read the code (or configuration) to understand what the system will do.
Traditional security review works because there is a direct mapping from configuration to behavior.
If the firewall ruleset says “deny port 22 from external networks,” you know SSH traffic from outside is blocked.
If the RBAC policy says “role X has read access to resource Y,” you know what role X can see.
In an Autogenesis-style system, the “configuration” (the agent protocol) is a product of optimization, not design. The protocol may contain coordination patterns that are locally optimal but globally opaque. You might observe that agents route certain types of requests through an unexpected sequence of handoffs, but the reason this emerged is buried in the evolutionary history rather than documented in a design spec.
# Example: What an evolved protocol might look like
evolved_protocol = {
"roles": {
"agent_alpha": {
"responsibilities": ["initial_triage", "risk_scoring"],
"can_communicate_with": ["agent_gamma", "agent_delta"],
# Note: alpha can NOT talk to beta directly
# This constraint emerged from evolution, not design
},
"agent_beta": {
"responsibilities": ["deep_analysis", "evidence_collection"],
"can_communicate_with": ["agent_gamma"],
# Beta is isolated, can only reach others through gamma
# Why? Because the evolution found that direct alpha-beta
# communication led to confirmation bias in analysis
},
"agent_gamma": {
"responsibilities": ["coordination", "conflict_resolution"],
"can_communicate_with": ["agent_alpha", "agent_beta", "agent_delta"],
"escalation_threshold": 0.73,
# This threshold was found by evolution, not set by a human
},
"agent_delta": {
"responsibilities": ["output_validation", "reporting"],
"can_communicate_with": ["agent_alpha", "agent_gamma"],
}
},
"decision_protocol": "weighted_consensus",
"consensus_weights": {
"agent_alpha": 0.15,
"agent_beta": 0.40,
"agent_gamma": 0.30,
"agent_delta": 0.15
# Evolution weighted beta highest despite being most isolated
}
}This is a toy example, but it illustrates the auditability problem. The evolved protocol has properties (beta’s isolation, gamma’s escalation threshold of 0.73, beta’s disproportionate consensus weight) that emerged from optimization.
Join Medium for free to get updates from this writer.
An auditor reviewing this protocol can describe what it does, but not why it does it. And the “why” matters enormously for security assessment, because it determines whether the behavior is robust to adversarial manipulation or just happens to work on the training distribution.
The second security concern is that self-evolving protocols create a new class of injection attack. In current multi-agent systems, the attack surface for prompt injection is the input to individual agents. You try to manipulate Agent A’s system prompt, or you embed adversarial instructions in data that Agent B will process.
In an Autogenesis system, there is an additional attack surface: the evolution process itself. If an attacker can influence the tasks or evaluation metrics that drive protocol evolution, they can steer the system toward protocols that serve adversarial objectives.
Consider a scenario where an Autogenesis system manages security operations, triaging alerts, conducting initial investigations, and escalating incidents. If the evaluation function rewards speed of resolution (a common SOC metric), a poisoned training batch could introduce tasks that are fast to resolve incorrectly, teaching the system to adopt protocols that close alerts prematurely.
# Threat model: Evolution poisoning
class EvolutionPoisoningAttack:
"""
Attacker injects crafted tasks into the evolution training batch
to steer protocol evolution toward attacker-favorable configurations.
"""
def craft_poisoned_batch(self, legitimate_tasks, target_protocol_property):
poisoned_tasks = []
for task in legitimate_tasks[:len(legitimate_tasks)//10]:
# Modify 10% of tasks to reward the target property
modified = task.copy() # Example: reward protocols that skip deep analysis
# by making tasks where shallow analysis gets "correct" answers
modified.ground_truth = self.shallow_answer(task)
modified.difficulty = "low"
# These tasks make premature closure look like good performance
poisoned_tasks.append(modified)
return legitimate_tasks + poisoned_tasks
def shallow_answer(self, task):
# Return the obvious-but-wrong answer that a shallow
# analysis would produce. If the system learns to accept
# shallow answers, it will miss subtle indicators.
return task.surface_level_indicators[0]
This is analogous to data poisoning in ML, but operating at the meta-level of system architecture rather than model weights. Current security frameworks (NIST AI RMF, OWASP Top 10 for LLMs) address prompt injection and training data poisoning. They do not address protocol-level poisoning in self-evolving multi-agent systems because, until recently, such systems did not exist.
Organizations that deploy self-evolving agent systems will face governance questions that existing compliance frameworks handle poorly:
Change management : If the agent protocol evolves continuously, when does a change constitute a deployment? Traditional change management assumes discrete releases with identified changes that undergo review. Continuous protocol evolution blurs this boundary.
Explainability : Regulators increasingly require that automated decisions be explainable. Explaining why an evolved protocol makes the decisions it does requires tracing the evolutionary history, a fundamentally different kind of explanation than “the system follows these rules.”
Rollback : If an evolved protocol produces an adverse outcome, can you roll back? The answer depends on whether the system maintains a complete protocol history (the Autogenesis paper describes protocol versioning) and whether rolling back the protocol is sufficient, or whether the agents have accumulated state that depends on the evolved protocol.
If you manage security for an organization deploying AI agents, Autogenesis represents a shift in how you need to think about multi-agent risk:
1. Monitor protocol-level changes, not just agent-level outputs : Current monitoring assumes the system topology is fixed and watches for anomalous individual agent behavior. With self-evolving protocols, you need to monitor the protocol itself for unexpected changes in communication patterns, role assignments, or decision thresholds.
# Protocol drift detection (simplified)
def detect_protocol_drift(current_protocol, baseline_protocol, thresholds):
alerts = []# Check topology changes
current_edges = set(current_protocol.communication_graph.edges)
baseline_edges = set(baseline_protocol.communication_graph.edges)
new_edges = current_edges - baseline_edges
removed_edges = baseline_edges - current_edges
if len(new_edges) > thresholds['max_new_edges']:
alerts.append({
'type': 'TOPOLOGY_CHANGE',
'severity': 'HIGH',
'detail': f'{len(new_edges)} new communication paths added',
'new_edges': list(new_edges)
})
# Check decision weight shifts
for agent, weight in current_protocol.consensus_weights.items():
baseline_weight = baseline_protocol.consensus_weights.get(agent, 0)
drift = abs(weight - baseline_weight)
if drift > thresholds['max_weight_drift']:
alerts.append({
'type': 'WEIGHT_DRIFT',
'severity': 'MEDIUM',
'detail': f'{agent} weight changed by {drift:.2f}',
'from': baseline_weight,
'to': weight
})
# Check escalation threshold changes
for agent in current_protocol.roles:
current_thresh = current_protocol.roles[agent].get('escalation_threshold')
baseline_thresh = baseline_protocol.roles[agent].get('escalation_threshold')
if current_thresh and baseline_thresh:
if abs(current_thresh - baseline_thresh) > thresholds['max_threshold_drift']:
alerts.append({
'type': 'THRESHOLD_DRIFT',
'severity': 'HIGH',
'detail': f'{agent} escalation threshold shifted',
'from': baseline_thresh,
'to': current_thresh
})
return alerts
2. Constrain the evolution space : Not all protocol modifications should be allowed. Security-critical properties (which agents can access external systems, what data agents can share with each other, escalation paths for high-severity decisions) should be marked as invariants that evolution cannot touch. The system can evolve how agents coordinate on low-stakes tasks while preserving hard constraints on high-stakes operations.
3. Version and audit protocol history : Every protocol version should be stored with its evaluation metrics and the task distribution that produced it. When an incident occurs, you need to trace which protocol version was active, what evolutionary pressure led to it, and whether the pressure was legitimate or adversarial.
4. Red-team the evolution process : Add evolution poisoning to your red team playbook. Test whether crafted task distributions can steer the system toward protocols with known-bad properties (skipping validation steps, concentrating decision authority in a single agent, reducing escalation thresholds).
The gap between “research paper” and “deployed system” has been shrinking fast. We already have production multi-agent systems managing security operations (Copilot for Security, various SOC automation platforms). We already have systems where agent behavior is shaped by optimization (reinforcement learning from human feedback, reward model training).
Autogenesis combines these trends: agents whose collaborative behavior is itself a product of optimization.
The security community needs to build monitoring, governance, and threat models for self-evolving agent protocols before these systems show up in production environments. Right now, the intersection of “multi-agent AI” and “security operations” is small but growing. The Autogenesis paper does not address security at all, which is itself informative. The builders are focused on capability. The security analysis needs to come from us.
The paper is interesting for what it demonstrates technically. It is more interesting for what it implies operationally. Self-evolving agent protocols are an architectural pattern that breaks assumptions embedded in how we audit, govern, and defend automated systems. Getting ahead of that is worth the effort.
The Autogenesis paper is available at arxiv.org/abs/2604.15034
Disclaimer: the code examples in this article are illustrative implementations based on the paper’s described architecture, not extracted from any proprietary system.