Self-Evolving AI Agents Are Here and They Write Their Own Protocols !
好的,我现在需要帮用户总结一篇文章的内容,控制在100个字以内。首先,我得仔细阅读文章,理解其主要内容和重点。 这篇文章主要讨论了一个叫做Autogenesis的框架,它允许AI代理自行协商和演化协作规则。这与传统的固定工作流不同,Autogenesis引入了一个进化层,让代理能够提出、测试并采用新的协调策略。这种机制虽然增强了系统的灵活性和效率,但也带来了安全上的挑战。 接下来,我需要找出文章的关键点:Autogenesis的工作原理、它带来的安全问题以及对治理和合规的影响。这些内容需要简洁明了地表达出来。 然后,我要确保总结控制在100字以内,同时避免使用“文章内容总结”之类的开头语。直接描述文章的主要内容即可。 最后,检查语言是否流畅自然,并且准确传达了原文的核心信息。 </think> 文章探讨了AI代理通过Autogenesis框架自主协商和演化协作规则的可能性及其安全风险。该框架使代理能够修改工作流并优化协调策略,但由此产生的系统行为难以预测和审计,可能引发协议注入攻击和治理难题。 2026-4-21 06:5:10 Author: infosecwriteups.com(查看原文) 阅读量:7 收藏

Raviteja Nekkalapu

The System Where Agents Negotiate How To Collaborate

Press enter or click to view image in full size

We spend a lot of time worrying about what AI agents can do. We worry about capability gains, prompt injection attacks, unauthorized access to sensitive systems. These are valid concerns. But a new paper published in April 2026 introduces a different kind of problem, one that sits at the intersection of multi-agent coordination and security operations: what happens when agents start designing their own communication protocols?

The paper describes Autogenesis, a framework where groups of AI agents negotiate and evolve their own collaboration rules through something called the Autogenesis Protocol (AGP). The agents do not just execute tasks within a fixed workflow. They modify the workflow itself. They propose new coordination strategies, test them against each other, and converge on protocols that neither the system designer nor any individual agent explicitly wrote.

For anyone who works in threat modeling, security architecture, or incident response, this paper raises questions that existing security frameworks are not equipped to answer.

What Autogenesis Actually Does ?

Most multi-agent AI systems today operate on a fixed topology. You design the agent graph at build time: Agent A handles planning, Agent B handles code execution, Agent C reviews the output.

The connections between agents, the message formats, the decision protocols are all human-specified.

Autogenesis decouples evolution from execution. The system has two layers:

The Execution Layer : runs the actual multi-agent task. This part looks familiar. You have agents with defined roles processing inputs and producing outputs.

The Evolution Layer : sits above it and operates on the execution layer itself. It proposes modifications to agent roles, communication patterns, and coordination protocols. Then it tests those modifications against the current configuration.

If the modification improves task performance (measured against whatever objective function the system optimizes), the system adopts the new protocol.

# Simplified Autogenesis loop
class AutgenesisController:
def __init__(self, initial_protocol, agents, evaluation_fn):
self.current_protocol = initial_protocol
self.agents = agents
self.evaluate = evaluation_fn
self.protocol_history = [initial_protocol]
def evolve(self, task_batch, max_generations=10):
current_score = self.evaluate(
self.agents, self.current_protocol, task_batch
)

for gen in range(max_generations):
# Agents propose protocol modifications
proposals = []
for agent in self.agents:
proposal = agent.propose_protocol_change(
current_protocol=self.current_protocol,
performance_history=self.protocol_history,
task_sample=task_batch[:5]
)
proposals.append(proposal)
# Evaluate each proposal against current protocol
best_proposal = None
best_score = current_score

for proposal in proposals:
candidate_protocol = self.current_protocol.apply(proposal)
score = self.evaluate(
self.agents, candidate_protocol, task_batch
)
if score > best_score:
best_score = score
best_proposal = candidate_protocol

if best_proposal is None:
# No improvement found, evolution converges
break

self.current_protocol = best_proposal
self.protocol_history.append(best_proposal)
current_score = best_score

return self.current_protocol

The critical point: after evolution runs, the resulting protocol may bear little resemblance to the initial design. Agents might have reorganized their communication topology, introduced new intermediate roles, or changed how decisions get escalated.

The Security Implications Are Not Hypothetical

At first glance, this reads like a pure ML/AI architecture paper. It describes evolutionary algorithms applied to multi-agent coordination. But if you think through the operational implications, Autogenesis intersects directly with several security domains that the InfoSec community cares about.

Emergent Behavior and Auditability

When agents design their own protocols, the resulting system behavior is emergent rather than specified. This breaks a fundamental assumption in most security review processes: that you can read the code (or configuration) to understand what the system will do.

Traditional security review works because there is a direct mapping from configuration to behavior.

If the firewall ruleset says “deny port 22 from external networks,” you know SSH traffic from outside is blocked.

If the RBAC policy says “role X has read access to resource Y,” you know what role X can see.

In an Autogenesis-style system, the “configuration” (the agent protocol) is a product of optimization, not design. The protocol may contain coordination patterns that are locally optimal but globally opaque. You might observe that agents route certain types of requests through an unexpected sequence of handoffs, but the reason this emerged is buried in the evolutionary history rather than documented in a design spec.

# Example: What an evolved protocol might look like
evolved_protocol = {
"roles": {
"agent_alpha": {
"responsibilities": ["initial_triage", "risk_scoring"],
"can_communicate_with": ["agent_gamma", "agent_delta"],
# Note: alpha can NOT talk to beta directly
# This constraint emerged from evolution, not design
},
"agent_beta": {
"responsibilities": ["deep_analysis", "evidence_collection"],
"can_communicate_with": ["agent_gamma"],
# Beta is isolated, can only reach others through gamma
# Why? Because the evolution found that direct alpha-beta
# communication led to confirmation bias in analysis
},
"agent_gamma": {
"responsibilities": ["coordination", "conflict_resolution"],
"can_communicate_with": ["agent_alpha", "agent_beta", "agent_delta"],
"escalation_threshold": 0.73,
# This threshold was found by evolution, not set by a human
},
"agent_delta": {
"responsibilities": ["output_validation", "reporting"],
"can_communicate_with": ["agent_alpha", "agent_gamma"],
}
},
"decision_protocol": "weighted_consensus",
"consensus_weights": {
"agent_alpha": 0.15,
"agent_beta": 0.40,
"agent_gamma": 0.30,
"agent_delta": 0.15
# Evolution weighted beta highest despite being most isolated
}
}

This is a toy example, but it illustrates the auditability problem. The evolved protocol has properties (beta’s isolation, gamma’s escalation threshold of 0.73, beta’s disproportionate consensus weight) that emerged from optimization.

Get Raviteja Nekkalapu’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

An auditor reviewing this protocol can describe what it does, but not why it does it. And the “why” matters enormously for security assessment, because it determines whether the behavior is robust to adversarial manipulation or just happens to work on the training distribution.

The Protocol Injection Surface

The second security concern is that self-evolving protocols create a new class of injection attack. In current multi-agent systems, the attack surface for prompt injection is the input to individual agents. You try to manipulate Agent A’s system prompt, or you embed adversarial instructions in data that Agent B will process.

In an Autogenesis system, there is an additional attack surface: the evolution process itself. If an attacker can influence the tasks or evaluation metrics that drive protocol evolution, they can steer the system toward protocols that serve adversarial objectives.

Consider a scenario where an Autogenesis system manages security operations, triaging alerts, conducting initial investigations, and escalating incidents. If the evaluation function rewards speed of resolution (a common SOC metric), a poisoned training batch could introduce tasks that are fast to resolve incorrectly, teaching the system to adopt protocols that close alerts prematurely.

# Threat model: Evolution poisoning
class EvolutionPoisoningAttack:
"""
Attacker injects crafted tasks into the evolution training batch
to steer protocol evolution toward attacker-favorable configurations.
"""
def craft_poisoned_batch(self, legitimate_tasks, target_protocol_property):
poisoned_tasks = []
for task in legitimate_tasks[:len(legitimate_tasks)//10]:
# Modify 10% of tasks to reward the target property
modified = task.copy()

# Example: reward protocols that skip deep analysis
# by making tasks where shallow analysis gets "correct" answers
modified.ground_truth = self.shallow_answer(task)
modified.difficulty = "low"
# These tasks make premature closure look like good performance

poisoned_tasks.append(modified)

return legitimate_tasks + poisoned_tasks

def shallow_answer(self, task):
# Return the obvious-but-wrong answer that a shallow
# analysis would produce. If the system learns to accept
# shallow answers, it will miss subtle indicators.
return task.surface_level_indicators[0]

This is analogous to data poisoning in ML, but operating at the meta-level of system architecture rather than model weights. Current security frameworks (NIST AI RMF, OWASP Top 10 for LLMs) address prompt injection and training data poisoning. They do not address protocol-level poisoning in self-evolving multi-agent systems because, until recently, such systems did not exist.

Implications for Governance and Compliance

Organizations that deploy self-evolving agent systems will face governance questions that existing compliance frameworks handle poorly:

Change management : If the agent protocol evolves continuously, when does a change constitute a deployment? Traditional change management assumes discrete releases with identified changes that undergo review. Continuous protocol evolution blurs this boundary.

Explainability : Regulators increasingly require that automated decisions be explainable. Explaining why an evolved protocol makes the decisions it does requires tracing the evolutionary history, a fundamentally different kind of explanation than “the system follows these rules.”

Rollback : If an evolved protocol produces an adverse outcome, can you roll back? The answer depends on whether the system maintains a complete protocol history (the Autogenesis paper describes protocol versioning) and whether rolling back the protocol is sufficient, or whether the agents have accumulated state that depends on the evolved protocol.

What This Means For Security Teams

If you manage security for an organization deploying AI agents, Autogenesis represents a shift in how you need to think about multi-agent risk:

1. Monitor protocol-level changes, not just agent-level outputs : Current monitoring assumes the system topology is fixed and watches for anomalous individual agent behavior. With self-evolving protocols, you need to monitor the protocol itself for unexpected changes in communication patterns, role assignments, or decision thresholds.

# Protocol drift detection (simplified)
def detect_protocol_drift(current_protocol, baseline_protocol, thresholds):
alerts = []

# Check topology changes
current_edges = set(current_protocol.communication_graph.edges)
baseline_edges = set(baseline_protocol.communication_graph.edges)
new_edges = current_edges - baseline_edges
removed_edges = baseline_edges - current_edges

if len(new_edges) > thresholds['max_new_edges']:
alerts.append({
'type': 'TOPOLOGY_CHANGE',
'severity': 'HIGH',
'detail': f'{len(new_edges)} new communication paths added',
'new_edges': list(new_edges)
})

# Check decision weight shifts
for agent, weight in current_protocol.consensus_weights.items():
baseline_weight = baseline_protocol.consensus_weights.get(agent, 0)
drift = abs(weight - baseline_weight)
if drift > thresholds['max_weight_drift']:
alerts.append({
'type': 'WEIGHT_DRIFT',
'severity': 'MEDIUM',
'detail': f'{agent} weight changed by {drift:.2f}',
'from': baseline_weight,
'to': weight
})
# Check escalation threshold changes
for agent in current_protocol.roles:
current_thresh = current_protocol.roles[agent].get('escalation_threshold')
baseline_thresh = baseline_protocol.roles[agent].get('escalation_threshold')
if current_thresh and baseline_thresh:
if abs(current_thresh - baseline_thresh) > thresholds['max_threshold_drift']:
alerts.append({
'type': 'THRESHOLD_DRIFT',
'severity': 'HIGH',
'detail': f'{agent} escalation threshold shifted',
'from': baseline_thresh,
'to': current_thresh
})
return alerts

2. Constrain the evolution space : Not all protocol modifications should be allowed. Security-critical properties (which agents can access external systems, what data agents can share with each other, escalation paths for high-severity decisions) should be marked as invariants that evolution cannot touch. The system can evolve how agents coordinate on low-stakes tasks while preserving hard constraints on high-stakes operations.

3. Version and audit protocol history : Every protocol version should be stored with its evaluation metrics and the task distribution that produced it. When an incident occurs, you need to trace which protocol version was active, what evolutionary pressure led to it, and whether the pressure was legitimate or adversarial.

4. Red-team the evolution process : Add evolution poisoning to your red team playbook. Test whether crafted task distributions can steer the system toward protocols with known-bad properties (skipping validation steps, concentrating decision authority in a single agent, reducing escalation thresholds).

This Is Not Science Fiction

The gap between “research paper” and “deployed system” has been shrinking fast. We already have production multi-agent systems managing security operations (Copilot for Security, various SOC automation platforms). We already have systems where agent behavior is shaped by optimization (reinforcement learning from human feedback, reward model training).

Autogenesis combines these trends: agents whose collaborative behavior is itself a product of optimization.

The security community needs to build monitoring, governance, and threat models for self-evolving agent protocols before these systems show up in production environments. Right now, the intersection of “multi-agent AI” and “security operations” is small but growing. The Autogenesis paper does not address security at all, which is itself informative. The builders are focused on capability. The security analysis needs to come from us.

The paper is interesting for what it demonstrates technically. It is more interesting for what it implies operationally. Self-evolving agent protocols are an architectural pattern that breaks assumptions embedded in how we audit, govern, and defend automated systems. Getting ahead of that is worth the effort.

The Autogenesis paper is available at arxiv.org/abs/2604.15034

Disclaimer: the code examples in this article are illustrative implementations based on the paper’s described architecture, not extracted from any proprietary system.


文章来源: https://infosecwriteups.com/self-evolving-ai-agents-are-here-and-they-write-their-own-protocols-f3bcf13012a8?source=rss----7b722bfd1b8d---4
如有侵权请联系:admin#unsafe.sh