A colleague shared a paper this week that named something I’ve been building around without having clean language for.
The paper is “Autogenesis: A Self-Evolving Agent Protocol” from Nanyang Technological University, Stanford, and Princeton. The core argument: agent self-improvement fails not because the ideas are wrong, but because there’s no protocol governing how change happens.
The typical evolution loop today is: edit a prompt, rerun, hope it improves, overwrite the previous version. No audit trail. No rollback. No evaluation gate. No lineage. The system changes, but you have no idea whether it got better or just different.
The paper’s solution is to treat agent components – prompts, tools, memory, plans – as first-class resources with state and version lineage. Every change must pass through a defined control loop: reflect, select, improve, evaluate, commit. A change doesn’t happen unless it passes. If it fails, it rolls back.
That’s not an AI workflow. That’s a transaction system.
The performance numbers are significant. On GAIA benchmarks, iterative tool evolution within this controlled loop improves task success from 79% to 89%, with the largest gains on the complex tasks where static toolchains break.
But the benchmark isn’t the point. The design principle is.
I’ve been building something similar with Alice – my personal agentic assistant – without having the academic framing for it.
The principle I landed on: treat memory, instructions, and skills as git objects.
Every component has a version. Nothing overwrites anything directly. When a reflection cycle or a compaction event generates a proposed change – to how Alice processes context, to a skill’s logic, to a piece of long-term memory – that change starts as a branch, not an edit.
From there it follows the same discipline I apply to code: does the change introduce regressions? does it conflict with existing behavior? does it actually move the system toward the goal it’s supposed to serve? If yes: PR and review, then merge to main. If not: the branch stays open or gets closed. The previous state is untouched.
The agent can propose. It cannot overwrite itself.
The part that isn’t automated is the most important part.
Changes have to be goal-oriented. They have to be questioned against principles. There has to be a human in the loop – not to approve every micro-decision, but to evaluate whether the direction of change is actually desirable.
This is the constraint the paper also identifies: if your evaluation surface is weak or noisy, the protocol enforces stability but can’t guarantee meaningful improvement. Structure alone doesn’t make an agent smarter. It just makes the changes legible enough to govern.
An agent that can change without governance isn’t improving. It’s drifting.
Most of what I’ve seen in production agent stacks ignores this entirely. Prompts get tweaked. Memory gets appended. Skills get patched. Each change feels small, so nobody formalizes it. Over time the system becomes something nobody fully understands – not because it’s complex, but because its history was never tracked.
The tooling exists. Git is forty years old. The concept of a merge request is not exotic.
The gap isn’t technical. It’s discipline.
Agents become reliable the same way codebases do: not by changing, but by changing carefully, with structure, with lineage, and with someone accountable for what gets merged.
Paper: Autogenesis: A Self-Evolving Agent Protocol – Nanyang Technological University, Stanford, Princeton, 2025.