Self-Organizing AI Agent Swarms. On Demand.
k8s orchestrates containers. a8s orchestrates AI agents. Inspired by Kubernetes.

Agentnetes

Zero to a Self-Discovering AI Agency.

One goal. A swarm of agents spins up, each in its own sandbox and coding agent. They research, build, collaborate, and deliver together.

RLM PatternMIT CSAIL
AutoResearch LoopKarpathy
Two-Tool MCPsearch() + execute()
A2A ProtocolGoogle Agent-to-Agent
$agentnetes run
$ npm install -g agentnetes
then run on any git repo
$ agentnetes run "add dark mode to this app"
0+
parallel agents
0 tools
per agent (MCP)
0K+
lines explored
0 goal
to a full team

How It Works

From one goal to a working team.

Type a goal. The system researches, decomposes, builds, and delivers — autonomously.

01 · Goal
$ agentnetes run "Add billing with Stripe"

One prompt. No config, no YAML, no agent definitions.

02 · Decompose
Root Agent · RLM
search → analyze → plan → split
UI Eng
API Eng
DB Eng
Tester

Root agent auto-researches the codebase and invents the right team.

03 · Sandboxes
UI EngmicroVM
API EngmicroVM
DB EngmicroVM
TestermicroVM
Firecracker Coding harness
04 · Auto-Research
API EngineerRLM loop 3/5
search···
read···
think···
execute···
verify···

Each agent runs a tight RLM loop — search, read, think, execute, verify — repeating until solved.

05 · A2A Collaborate
Tester
Webhook sig check fails
API Eng
Patched env config — re-run
Tester
12/12 tests passing ✓

Agents coordinate via A2A protocol — no shared memory or context window.

06 · Deliver
Complete52s
9 files modified
Stripe webhooks + billing UI
12/12 tests passing
4
agents
4
sandboxes
5
RLM
52s
time

The Problem

Single agents break on real codebases.

Teams have tried everything: single coding agents, prompt-stuffed RAG pipelines, hand-wired MCP tools, elaborate harnesses. They all hit the same wall. Real codebases are too large for one context window, too complex for one agent, and too dynamic for static tool configs. Nobody stopped to ask: what if the agents invented their own team?

Single agent + tools
Cursor, Copilot, custom harnesses
  • Entire codebase stuffed into context
  • Context rot beyond ~50K tokens
  • Sequential: one task at a time
  • One failure halts everything
  • Static tool lists, no emergent roles
Agentnetes
vRLM runtime · RLM pattern
  • Context lives in sandboxes, not prompts
  • Agents explore via grep, find, cat
  • Specialists run in parallel
  • Agents catch and fix each other's errors
  • Roles invented per goal, never hardcoded
Teams using AI tools
Copilot, Cursor, RAG, MCP harnesses
  • Try different coding agent harnesses
  • Hand-wire MCP tools and hope they compose
  • Build RAG pipelines to stuff context
  • Tune prompts and pray context holds
  • Never break the problem down RLM-style

Who Is This For

Built for developers who move fast.

🧑‍💻
Solo developers
Get the leverage of a full engineering team on your personal projects. Ship faster without hiring.
🏗️
Monorepo teams
Point the swarm at your large codebase. Agents explore only what they need · no context limits.
🔐
Security engineers
Spin up a dedicated audit swarm to scan, analyse, and fix vulnerabilities across the entire codebase.
⚙️
Platform engineers
Automate cross-cutting concerns · test coverage, observability, migrations · across all services.

Differentiation

Not another AI coding assistant.

Agentnetes is not autocomplete. It is not a chatbot. It is a swarm that investigates, builds, tests, and delivers.

01
Self-organizing teams
The model invents the roles it needs. A provider task spawns a Scout, Engineer, Tester. A security audit spawns an entirely different team. Nothing is hardcoded.
02
Real code execution
Agents run real shell commands in real sandboxes. Tests actually execute. Build failures get fixed. This is not a simulation of engineering · it is engineering.
03
Any model, same swarm
Swap Gemini for Claude or GPT by changing one env var. The swarm architecture is model-agnostic. Route through Vercel AI Gateway or call Google directly.
04
Two-tool MCP strategy
Each agent has exactly two tools: search() and execute(). ~1,000 token footprint regardless of codebase size. No tool bloat, no context waste.
05
Context stays external
Files never enter the prompt. Agents write code to explore the codebase · grep, find, cat. Proven by the MIT CSAIL RLM paper to outperform context-stuffing 2×.
06
Agents fix each other
When the Tester finds a bug, it routes back to the Engineer automatically. The swarm has a built-in try → test → fix loop that runs until tests pass.

Architecture

One goal. A recursive agent swarm.

The root agent decomposes your goal, invents the right team of specialists, and orchestrates them across isolated sandboxes. Roles are fully emergent. Nothing is hardcoded.

goal:"Add @ai-sdk/deepseek provider"
Root Agent / Tech Lead
Gemini 2.5 Pro via AI Gateway
vRLM orchestrator
🔍
Architecture Scout
Gemini 2.5 Flash
search()
⚙️
Provider Engineer
Gemini 2.5 Flash
execute()
🧪
Test Engineer
Gemini 2.5 Flash
execute()
📦
Package Engineer
Gemini 2.5 Flash
execute()
Firecracker microVM per agentsearch() + execute() MCP onlySSE event stream to UI
Emergent team formation
The root agent reads the codebase, understands the task, and invents the right specialist roles. A provider task gets a Scout, Engineer, Tester, and Packager. A security audit gets an entirely different team.
Isolated Firecracker sandboxes
Each agent runs in its own Vercel Sandbox (Firecracker microVM). Pre-warmed from a repo snapshot for near-instant startup. Agents cannot interfere with each other.
Context externalized, not stuffed
Agents do not receive hundreds of files in their prompts. They write code to explore context: grep, find, cat. This is the RLM Pattern from MIT CSAIL, proven 2x more effective.
Agents collaborate at runtime
When the Test Engineer finds a type error, that finding routes back to the Provider Engineer automatically. The vRLM runtime handles inter-agent communication.
Live Preview

Watch the swarm execute

Real output from a simulated run. The actual runtime produces identical event streams.

agentnetes -- vRLM runtime
Root orchestrator
Shell execution
Worker agent
Inter-agent collaboration
Completion

Research Foundations

Five ideas. One system.

Agentnetes combines five research-backed patterns that individually improve agent performance and together create something qualitatively different.

MIT CSAIL

RLM Pattern: Context lives in sandboxes

Recursive Language Model runtime. Instead of stuffing files into an agent's context window, context is externalized into a filesystem that agents explore programmatically.

Agents write small shell scripts to grep, find, and read exactly what they need. This keeps token footprints tiny regardless of codebase size and has been shown to outperform naive context-stuffing by 2x on software engineering benchmarks.

agent.execute('grep -r LanguageModelV1 packages/ -l')
Based on: RLM paper, MIT CSAIL -- externalized context via code execution
Karpathy

AutoResearch Loop: Try, measure, keep

Agents do not write code and hope for the best. They write code, run tests, measure the result, and either keep the change or discard it and try again.

The Test Engineer runs vitest after every change. Type errors trigger re-implementation. Test failures trigger targeted patches. The loop runs until everything passes or the agent escalates to root.

run tests -> check failures -> patch -> repeat
Based on: AutoResearch pattern -- Andrej Karpathy
MCP v2

Two-Tool MCP: search() and execute()

Each agent has exactly two tools exposed via MCP: search() for finding things in the codebase, and execute() for running shell commands in its sandbox.

This keeps the agent's tool surface to roughly 1,000 tokens regardless of task complexity. The agent writes arbitrary code against these two primitives. No tool proliferation. Just two powerful primitives that compose into anything.

tools: [search(), execute()] // ~1000 token footprint
MCP protocol -- two-tool minimal surface strategy
A2A Protocol

A2A-Ready: Every agent is a publishable service

Every agent Agentnetes spawns generates a standard A2A Agent Card describing its capabilities, skills, and endpoints.

Today these cards are internal. Tomorrow any specialist agent can be published as an independent, discoverable service. A Provider Engineer becomes a reusable service any other system can call.

GET /agents/provider-engineer/.well-known/agent.json
A2A Protocol v1.0 -- Google Agent-to-Agent standard
Kubernetes-inspired

Load Balancing: Work distributed across the swarm

All specialist agents run concurrently via Promise.allSettled. Work is load-balanced across the agent pool automatically. One agent failing never blocks the others.

maxWorkers caps concurrency like Kubernetes resource limits on a node. Promise.allSettled provides fault-tolerant dispatch: a failing Engineer never kills the Scout or Tester. The swarm delivers what it can regardless of individual failures.

Promise.allSettled(workers.map(task => runWorker(task)))
Kubernetes-inspired: parallel execution + fault-tolerant dispatch
Stack

Built on the bleeding edge

Every layer is the latest available. No legacy versions. No compatibility shims.

AI Runtime
ai (Vercel AI SDK)
v7.0.0-beta.33
ToolLoopAgent
beta agent primitive
@ai-sdk/google
v3.0.52
ToolLoopAgent is the core primitive in AI SDK v7 beta. Each worker runs two MCP tools: search() and execute().
Sandbox
Docker
node:20-alpine · local default
@vercel/sandbox
Firecracker microVMs
Local shell
no-install fallback
Docker is the default sandbox for local runs. Vercel Firecracker available for cloud self-hosting.
Models
Gemini 3.1 Pro
planner · latest
Gemini 2.5 Flash
planner + worker · default
Gemini 3.1 Flash-Lite
worker · fastest
Full Gemini 2.0 · 2.5 · 3.x lineup supported. Separate planner and worker models configurable in the UI.
lib/vrlm/runtime.ts
// Each worker: two tools, one sandbox, emergent role
const agent = new ToolLoopAgent({
model: google('gemini-2.5-flash'),
tools: { search, execute },
stopWhen: stepCountIs(40),
instructions: buildWorkerPrompt(task),
});
// Drive the tool loop; events stream to the UI via SSE
for await (const _ of result.fullStream) { ... }
Workflow

One sentence. Dynamic team. Real results.

From goal to dynamic team formation to delivered artifacts in isolated sandboxes.

01

You give a goal

Type what you want built. Not instructions on how. Just what. Agentnetes figures out the rest. The goal can reference a GitHub repo or an uploaded codebase.

02

Root agent explores

The Tech Lead spawns a Firecracker sandbox pre-warmed with the target repo. It uses grep, find, and cat to map the architecture. Context lives in the filesystem, not the prompt.

03

Team self-assembles

Based on what it finds, the root agent invents the team. Roles, goals, and dependencies are all emergent. A provider task gets different specialists than a security audit.

04

Agents work in parallel

Each specialist runs concurrently in its own Firecracker microVM. They explore, write code, run tests, and fix failures. No sequential bottlenecks.

05

Agents collaborate

Test failures and findings route back to the right specialist automatically. The vRLM runtime handles inter-agent communication. No human needed to relay.

06

Synthesis and delivery

The root agent collects all artifacts, verifies completeness, and streams a structured summary to the UI with every generated file. Each agent publishes an A2A card.

Runtime

vRLM: The Orchestration Runtime

Virtual Recursive Language Model Runtime. The engine between your goal and the agents. Inspired by the RLM pattern from MIT CSAIL.

01Plan

Root agent explores the repo with grep and find, then calls the Gemini planner to invent a specialist team. Roles are fully emergent. Nothing is hardcoded.

02Execute

Workers run in parallel. Each gets an isolated Docker container with the repo pre-cloned, and two tools: search() to grep the codebase and execute() to run any shell command.

03Synthesize

When all workers complete, the root agent reads their findings and artifacts, then produces a structured summary. Every generated file is collected and streamed to the UI.

Event Stream · SSE

Every phase emits typed events over Server-Sent Events. The UI subscribes and renders agent activity in real time with no polling.

task-creatednew agent spawned
task-updatedstatus or progress change
task-completedagent finished with artifacts
task-failedagent error
findingagent discovered something
terminalshell command + output
artifactfile produced by an agent
collaborationinter-agent finding shared
synthesisroot agent final summary
donerun complete
errorruntime error
lib/vrlm/types.ts
interface VrlmConfig {
maxWorkers: number; // default 6
maxStepsPerAgent: number; // default 20
plannerModel: string; // orchestrator
workerModel: string; // specialists
repoUrl: string; // cloned per agent
sandboxProvider: 'docker' | 'vercel' | 'e2b' | 'daytona' | 'local';
googleApiKey?: string; // UI override
}
Get Started

Three ways to run

From instant browser demo to full local execution with Docker sandboxes

no setup
Simulation

Watch the full agent lifecycle in your browser. No API key, no Docker, no install. Pre-scripted scenarios replay real event sequences.

Open /demo → toggle Simulation → watch
Try Demo
recommended
Self-host · Local

Clone the repo and run locally. Both simulation and real agent execution available. Configure your Google API key and target repo directly in the UI. No .env file needed.

$ git clone · npm install
$ docker pull node:20-alpine
$ npm run dev → open /demo
Read the docs
CLI
CLI · Any Repo

Install the npm package globally and run against any local git repo. Requires Docker running locally and a Google API key.

$ npm install -g agentnetes
$ export GOOGLE_API_KEY=...
$ agentnetes run "your goal"
npm package
Models

Any model. Same swarm.

Bring your own Google API key and swap models from the UI. No config file needed. Separate planner and worker models let you balance quality vs cost.

Gemini 3.x, 2.5, 2.0 all supported
Separate planner and worker models
BYOK: paste API key directly in the UI
Claude / GPT-4o supportsoon
Gemini 3.1 Pro
Planner · latest
new
Gemini 2.5 Flash
Planner + Worker · default
recommended
Gemini 3.1 Flash-Lite
Worker · fastest
fast
Gemini 2.0 Flash
Worker · budget
createGoogleGenerativeAI({
apiKey: process.env.GOOGLE_API_KEY
});
Demo

See it in action

One goal in, a swarm of self-discovering agents out. Watch the full pipeline from goal to delivered code.

Agentnetes
Try it now

See the swarm execute

Give it a goal. Watch specialist agents spawn, explore, implement, test, and deliver inside isolated sandboxes. Under 90 seconds.