Give three agents one shared, governed brain — in about 30 minutes, with zero custom code.

Every AI agent you run today is a brilliant amnesiac.

It debugs a gnarly authentication issue at 2 PM, and by 2:05 — new session, new context window — that hard-won knowledge is gone. Worse: if you run several agents, each one re-learns the same lessons in its own silo. Your reviewer agent doesn't know what your dev agent discovered yesterday. Your docs agent contradicts both of them.

The fix isn't a bigger context window. It's shared memory — and once more than one agent can write to it, you immediately need governance: who can read what, who can write what, and an audit trail when something goes wrong.

That's exactly what MemClaw does. It's an open-source (Apache 2.0), MCP-native memory layer for agent fleets: agents write plain text, MemClaw turns it into enriched, searchable, permissioned memory that improves with use. It's running in production at eToro (NASDAQ: ETOR) with 300+ agents, but the OSS engine spins up on your laptop with one Docker command.

In this tutorial we'll build a small but real three-agent development fleet:

backend-dev — writes code, records decisions and gotchas
code-reviewer — reviews PRs, recalls past decisions before nitpicking
docs-writer — keeps documentation consistent with what the other two actually did

Our agent harness is Claude Code — Anthropic's terminal-based coding agent. I picked it because it's the most popular agentic surface right now, it speaks MCP natively, and MemClaw ships a one-line skill installer for it. No frameworks, no orchestration code, no SDKs. Just config.

Everything below also works with Cursor, Windsurf, Claude Desktop, or any MCP client — only the config file location changes.

The architecture

┌──────────────┐   ┌──────────────┐   ┌──────────────┐
│ Claude Code  │   │ Claude Code  │   │ Claude Code  │
│ backend-dev  │   │ code-reviewer│   │ docs-writer  │
└──────┬───────┘   └──────┬───────┘   └──────┬───────┘
       │       MCP (Streamable HTTP)         │
       └──────────────┬───────────┬──────────┘
                      ▼           
              ┌───────────────────┐
              │     MemClaw       │
              │  (FastAPI + MCP)  │
              │ Postgres+pgvector │
              │      Redis        │
              └───────────────────┘

Each Claude Code instance is an independent agent with its own agent_id. They never talk to each other directly. They communicate through memory — one writes, the others recall. This is the pattern that scales: at eToro, the same loop runs across 300+ agent identifiers with 26,500+ memories and 23 ms p50 search.

Step 1 — Spin up MemClaw (5 minutes)

You need Docker, Git, and an OpenAI API key (for embeddings and enrichment — Gemini, Anthropic, and OpenRouter are also supported for the LLM side).

git clone https://github.com/caura-ai/caura-memclaw.git
cd caura-memclaw
cp .env.example .env

Edit .env with the minimal setup:

EMBEDDING_PROVIDER=openai
ENTITY_EXTRACTION_PROVIDER=openai
USE_LLM_FOR_MEMORY_CREATION=true
OPENAI_API_KEY=sk-...
# Embedding model — optional; this is the default. Set it to override.
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

# Single-tenant local mode — simplest auth path
IS_STANDALONE=true

OPENAI_EMBEDDING_MODEL is optional — MemClaw defaults to text-embedding-3-small when it's unset. Pin it to be explicit, or change it to use a different model. The embedding dimension is fixed per deployment, so a swap has to match it.

Then:

docker compose up -d

This pulls multi-arch images from ghcr.io and brings up Postgres with pgvector, Redis, and the MemClaw API. Verify:

curl http://localhost:8000/api/v1/health
# {"status":"ok","storage":"connected","redis":"connected","event_bus":"ok"}

Smoke-test the memory pipeline before wiring up any agents:

curl -X POST http://localhost:8000/api/v1/memories \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "agent_id": "quickstart", "content": "Our auth service uses JWT with 15-minute expiry."}'

Every MemClaw write is authored by an agent_id — that identity is what makes the fleet governance in Step 6 possible, so we pass one explicitly. In standalone mode you may omit it (MemClaw falls back to a reserved default identity); on a shared or gateway deployment agent_id is required and an anonymous write is rejected.

Look at the response. You sent one raw sentence; MemClaw returned a memory with an LLM-inferred memory_type, title, summary, tags, status, and weight (importance). That single-pass enrichment — classify, extract entities, scan for PII, detect contradictions, embed — happens on every write. Your agents never have to structure anything.

Now search for it with completely different words:

curl -X POST http://localhost:8000/api/v1/search \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"tenant_id": "default", "query": "authentication token lifetime"}'

It comes back under an items array, each hit carrying its similarity, title, agent_id, and visibility:

{ "items": [ {
  "title": "Our auth service uses JWT with 15-minute expiry.",
  "memory_type": "fact", "similarity": 0.47,
  "agent_id": "quickstart", "visibility": "scope_team"
} ] }

Hybrid search blends pgvector semantic similarity, keyword matching, and knowledge-graph expansion. Memory layer: done. (Prefer not to self-host? Sign up at memclaw.net, grab an mc_ API key, and skip this step — everything else is identical.)

Step 2 — Connect Claude Code via MCP (2 minutes)

Add MemClaw as an MCP server. Register it at user scope (-s user) so it's available in every directory — Step 5 runs Claude Code from three different folders, and the default local scope would register it only for the one you're standing in now:

claude mcp add --transport http -s user memclaw http://localhost:8000/mcp \
  --header "X-API-Key: standalone"

Prefer a config file? Commit a .mcp.json to a project root — Claude Code reads that, but not settings.json. (For a multi-directory fleet like ours, the -s user command above is simpler.)

{
  "mcpServers": {
    "memclaw": {
      "url": "http://localhost:8000/mcp",
      "headers": { "X-API-Key": "standalone" }
    }
  }
}

Confirm with claude mcp list — you should see memclaw: ✓ Connected. Claude Code now auto-discovers all 12 MemClaw tools: memclaw_write, memclaw_recall, memclaw_list, memclaw_manage, memclaw_doc, memclaw_entity_get, memclaw_tune, memclaw_insights, memclaw_evolve, memclaw_stats, memclaw_keystones, and memclaw_keystones_set.

Step 3 — Install the skill (1 minute)

Tools tell an agent what it can call. A skill tells it when and how — the recall-before-acting habit, the write-after-learning habit, what makes a good memory versus noise. MemClaw ships its usage guide as a Claude Code skill, served by your own instance:

curl -s "http://localhost:8000/api/v1/install-skill" > /tmp/install-memclaw-skill.sh
less /tmp/install-memclaw-skill.sh   # always inspect before running
bash /tmp/install-memclaw-skill.sh

Verify and restart Claude Code (skills load at startup):

ls -la ~/.claude/skills/memclaw/SKILL.md

The skill is loaded on demand, not injected into every turn — it costs nothing until the agent actually reaches for memory. This step is the difference between an agent that can use memory and one that does.

Step 4 — Give each agent an identity

Here's the move that turns "Claude Code with memory" into "a fleet."

Every MemClaw tool accepts an agent_id — the caller's identity. Agents auto-register on first write and get a trust tier. Memory authorship, retrieval tuning, and governance all hang off this identity. So each of our three agents needs to consistently identify itself.

The simplest mechanism in Claude Code is the project-level CLAUDE.md memory file. Create three working directories (or git worktrees), one per agent, and drop an identity block into each.

~/fleet/backend-dev/CLAUDE.md:

# Agent identity
You are agent `backend-dev` in fleet `dev-fleet`.
On EVERY MemClaw tool call, pass agent_id="backend-dev" and fleet_id="dev-fleet".

# Memory discipline
- BEFORE starting any task: memclaw_recall for relevant context
  (prior decisions, known gotchas, conventions).
- AFTER completing any task: memclaw_write what you learned —
  decisions made, bugs found, anything a teammate would want to know.
- Write facts that age well. Skip transient noise.

~/fleet/code-reviewer/CLAUDE.md:

# Agent identity
You are agent `code-reviewer` in fleet `dev-fleet`.
On EVERY MemClaw tool call, pass agent_id="code-reviewer" and fleet_id="dev-fleet".

# Memory discipline
- BEFORE reviewing: memclaw_recall the relevant architectural decisions
  and conventions, so you review against what the team agreed — not
  against your own taste.
- AFTER reviewing: memclaw_write recurring issues you flagged, so the
  fleet stops repeating them.

~/fleet/docs-writer/CLAUDE.md:

# Agent identity
You are agent `docs-writer` in fleet `dev-fleet`.
On EVERY MemClaw tool call, pass agent_id="docs-writer" and fleet_id="dev-fleet".

# Memory discipline
- BEFORE writing docs: memclaw_recall what backend-dev and code-reviewer
  recorded about the feature. Docs describe what was BUILT, not what
  was planned.
- AFTER writing: memclaw_write a pointer to what you documented.

That's the entire orchestration layer. No message bus, no LangGraph, no shared scratchpad files. Identity plus a shared memory substrate.

Step 5 — Watch knowledge flow

Open three terminals and run claude in each agent's directory. These are pure memory tasks — no repo or sample code required; the point is to watch one agent's knowledge surface in another.

Terminal 1 (backend-dev):

"We've decided to rate-limit the payments API at 100 req/min per key using a sliding window in Redis. Record that decision and any Redis gotchas worth flagging for the team."

The agent calls memclaw_write with something like "Payments API rate limiting: 100 req/min per API key, sliding-window algorithm in Redis. Gotcha: the Redis connection pool maxes at 10 — raise REDIS_POOL_SIZE before adding more middleware that touches Redis." MemClaw classifies it (a decision and a rule, probably), extracts entities (Redis, payments API), embeds it, and stamps it scope_team — visible to the whole fleet by default.

Terminal 2 (code-reviewer), about to add a Redis-backed feature-flag cache:

"I'm about to add a Redis-backed feature-flag cache. Check our team's memory for Redis conventions and known issues I should follow before I start — and record any convention worth flagging for the team."

The agent calls memclaw_recall("Redis conventions and known issues") — and gets backend-dev's pool-size warning back: "Heads up — the Redis connection pool is capped at 10 and the rate limiter already consumes connections; bump REDIS_POOL_SIZE or a feature-flag cache will starve it under load."

The reviewer never spoke to backend-dev, and that warning lives in no codebase. One agent discovered; another recalled. That's the moment the fleet stops being three isolated amnesiacs and starts compounding. (Per its CLAUDE.md, code-reviewer also writes a short convention note here — which registers it as an agent, something Part 4's per-agent tuning relies on.)

Terminal 3 (docs-writer):

"Using our team's memory, document the current payments API rate-limiting decision."

It recalls both the original decision and the reviewer's note, and produces docs that match what the team actually decided — including the operational caveat that exists nowhere but in the fleet's shared memory.

Step 6 — Governance: the part everyone skips until it bites

Shared memory without permissions is a liability. Three MemClaw mechanisms matter even at three agents:

Visibility scopes. Every memory is stamped at write time: scope_agent (private to the author), scope_team (fleet-wide — the default), or scope_org (cross-fleet). When backend-dev writes a half-baked hypothesis it isn't sure about yet, it can keep it scope_agent until confirmed. Cross-fleet recall is permissioned, not open.

Trust tiers. Agents carry a trust level (0–3) controlling cross-fleet reads, writes, and deletes. New agents auto-register at a baseline; you promote them deliberately:

curl -X PATCH http://localhost:8000/api/v1/agents/backend-dev/trust \
  -H "X-API-Key: standalone" \
  -H "Content-Type: application/json" \
  -d '{"trust_level": 2}'

A trust-1 agent can manage its own memories; cross-agent operations require trust 2+. A compromised or buggy low-trust agent simply can't corrupt or delete the fleet's shared knowledge.

The audit log. Every write, read, delete, and status transition is logged with tenant, agent, and scope context:

curl "http://localhost:8000/api/v1/audit-log" -H "X-API-Key: standalone"

When someone asks "why does the docs agent believe X?" you have a provenance chain, not a shrug. PII is auto-detected at write time and flagged in each memory's metadata (contains_pii, pii_types), so you can review or filter sensitive writes.

None of this required configuration. It's built in, not bolted on — which is precisely the argument for using a governed memory layer instead of pointing three agents at a shared vector store and hoping.

Step 7 — Close the loop: a fleet that learns

So far the fleet shares. MemClaw's third pillar is that it improves:

Outcome reporting. After acting on recalled memories, an agent reports back via memclaw_evolve — success, failure, or partial, with the memory IDs that influenced the action. Successes reinforce memory weights; failures auto-generate preventive rule-type memories. (The project calls this the Karpathy Loop.) One fleet nuance: memclaw_evolve defaults to scope="agent", which only reinforces the caller's own memories. To reinforce a teammate's recalled memory, the reporting agent must pass scope="fleet" (with fleet_id) and hold trust 2+ (the same promotion from Step 6) — at the default scope="agent" the update is skipped as out-of-scope, and even at scope="fleet" a trust-1 agent can't move another agent's weight. Add one line to each CLAUDE.md: "After acting on recalled memories, report the outcome with memclaw_evolve — pass scope="fleet" when the memory came from another agent."

Per-agent retrieval tuning. memclaw_tune lets each agent adjust its own retrieval profile. Your code-reviewer might want precision (higher min_similarity, lower top_k); your docs-writer wants breadth (more graph hops, more results). Search quality compounds per agent, per feedback signal.

Hygiene runs automatically. The LLM crystallizer merges near-duplicate memories into canonical atomic facts with provenance. Contradiction detection (RDF triples + LLM analysis) supersedes stale facts when the team changes its mind. An 8-status lifecycle retires what's outdated. Run memclaw_insights with focus contradictions or stale periodically — findings persist as insight memories that future runs build on.

The result is a flywheel: write → recall → act → report → better recall. Every interaction makes the next one smarter.

Where to go from here

You now have a working three-agent fleet with shared, governed, self-improving memory — built entirely from config files. Scaling paths:

More agents is just more CLAUDE.md identities. The pattern is identical at 3 or 300.
Running an OpenClaw fleet? MemClaw installs as a gateway plugin with one command and auto-stamps fleet_id on every write — no per-agent prompting needed.
Skip the infra with the managed platform at memclaw.net (free tier: 10K memories) — same MCP endpoint, governance dashboard included.
Mixed fleets: anything that speaks MCP joins the same brain — Cursor, Windsurf, Claude Desktop, Codex, custom agents via REST.
Production reference: read the eToro "Company Brain" case study — 300+ agents, 26,500+ memories, 1,372 shared skills, 23 ms p50 search.

The repo is at github.com/caura-ai/caura-memclaw (Apache 2.0). Star it, break it, file issues — and come argue about agent memory on Discord.

Your agents are already smart. Stop making them start from zero.

MemClaw is built by Caura.ai. All benchmarks and limits referenced are from the project README and memclaw.net as of June 2026.

Building a Multi-Agent Fleet with MemClaw and Claude Code