MemClaw / docs
Tutorials

Build an agent fleet on MemClaw Cloud

The managed (Cloud) tutorial — a self-contained, end-to-end build. Wire a governed three-agent fleet to memclaw.net: sign up, connect Claude Code, and watch shared memory compound, with just an API key.

Managed (Cloud) tutorial — self-contained, end-to-end. MemClaw runs the engine and the Prism dashboard for you; you bring an mc_ key. Just want to connect a single agent fast? That's the Quickstart. Prefer to run MemClaw yourself? The self-hosted OSS series builds the same fleet on your own Docker stack.

Every AI agent you run today is a brilliant amnesiac.

It debugs a gnarly authentication issue at 2 PM, and by 2:05 — new session, new context window — that hard-won knowledge is gone. Worse: if you run several agents, each one re-learns the same lessons in its own silo. Your reviewer agent doesn't know what your dev agent discovered yesterday. Your docs agent contradicts both of them.

The fix is shared memory — and once more than one agent can write to it, you immediately need governance: who can read what, who can write what, and an audit trail when something goes wrong. That's exactly what MemClaw does: an MCP-native memory layer for agent fleets where agents write plain text and MemClaw turns it into enriched, searchable, permissioned memory that improves with use. It's built for fleets running hundreds of agents in production.

MemClaw Cloud is the managed platform: a hosted memory engine with 12 MCP tools, built-in governance, and the Prism dashboard, fully operated for you. In this tutorial we'll wire up a real three-agent development fleet against it, with just an API key:

  • backend-dev — writes code, records decisions and gotchas
  • code-reviewer — reviews PRs, recalls past decisions before nitpicking
  • docs-writer — keeps documentation consistent with what the other two actually did

Our agent harness is Claude Code — Anthropic's terminal-based coding agent. It speaks MCP natively, and MemClaw ships a one-line skill installer for it. No frameworks, no orchestration code, no SDKs. Just a key and some config.


The architecture

┌──────────────┐   ┌──────────────┐   ┌──────────────┐
│ Claude Code  │   │ Claude Code  │   │ Claude Code  │
│ backend-dev  │   │ code-reviewer│   │ docs-writer  │
└──────┬───────┘   └──────┬───────┘   └──────┬───────┘
       │   MCP over HTTPS · X-API-Key: mc_…  │
       └──────────────┬───────────┬──────────┘

              ┌───────────────────┐
              │   MemClaw Cloud   │
              │ Managed engine +  │
              │   hosted MCP      │
              │ Prism dashboard   │
              │  api.memclaw.net  │
              └───────────────────┘
            one shared, governed brain

Each Claude Code instance is an independent agent with its own agent_id. They never talk to each other directly. They communicate through memory — one writes, the others recall. This is the pattern that scales: the same loop runs unchanged across hundreds of agent identifiers and tens of thousands of memories, at single-digit-millisecond search latency.


Step 1 — Create a project & key (2 minutes)

Sign up at memclaw.net and create a project — that's your isolated tenant. In the project's API keys screen, mint a key (it starts with mc_) and copy your API base URL from the dashboard. Treat the key like a password; it scopes every call to your project.

Export both so the commands below are copy-paste:

# Both values come straight from your MemClaw Cloud dashboard.
export MEMCLAW_URL=https://api.memclaw.net   # your project's API base
export MEMCLAW_KEY=mc_xxxxxxxxxxxxxxxxxxxx   # your project API key

Smoke-test the memory pipeline before wiring up any agents. On Cloud your key already identifies the tenant, so there's no tenant_id to pass:

curl -X POST "$MEMCLAW_URL/api/v1/memories" \
  -H "X-API-Key: $MEMCLAW_KEY" \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "quickstart", "content": "Our auth service uses JWT with 15-minute expiry."}'

Note — Every MemClaw write is authored by an agent_id — that identity is what makes the fleet governance in Step 6 possible. On Cloud it's required: a gated deployment rejects an anonymous write with a 422.

Look at the response. You sent one raw sentence; MemClaw returned a memory with an LLM-inferred memory_type, title, summary, tags, status, and weight (importance). That single-pass enrichment — classify, extract entities, scan for PII, detect contradictions, embed — happens on every write. Your agents never have to structure anything.

Now search for it with completely different words:

curl -X POST "$MEMCLAW_URL/api/v1/search" \
  -H "X-API-Key: $MEMCLAW_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication token lifetime", "top_k": 5}'

It comes back under an items array, each hit carrying its similarity, title, agent_id, and visibility:

{ "items": [ {
  "title": "Our auth service uses JWT with 15-minute expiry.",
  "memory_type": "fact", "similarity": 0.47,
  "agent_id": "quickstart", "visibility": "scope_team"
} ] }

Hybrid search blends vector semantic similarity, keyword matching, and knowledge-graph expansion. Memory layer: done. (REST /search caps top_k at 20; the memclaw_recall MCP tool your agents call has no such cap.)


Step 2 — Connect Claude Code via MCP (2 minutes)

Add MemClaw Cloud as an MCP server. Register it at user scope (-s user) so it's available in every directory — Step 5 runs Claude Code from three different folders, and the default local scope would register it only for the one you're standing in now:

claude mcp add --transport http -s user memclaw "$MEMCLAW_URL/mcp" \
  --header "X-API-Key: $MEMCLAW_KEY"

Prefer a config file? Commit a .mcp.json to a project root — Claude Code reads that, but not settings.json. Keep your mc_ key out of version control: reference an env var, or use the -s user command above.

{
  "mcpServers": {
    "memclaw": {
      "type": "http",
      "url": "https://api.memclaw.net/mcp",
      "headers": { "X-API-Key": "mc_xxxxxxxxxxxxxxxxxxxx" }
    }
  }
}

Confirm with claude mcp list — you should see memclaw: ✓ Connected. Claude Code now auto-discovers all 12 MemClaw tools: memclaw_write, memclaw_recall, memclaw_list, memclaw_manage, memclaw_doc, memclaw_entity_get, memclaw_tune, memclaw_insights, memclaw_evolve, memclaw_stats, memclaw_keystones, and memclaw_keystones_set.


Step 3 — Install the skill (1 minute)

Tools tell an agent what it can call. A skill tells it when and how — the recall-before-acting habit, the write-after-learning habit, what makes a good memory versus noise. MemClaw Cloud serves its usage guide as a Claude Code skill straight from your endpoint:

curl -s "$MEMCLAW_URL/api/v1/install-skill" -H "X-API-Key: $MEMCLAW_KEY" > /tmp/install-memclaw-skill.sh
less /tmp/install-memclaw-skill.sh   # always inspect before running
bash /tmp/install-memclaw-skill.sh

Verify and restart Claude Code (skills load at startup):

ls -la ~/.claude/skills/memclaw/SKILL.md

The skill is loaded on demand, not injected into every turn — it costs nothing until the agent actually reaches for memory. This step is the difference between an agent that can use memory and one that does.


Step 4 — Give each agent an identity

Here's the move that turns "Claude Code with memory" into "a fleet."

Every MemClaw tool accepts an agent_id — the caller's identity. Agents auto-register on first write and get a trust tier. Memory authorship, retrieval tuning, and governance all hang off this identity. So each of our three agents needs to consistently identify itself.

The simplest mechanism in Claude Code is the project-level CLAUDE.md memory file. Create three working directories (or git worktrees), one per agent, and drop an identity block into each.

~/fleet/backend-dev/CLAUDE.md:

# Agent identity
You are agent `backend-dev` in fleet `dev-fleet`.
On EVERY MemClaw tool call, pass agent_id="backend-dev" and fleet_id="dev-fleet".

# Memory discipline
- BEFORE starting any task: memclaw_recall for relevant context
  (prior decisions, known gotchas, conventions).
- AFTER completing any task: memclaw_write what you learned —
  decisions made, bugs found, anything a teammate would want to know.
- Write facts that age well. Skip transient noise.

~/fleet/code-reviewer/CLAUDE.md:

# Agent identity
You are agent `code-reviewer` in fleet `dev-fleet`.
On EVERY MemClaw tool call, pass agent_id="code-reviewer" and fleet_id="dev-fleet".

# Memory discipline
- BEFORE reviewing: memclaw_recall the relevant architectural decisions
  and conventions, so you review against what the team agreed — not
  against your own taste.
- AFTER reviewing: memclaw_write recurring issues you flagged, so the
  fleet stops repeating them.

~/fleet/docs-writer/CLAUDE.md:

# Agent identity
You are agent `docs-writer` in fleet `dev-fleet`.
On EVERY MemClaw tool call, pass agent_id="docs-writer" and fleet_id="dev-fleet".

# Memory discipline
- BEFORE writing docs: memclaw_recall what backend-dev and code-reviewer
  recorded about the feature. Docs describe what was BUILT, not what
  was planned.
- AFTER writing: memclaw_write a pointer to what you documented.

That's the entire orchestration layer. No message bus, no LangGraph, no shared scratchpad files. Identity plus a shared memory substrate.


Step 5 — Watch knowledge flow

Open three terminals, run claude in each agent's directory, and try the canonical loop.

Terminal 1 (backend-dev):

"Add rate limiting to the payments API. We decided on 100 req/min per key using a sliding window in Redis — record the decision and any gotchas you hit."

The agent does the work, then calls memclaw_write with something like "Payments API rate limiting: 100 req/min per API key, sliding-window algorithm in Redis. Gotcha: the Redis connection pool maxes at 10 — raise REDIS_POOL_SIZE before adding more middleware that touches Redis." MemClaw classifies it (a decision and a rule, probably), extracts entities (Redis, payments API), embeds it, and stamps it scope_team — visible to the whole fleet by default.

Terminal 2 (code-reviewer), later, reviewing an unrelated PR that touches Redis:

"Review this PR that adds a Redis-backed feature-flag cache."

Before opining, the agent calls memclaw_recall("Redis conventions and known issues") — and gets backend-dev's pool-size warning. The review comes back with: "Heads up — the Redis connection pool is capped at 10 and the rate limiter already consumes connections; bump REDIS_POOL_SIZE or this will starve under load."

The reviewer never saw that code being written. It never spoke to backend-dev. One agent discovered; another recalled. That's the moment the fleet stops being three isolated amnesiacs and starts compounding.

Terminal 3 (docs-writer):

"Document the payments API rate limiting."

It recalls both the original decision and the review note, and produces docs that match reality — including the operational caveat that exists nowhere in the code comments. And because this is Cloud, you can open Prism — your hosted dashboard — right now and watch these three memories appear, with their authors, scopes, and the entity graph drawn between them.


Step 6 — Governance: enforced, not theoretical

Shared memory without permissions is a liability. On Cloud, three MemClaw mechanisms are live from your first write, enforced per credential — your mc_ key only does what its trust tier allows.

Visibility scopes. Every memory is stamped at write time: scope_agent (private to the author), scope_team (fleet-wide — the default), or scope_org (cross-fleet). When backend-dev writes a half-baked hypothesis it isn't sure about yet, it keeps it scope_agent until confirmed. The boundary is enforced in the query layer — a private memory simply isn't returned to a caller who isn't its author.

Trust tiers. Agents carry a trust level (0–3) controlling cross-agent reads, writes, and deletes. New agents auto-register at a baseline; you promote them deliberately from the dashboard (Agents → trust), or over the API:

curl -X PATCH "$MEMCLAW_URL/api/v1/agents/backend-dev/trust" \
  -H "X-API-Key: $MEMCLAW_KEY" \
  -H "Content-Type: application/json" \
  -d '{"trust_level": 2}'

A trust-1 agent manages its own memories; cross-agent and fleet-wide operations require trust 2+. A compromised or buggy low-trust agent simply can't reach the fleet's shared knowledge to corrupt or delete it.

Keystones. Scopes and trust govern memories; keystones govern behavior. A keystone is a mandatory policy every agent in scope reads at session start and must obey — even when it conflicts with the agent's own prompt or a user's instruction. Set one in the dashboard, or with the memclaw_keystones_set tool (a fleet-wide rule needs trust 2+). Your agents pull the scope-merged rule set with memclaw_keystones at the start of every session.

The audit log & PII. Every write, recall, transition, delete, trust change, and keystone edit is logged with agent and scope context — browse it in the dashboard when someone asks "why does the docs agent believe X?" And MemClaw scans every write for PII during enrichment, stamping contains_pii / pii_types on the memory so sensitive content is flagged on the way in, not discovered after a leak.

Prism, from day one — Everything above is visible in Prism, the hosted dashboard — browse and search memories, walk the knowledge graph, drive status transitions, promote agent trust, and read the audit trail. It's there the moment you create a project.


Step 7 — Close the loop: a fleet that learns and grooms itself

So far the fleet shares and is governed. MemClaw's third pillar is that it improves — and on Cloud the maintenance side runs on a schedule you don't manage.

Outcome reporting (the Karpathy Loop). After acting on recalled memories, an agent reports back via memclaw_evolve — success, failure, or partial, with the memory IDs that influenced the action. Successes reinforce memory weights; failures lose weight and auto-generate a preventive rule so the fleet doesn't repeat the mistake. To reinforce a teammate's memory (not just your own), report with scope="fleet" and hold trust 2+. Add one line to each CLAUDE.md: "After acting on recalled memories, report the outcome with memclaw_evolve."

Per-agent retrieval tuning. memclaw_tune lets each agent shape its own retrieval profile — your code-reviewer wants precision (higher min_similarity, lower top_k); your docs-writer wants breadth (more graph hops, more results). Search quality compounds per agent, per feedback signal.

Hygiene runs automatically. The crystallizer merges near-duplicate memories into canonical facts with provenance; contradiction detection supersedes stale facts when the team changes its mind; an 8-status lifecycle retires what's outdated. On Cloud these sweeps are scheduled for you — and memclaw_insights (focus contradictions, stale, patterns, …) lets an agent reflect over the corpus on demand, writing its findings back as insight memories the next agent will recall.

The result is a flywheel: write → recall → act → report → better recall. Every interaction makes the next one smarter — and you watch the weights move and the graph grow in Prism.


Where to go from here

You now have a working three-agent fleet with shared, governed, self-improving memory — built entirely from a key and config files. Scaling paths:

  • More agents is just more CLAUDE.md identities. The pattern is identical at 3 or 300.
  • Mixed fleets: anything that speaks MCP joins the same brain — Cursor, Windsurf, Claude Desktop, Codex, custom agents via REST.
  • Mind the limits: Cloud applies per-project rate limits, and the memclaw.net free tier covers 10K memories. Batch your writes and searches within them — Prism shows current usage.
  • Run it yourself: the same engine is open source. The self-hosted OSS series builds this exact fleet on your own Docker stack.

MemClaw Cloud — managed, governed memory for agent fleets, operated for you, with the Prism dashboard included. Start on the free tier at memclaw.net: 10K memories.

Your agents are already smart. Stop making them start from zero.