memclaw_docArchitectureMulti-AgentMay 8, 2026

Memory Isn’t Records.
How memclaw_doc Solves the Other Half.

Six operations. One collection-based primitive. Most of the agent infrastructure you’re about to build, you don’t have to.

When people first see MemClaw, they latch onto memory: agents that remember what happened, share what they learned, accumulate context across runs. That’s the headline feature. But there’s a second tool sitting next to it that quietly does the work most teams end up rebuilding from scratch — and once you see it, you can’t unsee how badly you needed it.

It’s called memclaw_doc. Six operations, one collection-based primitive. It replaces an entire shelf of side-systems your agents would otherwise need.


The mismatch

Memory is observations. “The customer asked about pricing.” “We hit a 503 on retry #4.” “Alice prefers Slack notifications.” Each row is a fact-shaped thing the agent witnessed or concluded — write-once, mostly-immutable, surfaced by meaning.

But agents also generate, consume, and curate records. Customer profiles. Configuration. Inventory. Workflows. Playbooks. Skills. These have keys (a doc_id), typed fields, and exact-match queries (“give me the customer with email X” — not “give me a memory that semantically resembles a customer with email X”).

If you only have memory, every record turns into either:

  • A custom microservice the agent has to call — now you’re maintaining a CRM API just so your agent can look up customers.
  • A misshaped memory row that you embed and pray the right one resurfaces (“the customer record for Acme Corp” — what does that even mean as a vector?).

Both are bad. The first is engineering tax. The second is wishful thinking masquerading as architecture.


What memclaw_doc actually is

A namespaced JSON-document store, with optional embedding on a chosen field, exposed through six ops:

op=write              Upsert a doc in a named collection by collection+doc_id.
op=read               Get one doc by collection+doc_id.
op=query              Filter by exact field equality, ordered, paginated.
op=delete             Remove a doc by collection+doc_id.
op=list_collections   Enumerate every collection this tenant has (with counts).
op=search             Semantic retrieval over docs indexed via embed_field.

Two things make this not-just-another-key-value-store.

1. Collections are first-class

A collection is a namespace — customers, config, playbooks, inventory. Created on first write, listable per-tenant, isolated from each other. No schema migrations, no admin step, no ceremony. An agent can inventory the world in one call: “what collections exist, how many docs in each?”

2. embed_field is optional, per-collection

When you write a doc, you can name a text field to embed. From then on op=search does semantic retrieval across that field — same hybrid embedding+keyword path as memory recall, but scoped to docs you’ve explicitly chosen to make searchable. Customer records? Don’t bother — you’ll always look up by email or ID. Skill descriptions, FAQ entries, runbook titles? Embed them and let agents find them by meaning.

This is the part that punches above its weight: structured records and semantic search aren’t a tradeoff anymore. You don’t have to pick between “exact key lookup” and “semantic discovery.” You get both, in the same primitive, switching with a parameter.


Why it matters more than it looks like

One tool replaces three side-systems

Before memclaw_doc, an agent that needed to manage skills called dedicated memclaw_share_skill / memclaw_unshare_skill tools. An agent that needed config talked to whatever you’d built around it. An agent that needed customer records talked to your CRM. Three integration surfaces, three auth flows, three failure modes.

We just deleted the skill-specific tools — entirely. memclaw_doc with collection="skills" does the same thing, plus everything else. One tool, one auth path, one set of trust gates, six predictable ops. The agent’s mental model collapses from “what’s the right tool for this kind of record?” to “what collection should this go in?”

Skill discovery is now actually a skill

Here’s the concrete case that proves the design. A skill in MemClaw is a SKILL.md doc — a slug, a description, a markdown body. With dedicated tools, the agent had to know memclaw_share_skill exists and what it expects. With memclaw_doc and collection="skills", the agent uses the same primitive it already uses for everything structured:

memclaw_doc op=search collection=skills query="migrating from sqlite to postgres"

That’s it. Semantic discovery over the fleet’s accumulated playbooks. The same call shape works for finding a customer note, a config entry, or an FAQ. The agent’s vocabulary doesn’t grow with every new domain.

Collections become teach-once primitives

This is the long-tail value. Every team that builds with agents accumulates collections — meeting_notes, decisions, playbooks, vendors, experiments. Without memclaw_doc, each is a custom store. With it, each is one operation away.

A new agent joining your fleet doesn’t need a per-collection integration. It calls op=list_collections, sees the inventory, and is productive immediately. Onboarding cost approaches zero.

Audit and governance ride the same rails

Because every doc op routes through one tool, the trust model, the audit log, the visibility scopes, and the rate limits apply uniformly. You don’t have eleven access-control regimes — one for memories, one for skills, one for config, one for customers. You have one. When the security team asks “show me everything Alice’s agent touched in the last hour,” you have a single log to grep.


When NOT to use it

Be honest about the boundary. memclaw_doc is for records you’ll look up by ID, query by field, or search semantically by description. It’s not for:

  • Observations and learned facts. Those are memories — memclaw_write / memclaw_recall. The contradiction detector and the supersession chain only work on memories. You don’t want your customer records “conflicting” with each other.
  • Mutable counters, queues, or rate limits. Use Redis. Documents are upsert-by-key; they’re not built for atomic increments or fan-out queues.
  • Files, blobs, or large binary content. Use object storage. Embed the metadata in memclaw_doc and link out.

The two-store design is intentional. Memory is the agent’s epistemic state — what it has come to believe. memclaw_doc is the structured world the agent operates over. Conflating them is the trap; keeping them separate is what lets each one be the best version of itself.


Try it

If you’ve got a MemClaw tenant, write your first doc right now:

memclaw_doc op=write
  collection=runbooks
  doc_id=postgres-vacuum-stuck
  data={"title":"Postgres VACUUM stuck on a hot table","body":"...","tags":["postgres","ops"]}
  embed_field=title

Read it back. Query the collection. Search semantically. List collections. Six ops. One primitive. Most of the agent infrastructure you were about to build, you don’t have to.

That’s why it matters.


MemClaw is built by Caura.ai — governed memory for the hyper-agent generation. Free tier, no credit card. Get started.