Memory hygiene at scale: contradictions, supersession & the crystallizer
What keeps a fleet's shared memory from rotting at scale — automatic contradiction detection and supersession, an 8-state lifecycle, the crystallizer hygiene scan, and memclaw_insights.
Part 1 gave the fleet shared memory; Part 2 gave us eyes on it; Part 3 governed it; Part 4 made it learn from outcomes.
Every one of those parts adds memories. Decisions get revised. Facts go stale. Two agents write the same thing three different ways. A store that only grows is a store that slowly rots — and recall quality rots with it. MemClaw's answer is three mechanisms that run mostly on their own: contradiction detection (which flips a stale memory's status and links the one that replaced it), an explicit status lifecycle, and a crystallizer that scans the corpus for hygiene problems and merges near-duplicates into canonical facts. Then memclaw_insights lets an agent reflect over the whole store and write down what it notices.
Commands use the default
http://localhost:8000+X-API-Key: standalonefrom Part 1. Everything below was run against the live fleet from Parts 1–4; the IDs are real.
Step 1 — Contradiction detection & supersession
The fleet already holds a decision from Part 1: "Payments API rate limit set to 100 req/min per API key" (bbd5ddfe, a confirmed decision). Now a later decision lands that conflicts with it:
curl -X POST http://localhost:8000/api/v1/memories \
-H "X-API-Key: standalone" -H "X-Agent-ID: backend-dev" -H "Content-Type: application/json" \
-d '{"tenant_id":"default","agent_id":"backend-dev","fleet_id":"dev-fleet","scope":"fleet",
"content":"Payments API rate limit has been raised to 300 requests per minute per API key, replacing the previous 100 req/min limit, after the burst-tolerant token-bucket rollout."}'
# → {"id":"21959ba7-...","status":"confirmed","supersedes_id":null}The write returns immediately with supersedes_id: null — that's expected. Contradiction detection runs after the commit, as a background task, so the write path stays fast. A few seconds later, re-fetch both memories:
# OLD bbd5ddfe status=conflicted ← was "confirmed"
# NEW 21959ba7 status=confirmed supersedes_id=bbd5ddfeTwo things happened on their own:
- The older memory was flipped to
conflicted— it's no longer presented as current truth. - The newer memory now points back at it via
supersedes_id, forming a supersession chain you can walk. The fleet keeps the history (the old decision isn't deleted — it's demoted), but recall surfaces the live one.
MemClaw decides this two ways. For single-value facts with a known subject/predicate (an entity's "rate limit," a service's "owner"), it compares RDF triples structurally — same subject + same single-value predicate + different object → conflict, and the older memory is marked outdated. For everything else it uses the semantic path: it pulls candidates above a cosine-similarity floor (0.70) and asks an LLM judge whether they genuinely conflict; a confirmed semantic conflict marks the older memory conflicted (which is the path our prose-y decision took). Either way the new memory gets the supersedes_id link.
Step 2 — The status lifecycle
conflicted and outdated aren't ad-hoc strings — they're two of eight statuses every memory moves through:
| Status | Meaning |
|---|---|
active | Normal, recallable. |
pending | Written but not yet confirmed. |
confirmed | Verified / reinforced — the high-confidence tier. |
cancelled | Explicitly retracted by an agent. |
outdated | Superseded via the RDF (single-value) contradiction path. |
conflicted | Superseded via the semantic contradiction path. |
archived | Retired by lifecycle automation or the crystallizer. |
deleted | Soft-deleted (the dedup gate still filters these out). |
Most transitions are automatic — contradiction detection sets outdated/conflicted, lifecycle automation archives stale, low-weight memories. When an agent needs to move one by hand (retract a cancelled task, confirm a pending fact), it's the MCP memclaw_manage tool with op=transition (memory_id + target status). The point is that "stale" isn't a vibe — it's a state, and the state is what recall and the hygiene tools key off.
Step 3 — The crystallizer: a hygiene scan for the whole corpus
Contradiction detection is per-write. The crystallizer is the periodic sweep that looks at everything at once. Trigger a run for the tenant:
curl -X POST http://localhost:8000/api/v1/crystallize \
-H "X-API-Key: standalone" -H "Content-Type: application/json" \
-d '{"tenant_id":"default"}'
# → {"report_id":"5be5a05a-...","status":"running"}It runs in the background and writes a report. Fetch the latest:
curl "http://localhost:8000/api/v1/crystallize/latest?tenant_id=default" -H "X-API-Key: standalone"{
"status": "completed", "duration_ms": 170,
"summary": { "overall_score": 99, "critical": 0, "warning": 0, "info": 1 },
"health": {
"total_memories": 18, "embedding_coverage_pct": 100.0,
"entity_coverage_pct": 94.4, "pii_count": 2, "contradiction_count": 1,
"status_distribution": { "active": 8, "pending": 1, "confirmed": 8, "conflicted": 1 }
},
"hygiene": {
"near_duplicates": { "count": 0 }, "stale_memories": { "count": 0 },
"orphaned_entities": { "count": 0 }, "missing_embeddings": { "count": 0 },
"broken_entity_links": { "count": 0 }, "expired_still_active": { "count": 0 }
},
"issues": [
{ "severity": "info", "code": "CONTRADICTIONS_PRESENT",
"title": "Contradicted or outdated memories present", "count": 1 }
]
}One report, three lenses: health (coverage, PII count, the live status histogram — note our conflicted: 1 from Step 1), hygiene (seven structural checks: near-duplicates, stale, orphaned entities, missing embeddings, broken links, expired-still-active, short content), and a graded issues list with affected IDs. The contradiction we created surfaces as the single info issue, and the corpus scores 99/100.
Near-duplicate merging. When the scan finds memories that say the same thing, it reports them as pairs above a strict 0.95 cosine threshold and groups them into clusters (≥3 members). Seed three near-identical facts and re-scan and the report shows it:
"near_duplicates": { "count": 2, "pairs": [
{ "similarity": 0.9925, "id1": "626319b9", "id2": "c4dca263" },
{ "similarity": 0.9565, "id1": "626319b9", "id2": "f6413f5c" }
]},
"crystallization": { "enabled": true, "clusters_found": 1, "memories_crystallized": 0 }When crystallization is enabled, the engine hands each cluster to an LLM that distills it into one canonical memory (agent_id: "crystallizer", with metadata.crystallized_from listing the sources for provenance) and archives the originals. So three ways of saying "CI caches node_modules" collapse to one fact, and the history is still traceable. The scan is the durable, always-safe part — it tells you exactly what's wrong and never destroys anything; the merge is the enabled cleanup that acts on it.
Incremental by design. Near-duplicate detection marks each memory dedup-checked once it's scanned, so a second
/crystallizewon't re-flag the same pairs —near_duplicatesdrops back to0. Run the scan right after seeding new memories to see them, and read each report as "new findings since the last sweep," not a full re-scan.
Step 4 — memclaw_insights: let the fleet reflect
The crystallizer finds structural problems. memclaw_insights finds semantic ones — it points an LLM at a slice of memory and asks "what should I notice here?" There are six focus modes:
| Focus | What it surfaces |
|---|---|
contradictions | Conflicting / superseded memories and RDF divergence. |
failures | Low-weight memories that were still recalled (things that burned the fleet). |
stale | Memories never recalled, or decayed and untouched. |
divergence | One entity described differently by multiple agents (fleet/all scope only). |
patterns | Themes across recent active memory. |
discover | Embedding clusters — latent topics you didn't ask about. |
Point it at our fresh conflict:
curl -X POST http://localhost:8000/api/v1/insights/generate \
-H "X-API-Key: standalone" -H "X-Agent-ID: backend-dev" -H "Content-Type: application/json" \
-d '{"tenant_id":"default","agent_id":"backend-dev","focus":"contradictions","scope":"fleet","fleet_id":"dev-fleet"}'{
"summary": "The two memories directly conflict on the Payments API rate limit per API key: one claims 100 req/min, the other 300 req/min...",
"findings": [{
"type": "contradictions", "confidence": 0.93,
"title": "Payments API rate limit: 100 vs 300 req/min per key",
"recommendation": "Mark Memory 2 as superseded/obsolete... ensure the limiter config reflects 300.",
"related_memory_ids": ["21959ba7", "bbd5ddfe"]
}]
}It found exactly the right pair, at 0.93 confidence, with a concrete recommendation. And the payoff: the finding is itself written back as an insight-type memory (65f41ab9 · "Resolve Payments API rate-limit conflict (100 vs 300 req/min)"). Reflection compounds — the next agent that recalls "rate limit" gets the conflict and the note that someone already flagged it.
A few notes from running it:
agent_idgoes in the body. The insights route resolves the caller from the request body; without it you'll hit "Agent 'mcp-agent' is not registered."- Scope gates trust.
agentscope needs trust ≥ 1,fleet/allneed trust ≥ 2;divergenceonly runs atfleet/all(it's inherently cross-agent). Standalone's admin key bypasses the gate locally. - Honest empties.
focus=staleon our young corpus returned "No relevant memories found" — it doesn't invent findings to look busy.
Step 5 — Make hygiene a background habit
None of this should be a thing a human remembers to do. Two moves close the loop:
- Schedule the crystallizer. A nightly
POST /api/v1/crystallizeper tenant turns the hygiene report into a trend you can watch in the dashboard from Part 2 —overall_scoreover time is a single number for "is our memory healthy?" - Give one agent a reflection habit. A line in a
CLAUDE.md, the same way Part 4 added outcome reporting:
- Periodically run memclaw_insights (focus=contradictions, then stale) and act on
high-confidence findings; the findings persist as insight memories for the fleet.Contradiction detection fires on every write, the crystallizer sweeps on a timer, and insights let an agent reflect on demand. Together they're the reason a fleet's memory can grow to 20,000 entries and still answer like it has 200 — current facts on top, stale ones demoted but traceable, duplicates collapsed, conflicts flagged and written down. A store that only grows rots. A store that grooms itself compounds.
Next — Part 6: The knowledge graph: entities, relations, and how the graph quietly lifts every recall — paying off the graph view we built back in Part 2.
caura-memclaw · Apache 2.0 — contradiction detection, the status lifecycle, the crystallizer (/crystallize) and memclaw_insights are all in the open-source engine. ⭐ Star on GitHub · Join Discord · memclaw.net
A fleet that learns is an asset. A fleet that keeps itself clean is one you can still trust at scale.