MemClaw / docs
Tutorials

Memory hygiene at scale: contradictions, supersession & the crystallizer

What keeps a fleet's shared memory from rotting at scale — automatic contradiction detection and supersession, an 8-state lifecycle, the crystallizer hygiene scan, and memclaw_insights.

Part 1 gave the fleet shared memory; Part 2 gave us eyes on it; Part 3 governed it; Part 4 made it learn from outcomes.

Every one of those parts adds memories. Decisions get revised. Facts go stale. Two agents write the same thing three different ways. A store that only grows is a store that slowly rots — and recall quality rots with it. MemClaw's answer is three mechanisms that run mostly on their own: contradiction detection (which flips a stale memory's status and links the one that replaced it), an explicit status lifecycle, and a crystallizer that scans the corpus for hygiene problems and merges near-duplicates into canonical facts. Then memclaw_insights lets an agent reflect over the whole store and write down what it notices.

Commands use the default http://localhost:8000 + X-API-Key: standalone from Part 1. Everything below was run against the live fleet from Parts 1–4; the IDs are real.


Step 1 — Contradiction detection & supersession

The fleet already holds a decision from Part 1: "Payments API rate limit set to 100 req/min per API key" (bbd5ddfe, a confirmed decision). Now a later decision lands that conflicts with it:

curl -X POST http://localhost:8000/api/v1/memories \
  -H "X-API-Key: standalone" -H "X-Agent-ID: backend-dev" -H "Content-Type: application/json" \
  -d '{"tenant_id":"default","agent_id":"backend-dev","fleet_id":"dev-fleet","scope":"fleet",
       "content":"Payments API rate limit has been raised to 300 requests per minute per API key, replacing the previous 100 req/min limit, after the burst-tolerant token-bucket rollout."}'
#   → {"id":"21959ba7-...","status":"confirmed","supersedes_id":null}

The write returns immediately with supersedes_id: null — that's expected. Contradiction detection runs after the commit, as a background task, so the write path stays fast. A few seconds later, re-fetch both memories:

#   OLD  bbd5ddfe  status=conflicted   ← was "confirmed"
#   NEW  21959ba7  status=confirmed     supersedes_id=bbd5ddfe

Two things happened on their own:

  1. The older memory was flipped to conflicted — it's no longer presented as current truth.
  2. The newer memory now points back at it via supersedes_id, forming a supersession chain you can walk. The fleet keeps the history (the old decision isn't deleted — it's demoted), but recall surfaces the live one.

MemClaw decides this two ways. For single-value facts with a known subject/predicate (an entity's "rate limit," a service's "owner"), it compares RDF triples structurally — same subject + same single-value predicate + different object → conflict, and the older memory is marked outdated. For everything else it uses the semantic path: it pulls candidates above a cosine-similarity floor (0.70) and asks an LLM judge whether they genuinely conflict; a confirmed semantic conflict marks the older memory conflicted (which is the path our prose-y decision took). Either way the new memory gets the supersedes_id link.


Step 2 — The status lifecycle

conflicted and outdated aren't ad-hoc strings — they're two of eight statuses every memory moves through:

StatusMeaning
activeNormal, recallable.
pendingWritten but not yet confirmed.
confirmedVerified / reinforced — the high-confidence tier.
cancelledExplicitly retracted by an agent.
outdatedSuperseded via the RDF (single-value) contradiction path.
conflictedSuperseded via the semantic contradiction path.
archivedRetired by lifecycle automation or the crystallizer.
deletedSoft-deleted (the dedup gate still filters these out).

Most transitions are automatic — contradiction detection sets outdated/conflicted, lifecycle automation archives stale, low-weight memories. When an agent needs to move one by hand (retract a cancelled task, confirm a pending fact), it's the MCP memclaw_manage tool with op=transition (memory_id + target status). The point is that "stale" isn't a vibe — it's a state, and the state is what recall and the hygiene tools key off.


Step 3 — The crystallizer: a hygiene scan for the whole corpus

Contradiction detection is per-write. The crystallizer is the periodic sweep that looks at everything at once. Trigger a run for the tenant:

curl -X POST http://localhost:8000/api/v1/crystallize \
  -H "X-API-Key: standalone" -H "Content-Type: application/json" \
  -d '{"tenant_id":"default"}'
#   → {"report_id":"5be5a05a-...","status":"running"}

It runs in the background and writes a report. Fetch the latest:

curl "http://localhost:8000/api/v1/crystallize/latest?tenant_id=default" -H "X-API-Key: standalone"
{
  "status": "completed", "duration_ms": 170,
  "summary": { "overall_score": 99, "critical": 0, "warning": 0, "info": 1 },
  "health": {
    "total_memories": 18, "embedding_coverage_pct": 100.0,
    "entity_coverage_pct": 94.4, "pii_count": 2, "contradiction_count": 1,
    "status_distribution": { "active": 8, "pending": 1, "confirmed": 8, "conflicted": 1 }
  },
  "hygiene": {
    "near_duplicates": { "count": 0 }, "stale_memories": { "count": 0 },
    "orphaned_entities": { "count": 0 }, "missing_embeddings": { "count": 0 },
    "broken_entity_links": { "count": 0 }, "expired_still_active": { "count": 0 }
  },
  "issues": [
    { "severity": "info", "code": "CONTRADICTIONS_PRESENT",
      "title": "Contradicted or outdated memories present", "count": 1 }
  ]
}

One report, three lenses: health (coverage, PII count, the live status histogram — note our conflicted: 1 from Step 1), hygiene (seven structural checks: near-duplicates, stale, orphaned entities, missing embeddings, broken links, expired-still-active, short content), and a graded issues list with affected IDs. The contradiction we created surfaces as the single info issue, and the corpus scores 99/100.

Near-duplicate merging. When the scan finds memories that say the same thing, it reports them as pairs above a strict 0.95 cosine threshold and groups them into clusters (≥3 members). Seed three near-identical facts and re-scan and the report shows it:

"near_duplicates": { "count": 2, "pairs": [
  { "similarity": 0.9925, "id1": "626319b9", "id2": "c4dca263" },
  { "similarity": 0.9565, "id1": "626319b9", "id2": "f6413f5c" }
]},
"crystallization": { "enabled": true, "clusters_found": 1, "memories_crystallized": 0 }

When crystallization is enabled, the engine hands each cluster to an LLM that distills it into one canonical memory (agent_id: "crystallizer", with metadata.crystallized_from listing the sources for provenance) and archives the originals. So three ways of saying "CI caches node_modules" collapse to one fact, and the history is still traceable. The scan is the durable, always-safe part — it tells you exactly what's wrong and never destroys anything; the merge is the enabled cleanup that acts on it.

Incremental by design. Near-duplicate detection marks each memory dedup-checked once it's scanned, so a second /crystallize won't re-flag the same pairs — near_duplicates drops back to 0. Run the scan right after seeding new memories to see them, and read each report as "new findings since the last sweep," not a full re-scan.


Step 4 — memclaw_insights: let the fleet reflect

The crystallizer finds structural problems. memclaw_insights finds semantic ones — it points an LLM at a slice of memory and asks "what should I notice here?" There are six focus modes:

FocusWhat it surfaces
contradictionsConflicting / superseded memories and RDF divergence.
failuresLow-weight memories that were still recalled (things that burned the fleet).
staleMemories never recalled, or decayed and untouched.
divergenceOne entity described differently by multiple agents (fleet/all scope only).
patternsThemes across recent active memory.
discoverEmbedding clusters — latent topics you didn't ask about.

Point it at our fresh conflict:

curl -X POST http://localhost:8000/api/v1/insights/generate \
  -H "X-API-Key: standalone" -H "X-Agent-ID: backend-dev" -H "Content-Type: application/json" \
  -d '{"tenant_id":"default","agent_id":"backend-dev","focus":"contradictions","scope":"fleet","fleet_id":"dev-fleet"}'
{
  "summary": "The two memories directly conflict on the Payments API rate limit per API key: one claims 100 req/min, the other 300 req/min...",
  "findings": [{
    "type": "contradictions", "confidence": 0.93,
    "title": "Payments API rate limit: 100 vs 300 req/min per key",
    "recommendation": "Mark Memory 2 as superseded/obsolete... ensure the limiter config reflects 300.",
    "related_memory_ids": ["21959ba7", "bbd5ddfe"]
  }]
}

It found exactly the right pair, at 0.93 confidence, with a concrete recommendation. And the payoff: the finding is itself written back as an insight-type memory (65f41ab9 · "Resolve Payments API rate-limit conflict (100 vs 300 req/min)"). Reflection compounds — the next agent that recalls "rate limit" gets the conflict and the note that someone already flagged it.

A few notes from running it:

  • agent_id goes in the body. The insights route resolves the caller from the request body; without it you'll hit "Agent 'mcp-agent' is not registered."
  • Scope gates trust. agent scope needs trust ≥ 1, fleet/all need trust ≥ 2; divergence only runs at fleet/all (it's inherently cross-agent). Standalone's admin key bypasses the gate locally.
  • Honest empties. focus=stale on our young corpus returned "No relevant memories found" — it doesn't invent findings to look busy.

Step 5 — Make hygiene a background habit

None of this should be a thing a human remembers to do. Two moves close the loop:

  1. Schedule the crystallizer. A nightly POST /api/v1/crystallize per tenant turns the hygiene report into a trend you can watch in the dashboard from Part 2 — overall_score over time is a single number for "is our memory healthy?"
  2. Give one agent a reflection habit. A line in a CLAUDE.md, the same way Part 4 added outcome reporting:
- Periodically run memclaw_insights (focus=contradictions, then stale) and act on
  high-confidence findings; the findings persist as insight memories for the fleet.

Contradiction detection fires on every write, the crystallizer sweeps on a timer, and insights let an agent reflect on demand. Together they're the reason a fleet's memory can grow to 20,000 entries and still answer like it has 200 — current facts on top, stale ones demoted but traceable, duplicates collapsed, conflicts flagged and written down. A store that only grows rots. A store that grooms itself compounds.

Next — Part 6: The knowledge graph: entities, relations, and how the graph quietly lifts every recall — paying off the graph view we built back in Part 2.


caura-memclaw · Apache 2.0 — contradiction detection, the status lifecycle, the crystallizer (/crystallize) and memclaw_insights are all in the open-source engine. ⭐ Star on GitHub · Join Discord · memclaw.net

A fleet that learns is an asset. A fleet that keeps itself clean is one you can still trust at scale.