MemClaw / docs
Concepts

Memory Pipeline

What happens between calling memclaw_write and the memory becoming recallable.

Every write moves through four stages. Knowing them helps you debug "why didn't my agent recall this?" and "why does the stored memory look different from what I sent?".

1. Ingest

POST /api/v1/memories (or the memclaw_write MCP tool) validates auth, enforces the caller's trust level and tenant quota, and persists the row. The HTTP response returns as soon as the row is committed; everything else is asynchronous.

2. Enrich (async)

After the write returns, core-api publishes embed-requested and enrich-requested events to the event bus. core-worker consumes them and runs:

  • Embedding — vector generation via the configured embedding provider. Default is OpenAI; alternatives include local and fake for development. Set with EMBEDDING_PROVIDER and OPENAI_API_KEY (or the matching keys for other providers).
  • Entity extraction + classification — names, projects, repos, references; memory type, title, summary, tags. Driven by the configured entity-extraction provider (ENTITY_EXTRACTION_PROVIDER: openai / anthropic / gemini / openrouter). Vertex AI is the operator-managed platform-tier provider, used on the managed deployment.

Until enrichment lands, the memory is queryable by id but won't show up in semantic recall. The exact provider matrix and supported combinations live in the OSS .env.example header.

Contradiction detection also runs after commit (fire-and-forget via track_task — see core-api/src/core_api/services/contradiction_detector.py's detect_contradictions_async). The write itself does not block on it; contradictions surface through memclaw_insights after the loop runs.

3. Govern

Independent of the write path, three governance mechanisms keep the store useful:

  • Trust enforcement — every read and write checks the caller's trust_level. Operations beyond that level return 403 FORBIDDEN. See Trust levels for the table.
  • Karpathy Loop — agents report outcomes after acting on recalled memories via memclaw_evolve (success | failure | partial). The platform reinforces what works and may auto-generate preventive rule-type memories on failure. memclaw_insights surfaces the resulting reflection (contradictions, drift, stale entries).
  • Memory Crystallizer — a separate background process that consolidates many small memories about the same entity into stronger denser ones. Triggered with POST /api/v1/crystallize per tenant, or crystallize/all (admin) on a nightly schedule.

4. Recall

POST /api/v1/recall (or memclaw_recall) does hybrid retrieval: vector similarity + keyword full-text + entity-graph matches, blended into a single ranked list. Trust still applies at read — a level-1 agent cannot recall across fleets it doesn't own.

For non-semantic flows:

  • memclaw_list — paginated browse by entity / type / time. Use when recall is too narrow.
  • memclaw_manage op=read — read by id when you already know the memory_id.

Where each stage lives

StageServicePath in the OSS repo
Ingest + recall + sync contradiction checkcore-apicore-api/src/core_api/routes/, services/contradiction_detector.py
Embedding + entity extraction (async)core-workercore-worker/src/core_worker/
Storage (Postgres + pgvector)core-storage-apicore-storage-api/

For the live OpenAPI surface, see the API Reference.