Memory Pipeline
What happens between calling memclaw_write and the memory becoming recallable.
Every write moves through four stages. Knowing them helps you debug "why didn't my agent recall this?" and "why does the stored memory look different from what I sent?".
1. Ingest
POST /api/v1/memories (or the memclaw_write MCP tool) validates auth, enforces the caller's trust level and tenant quota, and persists the row. The HTTP response returns as soon as the row is committed; everything else is asynchronous.
2. Enrich (async)
After the write returns, core-api publishes embed-requested and enrich-requested events to the event bus. core-worker consumes them and runs:
- Embedding — vector generation via the configured embedding provider. Default is OpenAI; alternatives include
localandfakefor development. Set withEMBEDDING_PROVIDERandOPENAI_API_KEY(or the matching keys for other providers). - Entity extraction + classification — names, projects, repos, references; memory
type,title,summary,tags. Driven by the configured entity-extraction provider (ENTITY_EXTRACTION_PROVIDER:openai/anthropic/gemini/openrouter). Vertex AI is the operator-managed platform-tier provider, used on the managed deployment.
Until enrichment lands, the memory is queryable by id but won't show up in semantic recall. The exact provider matrix and supported combinations live in the OSS .env.example header.
Contradiction detection also runs after commit (fire-and-forget via track_task — see core-api/src/core_api/services/contradiction_detector.py's detect_contradictions_async). The write itself does not block on it; contradictions surface through memclaw_insights after the loop runs.
3. Govern
Independent of the write path, three governance mechanisms keep the store useful:
- Trust enforcement — every read and write checks the caller's
trust_level. Operations beyond that level return403 FORBIDDEN. See Trust levels for the table. - Karpathy Loop — agents report outcomes after acting on recalled memories via
memclaw_evolve(success | failure | partial). The platform reinforces what works and may auto-generate preventiverule-type memories on failure.memclaw_insightssurfaces the resulting reflection (contradictions, drift, stale entries). - Memory Crystallizer — a separate background process that consolidates many small memories about the same entity into stronger denser ones. Triggered with
POST /api/v1/crystallizeper tenant, orcrystallize/all(admin) on a nightly schedule.
4. Recall
POST /api/v1/recall (or memclaw_recall) does hybrid retrieval: vector similarity + keyword full-text + entity-graph matches, blended into a single ranked list. Trust still applies at read — a level-1 agent cannot recall across fleets it doesn't own.
For non-semantic flows:
memclaw_list— paginated browse by entity / type / time. Use whenrecallis too narrow.memclaw_manage op=read— read by id when you already know thememory_id.
Where each stage lives
| Stage | Service | Path in the OSS repo |
|---|---|---|
| Ingest + recall + sync contradiction check | core-api | core-api/src/core_api/routes/, services/contradiction_detector.py |
| Embedding + entity extraction (async) | core-worker | core-worker/src/core_worker/ |
| Storage (Postgres + pgvector) | core-storage-api | core-storage-api/ |
For the live OpenAPI surface, see the API Reference.