Quick Start
Let’s store some memories and search them. This takes about 30 seconds.
Connect and write your first memory
from unforget import MemoryStore
# Connect to your database
store = MemoryStore("postgresql://unforget:unforget@localhost:5433/unforget")
await store.initialize()
# Bind to a scope — this ties memories to a specific org + agent
memory = store.bind(org_id="my-app", agent_id="assistant")
# Store some facts — each write takes ~7ms, no LLM calls
await memory.write("User prefers dark mode", tags=["preference"])
await memory.write("User is a Python developer based in Berlin", importance=0.8)
await memory.write("Last deploy failed due to OOM on worker-3", memory_type="event")That’s it — three memories stored in ~21ms total. No LLM processing, no API calls, just a fast embed + SQL insert.
Search your memories
# 4-channel hybrid search: semantic + BM25 + entity + temporal
results = await memory.recall("user preferences")
for r in results:
print(f"[{r.memory_type}] {r.content} (score: {r.score:.3f})")recall() runs four search strategies in parallel inside a single SQL query, fuses the results with Reciprocal Rank Fusion, and reranks with a cross-encoder. You get the most relevant memories in ~25ms.
Inject memories into an LLM prompt
# Get formatted context ready for a system prompt
context = await memory.auto_recall("help with deployment")
print(context)
# → [Memory Context]
# → - Last deploy failed due to OOM on worker-3
# → - User is a Python developer based in Berlinauto_recall() returns a pre-formatted string you can drop directly into your LLM’s system prompt. It handles token budgeting so you don’t overflow the context window.
Memory types
Unforget has three memory types, each designed for a different purpose:
| Type | When to use | Default importance | Lifespan |
|---|---|---|---|
insight | Distilled facts, preferences, rules | 0.5 | Long-lived (default) |
event | Specific interactions, timestamped happenings | 0.5 | Medium |
raw | Ingested conversation chunks | 0.3 | Auto-expires after 30 days |
Most of the time, use insight (the default) for facts and event for things that happened. The raw type is used by the ingestion pipeline and gets promoted to insights by background consolidation.
Ingest a full conversation
Got a conversation transcript? Ingest it in one call:
await memory.ingest([
{"role": "user", "content": "My printer won't connect to wifi"},
{"role": "assistant", "content": "Let's try resetting the network settings..."},
{"role": "user", "content": "That worked! Thanks."},
], mode="background")Three ingestion modes:
background(default) — stores raw chunks instantly, consolidation promotes later. Zero LLM calls.immediate— LLM extracts key facts upfront. Requires anllmcallable.lightweight— NER + heuristics, stores as events. Zero LLM calls.
Update and version memories
When facts change, supersede the old memory instead of deleting it. This preserves the full history:
m = await memory.write("User lives in Austin, TX")
# Six months later, they move
old, new = await memory.supersede(m.id, "User lives in Denver, CO")
# Query what was true at a specific point in time
january_memories = await memory.timeline(at=datetime(2026, 1, 15))
# → [MemoryItem(content="User lives in Austin, TX")]Clean up
await store.close()Next steps
- Configuration — tune retrieval weights, embedders, and caching
- OpenAI Integration — wrap your OpenAI client with automatic memory
- Anthropic Integration — same for Anthropic
- Local MCP Server — use Unforget with Claude Code or Cursor
- OpenClaw Integration — add memory to your OpenClaw agent