Skip to Content
DocumentationGetting StartedQuick Start

Quick Start

Let’s store some memories and search them. This takes about 30 seconds.

Connect and write your first memory

from unforget import MemoryStore # Connect to your database store = MemoryStore("postgresql://unforget:unforget@localhost:5433/unforget") await store.initialize() # Bind to a scope — this ties memories to a specific org + agent memory = store.bind(org_id="my-app", agent_id="assistant") # Store some facts — each write takes ~7ms, no LLM calls await memory.write("User prefers dark mode", tags=["preference"]) await memory.write("User is a Python developer based in Berlin", importance=0.8) await memory.write("Last deploy failed due to OOM on worker-3", memory_type="event")

That’s it — three memories stored in ~21ms total. No LLM processing, no API calls, just a fast embed + SQL insert.

Search your memories

# 4-channel hybrid search: semantic + BM25 + entity + temporal results = await memory.recall("user preferences") for r in results: print(f"[{r.memory_type}] {r.content} (score: {r.score:.3f})")

recall() runs four search strategies in parallel inside a single SQL query, fuses the results with Reciprocal Rank Fusion, and reranks with a cross-encoder. You get the most relevant memories in ~25ms.

Inject memories into an LLM prompt

# Get formatted context ready for a system prompt context = await memory.auto_recall("help with deployment") print(context) # → [Memory Context] # → - Last deploy failed due to OOM on worker-3 # → - User is a Python developer based in Berlin

auto_recall() returns a pre-formatted string you can drop directly into your LLM’s system prompt. It handles token budgeting so you don’t overflow the context window.

Memory types

Unforget has three memory types, each designed for a different purpose:

TypeWhen to useDefault importanceLifespan
insightDistilled facts, preferences, rules0.5Long-lived (default)
eventSpecific interactions, timestamped happenings0.5Medium
rawIngested conversation chunks0.3Auto-expires after 30 days

Most of the time, use insight (the default) for facts and event for things that happened. The raw type is used by the ingestion pipeline and gets promoted to insights by background consolidation.

Ingest a full conversation

Got a conversation transcript? Ingest it in one call:

await memory.ingest([ {"role": "user", "content": "My printer won't connect to wifi"}, {"role": "assistant", "content": "Let's try resetting the network settings..."}, {"role": "user", "content": "That worked! Thanks."}, ], mode="background")

Three ingestion modes:

  • background (default) — stores raw chunks instantly, consolidation promotes later. Zero LLM calls.
  • immediate — LLM extracts key facts upfront. Requires an llm callable.
  • lightweight — NER + heuristics, stores as events. Zero LLM calls.

Update and version memories

When facts change, supersede the old memory instead of deleting it. This preserves the full history:

m = await memory.write("User lives in Austin, TX") # Six months later, they move old, new = await memory.supersede(m.id, "User lives in Denver, CO") # Query what was true at a specific point in time january_memories = await memory.timeline(at=datetime(2026, 1, 15)) # → [MemoryItem(content="User lives in Austin, TX")]

Clean up

await store.close()

Next steps

Last updated on
Apache 2.0 · Unforget