Configuration
Unforget works well with defaults — you don’t need to configure anything to get started. But when you want to tune performance, swap the embedding model, or adjust retrieval behavior, everything is configurable through the MemoryStore constructor.
MemoryStore options
store = MemoryStore(
database_url="postgresql://user:pass@localhost/db",
# Embedding model — runs locally, no API keys needed
# Default: all-MiniLM-L6-v2 (384 dims, ~3ms with ONNX)
embedding_model="all-MiniLM-L6-v2",
# Cross-encoder reranker — adds ~10ms but significantly improves precision
# Disable if you need the lowest possible latency
reranker_enabled=True,
reranker_model="cross-encoder/ms-marco-MiniLM-L-6-v2",
# Connection pool — adjust based on your concurrency needs
pool_min_size=1,
pool_max_size=10,
# Retrieval tuning
ef_search=100, # HNSW recall parameter (higher = more accurate, slower)
rrf_k=60, # Reciprocal Rank Fusion k (lower = more weight on top results)
# Channel weights — how much each retrieval channel contributes to the final score
channel_weights={
"semantic": 1.0, # Vector similarity (pgvector cosine)
"bm25": 1.0, # Full-text keyword search
"entity": 0.7, # Named entity overlap (people, places, dates)
"temporal": 0.3, # Recently accessed memories
},
# Type boosts — applied after fusion to favor certain memory types
type_boosts={
"insight": 1.5, # Facts and preferences rank highest
"event": 1.0, # Interactions rank normally
"raw": 0.5, # Raw chunks rank lowest
},
# Quotas — protect against runaway writes
max_writes_per_minute=100, # Per agent
max_memories_per_agent=10_000, # Total per agent
# Recall cache — repeated queries return instantly
recall_cache_ttl=60.0, # Cache lifetime in seconds
recall_cache_size=1000, # Max cached queries
)Custom embedder
The default embedder runs locally — fast and free. If you need higher quality embeddings or want to use a specific provider, swap it out:
OpenAI embeddings
from unforget import MemoryStore, OpenAIEmbedder
store = MemoryStore(
"postgresql://...",
embedder=OpenAIEmbedder(), # text-embedding-3-small
# or
embedder=OpenAIEmbedder(model="text-embedding-3-large"), # 3072 dims, higher quality
)Custom provider
Implement the BaseEmbedder interface to use any embedding provider — Cohere, Voyage, a local model, etc:
from unforget.embedder import BaseEmbedder
class CohereEmbedder(BaseEmbedder):
@property
def dims(self) -> int:
return 1024
def embed(self, text: str) -> list[float]:
return cohere_client.embed([text]).embeddings[0]
def embed_batch(self, texts: list[str]) -> list[list[float]]:
return cohere_client.embed(texts).embeddings
store = MemoryStore("postgresql://...", embedder=CohereEmbedder())The database schema auto-adapts to the embedding dimensionality. Switch embedders and the new dimensions are handled automatically.
Background consolidation
Consolidation runs in the background to keep your memory store clean and efficient. It handles four tasks:
- Dedup — merges near-identical memories (cosine similarity > 0.92)
- Decay — gradually reduces importance of memories that haven’t been accessed
- Expire — soft-deletes raw chunks past their 30-day TTL
- Promote — distills raw conversation chunks into clean insights (requires an LLM)
from unforget import ConsolidationScheduler
# Define an LLM callable for promotion (optional)
async def my_llm(prompt: str) -> str:
response = await openai_client.chat.completions.create(
model="gpt-4.1-nano",
messages=[{"role": "user", "content": prompt}],
)
return response.choices[0].message.content
scheduler = ConsolidationScheduler(
store,
interval_seconds=3600, # Run every hour
write_threshold=50, # Or after 50 writes, whichever comes first
llm=my_llm, # Optional: enables raw → insight promotion
similarity_threshold=0.92, # Dedup cosine threshold
)
store.attach_scheduler(scheduler)
await scheduler.start()Without an LLM, consolidation still handles dedup, decay, and expiry — just not promotion.
Tuning retrieval
The four retrieval channels each contribute differently depending on your use case:
| Channel | Good for | Weight guidance |
|---|---|---|
| Semantic | Conceptual similarity, paraphrased queries | Keep at 1.0 (anchor) |
| BM25 | Exact keyword matches, names, specific terms | Increase for factual Q&A |
| Entity | Queries about specific people, places, dates | Increase for entity-heavy data |
| Temporal | Recent context, “what did we just discuss” | Increase for chat-like applications |
Example: for a customer support bot where exact ticket numbers and product names matter:
store = MemoryStore(
"postgresql://...",
channel_weights={
"semantic": 1.0,
"bm25": 1.2, # Boost keyword matching
"entity": 0.8, # Product names, ticket IDs
"temporal": 0.2, # Recency less important
},
)