Skip to Content
DocumentationGetting StartedConfiguration

Configuration

Unforget works well with defaults — you don’t need to configure anything to get started. But when you want to tune performance, swap the embedding model, or adjust retrieval behavior, everything is configurable through the MemoryStore constructor.

MemoryStore options

store = MemoryStore( database_url="postgresql://user:pass@localhost/db", # Embedding model — runs locally, no API keys needed # Default: all-MiniLM-L6-v2 (384 dims, ~3ms with ONNX) embedding_model="all-MiniLM-L6-v2", # Cross-encoder reranker — adds ~10ms but significantly improves precision # Disable if you need the lowest possible latency reranker_enabled=True, reranker_model="cross-encoder/ms-marco-MiniLM-L-6-v2", # Connection pool — adjust based on your concurrency needs pool_min_size=1, pool_max_size=10, # Retrieval tuning ef_search=100, # HNSW recall parameter (higher = more accurate, slower) rrf_k=60, # Reciprocal Rank Fusion k (lower = more weight on top results) # Channel weights — how much each retrieval channel contributes to the final score channel_weights={ "semantic": 1.0, # Vector similarity (pgvector cosine) "bm25": 1.0, # Full-text keyword search "entity": 0.7, # Named entity overlap (people, places, dates) "temporal": 0.3, # Recently accessed memories }, # Type boosts — applied after fusion to favor certain memory types type_boosts={ "insight": 1.5, # Facts and preferences rank highest "event": 1.0, # Interactions rank normally "raw": 0.5, # Raw chunks rank lowest }, # Quotas — protect against runaway writes max_writes_per_minute=100, # Per agent max_memories_per_agent=10_000, # Total per agent # Recall cache — repeated queries return instantly recall_cache_ttl=60.0, # Cache lifetime in seconds recall_cache_size=1000, # Max cached queries )

Custom embedder

The default embedder runs locally — fast and free. If you need higher quality embeddings or want to use a specific provider, swap it out:

OpenAI embeddings

from unforget import MemoryStore, OpenAIEmbedder store = MemoryStore( "postgresql://...", embedder=OpenAIEmbedder(), # text-embedding-3-small # or embedder=OpenAIEmbedder(model="text-embedding-3-large"), # 3072 dims, higher quality )

Custom provider

Implement the BaseEmbedder interface to use any embedding provider — Cohere, Voyage, a local model, etc:

from unforget.embedder import BaseEmbedder class CohereEmbedder(BaseEmbedder): @property def dims(self) -> int: return 1024 def embed(self, text: str) -> list[float]: return cohere_client.embed([text]).embeddings[0] def embed_batch(self, texts: list[str]) -> list[list[float]]: return cohere_client.embed(texts).embeddings store = MemoryStore("postgresql://...", embedder=CohereEmbedder())

The database schema auto-adapts to the embedding dimensionality. Switch embedders and the new dimensions are handled automatically.

Background consolidation

Consolidation runs in the background to keep your memory store clean and efficient. It handles four tasks:

  • Dedup — merges near-identical memories (cosine similarity > 0.92)
  • Decay — gradually reduces importance of memories that haven’t been accessed
  • Expire — soft-deletes raw chunks past their 30-day TTL
  • Promote — distills raw conversation chunks into clean insights (requires an LLM)
from unforget import ConsolidationScheduler # Define an LLM callable for promotion (optional) async def my_llm(prompt: str) -> str: response = await openai_client.chat.completions.create( model="gpt-4.1-nano", messages=[{"role": "user", "content": prompt}], ) return response.choices[0].message.content scheduler = ConsolidationScheduler( store, interval_seconds=3600, # Run every hour write_threshold=50, # Or after 50 writes, whichever comes first llm=my_llm, # Optional: enables raw → insight promotion similarity_threshold=0.92, # Dedup cosine threshold ) store.attach_scheduler(scheduler) await scheduler.start()

Without an LLM, consolidation still handles dedup, decay, and expiry — just not promotion.

Tuning retrieval

The four retrieval channels each contribute differently depending on your use case:

ChannelGood forWeight guidance
SemanticConceptual similarity, paraphrased queriesKeep at 1.0 (anchor)
BM25Exact keyword matches, names, specific termsIncrease for factual Q&A
EntityQueries about specific people, places, datesIncrease for entity-heavy data
TemporalRecent context, “what did we just discuss”Increase for chat-like applications

Example: for a customer support bot where exact ticket numbers and product names matter:

store = MemoryStore( "postgresql://...", channel_weights={ "semantic": 1.0, "bm25": 1.2, # Boost keyword matching "entity": 0.8, # Product names, ticket IDs "temporal": 0.2, # Recency less important }, )
Last updated on
Apache 2.0 · Unforget