Architecture
This page explains the internal design of Kore Memory: how the Ebbinghaus decay engine works, how importance is scored without an LLM, how semantic search operates locally, and how memory compression merges redundant knowledge.
System Overview
+-----------------+
| REST API |
| (FastAPI) |
+--------+--------+
|
+--------------+--------------+
| | |
+--------v---+ +------v------+ +----v--------+
| Decay | | Embedding | | Importance |
| Engine | | Engine | | Scorer |
+--------+---+ +------+------+ +----+--------+
| | |
+--------------+--------------+
|
+--------v--------+
| SQLite |
| (FTS5 + WAL) |
+-----------------+
All components run in a single process. There are no external services, message queues, or background workers. The database is SQLite in WAL mode for concurrent reads during writes.
Ebbinghaus Decay Engine
The decay engine is the core differentiator of Kore Memory. It models human forgetting using Hermann Ebbinghaus's forgetting curve (1885), adapted for computational use.
The Formula
decay = e^(-t * ln(2) / half_life)
Where:
t= time elapsed since the memory was last accessed (in days)half_life= the half-life in days, determined by importance levele= Euler's number (~2.71828)ln(2)= natural logarithm of 2 (~0.693)
This means that after exactly one half-life, the decay score drops to 0.5 (50% retention). After two half-lives, it drops to 0.25, and so on.
Memory Half-Lives
Each importance level maps to a different half-life:
| Importance | Label | Half-Life | 50% Decay At | 10% Decay At |
|---|---|---|---|---|
| 1 | Low | 7 days | 1 week | ~23 days |
| 2 | Normal | 14 days | 2 weeks | ~46 days |
| 3 | Important | 30 days | 1 month | ~100 days |
| 4 | High | 90 days | 3 months | ~299 days |
| 5 | Critical | 365 days | 1 year | ~3.3 years |
Decay Score Over Time (Importance 3)
Day 0: decay = 1.000 ████████████████████ 100%
Day 7: decay = 0.851 █████████████████ 85%
Day 14: decay = 0.724 ██████████████ 72%
Day 30: decay = 0.500 ██████████ 50%
Day 60: decay = 0.250 █████ 25%
Day 90: decay = 0.125 ██ 13%
Day120: decay = 0.063 █ 6%
Spaced Repetition Effect
Every time a memory is retrieved (via search, timeline, or direct access), its decay score is reinforced:
decay_score += 0.05
Additionally, each retrieval extends the memory's effective half-life by +15%. This mirrors the spaced repetition effect in human learning: memories that are regularly accessed become more durable over time.
For example, an importance-3 memory (30-day half-life) that is accessed 5 times would have an effective half-life of:
30 * (1.15)^5 = 30 * 2.011 = ~60 days
When Decay Runs
The decay engine runs when you call POST /decay/run or the memory_decay_run MCP tool. It:
- Iterates over all active memories
- Recalculates each memory's
decay_scoreusing the formula above - Removes memories where
decay_scorefalls below a minimum threshold (effectively forgotten) - Updates the database in a single transaction
Schedule decay runs periodically (e.g., daily via cron) for best results. The web dashboard also provides a one-click button.
Auto-Importance Scoring
When importance is set to 1 (or omitted), Kore automatically scores the memory on a 1--5 scale using local heuristics. No LLM is required.
Scoring Signals
The scorer analyzes the content text and assigns importance based on multiple signals:
| Signal | Effect |
|---|---|
| Content length | Longer, more detailed content scores higher |
| Specificity keywords | Terms like "always", "never", "critical", "important" boost score |
| Category weight | decision and preference categories get a boost |
| Numeric data | Presence of numbers, dates, or measurements increases score |
| Named entities | Capitalized words (names, projects) increase specificity |
| Temporal markers | Words like "deadline", "by Friday", "Q3" boost score |
| Negation patterns | "Do not", "never", "avoid" indicate important constraints |
Score Distribution
In practice, auto-scored memories follow a natural distribution:
| Score | Frequency | Example |
|---|---|---|
| 1 (Low) | ~5% | Trivial observations, small talk |
| 2 (Normal) | ~35% | General facts, routine information |
| 3 (Important) | ~40% | Project details, preferences, decisions |
| 4 (High) | ~15% | Critical constraints, key relationships |
| 5 (Critical) | ~5% | Security rules, architecture decisions |
You can always override auto-scoring by setting importance to 2--5 explicitly.
Semantic Search
Semantic search uses local sentence-transformer models to find conceptually similar memories, even when the exact words differ.
How It Works
-
At save time: The memory content is embedded into a 384-dimensional vector using the configured sentence-transformer model (default:
paraphrase-multilingual-MiniLM-L12-v2) -
At search time: The query is embedded using the same model, then compared against all stored embeddings using cosine similarity
-
Ranking: Results are ranked by effective score:
effective_score = cosine_similarity * decay_score * (importance / 5)
This formula ensures three properties:
- Relevance -- Semantically similar memories rank higher
- Recency -- Fresh memories rank higher than stale ones
- Importance -- Critical memories rank higher than trivial ones
Multilingual Support
The default model (paraphrase-multilingual-MiniLM-L12-v2) supports 50+ languages. A memory saved in Italian can be found with an English query, and vice versa:
# Save in Italian
curl -X POST http://localhost:8765/save \
-d '{"content": "L'\''utente preferisce risposte concise"}'
# Search in English
curl "http://localhost:8765/search?q=user+response+preferences&semantic=true"
# Finds the Italian memory
FTS5 Fallback
When the semantic extra is not installed, search falls back to SQLite FTS5 full-text search. FTS5 is keyword-based and works well for exact matches but does not understand synonyms or cross-language queries.
Memory Compression
The compression engine identifies and merges redundant memories to prevent knowledge bloat.
Algorithm
- Compute pairwise cosine similarity between all memory embeddings
- Identify pairs where similarity exceeds the threshold (default: 0.88)
- For each pair:
- Keep the memory with higher importance
- Append unique information from the lower-importance memory
- Archive or delete the redundant memory
- Re-embed the merged memory
Example
Before compression:
- Memory A (importance 3): "React 19 supports server components natively"
- Memory B (importance 2): "React version 19 has built-in server component support"
Cosine similarity: 0.94 (above 0.88 threshold)
After compression:
- Memory A (importance 3): "React 19 supports server components natively" (kept)
- Memory B: archived
Tuning Compression
Adjust KORE_SIMILARITY_THRESHOLD to control aggressiveness:
# Conservative: only merge near-duplicates
KORE_SIMILARITY_THRESHOLD=0.95 kore
# Aggressive: merge loosely similar memories
KORE_SIMILARITY_THRESHOLD=0.80 kore
Setting the threshold too low (below 0.80) may merge memories that contain distinct information. The default of 0.88 is a safe balance between deduplication and information preservation.
Data Storage
SQLite with WAL
Kore uses SQLite in Write-Ahead Logging (WAL) mode, which allows:
- Concurrent reads during writes
- Crash recovery without data loss
- Single-file database with no external dependencies
Schema Overview
The database contains these core tables:
| Table | Purpose |
|---|---|
memories | Core memory storage (content, category, importance, decay_score, timestamps) |
embeddings | Vector embeddings for semantic search |
tags | Tag-to-memory mappings |
relations | Memory-to-memory relations (bidirectional) |
archive | Soft-deleted memories |
fts_index | FTS5 full-text search index |
Thread Safety
SQLite connections are managed via a thread-safe connection pool. Each request gets its own connection, preventing data races in concurrent scenarios.
Memory Lifecycle
A memory goes through the following lifecycle:
Created (decay=1.0)
│
├── Searched/accessed → reinforced (decay += 0.05, half-life *= 1.15)
│
├── Decay run → decay recalculated
│ │
│ ├── decay > threshold → memory survives
│ │
│ └── decay < threshold → memory removed
│
├── Compression → merged with similar memory
│
├── Archive → soft-deleted (restorable)
│
├── TTL expired → removed by cleanup
│
└── Delete → permanently removed
Request Flow
A typical save-then-search flow:
1. Client sends POST /save
2. Server validates input (Pydantic v2)
3. Auto-importance scorer assigns importance
4. Sentence-transformer generates embedding
5. SQLite stores memory + embedding in a transaction
6. FTS5 index is updated
7. Response returned with memory ID
8. Client sends GET /search?q=...&semantic=true
9. Query is embedded using the same model
10. Cosine similarity computed against all embeddings
11. Results filtered by agent namespace
12. Effective scores calculated (similarity * decay * importance)
13. Results sorted and paginated
14. Decay scores reinforced for accessed memories
15. Response returned
Performance Characteristics
| Operation | Latency (typical) |
|---|---|
| Save (with embedding) | 10--50 ms |
| Save (FTS5 only) | 1--5 ms |
| Search (semantic, 1000 memories) | 20--100 ms |
| Search (FTS5) | 1--10 ms |
| Decay run (1000 memories) | 50--200 ms |
| Compression (1000 memories) | 200--1000 ms |
| Batch save (100 memories) | 100--500 ms |
All benchmarks on a modern laptop CPU (no GPU). Performance scales linearly with memory count.