Skip to main content

Architecture

This page explains the internal design of Kore Memory: how the Ebbinghaus decay engine works, how importance is scored without an LLM, how semantic search operates locally, and how memory compression merges redundant knowledge.

System Overview

                    +-----------------+
| REST API |
| (FastAPI) |
+--------+--------+
|
+--------------+--------------+
| | |
+--------v---+ +------v------+ +----v--------+
| Decay | | Embedding | | Importance |
| Engine | | Engine | | Scorer |
+--------+---+ +------+------+ +----+--------+
| | |
+--------------+--------------+
|
+--------v--------+
| SQLite |
| (FTS5 + WAL) |
+-----------------+

All components run in a single process. There are no external services, message queues, or background workers. The database is SQLite in WAL mode for concurrent reads during writes.

Ebbinghaus Decay Engine

The decay engine is the core differentiator of Kore Memory. It models human forgetting using Hermann Ebbinghaus's forgetting curve (1885), adapted for computational use.

The Formula

decay = e^(-t * ln(2) / half_life)

Where:

  • t = time elapsed since the memory was last accessed (in days)
  • half_life = the half-life in days, determined by importance level
  • e = Euler's number (~2.71828)
  • ln(2) = natural logarithm of 2 (~0.693)

This means that after exactly one half-life, the decay score drops to 0.5 (50% retention). After two half-lives, it drops to 0.25, and so on.

Memory Half-Lives

Each importance level maps to a different half-life:

ImportanceLabelHalf-Life50% Decay At10% Decay At
1Low7 days1 week~23 days
2Normal14 days2 weeks~46 days
3Important30 days1 month~100 days
4High90 days3 months~299 days
5Critical365 days1 year~3.3 years

Decay Score Over Time (Importance 3)

Day  0: decay = 1.000  ████████████████████  100%
Day 7: decay = 0.851 █████████████████ 85%
Day 14: decay = 0.724 ██████████████ 72%
Day 30: decay = 0.500 ██████████ 50%
Day 60: decay = 0.250 █████ 25%
Day 90: decay = 0.125 ██ 13%
Day120: decay = 0.063 █ 6%

Spaced Repetition Effect

Every time a memory is retrieved (via search, timeline, or direct access), its decay score is reinforced:

decay_score += 0.05

Additionally, each retrieval extends the memory's effective half-life by +15%. This mirrors the spaced repetition effect in human learning: memories that are regularly accessed become more durable over time.

For example, an importance-3 memory (30-day half-life) that is accessed 5 times would have an effective half-life of:

30 * (1.15)^5 = 30 * 2.011 = ~60 days

When Decay Runs

The decay engine runs when you call POST /decay/run or the memory_decay_run MCP tool. It:

  1. Iterates over all active memories
  2. Recalculates each memory's decay_score using the formula above
  3. Removes memories where decay_score falls below a minimum threshold (effectively forgotten)
  4. Updates the database in a single transaction
tip

Schedule decay runs periodically (e.g., daily via cron) for best results. The web dashboard also provides a one-click button.

Auto-Importance Scoring

When importance is set to 1 (or omitted), Kore automatically scores the memory on a 1--5 scale using local heuristics. No LLM is required.

Scoring Signals

The scorer analyzes the content text and assigns importance based on multiple signals:

SignalEffect
Content lengthLonger, more detailed content scores higher
Specificity keywordsTerms like "always", "never", "critical", "important" boost score
Category weightdecision and preference categories get a boost
Numeric dataPresence of numbers, dates, or measurements increases score
Named entitiesCapitalized words (names, projects) increase specificity
Temporal markersWords like "deadline", "by Friday", "Q3" boost score
Negation patterns"Do not", "never", "avoid" indicate important constraints

Score Distribution

In practice, auto-scored memories follow a natural distribution:

ScoreFrequencyExample
1 (Low)~5%Trivial observations, small talk
2 (Normal)~35%General facts, routine information
3 (Important)~40%Project details, preferences, decisions
4 (High)~15%Critical constraints, key relationships
5 (Critical)~5%Security rules, architecture decisions

You can always override auto-scoring by setting importance to 2--5 explicitly.

Semantic search uses local sentence-transformer models to find conceptually similar memories, even when the exact words differ.

How It Works

  1. At save time: The memory content is embedded into a 384-dimensional vector using the configured sentence-transformer model (default: paraphrase-multilingual-MiniLM-L12-v2)

  2. At search time: The query is embedded using the same model, then compared against all stored embeddings using cosine similarity

  3. Ranking: Results are ranked by effective score:

effective_score = cosine_similarity * decay_score * (importance / 5)

This formula ensures three properties:

  • Relevance -- Semantically similar memories rank higher
  • Recency -- Fresh memories rank higher than stale ones
  • Importance -- Critical memories rank higher than trivial ones

Multilingual Support

The default model (paraphrase-multilingual-MiniLM-L12-v2) supports 50+ languages. A memory saved in Italian can be found with an English query, and vice versa:

# Save in Italian
curl -X POST http://localhost:8765/save \
-d '{"content": "L'\''utente preferisce risposte concise"}'

# Search in English
curl "http://localhost:8765/search?q=user+response+preferences&semantic=true"
# Finds the Italian memory

FTS5 Fallback

When the semantic extra is not installed, search falls back to SQLite FTS5 full-text search. FTS5 is keyword-based and works well for exact matches but does not understand synonyms or cross-language queries.

Memory Compression

The compression engine identifies and merges redundant memories to prevent knowledge bloat.

Algorithm

  1. Compute pairwise cosine similarity between all memory embeddings
  2. Identify pairs where similarity exceeds the threshold (default: 0.88)
  3. For each pair:
    • Keep the memory with higher importance
    • Append unique information from the lower-importance memory
    • Archive or delete the redundant memory
  4. Re-embed the merged memory

Example

Before compression:

  • Memory A (importance 3): "React 19 supports server components natively"
  • Memory B (importance 2): "React version 19 has built-in server component support"

Cosine similarity: 0.94 (above 0.88 threshold)

After compression:

  • Memory A (importance 3): "React 19 supports server components natively" (kept)
  • Memory B: archived

Tuning Compression

Adjust KORE_SIMILARITY_THRESHOLD to control aggressiveness:

# Conservative: only merge near-duplicates
KORE_SIMILARITY_THRESHOLD=0.95 kore

# Aggressive: merge loosely similar memories
KORE_SIMILARITY_THRESHOLD=0.80 kore
warning

Setting the threshold too low (below 0.80) may merge memories that contain distinct information. The default of 0.88 is a safe balance between deduplication and information preservation.

Data Storage

SQLite with WAL

Kore uses SQLite in Write-Ahead Logging (WAL) mode, which allows:

  • Concurrent reads during writes
  • Crash recovery without data loss
  • Single-file database with no external dependencies

Schema Overview

The database contains these core tables:

TablePurpose
memoriesCore memory storage (content, category, importance, decay_score, timestamps)
embeddingsVector embeddings for semantic search
tagsTag-to-memory mappings
relationsMemory-to-memory relations (bidirectional)
archiveSoft-deleted memories
fts_indexFTS5 full-text search index

Thread Safety

SQLite connections are managed via a thread-safe connection pool. Each request gets its own connection, preventing data races in concurrent scenarios.

Memory Lifecycle

A memory goes through the following lifecycle:

Created (decay=1.0)

├── Searched/accessed → reinforced (decay += 0.05, half-life *= 1.15)

├── Decay run → decay recalculated
│ │
│ ├── decay > threshold → memory survives
│ │
│ └── decay < threshold → memory removed

├── Compression → merged with similar memory

├── Archive → soft-deleted (restorable)

├── TTL expired → removed by cleanup

└── Delete → permanently removed

Request Flow

A typical save-then-search flow:

1. Client sends POST /save
2. Server validates input (Pydantic v2)
3. Auto-importance scorer assigns importance
4. Sentence-transformer generates embedding
5. SQLite stores memory + embedding in a transaction
6. FTS5 index is updated
7. Response returned with memory ID

8. Client sends GET /search?q=...&semantic=true
9. Query is embedded using the same model
10. Cosine similarity computed against all embeddings
11. Results filtered by agent namespace
12. Effective scores calculated (similarity * decay * importance)
13. Results sorted and paginated
14. Decay scores reinforced for accessed memories
15. Response returned

Performance Characteristics

OperationLatency (typical)
Save (with embedding)10--50 ms
Save (FTS5 only)1--5 ms
Search (semantic, 1000 memories)20--100 ms
Search (FTS5)1--10 ms
Decay run (1000 memories)50--200 ms
Compression (1000 memories)200--1000 ms
Batch save (100 memories)100--500 ms

All benchmarks on a modern laptop CPU (no GPU). Performance scales linearly with memory count.