Architecture

This page explains the internal design of Kore Memory: how the Ebbinghaus decay engine works, how importance is scored without an LLM, how semantic search operates locally, and how memory compression merges redundant knowledge.

System Overview

                    +-----------------+
                    |   REST API      |
                    |   (FastAPI)     |
                    +--------+--------+
                             |
              +--------------+--------------+
              |              |              |
     +--------v---+  +------v------+  +----v--------+
     | Decay      |  | Embedding   |  | Importance  |
     | Engine     |  | Engine      |  | Scorer      |
     +--------+---+  +------+------+  +----+--------+
              |              |              |
              +--------------+--------------+
                             |
                    +--------v--------+
                    |    SQLite       |
                    |  (FTS5 + WAL)  |
                    +-----------------+

All components run in a single process. There are no external services, message queues, or background workers. The database is SQLite in WAL mode for concurrent reads during writes.

Ebbinghaus Decay Engine

The decay engine is the core differentiator of Kore Memory. It models human forgetting using Hermann Ebbinghaus's forgetting curve (1885), adapted for computational use.

The Formula

decay = e^(-t * ln(2) / half_life)

Where:

t = time elapsed since the memory was last accessed (in days)
half_life = the half-life in days, determined by importance level
e = Euler's number (~2.71828)
ln(2) = natural logarithm of 2 (~0.693)

This means that after exactly one half-life, the decay score drops to 0.5 (50% retention). After two half-lives, it drops to 0.25, and so on.

Memory Half-Lives

Each importance level maps to a different half-life:

Importance	Label	Half-Life	50% Decay At	10% Decay At
1	Low	7 days	1 week	~23 days
2	Normal	14 days	2 weeks	~46 days
3	Important	30 days	1 month	~100 days
4	High	90 days	3 months	~299 days
5	Critical	365 days	1 year	~3.3 years

Decay Score Over Time (Importance 3)

Day  0: decay = 1.000  ████████████████████  100%
Day  7: decay = 0.851  █████████████████     85%
Day 14: decay = 0.724  ██████████████        72%
Day 30: decay = 0.500  ██████████            50%
Day 60: decay = 0.250  █████                 25%
Day 90: decay = 0.125  ██                    13%
Day120: decay = 0.063  █                      6%

Spaced Repetition Effect

Every time a memory is retrieved (via search, timeline, or direct access), its decay score is reinforced:

decay_score += 0.05

Additionally, each retrieval extends the memory's effective half-life by +15%. This mirrors the spaced repetition effect in human learning: memories that are regularly accessed become more durable over time.

For example, an importance-3 memory (30-day half-life) that is accessed 5 times would have an effective half-life of:

30 * (1.15)^5 = 30 * 2.011 = ~60 days

When Decay Runs

The decay engine runs when you call POST /decay/run or the memory_decay_run MCP tool. It:

Iterates over all active memories
Recalculates each memory's decay_score using the formula above
Removes memories where decay_score falls below a minimum threshold (effectively forgotten)
Updates the database in a single transaction

tip

Schedule decay runs periodically (e.g., daily via cron) for best results. The web dashboard also provides a one-click button.

Auto-Importance Scoring

When importance is set to 1 (or omitted), Kore automatically scores the memory on a 1--5 scale using local heuristics. No LLM is required.

Scoring Signals

The scorer analyzes the content text and assigns importance based on multiple signals:

Signal	Effect
Content length	Longer, more detailed content scores higher
Specificity keywords	Terms like "always", "never", "critical", "important" boost score
Category weight	`decision` and `preference` categories get a boost
Numeric data	Presence of numbers, dates, or measurements increases score
Named entities	Capitalized words (names, projects) increase specificity
Temporal markers	Words like "deadline", "by Friday", "Q3" boost score
Negation patterns	"Do not", "never", "avoid" indicate important constraints

Score Distribution

In practice, auto-scored memories follow a natural distribution:

Score	Frequency	Example
1 (Low)	~5%	Trivial observations, small talk
2 (Normal)	~35%	General facts, routine information
3 (Important)	~40%	Project details, preferences, decisions
4 (High)	~15%	Critical constraints, key relationships
5 (Critical)	~5%	Security rules, architecture decisions

You can always override auto-scoring by setting importance to 2--5 explicitly.

Semantic Search

Semantic search uses local sentence-transformer models to find conceptually similar memories, even when the exact words differ.

How It Works

At save time: The memory content is embedded into a 384-dimensional vector using the configured sentence-transformer model (default: paraphrase-multilingual-MiniLM-L12-v2)
At search time: The query is embedded using the same model, then compared against all stored embeddings using cosine similarity
Ranking: Results are ranked by effective score:

effective_score = cosine_similarity * decay_score * (importance / 5)

This formula ensures three properties:

Relevance -- Semantically similar memories rank higher
Recency -- Fresh memories rank higher than stale ones
Importance -- Critical memories rank higher than trivial ones

Multilingual Support

The default model (paraphrase-multilingual-MiniLM-L12-v2) supports 50+ languages. A memory saved in Italian can be found with an English query, and vice versa:

# Save in Italian
curl -X POST http://localhost:8765/save \
  -d '{"content": "L'\''utente preferisce risposte concise"}'

# Search in English
curl "http://localhost:8765/search?q=user+response+preferences&semantic=true"
# Finds the Italian memory

FTS5 Fallback

When the semantic extra is not installed, search falls back to SQLite FTS5 full-text search. FTS5 is keyword-based and works well for exact matches but does not understand synonyms or cross-language queries.

Memory Compression

The compression engine identifies and merges redundant memories to prevent knowledge bloat.

Algorithm

Compute pairwise cosine similarity between all memory embeddings
Identify pairs where similarity exceeds the threshold (default: 0.88)
For each pair:
- Keep the memory with higher importance
- Append unique information from the lower-importance memory
- Archive or delete the redundant memory
Re-embed the merged memory

Example

Before compression:

Memory A (importance 3): "React 19 supports server components natively"
Memory B (importance 2): "React version 19 has built-in server component support"

Cosine similarity: 0.94 (above 0.88 threshold)

After compression:

Memory A (importance 3): "React 19 supports server components natively" (kept)
Memory B: archived

Tuning Compression

Adjust KORE_SIMILARITY_THRESHOLD to control aggressiveness:

# Conservative: only merge near-duplicates
KORE_SIMILARITY_THRESHOLD=0.95 kore

# Aggressive: merge loosely similar memories
KORE_SIMILARITY_THRESHOLD=0.80 kore

warning

Setting the threshold too low (below 0.80) may merge memories that contain distinct information. The default of 0.88 is a safe balance between deduplication and information preservation.

Data Storage

SQLite with WAL

Kore uses SQLite in Write-Ahead Logging (WAL) mode, which allows:

Concurrent reads during writes
Crash recovery without data loss
Single-file database with no external dependencies

Schema Overview

The database contains these core tables:

Table	Purpose
`memories`	Core memory storage (content, category, importance, decay_score, timestamps)
`embeddings`	Vector embeddings for semantic search
`tags`	Tag-to-memory mappings
`relations`	Memory-to-memory relations (bidirectional)
`archive`	Soft-deleted memories
`fts_index`	FTS5 full-text search index

Thread Safety

SQLite connections are managed via a thread-safe connection pool. Each request gets its own connection, preventing data races in concurrent scenarios.

Memory Lifecycle

A memory goes through the following lifecycle:

Created (decay=1.0)
    │
    ├── Searched/accessed → reinforced (decay += 0.05, half-life *= 1.15)
    │
    ├── Decay run → decay recalculated
    │   │
    │   ├── decay > threshold → memory survives
    │   │
    │   └── decay < threshold → memory removed
    │
    ├── Compression → merged with similar memory
    │
    ├── Archive → soft-deleted (restorable)
    │
    ├── TTL expired → removed by cleanup
    │
    └── Delete → permanently removed

Request Flow

A typical save-then-search flow:

Client sends POST /save
Server validates input (Pydantic v2)
Auto-importance scorer assigns importance
Sentence-transformer generates embedding
SQLite stores memory + embedding in a transaction
FTS5 index is updated
Response returned with memory ID

Client sends GET /search?q=...&semantic=true
Query is embedded using the same model
Cosine similarity computed against all embeddings
Results filtered by agent namespace
Effective scores calculated (similarity * decay * importance)
Results sorted and paginated
Decay scores reinforced for accessed memories
Response returned

Performance Characteristics

Operation	Latency (typical)
Save (with embedding)	10--50 ms
Save (FTS5 only)	1--5 ms
Search (semantic, 1000 memories)	20--100 ms
Search (FTS5)	1--10 ms
Decay run (1000 memories)	50--200 ms
Compression (1000 memories)	200--1000 ms
Batch save (100 memories)	100--500 ms

All benchmarks on a modern laptop CPU (no GPU). Performance scales linearly with memory count.

System Overview​

Ebbinghaus Decay Engine​

The Formula​

Memory Half-Lives​

Decay Score Over Time (Importance 3)​

Spaced Repetition Effect​

When Decay Runs​

Auto-Importance Scoring​

Scoring Signals​

Score Distribution​

Semantic Search​

How It Works​

Multilingual Support​

FTS5 Fallback​

Memory Compression​

Algorithm​

Example​

Tuning Compression​

Data Storage​

SQLite with WAL​

Schema Overview​

Thread Safety​

Memory Lifecycle​

Request Flow​

Performance Characteristics​