Palace Index Curator
Overview
The web-research hooks auto-capture every WebFetch and WebSearch into
hooks/memory-palace-index.yaml, storing each as a markdown file and an
index entry. Captures land at the defaults routing_type: pending,
maturity: seedling, importance_score: 50, and nothing advances them.
Left alone, the index becomes a write-only graveyard: the majority of
entries are never incorporated, analyzed, or surfaced.
This skill drains that backlog and keeps it drained. It wires the
capture index to the corpus tooling the plugin already ships
(decay_model, keyword_index, marginal_value) through three
commands: a read-only report, a dry-run-first promotion engine, and a
SessionStart surfacing hook.
When to Use
- The capture backlog has grown and most entries are still
pending. - You want a corpus health report (inert ratio, orphans, topic clusters).
- You want stored research surfaced automatically during sessions.
When NOT to Use
- Ingesting a single new resource: use
knowledge-intake. - Searching stored knowledge ad hoc: use
knowledge-locator. - Tending a digital garden file: use
digital-garden-cultivator.
Workflow
1. Analyze (read-only)
uv run python scripts/memory_palace_cli.py index report
Reports total entries, the inert ratio, orphaned captures (entries whose backing file is gone), the largest topic clusters by domain, and the top promotion candidates. Writes nothing.
2. Incorporate (dry-run, then apply)
# Dry run: prints promote/archive proposals, writes nothing.
uv run python scripts/memory_palace_cli.py index promote
# Apply: backs up the index under data/backups/, then persists.
uv run python scripts/memory_palace_cli.py index promote --apply
Each pending entry is classified into one action:
- promote: recent, authoritative, or clustered. Gets a real
importance score, a routing type, and maturity
seedling -> growing. - archive: orphaned or older than the archive horizon and never
revisited. Marked
archivedrather than promoted, following the principle that unused captures should drain, not accumulate. - hold: everything else stays
pendingwith no change.
Applying is idempotent: promoted and archived entries are no longer
pending, so a second run proposes nothing new. The dry-run diff is
always shown before --apply writes.
3. Surface (learn)
A SessionStart hook (hooks/index_surfacer.py) names the highest-value
promoted captures at the start of a session. It is disabled by default.
Enable it in memory-palace-config.yaml:
feature_flags:
context_injection: true
The hook only speaks when promoted entries clear the importance floor, and it exits silently on any error so it can never block a session.
Design Notes
- Promotion uses only structural signals (recency, domain authority, cluster size). The decision logic is deterministic; no model call gates a transition.
- The decay half-lives (14/30/90 days) are tunable priors, not retention constants. Wixted & Ebbesen (1997) and Murre & Dros (2015) show forgetting follows a power law; FSRS (Ye, Su & Cao, 2022) validates exponential decay only with a learned per-item half-life. Calibrate against reopen logs if usage data accrues.
- Retrieval stays keyword-first (
cache_lookup/keyword_index); embeddings are not required at the current corpus scale. BM25 is the workhorse up to ~5000 documents; embeddings add value only for vocabulary-mismatch discovery. - Near-duplicate detection layers SHA-256 exact match (present via
content_hash) then MinHash with k-shingling for near-duplicates (Broder, 1997). SimHash is preferable only at tens of thousands of documents. - Importance formula:
relevance = w1 * centrality + w2 * decay(t) + w3 * usage. The plugin ships all three terms (graph_analyzerPageRank,decay_model,usage_tracker).
Exit Criteria
-
index reportruns and prints the inert ratio and orphan count for the live index. -
index promote(no flag) prints proposals and writes nothing (the index file is byte-identical afterward). -
index promote --applycreates a timestamped backup underdata/backups/before persisting, and a re-run proposes nothing. - With
context_injection: true, a SessionStart event surfaces the top promoted captures; with the flag off, it stays silent. - Failure modes (missing index, corrupt YAML, missing backing files) are handled without raising: report degrades, promote holds, hook exits silently.