Mine
Deep extraction. Turn raw sources into structured context.
Not a quick search (that's find). Not session timeline (that's history). Mine is the heavy operation — reading source material systematically, extracting everything of value, and routing it into the world.
When It Fires
- The human says "mine this", "extract from these", "what's buried in here"
alive:session-historyescalates — "these sessions look rich, want to deep-mine them?"- The human points at a folder of transcripts, documents, or exports
- A bundle has raw sources that haven't been fully processed
- The human asks "what have I missed?" or "find everything about X in my sources"
What It Does
0. Pre-check: Source Location
Before mining, check whether the source file lives inside a bundle's raw/ folder:
- In a bundle — proceed normally. Phase 4 tracking writes to that bundle's
context.manifest.yamldiscovered:field. - Loose file (inbox, walnut root, or anywhere without a manifest) — enter loose-mine mode. Mining still works, but tracking state writes to
$WORLD_ROOT/.alive/_mining-log.jsoninstead of a bundle manifest. This prevents duplicate work if the same file is mined again before being captured into a bundle.
// .alive/_mining-log.json — created on first loose mine
{
"mined": [
{
"path": "03_Inbox/Will x Ben Standup.md",
"mined_at": "2026-04-16T10:30:00Z",
"items_extracted": 14,
"routed_to": "alive-os"
}
]
}
If the file appears in _mining-log.json already, warn: "This file was mined on {date} ({N} items extracted). Mine again or skip?"
1. Assess Source Material
Scan what's available. Don't read everything — just understand the landscape.
╭─ squirrel scanning sources
│
│ client-calls/raw/ — 12 transcripts (3 unprocessed)
│ research/raw/ — 8 documents (5 unprocessed)
│ .alive/_squirrels/ — 23 sessions (8 with rich stash, 4 unmined transcripts)
│
│ Total: 43 sources, 12 unprocessed, 4 unmined sessions
╰─
2. Build Extraction Plan
Before touching any content, propose a plan. What to look for, in what order, expected yield.
╭─ squirrel extraction plan
│ Source: client-calls/raw/ (12 transcripts)
│
│ Targets:
│ - People mentioned (create/update person walnuts)
│ - Decisions made (route to walnut logs)
│ - Tasks assigned (route to walnut tasks)
│ - Domain knowledge (route to insights)
│ - Recurring themes (flag as potential bundles)
│
│ > Run this plan?
│ 1. Yeah, mine it
│ 2. Adjust targets
│ 3. Mine specific files only
╰─
The plan adapts to what the human needs:
- "Mine the last 3 squirrels" -> session-focused extraction
- "Mine everything about glass-cathedral from February" -> topic-focused, date-filtered
- "Who keeps coming up that I haven't tracked?" -> people discovery mode
- "What patterns do you see?" -> theme extraction
3. Extract Systematically
Process sources one at a time or in batches. For each source:
- Read the full content (transcript, document, export)
- Extract against the plan targets
- Route extracted items:
- People -> stash for person walnut creation/update
- Decisions -> stash for log routing at save
- Tasks -> stash for task routing at save
- Knowledge -> stash as insight candidates
- Themes -> flag as potential new bundles
- Update context.manifest.yaml -> mark the source as processed in the bundle's
context.manifest.yamldiscovered:field
Extraction is bounded by source type:
| Source Type | Extract |
|---|---|
| Transcript | Decisions, action items, people + roles, key quotes, domain knowledge, commitments, deadlines |
| Document | Key claims, data points, relevant sections, author context, links to other work |
| Session transcript | Decisions + rationale, files touched, architectural choices, dead ends, open threads |
| Export (ChatGPT, etc.) | Topics discussed, decisions made, knowledge synthesized, people referenced |
| Email thread | Commitments, deadlines, people + relationships, action items, context updates |
4. Track Progress
The bundle's context.manifest.yaml tracks extraction state:
discovered:
status: partial # none | partial | complete
last_mined: 2026-03-10
processed:
- path: raw/2026-02-15-call-with-jax.md
extracted: [3 people, 2 decisions, 1 insight]
- path: raw/2026-02-20-shielding-review.md
extracted: [1 person, 4 decisions, 2 tasks]
unprocessed:
- raw/2026-03-01-vendor-followup.md
- raw/2026-03-05-budget-review.md
This means the squirrel can resume mining across sessions — it knows exactly what's been done and what remains.
5. Discover Targets
The most valuable part. While extracting, actively watch for:
- New people — names that appear across multiple sources but have no person walnut. Surface them with context about who they are and how they relate.
- Recurring subjects — topics that keep coming up across sources. These might deserve their own bundle or even walnut.
- Interests and tendencies — patterns in what the human focuses on, returns to, or avoids. Not for judgment — for awareness.
- Cross-walnut connections — references in one walnut's sources to another walnut's domain. These are invisible links the system should know about.
- Contradictions — decisions in one source that conflict with decisions in another. Surface these gently. Detection procedure: for each extracted decision, grep the target walnut's
_kernel/insights.mdand last 10 entries of_kernel/log.mdfor the topic keyword. If a match states the opposite position, flag as a contradiction with: old-statement source, new-statement source, suggested resolution (supersede / merge / user-decide).
╭─ squirrel discoveries
│
│ People without walnuts:
│ - Dr. Elara Voss (mentioned 7 times across 3 transcripts)
│ - Marcus Chen (mentioned in 2 documents, seems to be a vendor contact)
│
│ Recurring themes:
│ - "Regulatory timeline" comes up in 5 of 12 transcripts
│ -> worth a bundle?
│
│ Cross-walnut:
│ - nova-station sources reference glass-cathedral pricing 3 times
│ -> add link?
│
│ > Act on these / note and move on
╰─
5b. Entity Name Resolution
Transcripts often contain speech-to-text artefacts that mangle proper nouns. Before finalising routes, resolve every extracted entity name against the world index:
- Load walnut names from
.alive/_index.yaml - For every proper-noun extracted (person names, walnut references, venture names), check for an exact match
- If no exact match but a name within edit distance ≤ 2 exists, flag: "No walnut 'nova-stations' — did you mean 'nova-station'?"
- If no match at all, route as "unresolved reference" and surface in discoveries for the user to decide (create walnut / ignore / rename)
6. Mark Completion
When a source or batch is fully mined:
╭─ squirrel mining complete
│
│ Processed: 12 transcripts in client-calls
│ Extracted: 8 people, 14 decisions, 6 tasks, 3 insights, 2 themes
│ Stashed: 33 items ready for routing at save
│ Discoveries: 2 new people suggested, 1 bundle suggested, 1 cross-link
│
│ > Run alive:save to route everything
│ > Mine another source
│ > Done for now
╰─
Mining Sessions (via alive:session-history)
When alive:session-history identifies unmined sessions — long transcripts with extensive decision-making or research — it can hand off to mine with a specific scope:
"Mine the last 3 squirrels" -> read session transcripts, extract decisions/rationale/context that didn't make it into the stash or log. This recovers lost context from sessions that were saved quickly or not saved at all.
"Mine everything about X from February" -> filter squirrel entries by date and topic, then mine matching transcripts for deep context on that specific subject.
The squirrel resolves transcript paths using the same discovery mechanism as history (see alive:session-history Transcript Discovery section).
Output
Everything mine extracts goes through the stash. Nothing is written directly to walnut files during mining — it all routes at save.
- Enriched context.manifest.yaml — extraction tracking updated in the
discovered:field - Stash items — decisions, tasks, insights, notes tagged with destination walnuts
- Suggested new walnuts — people, subjects that deserve their own space
- Suggested new bundles — themes or bodies of work emerging from sources
- Suggested cross-links — connections between walnuts discovered in sources
What Mine Is NOT
- Not
alive:capture-context— capture brings new content IN. Mine extracts value from content already captured. - Not
alive:session-history— history shows the session timeline. Mine goes deep on specific sources. - Not
alive:search-world— find searches for known things. Mine discovers unknown things.
Capture is the intake. Mine is the refinery. History is the timeline. Find is the retrieval.