Corpus Snapshot

Generate a point-in-time snapshot of the research corpus with computed metrics and analysis. Reads a snapshot template, fills [COMPUTE] sections with data, assists with [ANALYZE] sections, and writes the completed report.

Triggers

"take a corpus snapshot"
"generate corpus report"
"snapshot the research"
"corpus snapshot"
/corpus-snapshot

Parameters

`--compute-only` (optional)

Only compute data sections — skip analysis sections. Faster, fully automated.

`--delta-only` (optional)

Only compute the delta from the previous snapshot. Useful for tracking session progress.

`--template <path>` (optional)

Custom template path. Default: .aiwg/reports/corpus-snapshot-template.md.

`--format` (optional)

Output format: full (default for the report file), summary (terminal), json (programmatic).

Prerequisites

Before generating a snapshot, the following should be current:

Prerequisite	Command	Gates on
Citation edges complete	`/citation-backfill`	Topology metrics
Indices up to date	`/corpus-index-build`	Group counts, hub analysis
Stub rate < 10%	`/research-quality-audit`	Snapshot validity

If prerequisites are stale, the snapshot will include warnings.

Execution Flow

Phase 1: Collect Raw Metrics

Scan the corpus and compute:

Dimensions:

Total papers (node count)
Total citation edges (edge count)
Topics (unique tag count)
Authors (unique author count)
Year range (oldest → newest)
Source types distribution

Topology (from citation-network index):

Graph density: edges / (nodes * (nodes-1))
Average degree (mean edges per node)
Max hub (node with most connections)
Connected components count
Isolated nodes (degree 0)
Diameter estimate (longest shortest path in largest component)

Degree Distribution:

Histogram: how many nodes have degree 0, 1-2, 3-5, 6-10, 11-20, 20+
Power law fit (if applicable)

Quality Distribution:

GRADE breakdown: High / Moderate / Low / Very Low
Doc depth: Full / Adequate / Stub / Skeleton (from quality-audit)
Source availability: PDF present / Full text extracted / Missing

Phase 2: Compute Delta (if previous snapshot exists)

Compare current metrics against the most recent snapshot:

Delta from previous snapshot (2026-04-10):
  Papers:     +12 (360 → 372)
  Edges:      +87 (1,160 → 1,247)
  Density:    +0.001 (0.008 → 0.009)
  New topics:  +2 (gui-agents, code-generation)
  Stubs fixed: 23 (88 → 65)
  New hubs:    REF-364 (entered top 10)

Phase 3: Fill Template Sections

Read the snapshot template and fill sections:

[COMPUTE] sections — fully automated:

Dimensions table
Topology metrics
Degree distribution histogram
GRADE distribution
Delta table

[ANALYZE] sections — agent-assisted:

Cluster narrative: describe the main clusters and their themes
Chain analysis: identify citation chains (A→B→C→D) and their significance
Gap narrative: summarize disconnected areas and bridge opportunities
Trend analysis: what's growing, what's stagnant

Phase 4: Write Report

Write the completed snapshot to:

.aiwg/reports/corpus-snapshot-YYYY-MM-DD.md

With frontmatter:

---
type: corpus-snapshot
date: 2026-04-13
papers: 372
edges: 1247
density: 0.009
components: 9
stub_rate: 0.17
previous: corpus-snapshot-2026-04-10.md
---

Phase 5: Report Summary

Corpus Snapshot Generated
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Papers: 372 (+12)  |  Edges: 1,247 (+87)
Density: 0.009     |  Components: 9
Hub: REF-016 (34)  |  Isolated: 3
GRADE: 33% High, 24% Mod, 26% Low, 16% VLow
Stubs: 65 (17%)    |  Full text: 54%

Delta highlights:
  +12 papers inducted
  +87 citation edges (backfill)
  -23 stubs (expanded)
  +2 new topics

Written to: .aiwg/reports/corpus-snapshot-2026-04-13.md

Template Format

The default template uses markers for computed vs analyzed sections:

# Corpus Snapshot — [DATE]

## Dimensions
[COMPUTE: dimensions-table]

## Topology
[COMPUTE: topology-metrics]

## Degree Distribution
[COMPUTE: degree-histogram]

## Quality Distribution
[COMPUTE: grade-distribution]
[COMPUTE: depth-distribution]

## Delta
[COMPUTE: delta-from-previous]

## Cluster Analysis
[ANALYZE: describe main clusters, their themes, and notable papers]

## Citation Chains
[ANALYZE: identify significant citation chains and their meaning]

## Gaps and Opportunities
[ANALYZE: summarize disconnected areas and bridge opportunities]

## Recommendations
[ANALYZE: what should be inducted next, what needs expansion]

Integration Points

Component	Relationship
`corpus-index-build`	Reads index metrics (topology, hubs, components)
`research-quality-audit`	Reads depth distribution; gates if stub rate > 10%
`citation-backfill`	Must run before snapshot for accurate topology
`research-gap-detect`	Cluster data feeds into gap narrative
`research-status`	Snapshot is the detailed version of the health score

Examples

# Full snapshot with analysis
/corpus-snapshot

# Just data, no analysis sections
/corpus-snapshot --compute-only

# Delta from previous snapshot only
/corpus-snapshot --delta-only

# Custom template
/corpus-snapshot --template .aiwg/reports/custom-template.md

# JSON metrics for dashboards
/corpus-snapshot --format json

References

@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/corpus-index-build/SKILL.md — Index metrics source
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-quality-audit/SKILL.md — Depth distribution source
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/citation-backfill/SKILL.md — Prerequisite for topology
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-gap-detect/SKILL.md — Cluster data for narrative
@$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-status/SKILL.md — Health scoring complement