Skip to main content
Generalbdfinst

context-loading-protocol

Decide which agents and skills to load for a given task. Use at the start of every task to select the minimum viable context load, calculate the token budget, and stay below the 40% utilization ceiling.

Stars
190
Source
bdfinst/agentic-dev-team
Updated
2026-05-30
Slug
bdfinst--agentic-dev-team--context-loading-protocol
View on GitHubRaw SKILL.md

// install — copy + paste into any project

mkdir -p .claude/skills && curl -fsSL https://raw.githubusercontent.com/bdfinst/agentic-dev-team/HEAD/plugins/agentic-dev-team/skills/context-loading-protocol/SKILL.md -o .claude/skills/context-loading-protocol.md

Drops the SKILL.md into .claude/skills/context-loading-protocol.md. Works with Claude Code, Cursor, and any agent that loads SKILL.md files from .claude/skills/.

Context Loading Protocol

Token-budget reference (CLAUDE.md baseline, full-load ceiling, per-agent and per-skill costs) is the Baseline Budget section of CLAUDE.md. This skill is the runtime procedure; don't duplicate the table here — it goes stale.

Constraints

  • Never load all agents upfront; load only the primary agent for each phase.
  • Keep total context below 40% of the model's window at all times.
  • Load agents on demand when their phase begins, not speculatively.
  • Use tool-based file reads (Read); do not paste file contents into the prompt.

Loading Decision Procedure

Step 1: Classify the task

Profile Description Example
Simple/Single One agent, no skills "Fix this typo", "Write a unit test"
Standard/Single One agent + 1–2 skills "Implement this feature using hexagonal architecture"
Multi-Agent 2–3 agents coordinating "Design and implement a new API endpoint"
Complex/Multi 3+ agents + skills "Build a new bounded context with full test coverage"

Step 2: Select agents

Load the minimum set:

  1. Identify the primary agent (owns the deliverable).
  2. Identify supporting agents (input or review).
  3. Do NOT load agents for downstream validation yet — load them when their phase begins.

Order: primary first, then supporting agents one at a time as their phase begins.

Step 3: Select skills

For each loaded agent, check its ## Skills section:

  • Only load skills relevant to the current task — not all skills the agent references.
  • Skills shared by multiple loaded agents only need to be loaded once.

Step 4: Calculate token budget

Total = CLAUDE.md baseline
      + conversation history (estimate)
      + agent files (sum selected)
      + skill files (sum selected)
      + expected output (estimate)

Target: total < 40% of the model's context window. For Claude with a 200K window, that's < 80K tokens. The config files are a small fraction; the real budget concern is conversation history + output accumulation over multi-turn tasks.

Step 5: Load via tool-based file reads

Read agents/software-engineer.md
Read skills/hexagonal-architecture/SKILL.md

Do NOT copy file contents into the system prompt or conversation.

Loading Profiles

Pre-computed loading sets for common task types.

Code Implementation

  • Load: Software Engineer + relevant skill(s)
  • Defer: QA (load after implementation), Architect (load only if design questions arise)

Architecture Design

  • Load: Architect + relevant architecture skill(s)
  • Defer: Software Engineer (load at implementation), QA (load at validation)

Bug Fix

  • Load: Software Engineer only
  • Defer: QA (load if regression test needed)

New Feature (full lifecycle)

Three phases, each in a fresh context window with a human review gate between. Each phase's output is a structured progress file in memory/ that onboards the next phase.

Phase Load Purpose Output
1. Research Orchestrator + sub-agents (exploration) Understand system, find files, trace data flows Research progress file
2. Plan Architect + PM (if needed) + relevant skill(s) Specify every change: files, snippets, tests Implementation plan progress file
3. Implement Software Engineer + QA + skill(s) Execute the plan; code, tests Working code + test results

Key rules:

  • Each phase starts with a fresh context window, loading only the previous phase's progress file.
  • Human reviews and approves the progress file before the next phase begins.
  • Sub-agents primarily provide context isolation — they search, read, and return concise findings.
  • If implementation is large, compact mid-phase: update the plan progress file with completed steps and continue in a fresh context.

Unloading

Since tokens can't be literally removed from context:

  1. Phase transitions — summarize completed phase output into memory/ and start a new conversation for the next phase.
  2. Within a conversation — stop referencing the agent/skill; the orchestrator mentally notes it's no longer active. Use the Context Summarization skill to compress stale content.
  3. Multi-turn accumulation — when conversation history crosses 30% utilization, trigger summarization before loading additional agents.

Anti-patterns

  • Loading all agents upfront — wastes tokens before any work begins. Load only the primary agent.
  • Loading all of an agent's skills — most are irrelevant to the specific request.
  • Never unloading — context grows monotonically until hallucination risk. Summarize and phase-transition.
  • Loading agents "just in case" — adds cost without value. Load on demand when the phase begins.

Output

Loading plan as one table: selected agents + skills, token costs, estimated total, and utilization percentage against the 40% ceiling. No narration.