Compliance Mapping (pattern-first)
Purpose
Map unified findings to regulatory control citations so executive-audience reports can name the specific regulations at risk. Designed to be deterministic first — a pattern table handles the bulk; LLM judgment is invoked ONLY for cases the table flags as ambiguous.
This is explicitly informational, not audit-grade. No report produced by this pipeline substitutes for a certified auditor's opinion. The disclaimer is mandatory and exact-wording.
Inputs
- Disposition register (post-fp-reduction unified findings with verdicts)
knowledge/compliance-patterns.yaml(this plugin) — the pattern table- Optional: target organization's pre-declared control scope (which regulations apply)
Output
A compliance-annotations.json file per the following shape:
{
"schema_version": "1.0",
"generated_at": "2026-04-21T10:00:00Z",
"disclaimer": "<verbatim text from knowledge/disclaimers.md § Compliance mapping disclaimer>",
"annotations": [
{
"finding_rule_id": "semgrep.python.hardcoded-password",
"finding_file": "config/prod.py",
"finding_line": 42,
"regulations": [
{"regulation": "PCI-DSS", "control_id": "3.2.1", "citation": "PCI-DSS v4.0 §3.2.1"},
{"regulation": "GDPR", "control_id": "Art. 32", "citation": "GDPR Article 32 — Security of Processing"}
],
"annotator": "pattern-table" // or "llm" if llm_review_trigger fired
}
]
}
Pattern table schema (knowledge/compliance-patterns.yaml)
Each row maps a finding pattern to one or more regulatory citations.
# plugins/agentic-security-assessment/knowledge/compliance-patterns.yaml
patterns:
- id: pan-at-log
pattern_regex: 'log\.(debug|info).*(pan|card_number|primary_account)'
field_type: "pii.pan"
applies_to_rules:
- "semgrep.*.pii-log"
- "business-logic.fraud.tokenization-skip-under-flag"
regulations:
- regulation: "PCI-DSS"
control_id: "3.4"
citation: "PCI-DSS v4.0 §3.4 — PAN must be rendered unreadable in storage and in logs"
- regulation: "PCI-DSS"
control_id: "10.2"
citation: "PCI-DSS v4.0 §10.2 — Audit logs must not contain sensitive authentication data"
llm_review_trigger: false
- id: unencrypted-db-transit
pattern_regex: "(mongodb|postgres|mysql)://[^@]+@[^/]+/[^?]+($|(?!.*ssl|.*tls))"
field_type: "db.transit"
applies_to_rules:
- "semgrep.*.unencrypted-database-connection"
- "trivy.iac.*"
regulations:
- regulation: "PCI-DSS"
control_id: "4.1"
citation: "PCI-DSS v4.0 §4.1 — Strong cryptography and security protocols on open public networks"
- regulation: "GDPR"
control_id: "Art. 32"
citation: "GDPR Article 32 — Security of Processing"
llm_review_trigger: false
- id: auth-bypass-admin
pattern_regex: ".*admin.*"
field_type: "auth.missing"
applies_to_rules:
- "semgrep.*.missing-csrf"
- "semgrep.*.unauthenticated-endpoint"
- "business-logic.fraud.*"
regulations:
- regulation: "PCI-DSS"
control_id: "8.3"
citation: "PCI-DSS v4.0 §8.3 — Multi-factor authentication for admin access"
- regulation: "SOC2"
control_id: "CC6.1"
citation: "SOC2 Trust Services Criteria CC6.1 — Logical access controls"
llm_review_trigger: true # rules in this class often need case-specific judgment
- id: insecure-random
pattern_regex: ".*"
field_type: "crypto.random"
applies_to_rules:
- "semgrep.*.insecure-random"
regulations:
- regulation: "PCI-DSS"
control_id: "3.6.1"
citation: "PCI-DSS v4.0 §3.6.1 — Cryptographic key generation practices"
llm_review_trigger: false
Schema fields per row:
| Field | Required | Type | Purpose |
|---|---|---|---|
id |
yes | string | Stable identifier |
pattern_regex |
yes | string | Regex applied to the finding's message / file / context |
field_type |
yes | string | Categorical tag (pii.pan, auth.missing, crypto.random, etc.) for audit traceability |
applies_to_rules |
yes | list[string] | Rule-id glob patterns this row matches |
regulations |
yes | list[object] | One or more citation objects |
llm_review_trigger |
no | bool (default false) | If true, LLM edge annotator is invoked for findings matched by this row |
Procedure
1. Load and validate the pattern table
Parse YAML. Every row MUST have the required fields. An invalid row fails the run with a named error pointing to the row id.
2. For each finding in the disposition register
Apply rows in file-declaration order:
- Check
applies_to_rulesglobs againstfinding.rule_id. No match → skip row. - Check
pattern_regexagainstfinding.message+finding.file(concatenated). No match → skip row. - Match: record annotation with the row's regulations. Set
annotator: "pattern-table". - If
llm_review_trigger: true, invoke thecompliance-edge-annotatoragent for this finding only. The agent can refine citations, add or remove regulations, or add a judgment note. Setannotator: "llm"if the agent modified anything.
A finding with zero matching rows gets no annotation and does NOT appear in the output.
3. LLM call counting (for eval)
The skill exposes an optional LLMCallCounter interface for eval assertions:
class LLMCallCounter:
def invoke(self, prompt: str, context: dict) -> str: ...
def count(self) -> int: ...
def reset(self) -> None: ...
Production dispatches go through a real LLM. Eval runs inject a mock counter and assert the count matches the expected number of llm_review_trigger: true matches in the fixture.
4. Write output
Write memory/compliance-<slug>.json with the disclaimer at the root. The exec-report-generator includes the disclaimer verbatim in the report header.
Disclaimer
Use the exact wording in knowledge/disclaimers.md § "Compliance mapping disclaimer". Required on every report.
Invariants
- Pattern table is the source of truth. LLM never invents citations; it only refines or annotates table-matched citations.
- LLM call count is bounded: one call per finding with
llm_review_trigger: true. No loops, no recursion. - Schema-invalid pattern rows fail the run. No silent skipping.
- Every annotation carries
annotator: "pattern-table" | "llm"for audit. - The disclaimer appears verbatim in every compliance output. Downstream must not strip it.
Related
agents/compliance-edge-annotator.md— the sonnet-tier agent invoked forllm_review_trigger: truematchesknowledge/compliance-patterns.yaml— the pattern table itselfplugins/agentic-dev-team/knowledge/security-primitives-contract.md— unified finding envelope