Regression Check Command
Compare current system behavior against a baseline to detect regressions using executable feedback patterns.
Task
I'll perform comprehensive regression detection by:
- Establishing baseline comparison point (branch, tag, commit, or timestamp)
- Identifying scope of changes to test
- Executing relevant test suites with baseline comparison
- Analyzing execution feedback for behavioral changes
- Reporting detected regressions with severity and impact
- Optionally correlating with bug reports
Arguments
--baseline <ref>- Comparison baseline (branch name, git tag, commit hash, or ISO timestamp)- Examples:
--baseline main,--baseline v2026.1.4,--baseline abc123,--baseline 2026-01-20T10:00:00Z - Default:
mainbranch or most recent release tag
- Examples:
--scope <level>- Testing scopefull- Run complete test suite (default)module <name>- Test specific module/componentchanged-files- Test only files changed since baseline- Example:
--scope module auth
--format <type>- Output formatsummary- High-level regression report (default)detailed- Full execution logs with diffsjson- Machine-readable format for CI/CD integration
--bug-report <path>- Correlate with bug report- Verifies bug fix doesn't introduce regressions
- Cross-references test failures with reported issues
- Example:
--bug-report .aiwg/bugs/BUG-042.md
--save-baseline- Save current execution as new baseline- Useful after confirming changes are intentional
- Updates regression detection baseline for future runs
Process
Step 1: Baseline Establishment
# Determine baseline reference
BASELINE_REF="${--baseline:-main}"
# Checkout baseline (in temporary worktree)
git worktree add /tmp/baseline-check "$BASELINE_REF"
# Capture baseline test results
cd /tmp/baseline-check
npm test -- --json > /tmp/baseline-results.json
Step 2: Scope Identification
Full Scope:
- Run entire test suite in both baseline and current
- Compare all test outcomes
Module Scope:
- Identify module-specific tests
- Run subset of test suite
- Focus regression detection on specified component
Changed Files Scope:
# Identify changed files
git diff --name-only "$BASELINE_REF" > /tmp/changed-files.txt
# Map to test files
# Example: src/auth/login.ts → test/unit/auth/login.test.ts
# Run only affected tests
Step 3: Execution Comparison
Execute tests in current state:
# Run tests with same configuration as baseline
npm test -- --json > /tmp/current-results.json
# Capture additional metrics
# - Performance timing
# - Code coverage deltas
# - Error types and frequencies
Step 4: Regression Analysis
Compare execution results:
regression_detection:
new_failures:
- test: "test/unit/auth/login.test.ts::validateCredentials"
baseline_status: PASS
current_status: FAIL
error: "Expected 200, received 401"
severity: CRITICAL
behavior_changes:
- test: "test/integration/api/users.test.ts::createUser"
baseline_output: "User created in 150ms"
current_output: "User created in 450ms"
delta: +200%
severity: MAJOR
fixed_tests:
- test: "test/unit/utils/parser.test.ts::edgeCase"
baseline_status: FAIL
current_status: PASS
note: "Intentional fix"
Step 5: Reporting
Summary Format:
## Regression Check Report
**Baseline**: main (commit abc123)
**Scope**: full
**Date**: 2026-01-25T15:30:00Z
### Summary
- **New Failures**: 3 tests
- **Behavior Changes**: 5 tests (performance degradation)
- **Fixed Tests**: 2 tests
- **Overall**: ⚠️ REGRESSIONS DETECTED
### Critical Regressions
1. **auth/login.test.ts::validateCredentials**
- Status: PASS → FAIL
- Error: "Expected 200, received 401"
- Impact: Breaks user authentication
- Action: BLOCK MERGE
2. **payments/process.test.ts::chargeCard**
- Status: PASS → FAIL
- Error: "Transaction timeout"
- Impact: Payment processing broken
- Action: BLOCK MERGE
### Behavior Changes
1. **api/users.test.ts::createUser**
- Performance: 150ms → 450ms (+200%)
- Severity: MAJOR
- Action: INVESTIGATE
[... detailed report continues ...]
JSON Format:
{
"baseline": {
"ref": "main",
"commit": "abc123",
"timestamp": "2026-01-20T10:00:00Z"
},
"current": {
"commit": "def456",
"timestamp": "2026-01-25T15:30:00Z"
},
"summary": {
"new_failures": 3,
"behavior_changes": 5,
"fixed_tests": 2,
"total_tests": 247
},
"regressions": [
{
"test": "auth/login.test.ts::validateCredentials",
"type": "failure",
"severity": "critical",
"baseline_status": "PASS",
"current_status": "FAIL",
"error": "Expected 200, received 401",
"action": "block"
}
],
"verdict": "FAIL"
}
Bug Report Integration
When --bug-report is provided:
# Load bug report
BUG_REPORT=$(cat .aiwg/bugs/BUG-042.md)
# Extract:
# - Bug description
# - Affected components
# - Expected fix behavior
# Cross-reference regression results:
# 1. Did fix resolve reported bug? (test now passes)
# 2. Did fix introduce new failures? (regressions)
# 3. Did fix change behavior elsewhere? (side effects)
# Generate correlation report:
Example Output:
## Bug Fix Verification: BUG-042
**Bug**: Login fails for users with special characters in email
**Fix Verification**: ✅ PASS
- Test `auth/login.test.ts::specialCharactersInEmail` now passes
- Previously failing with "Email validation error"
**Regression Check**: ⚠️ WARNING
- **New failure detected**: `auth/login.test.ts::standardLogin`
- Error: "Unexpected character escape"
- Likely caused by overly aggressive input sanitization
- **Action**: Refine fix to avoid breaking standard login flow
**Recommendation**: REFINE FIX before merge
Baseline Modes
Branch Baseline
# Compare against branch
/regression-check --baseline develop --scope changed-files
# Use case: Feature branch testing
# Ensures changes don't break existing functionality
Tag Baseline
# Compare against release
/regression-check --baseline v2026.1.4 --format detailed
# Use case: Release regression testing
# Verifies new version doesn't break previous release behavior
Commit Baseline
# Compare against specific commit
/regression-check --baseline abc123def --scope module auth
# Use case: Pinpoint regression introduction
# Identifies exact commit that introduced regression
Timestamp Baseline
# Compare against point in time
/regression-check --baseline 2026-01-20T10:00:00Z
# Use case: Time-based regression detection
# Finds when behavior changed over time
Executable Feedback Pattern
Implements REF-013 MetaGPT executable feedback loop:
executable_feedback_loop:
1_execute_baseline:
- checkout_baseline_ref
- run_test_suite
- capture_execution_results
2_execute_current:
- run_test_suite_current
- capture_execution_results
3_compare_feedback:
- detect_new_failures
- detect_behavior_changes
- detect_performance_regressions
4_analyze_root_cause:
- diff_source_changes
- identify_likely_culprits
- suggest_fixes
5_iterate_or_report:
- if_regressions_critical: block_merge
- if_regressions_minor: warn_and_document
- if_no_regressions: approve
Usage Examples
Example 1: Pre-Merge Regression Check
# Before merging feature branch
/regression-check --baseline main --scope changed-files --format summary
# Expected outcome:
# - Fast execution (only changed files tested)
# - Summary of any regressions introduced
# - Recommendation: merge/block/investigate
Example 2: Bug Fix Verification
# After implementing bug fix
/regression-check --baseline v2026.1.4 --bug-report .aiwg/bugs/BUG-042.md
# Expected outcome:
# - Verifies bug is fixed (test now passes)
# - Checks for side effects (new failures)
# - Provides merge recommendation
Example 3: Release Regression Testing
# Before cutting release
/regression-check --baseline main --scope full --format detailed
# Expected outcome:
# - Comprehensive test suite comparison
# - Detailed logs for all detected regressions
# - Complete report for release decision
Example 4: Module-Specific Testing
# After refactoring auth module
/regression-check --baseline main --scope module auth --format json
# Expected outcome:
# - Focused regression detection
# - JSON output for CI/CD integration
# - Fast feedback on module changes
Example 5: Performance Regression Detection
# Detect performance regressions
/regression-check --baseline v2026.1.3 --format detailed
# Expected outcome:
# - Timing comparisons for all tests
# - Flags tests with >20% performance degradation
# - Detailed analysis of slow tests
Integration with CI/CD
# .github/workflows/regression-check.yml
name: Regression Check
on:
pull_request:
branches: [main]
jobs:
regression:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0 # Full history for baseline comparison
- name: Run regression check
run: |
aiwg regression-check \
--baseline main \
--scope changed-files \
--format json \
> regression-results.json
- name: Post results
if: failure()
uses: actions/github-script@v6
with:
script: |
const results = require('./regression-results.json');
// Post comment to PR with regression report
Output Artifacts
Generated files:
.aiwg/testing/regression-reports/
├── regression-{timestamp}.md # Human-readable report
├── regression-{timestamp}.json # Machine-readable results
├── baseline-execution.log # Baseline test output
├── current-execution.log # Current test output
└── diff-analysis.md # Source code diff analysis
Exit Codes
0 - No regressions detected
1 - Critical regressions (blocking)
2 - Major regressions (warning)
3 - Minor regressions (informational)
4 - Execution error (baseline/current test failure)
References
- @$AIWG_ROOT/docs/references/REF-013-metagpt-multi-agent-framework.md - Executable feedback pattern
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/executable-feedback.md - Feedback loop implementation rules
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/templates/test/regression-test-set-card.md - Regression test template
- @.aiwg/testing/ - Test artifacts location
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/executable-feedback.md - Test-first principles