E2E Test Runner
Run browser E2E tests defined in natural language JSON files. Each test case gets its own Claude Code session and browser instance for full isolation. Uses agent-browser for token-efficient interaction with built-in video recording.
Quick Start
# Run tests (dev server already running)
/e2e-test tests/login.test.json
# Auto-start dev server, take screenshots
/e2e-test tests/login.test.json --run "npm run dev" --port 3000 --screenshots
# Visual regression against baseline
/e2e-test tests/login.test.json --baseline ./e2e-results/1234567890
Workflow
- Parse
$ARGUMENTSto extract the test file path and any flags - Check that the runner dependencies are installed at
${CLAUDE_PLUGIN_DATA}/node_modules. If not, run:cd "${CLAUDE_PLUGIN_DATA}" && cp "${CLAUDE_PLUGIN_ROOT}/scripts/runner/package.json" . && npm install --production 2>&1 - Verify agent-browser is installed:
command -v agent-browser >/dev/null 2>&1 || { echo "agent-browser not found. Install: npm install -g agent-browser && agent-browser install"; exit 1; } - Run the test runner:
"${CLAUDE_PLUGIN_DATA}/node_modules/.bin/tsx" "${CLAUDE_PLUGIN_ROOT}/scripts/runner/src/index.ts" --testsPath <path> --resultsPath ./e2e-results [additional flags from $ARGUMENTS] - Read
./e2e-results/test-summary.mdand present the results to the user - If
./e2e-results/report.htmlexists, mention it for detailed interactive viewing - If any tests failed, point out screenshot and video locations
Test File Format
For how to write test files, see references/test-schema.md.
CLI Options
| Flag | Description |
|---|---|
--testsPath, -t |
Path to the JSON test file (required) |
--resultsPath, -o |
Output directory for results (default: ./e2e-results/<timestamp>) |
--verbose, -v |
Include all Claude Code messages in output |
--screenshots, -s |
Take screenshots at every step (not just failures) |
--maxTurns |
Max Claude Code interactions per test (default: 30) |
--model, -m |
Override the Claude model |
--run <command> |
Dev server start command (e.g. npm run dev) |
--port <port> |
Dev server port (auto-detected from framework, fallback: 3000) |
--url <url> |
Override the URL to open (instead of http://localhost:port) |
--headed |
Show the browser window (default: headless) |
--baseline <path> |
Baseline results directory for visual regression diff |
Gotchas
- agent-browser required: Install globally:
npm install -g agent-browser && agent-browser install. Without it, the runner exits immediately. - First run installs runner dependencies (~30s). Subsequent runs skip this.
- Each test case spawns a separate Claude Code session via the SDK, so Claude login is required.
- Video recordings are saved as
.webmfiles per test case. They play natively in browsers. - If
--runis omitted and--urlis not set, the runner auto-detects the framework frompackage.jsonand starts the appropriate dev server. Use--runto override. - The dev server is killed when tests complete. If the port is already in use, the existing server is used instead.
- Test results are saved to
./e2e-results/by default. Each run creates a timestamped subdirectory. - The interactive HTML report (
report.html) includes embedded video playback and expandable test details. - For visual regression (
--baseline), screenshots are compared pixel-by-pixel. Diff images are saved asdiff-step-*.png.