initial commit

2026-03-16 19:54:53 -04:00
commit bfe0e01254
3341 changed files with 483939 additions and 0 deletions
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/SKILL.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/SKILL.md
@@ -0,0 +1,6 @@
+---
+name: bmad-testarch-test-review
+description: 'Review test quality using best practices validation. Use when user says "lets review tests" or "I want to evaluate test quality"'
+---
+
+Follow the instructions in [workflow.md](workflow.md).
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/bmad-skill-manifest.yaml
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/bmad-skill-manifest.yaml
@@ -0,0 +1 @@
+type: skill
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/checklist.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/checklist.md
@@ -0,0 +1,475 @@
+# Test Quality Review - Validation Checklist
+
+Use this checklist to validate that the test quality review workflow completed successfully and all quality criteria were properly evaluated.
+
+---
+
+## Prerequisites
+
+Note: `test-review` is optional and only audits existing tests; it does not generate tests.
+Coverage analysis is out of scope for this workflow. Use `trace` for coverage metrics and coverage gate decisions.
+
+### Test File Discovery
+
+- [ ] Test file(s) identified for review (single/directory/suite scope)
+- [ ] Test files exist and are readable
+- [ ] Test framework detected (Playwright, Jest, Cypress, Vitest, etc.)
+- [ ] Test framework configuration found (playwright.config.ts, jest.config.js, etc.)
+
+### Knowledge Base Loading
+
+- [ ] tea-index.csv loaded successfully
+- [ ] `test-quality.md` loaded (Definition of Done)
+- [ ] `fixture-architecture.md` loaded (Pure function → Fixture patterns)
+- [ ] `network-first.md` loaded (Route intercept before navigate)
+- [ ] `data-factories.md` loaded (Factory patterns)
+- [ ] `test-levels-framework.md` loaded (E2E vs API vs Component vs Unit)
+- [ ] All other enabled fragments loaded successfully
+
+### Context Gathering
+
+- [ ] Story file discovered or explicitly provided (if available)
+- [ ] Test design document discovered or explicitly provided (if available)
+- [ ] Acceptance criteria extracted from story (if available)
+- [ ] Priority context (P0/P1/P2/P3) extracted from test-design (if available)
+
+---
+
+## Process Steps
+
+### Step 1: Context Loading
+
+- [ ] Review scope determined (single/directory/suite)
+- [ ] Test file paths collected
+- [ ] Related artifacts discovered (story, test-design)
+- [ ] Knowledge base fragments loaded successfully
+- [ ] Quality criteria flags read from workflow variables
+
+### Step 2: Test File Parsing
+
+**For Each Test File:**
+
+- [ ] File read successfully
+- [ ] File size measured (lines, KB)
+- [ ] File structure parsed (describe blocks, it blocks)
+- [ ] Test IDs extracted (if present)
+- [ ] Priority markers extracted (if present)
+- [ ] Imports analyzed
+- [ ] Dependencies identified
+
+**Test Structure Analysis:**
+
+- [ ] Describe block count calculated
+- [ ] It/test block count calculated
+- [ ] BDD structure identified (Given-When-Then)
+- [ ] Fixture usage detected
+- [ ] Data factory usage detected
+- [ ] Network interception patterns identified
+- [ ] Assertions counted
+- [ ] Waits and timeouts cataloged
+- [ ] Conditionals (if/else) detected
+- [ ] Try/catch blocks detected
+- [ ] Shared state or globals detected
+
+### Step 3: Quality Criteria Validation
+
+Coverage criteria are intentionally excluded from this checklist.
+
+**For Each Enabled Criterion:**
+
+#### BDD Format (if `check_given_when_then: true`)
+
+- [ ] Given-When-Then structure evaluated
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with line numbers
+- [ ] Examples of good/bad patterns noted
+
+#### Test IDs (if `check_test_ids: true`)
+
+- [ ] Test ID presence validated
+- [ ] Test ID format checked (e.g., 1.3-E2E-001)
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Missing IDs cataloged
+
+#### Priority Markers (if `check_priority_markers: true`)
+
+- [ ] P0/P1/P2/P3 classification validated
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Missing priorities cataloged
+
+#### Hard Waits (if `check_hard_waits: true`)
+
+- [ ] sleep(), waitForTimeout(), hardcoded delays detected
+- [ ] Justification comments checked
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with line numbers and recommended fixes
+
+#### Determinism (if `check_determinism: true`)
+
+- [ ] Conditionals (if/else/switch) detected
+- [ ] Try/catch abuse detected
+- [ ] Random values (Math.random, Date.now) detected
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+#### Isolation (if `check_isolation: true`)
+
+- [ ] Cleanup hooks (afterEach/afterAll) validated
+- [ ] Shared state detected
+- [ ] Global variable mutations detected
+- [ ] Resource cleanup verified
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+#### Fixture Patterns (if `check_fixture_patterns: true`)
+
+- [ ] Fixtures detected (test.extend)
+- [ ] Pure functions validated
+- [ ] mergeTests usage checked
+- [ ] beforeEach complexity analyzed
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+#### Data Factories (if `check_data_factories: true`)
+
+- [ ] Factory functions detected
+- [ ] Hardcoded data (magic strings/numbers) detected
+- [ ] Faker.js or similar usage validated
+- [ ] API-first setup pattern checked
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+#### Network-First (if `check_network_first: true`)
+
+- [ ] page.route() before page.goto() validated
+- [ ] Race conditions detected (route after navigate)
+- [ ] waitForResponse patterns checked
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+#### Assertions (if `check_assertions: true`)
+
+- [ ] Explicit assertions counted
+- [ ] Implicit waits without assertions detected
+- [ ] Assertion specificity validated
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+#### Test Length (if `check_test_length: true`)
+
+- [ ] File line count calculated
+- [ ] Threshold comparison (≤300 lines ideal)
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Splitting recommendations generated (if >300 lines)
+
+#### Test Duration (if `check_test_duration: true`)
+
+- [ ] Test complexity analyzed (as proxy for duration if no execution data)
+- [ ] Threshold comparison (≤1.5 min target)
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Optimization recommendations generated
+
+#### Flakiness Patterns (if `check_flakiness_patterns: true`)
+
+- [ ] Tight timeouts detected (e.g., { timeout: 1000 })
+- [ ] Race conditions detected
+- [ ] Timing-dependent assertions detected
+- [ ] Retry logic detected
+- [ ] Environment-dependent assumptions detected
+- [ ] Status assigned (PASS/WARN/FAIL)
+- [ ] Violations recorded with recommended fixes
+
+---
+
+### Step 4: Quality Score Calculation
+
+**Violation Counting:**
+
+- [ ] Critical (P0) violations counted
+- [ ] High (P1) violations counted
+- [ ] Medium (P2) violations counted
+- [ ] Low (P3) violations counted
+- [ ] Violation breakdown by criterion recorded
+
+**Score Calculation:**
+
+- [ ] Starting score: 100
+- [ ] Critical violations deducted (-10 each)
+- [ ] High violations deducted (-5 each)
+- [ ] Medium violations deducted (-2 each)
+- [ ] Low violations deducted (-1 each)
+- [ ] Bonus points added (max +30):
+  - [ ] Excellent BDD structure (+5 if applicable)
+  - [ ] Comprehensive fixtures (+5 if applicable)
+  - [ ] Comprehensive data factories (+5 if applicable)
+  - [ ] Network-first pattern (+5 if applicable)
+  - [ ] Perfect isolation (+5 if applicable)
+  - [ ] All test IDs present (+5 if applicable)
+- [ ] Final score calculated: max(0, min(100, Starting - Violations + Bonus))
+
+**Quality Grade:**
+
+- [ ] Grade assigned based on score:
+  - 90-100: A+ (Excellent)
+  - 80-89: A (Good)
+  - 70-79: B (Acceptable)
+  - 60-69: C (Needs Improvement)
+  - <60: F (Critical Issues)
+
+---
+
+### Step 5: Review Report Generation
+
+**Report Sections Created:**
+
+- [ ] **Header Section**:
+  - [ ] Test file(s) reviewed listed
+  - [ ] Review date recorded
+  - [ ] Review scope noted (single/directory/suite)
+  - [ ] Quality score and grade displayed
+
+- [ ] **Executive Summary**:
+  - [ ] Overall assessment (Excellent/Good/Needs Improvement/Critical)
+  - [ ] Key strengths listed (3-5 bullet points)
+  - [ ] Key weaknesses listed (3-5 bullet points)
+  - [ ] Recommendation stated (Approve/Approve with comments/Request changes/Block)
+
+- [ ] **Quality Criteria Assessment**:
+  - [ ] Table with all criteria evaluated
+  - [ ] Status for each criterion (PASS/WARN/FAIL)
+  - [ ] Violation count per criterion
+
+- [ ] **Critical Issues (Must Fix)**:
+  - [ ] P0/P1 violations listed
+  - [ ] Code location provided for each (file:line)
+  - [ ] Issue explanation clear
+  - [ ] Recommended fix provided with code example
+  - [ ] Knowledge base reference provided
+
+- [ ] **Recommendations (Should Fix)**:
+  - [ ] P2/P3 violations listed
+  - [ ] Code location provided for each (file:line)
+  - [ ] Issue explanation clear
+  - [ ] Recommended improvement provided with code example
+  - [ ] Knowledge base reference provided
+
+- [ ] **Best Practices Examples** (if good patterns found):
+  - [ ] Good patterns highlighted from tests
+  - [ ] Knowledge base fragments referenced
+  - [ ] Examples provided for others to follow
+
+- [ ] **Knowledge Base References**:
+  - [ ] All fragments consulted listed
+  - [ ] Links to detailed guidance provided
+
+---
+
+### Step 6: Optional Outputs Generation
+
+**Inline Comments** (if `generate_inline_comments: true`):
+
+- [ ] Inline comments generated at violation locations
+- [ ] Comment format: `// TODO (TEA Review): [Issue] - See test-review-{filename}.md`
+- [ ] Comments added to test files (no logic changes)
+- [ ] Test files remain valid and executable
+
+**Quality Badge** (if `generate_quality_badge: true`):
+
+- [ ] Badge created with quality score (e.g., "Test Quality: 87/100 (A)")
+- [ ] Badge format suitable for README or documentation
+- [ ] Badge saved to output folder
+
+**Story Update** (if `append_to_story: true` and story file exists):
+
+- [ ] "Test Quality Review" section created
+- [ ] Quality score included
+- [ ] Critical issues summarized
+- [ ] Link to full review report provided
+- [ ] Story file updated successfully
+
+---
+
+### Step 7: Save and Notify
+
+**Outputs Saved:**
+
+- [ ] Review report saved to `{output_file}`
+- [ ] Inline comments written to test files (if enabled)
+- [ ] Quality badge saved (if enabled)
+- [ ] Story file updated (if enabled)
+- [ ] All outputs are valid and readable
+
+**Summary Message Generated:**
+
+- [ ] Quality score and grade included
+- [ ] Critical issue count stated
+- [ ] Recommendation provided (Approve/Request changes/Block)
+- [ ] Next steps clarified
+- [ ] Message displayed to user
+
+---
+
+## Output Validation
+
+### Review Report Completeness
+
+- [ ] All required sections present
+- [ ] No placeholder text or TODOs in report
+- [ ] All code locations are accurate (file:line)
+- [ ] All code examples are valid and demonstrate fix
+- [ ] All knowledge base references are correct
+
+### Review Report Accuracy
+
+- [ ] Quality score matches violation breakdown
+- [ ] Grade matches score range
+- [ ] Violations correctly categorized by severity (P0/P1/P2/P3)
+- [ ] Violations correctly attributed to quality criteria
+- [ ] No false positives (violations are legitimate issues)
+- [ ] No false negatives (critical issues not missed)
+
+### Review Report Clarity
+
+- [ ] Executive summary is clear and actionable
+- [ ] Issue explanations are understandable
+- [ ] Recommended fixes are implementable
+- [ ] Code examples are correct and runnable
+- [ ] Recommendation (Approve/Request changes) is clear
+
+---
+
+## Quality Checks
+
+### Knowledge-Based Validation
+
+- [ ] All feedback grounded in knowledge base fragments
+- [ ] Recommendations follow proven patterns
+- [ ] No arbitrary or opinion-based feedback
+- [ ] Knowledge fragment references accurate and relevant
+
+### Actionable Feedback
+
+- [ ] Every issue includes recommended fix
+- [ ] Every fix includes code example
+- [ ] Code examples demonstrate correct pattern
+- [ ] Fixes reference knowledge base for more detail
+
+### Severity Classification
+
+- [ ] Critical (P0) issues are genuinely critical (hard waits, race conditions, no assertions)
+- [ ] High (P1) issues impact maintainability/reliability (missing IDs, hardcoded data)
+- [ ] Medium (P2) issues are nice-to-have improvements (long files, missing priorities)
+- [ ] Low (P3) issues are minor style/preference (verbose tests)
+
+### Context Awareness
+
+- [ ] Review considers project context (some patterns may be justified)
+- [ ] Violations with justification comments noted as acceptable
+- [ ] Edge cases acknowledged
+- [ ] Recommendations are pragmatic, not dogmatic
+
+---
+
+## Integration Points
+
+### Story File Integration
+
+- [ ] Story file discovered correctly (if available)
+- [ ] Acceptance criteria extracted and used for context
+- [ ] Test quality section appended to story (if enabled)
+- [ ] Link to review report added to story
+
+### Test Design Integration
+
+- [ ] Test design document discovered correctly (if available)
+- [ ] Priority context (P0/P1/P2/P3) extracted and used
+- [ ] Review validates tests align with prioritization
+- [ ] Misalignment flagged (e.g., P0 scenario missing tests)
+
+### Knowledge Base Integration
+
+- [ ] tea-index.csv loaded successfully
+- [ ] All required fragments loaded
+- [ ] Fragments applied correctly to validation
+- [ ] Fragment references in report are accurate
+
+---
+
+## Edge Cases and Special Situations
+
+### Empty or Minimal Tests
+
+- [ ] If test file is empty, report notes "No tests found"
+- [ ] If test file has only boilerplate, report notes "No meaningful tests"
+- [ ] Score reflects lack of content appropriately
+
+### Legacy Tests
+
+- [ ] Legacy tests acknowledged in context
+- [ ] Review provides practical recommendations for improvement
+- [ ] Recognizes that complete refactor may not be feasible
+- [ ] Prioritizes critical issues (flakiness) over style
+
+### Test Framework Variations
+
+- [ ] Review adapts to test framework (Playwright vs Jest vs Cypress)
+- [ ] Framework-specific patterns recognized (e.g., Playwright fixtures)
+- [ ] Framework-specific violations detected (e.g., Cypress anti-patterns)
+- [ ] Knowledge fragments applied appropriately for framework
+
+### Justified Violations
+
+- [ ] Violations with justification comments in code noted as acceptable
+- [ ] Justifications evaluated for legitimacy
+- [ ] Report acknowledges justified patterns
+- [ ] Score not penalized for justified violations
+
+---
+
+## Final Validation
+
+### Review Completeness
+
+- [ ] All enabled quality criteria evaluated
+- [ ] All test files in scope reviewed
+- [ ] All violations cataloged
+- [ ] All recommendations provided
+- [ ] Review report is comprehensive
+
+### Review Accuracy
+
+- [ ] Quality score is accurate
+- [ ] Violations are correct (no false positives)
+- [ ] Critical issues not missed (no false negatives)
+- [ ] Code locations are correct
+- [ ] Knowledge base references are accurate
+
+### Review Usefulness
+
+- [ ] Feedback is actionable
+- [ ] Recommendations are implementable
+- [ ] Code examples are correct
+- [ ] Review helps developer improve tests
+- [ ] Review educates on best practices
+
+### Workflow Complete
+
+- [ ] All checklist items completed
+- [ ] All outputs validated and saved
+- [ ] User notified with summary
+- [ ] Review ready for developer consumption
+- [ ] Follow-up actions identified (if any)
+
+---
+
+## Notes
+
+Record any issues, observations, or important context during workflow execution:
+
+- **Test Framework**: [Playwright, Jest, Cypress, etc.]
+- **Review Scope**: [single file, directory, full suite]
+- **Quality Score**: [0-100 score, letter grade]
+- **Critical Issues**: [Count of P0/P1 violations]
+- **Recommendation**: [Approve / Approve with comments / Request changes / Block]
+- **Special Considerations**: [Legacy code, justified patterns, edge cases]
+- **Follow-up Actions**: [Re-review after fixes, pair programming, etc.]
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/instructions.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/instructions.md
@@ -0,0 +1,45 @@
+# Test Quality Review
+
+**Workflow:** `bmad-testarch-test-review`
+**Version:** 5.0 (Step-File Architecture)
+
+---
+
+## Overview
+
+Review test quality using TEA knowledge base and produce a 0–100 quality score with actionable findings.
+
+Coverage assessment is intentionally out of scope for this workflow. Use `trace` for requirements coverage and coverage gate decisions.
+
+---
+
+## WORKFLOW ARCHITECTURE
+
+This workflow uses **step-file architecture**:
+
+- **Micro-file Design**: Each step is self-contained
+- **JIT Loading**: Only the current step file is in memory
+- **Sequential Enforcement**: Execute steps in order
+
+---
+
+## INITIALIZATION SEQUENCE
+
+### 1. Configuration Loading
+
+From `workflow.yaml`, resolve:
+
+- `config_source`, `test_artifacts`, `user_name`, `communication_language`, `document_output_language`, `date`
+- `test_dir`, `review_scope`
+
+### 2. First Step
+
+Load, read completely, and execute:
+`./steps-c/step-01-load-context.md`
+
+### 3. Resume Support
+
+If the user selects **Resume** mode, load, read completely, and execute:
+`./steps-c/step-01b-resume.md`
+
+This checks the output document for progress tracking frontmatter and routes to the next incomplete step.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-01-load-context.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-01-load-context.md
@@ -0,0 +1,197 @@
+---
+name: 'step-01-load-context'
+description: 'Load knowledge base, determine scope, and gather context'
+nextStepFile: './step-02-discover-tests.md'
+knowledgeIndex: '{project-root}/_bmad/tea/testarch/tea-index.csv'
+outputFile: '{test_artifacts}/test-review.md'
+---
+
+# Step 1: Load Context & Knowledge Base
+
+## STEP GOAL
+
+Determine review scope, load required knowledge fragments, and gather related artifacts.
+
+## MANDATORY EXECUTION RULES
+
+- 📖 Read the entire step file before acting
+- ✅ Speak in `{communication_language}`
+
+---
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+- 💾 Record outputs before proceeding
+- 📖 Load the next step only when instructed
+
+## CONTEXT BOUNDARIES:
+
+- Available context: config, loaded artifacts, and knowledge fragments
+- Focus: this step's goal only
+- Limits: do not execute future steps
+- Dependencies: prior steps' outputs (if any)
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly. Do not skip, reorder, or improvise.
+
+## 1. Determine Scope and Stack
+
+Use `review_scope`:
+
+- **single**: one file
+- **directory**: all tests in folder
+- **suite**: all tests in repo
+
+If unclear, ask the user.
+
+**Stack Detection** (for context-aware loading):
+
+Read `test_stack_type` from `{config_source}`. If `"auto"` or not configured, infer `{detected_stack}` by scanning `{project-root}`:
+
+- **Frontend indicators**: `playwright.config.*`, `cypress.config.*`, `package.json` with react/vue/angular
+- **Backend indicators**: `pyproject.toml`, `pom.xml`/`build.gradle`, `go.mod`, `*.csproj`, `Gemfile`, `Cargo.toml`
+- **Both present** → `fullstack`; only frontend → `frontend`; only backend → `backend`
+- Explicit `test_stack_type` overrides auto-detection
+
+---
+
+### Tiered Knowledge Loading
+
+Load fragments based on their `tier` classification in `tea-index.csv`:
+
+1. **Core tier** (always load): Foundational fragments required for this workflow
+2. **Extended tier** (load on-demand): Load when deeper analysis is needed or when the user's context requires it
+3. **Specialized tier** (load only when relevant): Load only when the specific use case matches (e.g., contract-testing only for microservices, email-auth only for email flows)
+
+> **Context Efficiency**: Loading only core fragments reduces context usage by 40-50% compared to loading all fragments.
+
+### Playwright Utils Loading Profiles
+
+**If `tea_use_playwright_utils` is enabled**, select the appropriate loading profile:
+
+- **API-only profile** (when `{detected_stack}` is `backend` or no `page.goto`/`page.locator` found in test files):
+  Load: `overview`, `api-request`, `auth-session`, `recurse` (~1,800 lines)
+
+- **Full UI+API profile** (when `{detected_stack}` is `frontend`/`fullstack` or browser tests detected):
+  Load: all Playwright Utils core fragments (~4,500 lines)
+
+**Detection**: Scan `{test_dir}` for files containing `page.goto` or `page.locator`. If none found, use API-only profile.
+
+### Pact.js Utils Loading
+
+**If `tea_use_pactjs_utils` is enabled** (and contract tests detected in review scope):
+
+Load: `pactjs-utils-overview.md`, `pactjs-utils-provider-verifier.md`, `pactjs-utils-request-filter.md` (the 3 most relevant for reviewing provider verification tests)
+
+**If `tea_use_pactjs_utils` is disabled** but contract tests are in review scope:
+
+Load: `contract-testing.md`
+
+### Pact MCP Loading
+
+**If `tea_pact_mcp` is `"mcp"`:**
+
+Load: `pact-mcp.md` — enables agent to use SmartBear MCP "Review Pact Tests" tool for automated best-practice feedback during test review.
+
+## 2. Load Knowledge Base
+
+From `{knowledgeIndex}` load:
+
+Read `{config_source}` and check `tea_use_playwright_utils`, `tea_use_pactjs_utils`, `tea_pact_mcp`, and `tea_browser_automation` to select the correct fragment set.
+
+**Core:**
+
+- `test-quality.md`
+- `data-factories.md`
+- `test-levels-framework.md`
+- `selective-testing.md`
+- `test-healing-patterns.md`
+- `selector-resilience.md`
+- `timing-debugging.md`
+
+**If Playwright Utils enabled:**
+
+- `overview.md`, `api-request.md`, `network-recorder.md`, `auth-session.md`, `intercept-network-call.md`, `recurse.md`, `log.md`, `file-utils.md`, `burn-in.md`, `network-error-monitor.md`, `fixtures-composition.md`
+
+**If disabled:**
+
+- `fixture-architecture.md`
+- `network-first.md`
+- `playwright-config.md`
+- `component-tdd.md`
+- `ci-burn-in.md`
+
+**Playwright CLI (if `tea_browser_automation` is "cli" or "auto"):**
+
+- `playwright-cli.md`
+
+**MCP Patterns (if `tea_browser_automation` is "mcp" or "auto"):**
+
+- (existing MCP-related fragments, if any are added in future)
+
+**Pact.js Utils (if enabled and contract tests in review scope):**
+
+- `pactjs-utils-overview.md`, `pactjs-utils-provider-verifier.md`, `pactjs-utils-request-filter.md`
+
+**Contract Testing (if pactjs-utils disabled but contract tests in review scope):**
+
+- `contract-testing.md`
+
+**Pact MCP (if tea_pact_mcp is "mcp"):**
+
+- `pact-mcp.md`
+
+---
+
+## 3. Gather Context Artifacts
+
+If available:
+
+- Story file (acceptance criteria)
+- Test design doc (priorities)
+- Framework config
+
+Summarize what was found.
+
+Coverage mapping and coverage gates are out of scope in `test-review`. Route those concerns to `trace`.
+
+---
+
+## 4. Save Progress
+
+**Save this step's accumulated work to `{outputFile}`.**
+
+- **If `{outputFile}` does not exist** (first save), create it using the workflow template (if available) with YAML frontmatter:
+
+  ```yaml
+  ---
+  stepsCompleted: ['step-01-load-context']
+  lastStep: 'step-01-load-context'
+  lastSaved: '{date}'
+  ---
+  ```
+
+  Then write this step's output below the frontmatter.
+
+- **If `{outputFile}` already exists**, update:
+  - Add `'step-01-load-context'` to `stepsCompleted` array (only if not already present)
+  - Set `lastStep: 'step-01-load-context'`
+  - Set `lastSaved: '{date}'`
+  - Append this step's output to the appropriate section of the document.
+
+**Update `inputDocuments`**: Set `inputDocuments` in the output template frontmatter to the list of artifact paths loaded in this step (e.g., knowledge fragments, test design documents, configuration files).
+
+Load next step: `{nextStepFile}`
+
+## 🚨 SYSTEM SUCCESS/FAILURE METRICS:
+
+### ✅ SUCCESS:
+
+- Step completed in full with required outputs
+
+### ❌ SYSTEM FAILURE:
+
+- Skipped sequence steps or missing outputs
+  **Master Rule:** Skipping steps is FORBIDDEN.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-01b-resume.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-01b-resume.md
@@ -0,0 +1,104 @@
+---
+name: 'step-01b-resume'
+description: 'Resume interrupted workflow from last completed step'
+outputFile: '{test_artifacts}/test-review.md'
+---
+
+# Step 1b: Resume Workflow
+
+## STEP GOAL
+
+Resume an interrupted workflow by loading the existing output document, displaying progress, and routing to the next incomplete step.
+
+## MANDATORY EXECUTION RULES
+
+- Read the entire step file before acting
+- Speak in `{communication_language}`
+
+---
+
+## EXECUTION PROTOCOLS:
+
+- Follow the MANDATORY SEQUENCE exactly
+- Load the next step only when instructed
+
+## CONTEXT BOUNDARIES:
+
+- Available context: Output document with progress frontmatter
+- Focus: Load progress and route to next step
+- Limits: Do not re-execute completed steps
+- Dependencies: Output document must exist from a previous run
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly.
+
+### 1. Load Output Document
+
+Read `{outputFile}` and parse YAML frontmatter for:
+
+- `stepsCompleted` -- array of completed step names
+- `lastStep` -- last completed step name
+- `lastSaved` -- timestamp of last save
+
+**If `{outputFile}` does not exist**, display:
+
+"No previous progress found. There is no output document to resume from. Please use **[C] Create** to start a fresh workflow run."
+
+**THEN:** Halt. Do not proceed.
+
+---
+
+### 2. Display Progress Dashboard
+
+Display progress with checkmark/empty indicators:
+
+```
+Test Quality Review - Resume Progress:
+
+1. Load Context (step-01-load-context)              [completed/pending]
+2. Discover Tests (step-02-discover-tests)           [completed/pending]
+3. Quality Evaluation + Aggregate (step-03f-aggregate-scores) [completed/pending]
+4. Generate Report (step-04-generate-report)         [completed/pending]
+
+Last saved: {lastSaved}
+```
+
+---
+
+### 3. Route to Next Step
+
+Based on `lastStep`, load the next incomplete step:
+
+| lastStep                    | Next Step File                    |
+| --------------------------- | --------------------------------- |
+| `step-01-load-context`      | `./step-02-discover-tests.md`     |
+| `step-02-discover-tests`    | `./step-03-quality-evaluation.md` |
+| `step-03f-aggregate-scores` | `./step-04-generate-report.md`    |
+| `step-04-generate-report`   | **Workflow already complete.**    |
+
+**If `lastStep` is the final step** (`step-04-generate-report`), display: "All steps completed. Use **[C] Create** to start fresh, **[V] Validate** to review outputs, or **[E] Edit** to make revisions." Then halt.
+
+**If `lastStep` does not match any value above**, display: "Unknown progress state (`lastStep`: {lastStep}). Please use **[C] Create** to start fresh." Then halt.
+
+**Otherwise**, load the identified step file, read completely, and execute.
+
+The existing content in `{outputFile}` provides context from previously completed steps.
+
+---
+
+## SYSTEM SUCCESS/FAILURE METRICS
+
+### SUCCESS:
+
+- Output document loaded and parsed correctly
+- Progress dashboard displayed accurately
+- Routed to correct next step
+
+### FAILURE:
+
+- Not loading output document
+- Incorrect progress display
+- Routing to wrong step
+
+**Master Rule:** Resume MUST route to the exact next incomplete step. Never re-execute completed steps.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-02-discover-tests.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-02-discover-tests.md
@@ -0,0 +1,113 @@
+---
+name: 'step-02-discover-tests'
+description: 'Find and parse test files'
+nextStepFile: './step-03-quality-evaluation.md'
+outputFile: '{test_artifacts}/test-review.md'
+---
+
+# Step 2: Discover & Parse Tests
+
+## STEP GOAL
+
+Collect test files in scope and parse structure/metadata.
+
+## MANDATORY EXECUTION RULES
+
+- 📖 Read the entire step file before acting
+- ✅ Speak in `{communication_language}`
+
+---
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+- 💾 Record outputs before proceeding
+- 📖 Load the next step only when instructed
+
+## CONTEXT BOUNDARIES:
+
+- Available context: config, loaded artifacts, and knowledge fragments
+- Focus: this step's goal only
+- Limits: do not execute future steps
+- Dependencies: prior steps' outputs (if any)
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly. Do not skip, reorder, or improvise.
+
+## 1. Discover Test Files
+
+- **single**: use provided file path
+- **directory**: glob under `{test_dir}` or selected folder
+- **suite**: glob all tests in repo
+
+Halt if no tests are found.
+
+---
+
+## 2. Parse Metadata (per file)
+
+Collect:
+
+- File size and line count
+- Test framework detected
+- Describe/test block counts
+- Test IDs and priority markers
+- Imports, fixtures, factories, network interception
+- Waits/timeouts and control flow (if/try/catch)
+
+---
+
+## 3. Evidence Collection (if `tea_browser_automation` is `cli` or `auto`)
+
+> **Fallback:** If CLI is not installed, fall back to MCP (if available) or skip evidence collection.
+
+**CLI Evidence Collection:**
+All commands use the same named session to target the correct browser:
+
+1. `playwright-cli -s=tea-review open <target_url>`
+2. `playwright-cli -s=tea-review tracing-start`
+3. Execute the flow under review (using `-s=tea-review` on each command)
+4. `playwright-cli -s=tea-review tracing-stop` → saves trace.zip
+5. `playwright-cli -s=tea-review screenshot --filename={test_artifacts}/review-evidence.png`
+6. `playwright-cli -s=tea-review network` → capture network request log
+7. `playwright-cli -s=tea-review close`
+
+> **Session Hygiene:** Always close sessions using `playwright-cli -s=tea-review close`. Do NOT use `close-all` — it kills every session on the machine and breaks parallel execution.
+
+---
+
+## 4. Save Progress
+
+**Save this step's accumulated work to `{outputFile}`.**
+
+- **If `{outputFile}` does not exist** (first save), create it using the workflow template (if available) with YAML frontmatter:
+
+  ```yaml
+  ---
+  stepsCompleted: ['step-02-discover-tests']
+  lastStep: 'step-02-discover-tests'
+  lastSaved: '{date}'
+  ---
+  ```
+
+  Then write this step's output below the frontmatter.
+
+- **If `{outputFile}` already exists**, update:
+  - Add `'step-02-discover-tests'` to `stepsCompleted` array (only if not already present)
+  - Set `lastStep: 'step-02-discover-tests'`
+  - Set `lastSaved: '{date}'`
+  - Append this step's output to the appropriate section of the document.
+
+Load next step: `{nextStepFile}`
+
+## 🚨 SYSTEM SUCCESS/FAILURE METRICS:
+
+### ✅ SUCCESS:
+
+- Step completed in full with required outputs
+
+### ❌ SYSTEM FAILURE:
+
+- Skipped sequence steps or missing outputs
+  **Master Rule:** Skipping steps is FORBIDDEN.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03-quality-evaluation.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03-quality-evaluation.md
@@ -0,0 +1,274 @@
+---
+name: 'step-03-quality-evaluation'
+description: 'Orchestrate adaptive quality dimension checks (agent-team, subagent, or sequential)'
+nextStepFile: './step-03f-aggregate-scores.md'
+---
+
+# Step 3: Orchestrate Adaptive Quality Evaluation
+
+## STEP GOAL
+
+Select execution mode deterministically, then evaluate quality dimensions using agent-team, subagent, or sequential execution while preserving output contracts:
+
+- Determinism
+- Isolation
+- Maintainability
+- Performance
+
+Coverage is intentionally excluded from this workflow and handled by `trace`.
+
+## MANDATORY EXECUTION RULES
+
+- 📖 Read the entire step file before acting
+- ✅ Speak in `{communication_language}`
+- ✅ Resolve execution mode from config (`tea_execution_mode`, `tea_capability_probe`)
+- ✅ Apply fallback rules deterministically when requested mode is unsupported
+- ✅ Wait for required worker steps to complete
+- ❌ Do NOT skip capability checks when probing is enabled
+- ❌ Do NOT proceed until required worker steps finish
+
+---
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+- 💾 Wait for subagent outputs
+- 📖 Load the next step only when instructed
+
+## CONTEXT BOUNDARIES:
+
+- Available context: test files from Step 2, knowledge fragments
+- Focus: orchestration only (mode selection + worker dispatch)
+- Limits: do not evaluate quality directly (delegate to worker steps)
+
+---
+
+## MANDATORY SEQUENCE
+
+### 1. Prepare Execution Context
+
+**Generate unique timestamp:**
+
+```javascript
+const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
+```
+
+**Prepare context for all subagents:**
+
+```javascript
+const parseBooleanFlag = (value, defaultValue = true) => {
+  if (typeof value === 'string') {
+    const normalized = value.trim().toLowerCase();
+    if (['false', '0', 'off', 'no'].includes(normalized)) return false;
+    if (['true', '1', 'on', 'yes'].includes(normalized)) return true;
+  }
+  if (value === undefined || value === null) return defaultValue;
+  return Boolean(value);
+};
+
+const subagentContext = {
+  test_files: /* from Step 2 */,
+  knowledge_fragments_loaded: ['test-quality'],
+  config: {
+    execution_mode: config.tea_execution_mode || 'auto',  // "auto" | "subagent" | "agent-team" | "sequential"
+    capability_probe: parseBooleanFlag(config.tea_capability_probe, true),  // supports booleans and "false"/"true" strings
+  },
+  timestamp: timestamp
+};
+```
+
+---
+
+### 2. Resolve Execution Mode with Capability Probe
+
+```javascript
+const normalizeUserExecutionMode = (mode) => {
+  if (typeof mode !== 'string') return null;
+  const normalized = mode.trim().toLowerCase().replace(/[-_]/g, ' ').replace(/\s+/g, ' ');
+
+  if (normalized === 'auto') return 'auto';
+  if (normalized === 'sequential') return 'sequential';
+  if (normalized === 'subagent' || normalized === 'sub agent' || normalized === 'subagents' || normalized === 'sub agents') {
+    return 'subagent';
+  }
+  if (normalized === 'agent team' || normalized === 'agent teams' || normalized === 'agentteam') {
+    return 'agent-team';
+  }
+
+  return null;
+};
+
+const normalizeConfigExecutionMode = (mode) => {
+  if (mode === 'subagent') return 'subagent';
+  if (mode === 'auto' || mode === 'sequential' || mode === 'subagent' || mode === 'agent-team') {
+    return mode;
+  }
+  return null;
+};
+
+// Explicit user instruction in the active run takes priority over config.
+const explicitModeFromUser = normalizeUserExecutionMode(runtime.getExplicitExecutionModeHint?.() || null);
+
+const requestedMode = explicitModeFromUser || normalizeConfigExecutionMode(subagentContext.config.execution_mode) || 'auto';
+const probeEnabled = subagentContext.config.capability_probe;
+
+const supports = {
+  subagent: false,
+  agentTeam: false,
+};
+
+if (probeEnabled) {
+  supports.subagent = runtime.canLaunchSubagents?.() === true;
+  supports.agentTeam = runtime.canLaunchAgentTeams?.() === true;
+}
+
+let resolvedMode = requestedMode;
+
+if (requestedMode === 'auto') {
+  if (supports.agentTeam) resolvedMode = 'agent-team';
+  else if (supports.subagent) resolvedMode = 'subagent';
+  else resolvedMode = 'sequential';
+} else if (probeEnabled && requestedMode === 'agent-team' && !supports.agentTeam) {
+  resolvedMode = supports.subagent ? 'subagent' : 'sequential';
+} else if (probeEnabled && requestedMode === 'subagent' && !supports.subagent) {
+  resolvedMode = 'sequential';
+}
+
+subagentContext.execution = {
+  requestedMode,
+  resolvedMode,
+  probeEnabled,
+  supports,
+};
+```
+
+Resolution precedence:
+
+1. Explicit user request in this run (`agent team` => `agent-team`; `subagent` => `subagent`; `sequential`; `auto`)
+2. `tea_execution_mode` from config
+3. Runtime capability fallback (when probing enabled)
+
+If probing is disabled, honor the requested mode strictly. If that mode cannot be executed at runtime, fail with explicit error instead of silent fallback.
+
+---
+
+### 3. Dispatch 4 Quality Workers
+
+**Subagent A: Determinism**
+
+- File: `./step-03a-subagent-determinism.md`
+- Output: `/tmp/tea-test-review-determinism-${timestamp}.json`
+- Execution:
+  - `agent-team` or `subagent`: launch non-blocking
+  - `sequential`: run blocking and wait
+- Status: Running... ⟳
+
+**Subagent B: Isolation**
+
+- File: `./step-03b-subagent-isolation.md`
+- Output: `/tmp/tea-test-review-isolation-${timestamp}.json`
+- Status: Running... ⟳
+
+**Subagent C: Maintainability**
+
+- File: `./step-03c-subagent-maintainability.md`
+- Output: `/tmp/tea-test-review-maintainability-${timestamp}.json`
+- Status: Running... ⟳
+
+**Subagent D: Performance**
+
+- File: `./step-03e-subagent-performance.md`
+- Output: `/tmp/tea-test-review-performance-${timestamp}.json`
+- Status: Running... ⟳
+
+In `agent-team` and `subagent` modes, runtime decides worker scheduling and concurrency.
+
+---
+
+### 4. Wait for Expected Worker Completion
+
+**If `resolvedMode` is `agent-team` or `subagent`:**
+
+```
+⏳ Waiting for 4 quality subagents to complete...
+✅ All 4 quality subagents completed successfully!
+```
+
+**If `resolvedMode` is `sequential`:**
+
+```
+✅ Sequential mode: each worker already completed during dispatch.
+```
+
+---
+
+### 5. Verify All Outputs Exist
+
+```javascript
+const outputs = ['determinism', 'isolation', 'maintainability', 'performance'].map(
+  (dim) => `/tmp/tea-test-review-${dim}-${timestamp}.json`,
+);
+
+outputs.forEach((output) => {
+  if (!fs.existsSync(output)) {
+    throw new Error(`Subagent output missing: ${output}`);
+  }
+});
+```
+
+---
+
+### 6. Execution Report
+
+```
+🚀 Performance Report:
+- Execution Mode: {resolvedMode}
+- Total Elapsed: ~mode-dependent
+- Parallel Gain: ~60-70% faster when mode is subagent/agent-team
+```
+
+---
+
+### 7. Proceed to Aggregation
+
+Pass the same `timestamp` value to Step 3F (do not regenerate it). Step 3F must read the exact temp files written in this step.
+
+Load next step: `{nextStepFile}`
+
+The aggregation step (3F) will:
+
+- Read all 4 subagent outputs
+- Calculate weighted overall score (0-100)
+- Aggregate violations by severity
+- Generate review report with top suggestions
+
+---
+
+## EXIT CONDITION
+
+Proceed to Step 3F when:
+
+- ✅ All 4 subagents completed successfully
+- ✅ All output files exist and are valid JSON
+- ✅ Execution metrics displayed
+
+**Do NOT proceed if any subagent failed.**
+
+---
+
+## 🚨 SYSTEM SUCCESS METRICS
+
+### ✅ SUCCESS:
+
+- All 4 subagents launched and completed
+- All required worker steps completed
+- Output files generated and valid
+- Fallback behavior respected configuration and capability probe rules
+
+### ❌ FAILURE:
+
+- One or more subagents failed
+- Output files missing or invalid
+- Unsupported requested mode with probing disabled
+
+**Master Rule:** Deterministic mode selection + stable output contract. Use the best supported mode, then aggregate normally.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03a-subagent-determinism.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03a-subagent-determinism.md
@@ -0,0 +1,214 @@
+---
+name: 'step-03a-subagent-determinism'
+description: 'Subagent: Check test determinism (no random/time dependencies)'
+subagent: true
+outputFile: '/tmp/tea-test-review-determinism-{{timestamp}}.json'
+---
+
+# Subagent 3A: Determinism Quality Check
+
+## SUBAGENT CONTEXT
+
+This is an **isolated subagent** running in parallel with other quality dimension checks.
+
+**What you have from parent workflow:**
+
+- Test files discovered in Step 2
+- Knowledge fragment: test-quality (determinism criteria)
+- Config: test framework
+
+**Your task:** Analyze test files for DETERMINISM violations only.
+
+---
+
+## MANDATORY EXECUTION RULES
+
+- 📖 Read this entire subagent file before acting
+- ✅ Check DETERMINISM only (not other quality dimensions)
+- ✅ Output structured JSON to temp file
+- ❌ Do NOT check isolation, maintainability, coverage, or performance (other subagents)
+- ❌ Do NOT modify test files (read-only analysis)
+- ❌ Do NOT run tests (just analyze code)
+
+---
+
+## SUBAGENT TASK
+
+### 1. Identify Determinism Violations
+
+**Scan test files for non-deterministic patterns:**
+
+**HIGH SEVERITY Violations**:
+
+- `Math.random()` - Random number generation
+- `Date.now()` or `new Date()` without mocking
+- `setTimeout` / `setInterval` without proper waits
+- External API calls without mocking
+- File system operations on random paths
+- Database queries with non-deterministic ordering
+
+**MEDIUM SEVERITY Violations**:
+
+- `page.waitForTimeout(N)` - Hard waits instead of conditions
+- Flaky selectors (CSS classes that may change)
+- Race conditions (missing proper synchronization)
+- Test order dependencies (test A must run before test B)
+
+**LOW SEVERITY Violations**:
+
+- Missing test isolation (shared state between tests)
+- Console timestamps without fixed timezone
+
+### 2. Analyze Each Test File
+
+For each test file from Step 2:
+
+```javascript
+const violations = [];
+
+// Check for Math.random()
+if (testFileContent.includes('Math.random()')) {
+  violations.push({
+    file: testFile,
+    line: findLineNumber('Math.random()'),
+    severity: 'HIGH',
+    category: 'random-generation',
+    description: 'Test uses Math.random() - non-deterministic',
+    suggestion: 'Use faker.seed(12345) for deterministic random data',
+  });
+}
+
+// Check for Date.now()
+if (testFileContent.includes('Date.now()') || testFileContent.includes('new Date()')) {
+  violations.push({
+    file: testFile,
+    line: findLineNumber('Date.now()'),
+    severity: 'HIGH',
+    category: 'time-dependency',
+    description: 'Test uses Date.now() or new Date() without mocking',
+    suggestion: 'Mock system time with test.useFakeTimers() or use fixed timestamps',
+  });
+}
+
+// Check for hard waits
+if (testFileContent.includes('waitForTimeout')) {
+  violations.push({
+    file: testFile,
+    line: findLineNumber('waitForTimeout'),
+    severity: 'MEDIUM',
+    category: 'hard-wait',
+    description: 'Test uses waitForTimeout - creates flakiness',
+    suggestion: 'Replace with expect(locator).toBeVisible() or waitForResponse',
+  });
+}
+
+// ... check other patterns
+```
+
+### 3. Calculate Determinism Score
+
+**Scoring Logic**:
+
+```javascript
+const totalChecks = testFiles.length * checksPerFile;
+const failedChecks = violations.length;
+const passedChecks = totalChecks - failedChecks;
+
+// Weight violations by severity
+const severityWeights = { HIGH: 10, MEDIUM: 5, LOW: 2 };
+const totalPenalty = violations.reduce((sum, v) => sum + severityWeights[v.severity], 0);
+
+// Score: 100 - (penalty points)
+const score = Math.max(0, 100 - totalPenalty);
+```
+
+---
+
+## OUTPUT FORMAT
+
+Write JSON to temp file: `/tmp/tea-test-review-determinism-{{timestamp}}.json`
+
+```json
+{
+  "dimension": "determinism",
+  "score": 85,
+  "max_score": 100,
+  "grade": "B",
+  "violations": [
+    {
+      "file": "tests/api/user.spec.ts",
+      "line": 42,
+      "severity": "HIGH",
+      "category": "random-generation",
+      "description": "Test uses Math.random() - non-deterministic",
+      "suggestion": "Use faker.seed(12345) for deterministic random data",
+      "code_snippet": "const userId = Math.random() * 1000;"
+    },
+    {
+      "file": "tests/e2e/checkout.spec.ts",
+      "line": 78,
+      "severity": "MEDIUM",
+      "category": "hard-wait",
+      "description": "Test uses waitForTimeout - creates flakiness",
+      "suggestion": "Replace with expect(locator).toBeVisible()",
+      "code_snippet": "await page.waitForTimeout(5000);"
+    }
+  ],
+  "passed_checks": 12,
+  "failed_checks": 3,
+  "total_checks": 15,
+  "violation_summary": {
+    "HIGH": 1,
+    "MEDIUM": 1,
+    "LOW": 1
+  },
+  "recommendations": [
+    "Use faker with fixed seed for all random data",
+    "Replace all waitForTimeout with conditional waits",
+    "Mock Date.now() in tests that use current time"
+  ],
+  "summary": "Tests are mostly deterministic with 3 violations (1 HIGH, 1 MEDIUM, 1 LOW)"
+}
+```
+
+**On Error:**
+
+```json
+{
+  "dimension": "determinism",
+  "success": false,
+  "error": "Error message describing what went wrong"
+}
+```
+
+---
+
+## EXIT CONDITION
+
+Subagent completes when:
+
+- ✅ All test files analyzed for determinism violations
+- ✅ Score calculated (0-100)
+- ✅ Violations categorized by severity
+- ✅ Recommendations generated
+- ✅ JSON output written to temp file
+
+**Subagent terminates here.** Parent workflow will read output and aggregate with other quality dimensions.
+
+---
+
+## 🚨 SUBAGENT SUCCESS METRICS
+
+### ✅ SUCCESS:
+
+- All test files scanned for determinism violations
+- Score calculated with proper severity weighting
+- JSON output valid and complete
+- Only determinism checked (not other dimensions)
+
+### ❌ FAILURE:
+
+- Checked quality dimensions other than determinism
+- Invalid or missing JSON output
+- Score calculation incorrect
+- Modified test files (should be read-only)
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03b-subagent-isolation.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03b-subagent-isolation.md
@@ -0,0 +1,125 @@
+---
+name: 'step-03b-subagent-isolation'
+description: 'Subagent: Check test isolation (no shared state/dependencies)'
+subagent: true
+outputFile: '/tmp/tea-test-review-isolation-{{timestamp}}.json'
+---
+
+# Subagent 3B: Isolation Quality Check
+
+## SUBAGENT CONTEXT
+
+This is an **isolated subagent** running in parallel with other quality dimension checks.
+
+**Your task:** Analyze test files for ISOLATION violations only.
+
+---
+
+## MANDATORY EXECUTION RULES
+
+- ✅ Check ISOLATION only (not other quality dimensions)
+- ✅ Output structured JSON to temp file
+- ❌ Do NOT check determinism, maintainability, coverage, or performance
+- ❌ Do NOT modify test files (read-only analysis)
+
+---
+
+## SUBAGENT TASK
+
+### 1. Identify Isolation Violations
+
+**Scan test files for isolation issues:**
+
+**HIGH SEVERITY Violations**:
+
+- Global state mutations (global variables modified)
+- Test order dependencies (test B depends on test A running first)
+- Shared database records without cleanup
+- beforeAll/afterAll with side effects leaking to other tests
+
+**MEDIUM SEVERITY Violations**:
+
+- Missing test cleanup (created data not deleted)
+- Shared fixtures that mutate state
+- Tests that assume specific execution order
+- Environment variables modified without restoration
+
+**LOW SEVERITY Violations**:
+
+- Tests sharing test data (but not mutating)
+- Missing test.describe grouping
+- Tests that could be more isolated
+
+### 2. Calculate Isolation Score
+
+```javascript
+const totalChecks = testFiles.length * checksPerFile;
+const failedChecks = violations.length;
+const severityWeights = { HIGH: 10, MEDIUM: 5, LOW: 2 };
+const totalPenalty = violations.reduce((sum, v) => sum + severityWeights[v.severity], 0);
+const score = Math.max(0, 100 - totalPenalty);
+```
+
+---
+
+## OUTPUT FORMAT
+
+```json
+{
+  "dimension": "isolation",
+  "score": 90,
+  "max_score": 100,
+  "grade": "A-",
+  "violations": [
+    {
+      "file": "tests/api/integration.spec.ts",
+      "line": 15,
+      "severity": "HIGH",
+      "category": "test-order-dependency",
+      "description": "Test depends on previous test creating user record",
+      "suggestion": "Each test should create its own test data in beforeEach",
+      "code_snippet": "test('should update user', async () => { /* assumes user exists */ });"
+    }
+  ],
+  "passed_checks": 14,
+  "failed_checks": 1,
+  "total_checks": 15,
+  "violation_summary": {
+    "HIGH": 1,
+    "MEDIUM": 0,
+    "LOW": 0
+  },
+  "recommendations": [
+    "Add beforeEach hooks to create test data",
+    "Add afterEach hooks to cleanup created records",
+    "Use test.describe.configure({ mode: 'parallel' }) to enforce isolation"
+  ],
+  "summary": "Tests are well isolated with 1 HIGH severity violation"
+}
+```
+
+---
+
+## EXIT CONDITION
+
+Subagent completes when:
+
+- ✅ All test files analyzed for isolation violations
+- ✅ Score calculated
+- ✅ JSON output written to temp file
+
+**Subagent terminates here.**
+
+---
+
+## 🚨 SUBAGENT SUCCESS METRICS
+
+### ✅ SUCCESS:
+
+- Only isolation checked (not other dimensions)
+- JSON output valid and complete
+
+### ❌ FAILURE:
+
+- Checked quality dimensions other than isolation
+- Invalid or missing JSON output
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03c-subagent-maintainability.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03c-subagent-maintainability.md
@@ -0,0 +1,102 @@
+---
+name: 'step-03c-subagent-maintainability'
+description: 'Subagent: Check test maintainability (readability, structure, DRY)'
+subagent: true
+outputFile: '/tmp/tea-test-review-maintainability-{{timestamp}}.json'
+---
+
+# Subagent 3C: Maintainability Quality Check
+
+## SUBAGENT CONTEXT
+
+This is an **isolated subagent** running in parallel with other quality dimension checks.
+
+**Your task:** Analyze test files for MAINTAINABILITY violations only.
+
+---
+
+## MANDATORY EXECUTION RULES
+
+- ✅ Check MAINTAINABILITY only (not other quality dimensions)
+- ✅ Output structured JSON to temp file
+- ❌ Do NOT check determinism, isolation, coverage, or performance
+
+---
+
+## SUBAGENT TASK
+
+### 1. Identify Maintainability Violations
+
+**HIGH SEVERITY Violations**:
+
+- Tests >100 lines (too complex)
+- No test.describe grouping
+- Duplicate test logic (copy-paste)
+- Unclear test names (no Given/When/Then structure)
+- Magic numbers/strings without constants
+
+**MEDIUM SEVERITY Violations**:
+
+- Tests missing comments for complex logic
+- Inconsistent naming conventions
+- Excessive nesting (>3 levels)
+- Large setup/teardown blocks
+
+**LOW SEVERITY Violations**:
+
+- Minor code style issues
+- Could benefit from helper functions
+- Inconsistent assertion styles
+
+### 2. Calculate Maintainability Score
+
+```javascript
+const severityWeights = { HIGH: 10, MEDIUM: 5, LOW: 2 };
+const totalPenalty = violations.reduce((sum, v) => sum + severityWeights[v.severity], 0);
+const score = Math.max(0, 100 - totalPenalty);
+```
+
+---
+
+## OUTPUT FORMAT
+
+```json
+{
+  "dimension": "maintainability",
+  "score": 75,
+  "max_score": 100,
+  "grade": "C",
+  "violations": [
+    {
+      "file": "tests/e2e/complex-flow.spec.ts",
+      "line": 1,
+      "severity": "HIGH",
+      "category": "test-too-long",
+      "description": "Test file is 250 lines - too complex to maintain",
+      "suggestion": "Split into multiple smaller test files by feature area",
+      "code_snippet": "test.describe('Complex flow', () => { /* 250 lines */ });"
+    }
+  ],
+  "passed_checks": 10,
+  "failed_checks": 5,
+  "violation_summary": {
+    "HIGH": 2,
+    "MEDIUM": 2,
+    "LOW": 1
+  },
+  "recommendations": [
+    "Split large test files into smaller, focused files (<100 lines each)",
+    "Add test.describe grouping for related tests",
+    "Extract duplicate logic into helper functions"
+  ],
+  "summary": "Tests have maintainability issues - 5 violations (2 HIGH)"
+}
+```
+
+---
+
+## EXIT CONDITION
+
+Subagent completes when JSON output written to temp file.
+
+**Subagent terminates here.**
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03e-subagent-performance.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03e-subagent-performance.md
@@ -0,0 +1,117 @@
+---
+name: 'step-03e-subagent-performance'
+description: 'Subagent: Check test performance (speed, efficiency, parallelization)'
+subagent: true
+outputFile: '/tmp/tea-test-review-performance-{{timestamp}}.json'
+---
+
+# Subagent 3E: Performance Quality Check
+
+## SUBAGENT CONTEXT
+
+This is an **isolated subagent** running in parallel with other quality dimension checks.
+
+**Your task:** Analyze test files for PERFORMANCE violations only.
+
+---
+
+## MANDATORY EXECUTION RULES
+
+- ✅ Check PERFORMANCE only (not other quality dimensions)
+- ✅ Output structured JSON to temp file
+- ❌ Do NOT check determinism, isolation, maintainability, or coverage
+
+---
+
+## SUBAGENT TASK
+
+### 1. Identify Performance Violations
+
+**HIGH SEVERITY Violations**:
+
+- Tests not parallelizable (using test.describe.serial unnecessarily)
+- Slow setup/teardown (creating fresh DB for every test)
+- Excessive navigation (reloading pages unnecessarily)
+- No fixture reuse (repeating expensive operations)
+
+**MEDIUM SEVERITY Violations**:
+
+- Hard waits >2 seconds (waitForTimeout(5000))
+- Inefficient selectors (page.$$ instead of locators)
+- Large data sets in tests without pagination
+- Missing performance optimizations
+
+**LOW SEVERITY Violations**:
+
+- Could use parallelization (test.describe.configure({ mode: 'parallel' }))
+- Minor inefficiencies
+- Excessive logging
+
+### 2. Calculate Performance Score
+
+```javascript
+const severityWeights = { HIGH: 10, MEDIUM: 5, LOW: 2 };
+const totalPenalty = violations.reduce((sum, v) => sum + severityWeights[v.severity], 0);
+const score = Math.max(0, 100 - totalPenalty);
+```
+
+---
+
+## OUTPUT FORMAT
+
+```json
+{
+  "dimension": "performance",
+  "score": 80,
+  "max_score": 100,
+  "grade": "B",
+  "violations": [
+    {
+      "file": "tests/e2e/search.spec.ts",
+      "line": 10,
+      "severity": "HIGH",
+      "category": "not-parallelizable",
+      "description": "Tests use test.describe.serial unnecessarily - reduces parallel execution",
+      "suggestion": "Remove .serial unless tests truly share state",
+      "code_snippet": "test.describe.serial('Search tests', () => { ... });"
+    },
+    {
+      "file": "tests/api/bulk-operations.spec.ts",
+      "line": 35,
+      "severity": "MEDIUM",
+      "category": "slow-setup",
+      "description": "Test creates 1000 records in setup - very slow",
+      "suggestion": "Use smaller data sets or fixture factories",
+      "code_snippet": "beforeEach(async () => { for (let i=0; i<1000; i++) { ... } });"
+    }
+  ],
+  "passed_checks": 13,
+  "failed_checks": 2,
+  "violation_summary": {
+    "HIGH": 1,
+    "MEDIUM": 1,
+    "LOW": 0
+  },
+  "performance_metrics": {
+    "parallelizable_tests": 80,
+    "serial_tests": 20,
+    "avg_test_duration_estimate": "~2 seconds",
+    "slow_tests": ["bulk-operations.spec.ts (>30s)"]
+  },
+  "recommendations": [
+    "Enable parallel mode where possible",
+    "Reduce setup data to minimum needed",
+    "Use fixtures to share expensive setup across tests",
+    "Remove unnecessary .serial constraints"
+  ],
+  "summary": "Good performance with 2 violations - 80% tests can run in parallel"
+}
+```
+
+---
+
+## EXIT CONDITION
+
+Subagent completes when JSON output written to temp file.
+
+**Subagent terminates here.**
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03f-aggregate-scores.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-03f-aggregate-scores.md
@@ -0,0 +1,277 @@
+---
+name: 'step-03f-aggregate-scores'
+description: 'Aggregate quality dimension scores into overall 0-100 score'
+nextStepFile: './step-04-generate-report.md'
+outputFile: '{test_artifacts}/test-review.md'
+---
+
+# Step 3F: Aggregate Quality Scores
+
+## STEP GOAL
+
+Read outputs from 4 quality subagents, calculate weighted overall score (0-100), and aggregate violations for report generation.
+
+---
+
+## MANDATORY EXECUTION RULES
+
+- 📖 Read the entire step file before acting
+- ✅ Speak in `{communication_language}`
+- ✅ Read all 4 subagent outputs
+- ✅ Calculate weighted overall score
+- ✅ Aggregate violations by severity
+- ❌ Do NOT re-evaluate quality (use subagent outputs)
+
+---
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+- 💾 Record outputs before proceeding
+- 📖 Load the next step only when instructed
+
+---
+
+## MANDATORY SEQUENCE
+
+### 1. Read All Subagent Outputs
+
+```javascript
+// Use the SAME timestamp generated in Step 3 (do not regenerate).
+const timestamp = subagentContext?.timestamp;
+if (!timestamp) {
+  throw new Error('Missing timestamp from Step 3 context. Pass Step 3 timestamp into Step 3F.');
+}
+const dimensions = ['determinism', 'isolation', 'maintainability', 'performance'];
+const results = {};
+
+dimensions.forEach((dim) => {
+  const outputPath = `/tmp/tea-test-review-${dim}-${timestamp}.json`;
+  results[dim] = JSON.parse(fs.readFileSync(outputPath, 'utf8'));
+});
+```
+
+**Verify all succeeded:**
+
+```javascript
+const allSucceeded = dimensions.every((dim) => results[dim].score !== undefined);
+if (!allSucceeded) {
+  throw new Error('One or more quality subagents failed!');
+}
+```
+
+---
+
+### 2. Calculate Weighted Overall Score
+
+**Dimension Weights** (based on TEA quality priorities):
+
+```javascript
+const weights = {
+  determinism: 0.3, // 30% - Reliability and flake prevention
+  isolation: 0.3, // 30% - Parallel safety and independence
+  maintainability: 0.25, // 25% - Readability and long-term health
+  performance: 0.15, // 15% - Speed and execution efficiency
+};
+```
+
+**Calculate overall score:**
+
+```javascript
+const overallScore = dimensions.reduce((sum, dim) => {
+  return sum + results[dim].score * weights[dim];
+}, 0);
+
+const roundedScore = Math.round(overallScore);
+```
+
+**Determine grade:**
+
+```javascript
+const getGrade = (score) => {
+  if (score >= 90) return 'A';
+  if (score >= 80) return 'B';
+  if (score >= 70) return 'C';
+  if (score >= 60) return 'D';
+  return 'F';
+};
+
+const overallGrade = getGrade(roundedScore);
+```
+
+---
+
+### 3. Aggregate Violations by Severity
+
+**Collect all violations from all dimensions:**
+
+```javascript
+const allViolations = dimensions.flatMap((dim) =>
+  results[dim].violations.map((v) => ({
+    ...v,
+    dimension: dim,
+  })),
+);
+
+// Group by severity
+const highSeverity = allViolations.filter((v) => v.severity === 'HIGH');
+const mediumSeverity = allViolations.filter((v) => v.severity === 'MEDIUM');
+const lowSeverity = allViolations.filter((v) => v.severity === 'LOW');
+
+const violationSummary = {
+  total: allViolations.length,
+  HIGH: highSeverity.length,
+  MEDIUM: mediumSeverity.length,
+  LOW: lowSeverity.length,
+};
+```
+
+---
+
+### 4. Prioritize Recommendations
+
+**Extract recommendations from all dimensions:**
+
+```javascript
+const allRecommendations = dimensions.flatMap((dim) =>
+  results[dim].recommendations.map((rec) => ({
+    dimension: dim,
+    recommendation: rec,
+    impact: results[dim].score < 70 ? 'HIGH' : 'MEDIUM',
+  })),
+);
+
+// Sort by impact (HIGH first)
+const prioritizedRecommendations = allRecommendations.sort((a, b) => (a.impact === 'HIGH' ? -1 : 1)).slice(0, 10); // Top 10 recommendations
+```
+
+---
+
+### 5. Create Review Summary Object
+
+**Aggregate all results:**
+
+```javascript
+const reviewSummary = {
+  overall_score: roundedScore,
+  overall_grade: overallGrade,
+  quality_assessment: getQualityAssessment(roundedScore),
+
+  dimension_scores: {
+    determinism: results.determinism.score,
+    isolation: results.isolation.score,
+    maintainability: results.maintainability.score,
+    performance: results.performance.score,
+  },
+
+  dimension_grades: {
+    determinism: results.determinism.grade,
+    isolation: results.isolation.grade,
+    maintainability: results.maintainability.grade,
+    performance: results.performance.grade,
+  },
+
+  violations_summary: violationSummary,
+
+  all_violations: allViolations,
+
+  high_severity_violations: highSeverity,
+
+  top_10_recommendations: prioritizedRecommendations,
+
+  subagent_execution: 'PARALLEL (4 quality dimensions)',
+  performance_gain: '~60% faster than sequential',
+};
+
+// Save for Step 4 (report generation)
+fs.writeFileSync(`/tmp/tea-test-review-summary-${timestamp}.json`, JSON.stringify(reviewSummary, null, 2), 'utf8');
+```
+
+---
+
+### 6. Display Summary to User
+
+```
+✅ Quality Evaluation Complete (Parallel Execution)
+
+📊 Overall Quality Score: {roundedScore}/100 (Grade: {overallGrade})
+
+📈 Dimension Scores:
+- Determinism:      {determinism_score}/100 ({determinism_grade})
+- Isolation:        {isolation_score}/100 ({isolation_grade})
+- Maintainability:  {maintainability_score}/100 ({maintainability_grade})
+- Performance:      {performance_score}/100 ({performance_grade})
+
+ℹ️ Coverage is excluded from `test-review` scoring. Use `trace` for coverage analysis and gates.
+
+⚠️ Violations Found:
+- HIGH:   {high_count} violations
+- MEDIUM: {medium_count} violations
+- LOW:    {low_count} violations
+- TOTAL:  {total_count} violations
+
+🚀 Performance: Parallel execution ~60% faster than sequential
+
+✅ Ready for report generation (Step 4)
+```
+
+---
+
+---
+
+### 7. Save Progress
+
+**Save this step's accumulated work to `{outputFile}`.**
+
+- **If `{outputFile}` does not exist** (first save), create it using the workflow template (if available) with YAML frontmatter:
+
+  ```yaml
+  ---
+  stepsCompleted: ['step-03f-aggregate-scores']
+  lastStep: 'step-03f-aggregate-scores'
+  lastSaved: '{date}'
+  ---
+  ```
+
+  Then write this step's output below the frontmatter.
+
+- **If `{outputFile}` already exists**, update:
+  - Add `'step-03f-aggregate-scores'` to `stepsCompleted` array (only if not already present)
+  - Set `lastStep: 'step-03f-aggregate-scores'`
+  - Set `lastSaved: '{date}'`
+  - Append this step's output to the appropriate section of the document.
+
+---
+
+## EXIT CONDITION
+
+Proceed to Step 4 when:
+
+- ✅ All subagent outputs read successfully
+- ✅ Overall score calculated
+- ✅ Violations aggregated
+- ✅ Recommendations prioritized
+- ✅ Summary saved to temp file
+- ✅ Output displayed to user
+- ✅ Progress saved to output document
+
+Load next step: `{nextStepFile}`
+
+---
+
+## 🚨 SYSTEM SUCCESS METRICS
+
+### ✅ SUCCESS:
+
+- All 4 subagent outputs read and parsed
+- Overall score calculated with proper weights
+- Violations aggregated correctly
+- Summary complete and saved
+
+### ❌ FAILURE:
+
+- Failed to read one or more subagent outputs
+- Score calculation incorrect
+- Summary missing or incomplete
+
+**Master Rule:** Aggregate determinism, isolation, maintainability, and performance only.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-04-generate-report.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-c/step-04-generate-report.md
@@ -0,0 +1,111 @@
+---
+name: 'step-04-generate-report'
+description: 'Create test-review report and validate'
+outputFile: '{test_artifacts}/test-review.md'
+---
+
+# Step 4: Generate Report & Validate
+
+## STEP GOAL
+
+Produce the test-review report and validate against checklist.
+
+## MANDATORY EXECUTION RULES
+
+- 📖 Read the entire step file before acting
+- ✅ Speak in `{communication_language}`
+
+---
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+- 💾 Record outputs before proceeding
+- 📖 Load the next step only when instructed
+
+## CONTEXT BOUNDARIES:
+
+- Available context: config, loaded artifacts, and knowledge fragments
+- Focus: this step's goal only
+- Limits: do not execute future steps
+- Dependencies: prior steps' outputs (if any)
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly. Do not skip, reorder, or improvise.
+
+## 1. Report Generation
+
+Use `test-review-template.md` to produce `{outputFile}` including:
+
+- Score summary
+- Critical findings with fixes
+- Warnings and recommendations
+- Context references (story/test-design if available)
+- Coverage boundary note: `test-review` does not score coverage. Direct coverage findings to `trace`.
+
+---
+
+## 2. Polish Output
+
+Before finalizing, review the complete output document for quality:
+
+1. **Remove duplication**: Progressive-append workflow may have created repeated sections — consolidate
+2. **Verify consistency**: Ensure terminology, risk scores, and references are consistent throughout
+3. **Check completeness**: All template sections should be populated or explicitly marked N/A
+4. **Format cleanup**: Ensure markdown formatting is clean (tables aligned, headers consistent, no orphaned references)
+
+---
+
+## 3. Validation
+
+Validate against `checklist.md` and fix any gaps.
+
+- [ ] CLI sessions cleaned up (no orphaned browsers)
+- [ ] Temp artifacts stored in `{test_artifacts}/` not random locations
+
+---
+
+## 4. Save Progress
+
+**Save this step's accumulated work to `{outputFile}`.**
+
+- **If `{outputFile}` does not exist** (first save), create it using the workflow template (if available) with YAML frontmatter:
+
+  ```yaml
+  ---
+  stepsCompleted: ['step-04-generate-report']
+  lastStep: 'step-04-generate-report'
+  lastSaved: '{date}'
+  ---
+  ```
+
+  Then write this step's output below the frontmatter.
+
+- **If `{outputFile}` already exists**, update:
+  - Add `'step-04-generate-report'` to `stepsCompleted` array (only if not already present)
+  - Set `lastStep: 'step-04-generate-report'`
+  - Set `lastSaved: '{date}'`
+  - Append this step's output to the appropriate section of the document.
+
+---
+
+## 5. Completion Summary
+
+Report:
+
+- Scope reviewed
+- Overall score
+- Critical blockers
+- Next recommended workflow (e.g., `automate` or `trace`)
+
+## 🚨 SYSTEM SUCCESS/FAILURE METRICS:
+
+### ✅ SUCCESS:
+
+- Step completed in full with required outputs
+
+### ❌ SYSTEM FAILURE:
+
+- Skipped sequence steps or missing outputs
+  **Master Rule:** Skipping steps is FORBIDDEN.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-e/step-01-assess.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-e/step-01-assess.md
@@ -0,0 +1,65 @@
+---
+name: 'step-01-assess'
+description: 'Load an existing output for editing'
+nextStepFile: './step-02-apply-edit.md'
+---
+
+# Step 1: Assess Edit Target
+
+## STEP GOAL:
+
+Identify which output should be edited and load it.
+
+## MANDATORY EXECUTION RULES (READ FIRST):
+
+### Universal Rules:
+
+- 📖 Read the complete step file before taking any action
+- ✅ Speak in `{communication_language}`
+
+### Role Reinforcement:
+
+- ✅ You are the Master Test Architect
+
+### Step-Specific Rules:
+
+- 🎯 Ask the user which output file to edit
+- 🚫 Do not edit until target is confirmed
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+
+## CONTEXT BOUNDARIES:
+
+- Available context: existing outputs
+- Focus: select edit target
+- Limits: no edits yet
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly.
+
+### 1. Identify Target
+
+Ask the user to provide the output file path or select from known outputs.
+
+### 2. Load Target
+
+Read the provided output file in full.
+
+### 3. Confirm
+
+Confirm the target and proceed to edit.
+
+Load next step: `{nextStepFile}`
+
+## 🚨 SYSTEM SUCCESS/FAILURE METRICS:
+
+### ✅ SUCCESS:
+
+- Target identified and loaded
+
+### ❌ SYSTEM FAILURE:
+
+- Proceeding without a confirmed target
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-e/step-02-apply-edit.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-e/step-02-apply-edit.md
@@ -0,0 +1,60 @@
+---
+name: 'step-02-apply-edit'
+description: 'Apply edits to the selected output'
+---
+
+# Step 2: Apply Edits
+
+## STEP GOAL:
+
+Apply the requested edits to the selected output and confirm changes.
+
+## MANDATORY EXECUTION RULES (READ FIRST):
+
+### Universal Rules:
+
+- 📖 Read the complete step file before taking any action
+- ✅ Speak in `{communication_language}`
+
+### Role Reinforcement:
+
+- ✅ You are the Master Test Architect
+
+### Step-Specific Rules:
+
+- 🎯 Only apply edits explicitly requested by the user
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+
+## CONTEXT BOUNDARIES:
+
+- Available context: selected output and user changes
+- Focus: apply edits only
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly.
+
+### 1. Confirm Requested Changes
+
+Restate what will be changed and confirm.
+
+### 2. Apply Changes
+
+Update the output file accordingly.
+
+### 3. Report
+
+Summarize the edits applied.
+
+## 🚨 SYSTEM SUCCESS/FAILURE METRICS:
+
+### ✅ SUCCESS:
+
+- Changes applied and confirmed
+
+### ❌ SYSTEM FAILURE:
+
+- Unconfirmed edits or missing update
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-v/step-01-validate.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/steps-v/step-01-validate.md
@@ -0,0 +1,67 @@
+---
+name: 'step-01-validate'
+description: 'Validate workflow outputs against checklist'
+outputFile: '{test_artifacts}/test-review-validation-report.md'
+validationChecklist: '../checklist.md'
+---
+
+# Step 1: Validate Outputs
+
+## STEP GOAL:
+
+Validate outputs using the workflow checklist and record findings.
+
+## MANDATORY EXECUTION RULES (READ FIRST):
+
+### Universal Rules:
+
+- 📖 Read the complete step file before taking any action
+- ✅ Speak in `{communication_language}`
+
+### Role Reinforcement:
+
+- ✅ You are the Master Test Architect
+
+### Step-Specific Rules:
+
+- 🎯 Validate against `{validationChecklist}`
+- 🚫 Do not skip checks
+
+## EXECUTION PROTOCOLS:
+
+- 🎯 Follow the MANDATORY SEQUENCE exactly
+- 💾 Write findings to `{outputFile}`
+
+## CONTEXT BOUNDARIES:
+
+- Available context: workflow outputs and checklist
+- Focus: validation only
+- Limits: do not modify outputs in this step
+
+## MANDATORY SEQUENCE
+
+**CRITICAL:** Follow this sequence exactly.
+
+### 1. Load Checklist
+
+Read `{validationChecklist}` and list all criteria.
+
+### 2. Validate Outputs
+
+Evaluate outputs against each checklist item.
+
+### 3. Write Report
+
+Write a validation report to `{outputFile}` with PASS/WARN/FAIL per section.
+
+## 🚨 SYSTEM SUCCESS/FAILURE METRICS:
+
+### ✅ SUCCESS:
+
+- Validation report written
+- All checklist items evaluated
+
+### ❌ SYSTEM FAILURE:
+
+- Skipped checklist items
+- No report produced
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/test-review-template.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/test-review-template.md
@@ -0,0 +1,387 @@
+---
+stepsCompleted: []
+lastStep: ''
+lastSaved: ''
+workflowType: 'testarch-test-review'
+inputDocuments: []
+---
+
+# Test Quality Review: {test_filename}
+
+**Quality Score**: {score}/100 ({grade} - {assessment})
+**Review Date**: {YYYY-MM-DD}
+**Review Scope**: {single | directory | suite}
+**Reviewer**: {user_name or TEA Agent}
+
+---
+
+Note: This review audits existing tests; it does not generate tests.
+Coverage mapping and coverage gates are out of scope here. Use `trace` for coverage decisions.
+
+## Executive Summary
+
+**Overall Assessment**: {Excellent | Good | Acceptable | Needs Improvement | Critical Issues}
+
+**Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
+
+### Key Strengths
+
+✅ {strength_1}
+✅ {strength_2}
+✅ {strength_3}
+
+### Key Weaknesses
+
+❌ {weakness_1}
+❌ {weakness_2}
+❌ {weakness_3}
+
+### Summary
+
+{1-2 paragraph summary of overall test quality, highlighting major findings and recommendation rationale}
+
+---
+
+## Quality Criteria Assessment
+
+| Criterion                            | Status                          | Violations | Notes        |
+| ------------------------------------ | ------------------------------- | ---------- | ------------ |
+| BDD Format (Given-When-Then)         | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Test IDs                             | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Priority Markers (P0/P1/P2/P3)       | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Hard Waits (sleep, waitForTimeout)   | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Determinism (no conditionals)        | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Isolation (cleanup, no shared state) | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Fixture Patterns                     | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Data Factories                       | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Network-First Pattern                | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Explicit Assertions                  | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+| Test Length (≤300 lines)             | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {lines}    | {brief_note} |
+| Test Duration (≤1.5 min)             | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {duration} | {brief_note} |
+| Flakiness Patterns                   | {✅ PASS \| ⚠️ WARN \| ❌ FAIL} | {count}    | {brief_note} |
+
+**Total Violations**: {critical_count} Critical, {high_count} High, {medium_count} Medium, {low_count} Low
+
+---
+
+## Quality Score Breakdown
+
+```
+Starting Score:          100
+Critical Violations:     -{critical_count} × 10 = -{critical_deduction}
+High Violations:         -{high_count} × 5 = -{high_deduction}
+Medium Violations:       -{medium_count} × 2 = -{medium_deduction}
+Low Violations:          -{low_count} × 1 = -{low_deduction}
+
+Bonus Points:
+  Excellent BDD:         +{0|5}
+  Comprehensive Fixtures: +{0|5}
+  Data Factories:        +{0|5}
+  Network-First:         +{0|5}
+  Perfect Isolation:     +{0|5}
+  All Test IDs:          +{0|5}
+                         --------
+Total Bonus:             +{bonus_total}
+
+Final Score:             {final_score}/100
+Grade:                   {grade}
+```
+
+---
+
+## Critical Issues (Must Fix)
+
+{If no critical issues: "No critical issues detected. ✅"}
+
+{For each critical issue:}
+
+### {issue_number}. {Issue Title}
+
+**Severity**: P0 (Critical)
+**Location**: `{filename}:{line_number}`
+**Criterion**: {criterion_name}
+**Knowledge Base**: [{fragment_name}]({fragment_path})
+
+**Issue Description**:
+{Detailed explanation of what the problem is and why it's critical}
+
+**Current Code**:
+
+```typescript
+// ❌ Bad (current implementation)
+{
+  code_snippet_showing_problem;
+}
+```
+
+**Recommended Fix**:
+
+```typescript
+// ✅ Good (recommended approach)
+{
+  code_snippet_showing_solution;
+}
+```
+
+**Why This Matters**:
+{Explanation of impact - flakiness risk, maintainability, reliability}
+
+**Related Violations**:
+{If similar issue appears elsewhere, note line numbers}
+
+---
+
+## Recommendations (Should Fix)
+
+{If no recommendations: "No additional recommendations. Test quality is excellent. ✅"}
+
+{For each recommendation:}
+
+### {rec_number}. {Recommendation Title}
+
+**Severity**: {P1 (High) | P2 (Medium) | P3 (Low)}
+**Location**: `{filename}:{line_number}`
+**Criterion**: {criterion_name}
+**Knowledge Base**: [{fragment_name}]({fragment_path})
+
+**Issue Description**:
+{Detailed explanation of what could be improved and why}
+
+**Current Code**:
+
+```typescript
+// ⚠️ Could be improved (current implementation)
+{
+  code_snippet_showing_current_approach;
+}
+```
+
+**Recommended Improvement**:
+
+```typescript
+// ✅ Better approach (recommended)
+{
+  code_snippet_showing_improvement;
+}
+```
+
+**Benefits**:
+{Explanation of benefits - maintainability, readability, reusability}
+
+**Priority**:
+{Why this is P1/P2/P3 - urgency and impact}
+
+---
+
+## Best Practices Found
+
+{If good patterns found, highlight them}
+
+{For each best practice:}
+
+### {practice_number}. {Best Practice Title}
+
+**Location**: `{filename}:{line_number}`
+**Pattern**: {pattern_name}
+**Knowledge Base**: [{fragment_name}]({fragment_path})
+
+**Why This Is Good**:
+{Explanation of why this pattern is excellent}
+
+**Code Example**:
+
+```typescript
+// ✅ Excellent pattern demonstrated in this test
+{
+  code_snippet_showing_best_practice;
+}
+```
+
+**Use as Reference**:
+{Encourage using this pattern in other tests}
+
+---
+
+## Test File Analysis
+
+### File Metadata
+
+- **File Path**: `{relative_path_from_project_root}`
+- **File Size**: {line_count} lines, {kb_size} KB
+- **Test Framework**: {Playwright | Jest | Cypress | Vitest | Other}
+- **Language**: {TypeScript | JavaScript}
+
+### Test Structure
+
+- **Describe Blocks**: {describe_count}
+- **Test Cases (it/test)**: {test_count}
+- **Average Test Length**: {avg_lines_per_test} lines per test
+- **Fixtures Used**: {fixture_count} ({fixture_names})
+- **Data Factories Used**: {factory_count} ({factory_names})
+
+### Test Scope
+
+- **Test IDs**: {test_id_list}
+- **Priority Distribution**:
+  - P0 (Critical): {p0_count} tests
+  - P1 (High): {p1_count} tests
+  - P2 (Medium): {p2_count} tests
+  - P3 (Low): {p3_count} tests
+  - Unknown: {unknown_count} tests
+
+### Assertions Analysis
+
+- **Total Assertions**: {assertion_count}
+- **Assertions per Test**: {avg_assertions_per_test} (avg)
+- **Assertion Types**: {assertion_types_used}
+
+---
+
+## Context and Integration
+
+### Related Artifacts
+
+{If story file found:}
+
+- **Story File**: [{story_filename}]({story_path})
+
+{If test-design found:}
+
+- **Test Design**: [{test_design_filename}]({test_design_path})
+- **Risk Assessment**: {risk_level}
+- **Priority Framework**: P0-P3 applied
+
+---
+
+## Knowledge Base References
+
+This review consulted the following knowledge base fragments:
+
+- **[test-quality.md](../../../testarch/knowledge/test-quality.md)** - Definition of Done for tests (no hard waits, <300 lines, <1.5 min, self-cleaning)
+- **[fixture-architecture.md](../../../testarch/knowledge/fixture-architecture.md)** - Pure function → Fixture → mergeTests pattern
+- **[network-first.md](../../../testarch/knowledge/network-first.md)** - Route intercept before navigate (race condition prevention)
+- **[data-factories.md](../../../testarch/knowledge/data-factories.md)** - Factory functions with overrides, API-first setup
+- **[test-levels-framework.md](../../../testarch/knowledge/test-levels-framework.md)** - E2E vs API vs Component vs Unit appropriateness
+- **[tdd-cycles.md](../../../testarch/knowledge/tdd-cycles.md)** - Red-Green-Refactor patterns
+- **[selective-testing.md](../../../testarch/knowledge/selective-testing.md)** - Duplicate coverage detection
+- **[ci-burn-in.md](../../../testarch/knowledge/ci-burn-in.md)** - Flakiness detection patterns (10-iteration loop)
+- **[test-priorities.md](../../../testarch/knowledge/test-priorities.md)** - P0/P1/P2/P3 classification framework
+
+For coverage mapping, consult `trace` workflow outputs.
+
+See [tea-index.csv](../../../testarch/tea-index.csv) for complete knowledge base.
+
+---
+
+## Next Steps
+
+### Immediate Actions (Before Merge)
+
+1. **{action_1}** - {description}
+   - Priority: {P0 | P1 | P2}
+   - Owner: {team_or_person}
+   - Estimated Effort: {time_estimate}
+
+2. **{action_2}** - {description}
+   - Priority: {P0 | P1 | P2}
+   - Owner: {team_or_person}
+   - Estimated Effort: {time_estimate}
+
+### Follow-up Actions (Future PRs)
+
+1. **{action_1}** - {description}
+   - Priority: {P2 | P3}
+   - Target: {next_milestone | backlog}
+
+2. **{action_2}** - {description}
+   - Priority: {P2 | P3}
+   - Target: {next_milestone | backlog}
+
+### Re-Review Needed?
+
+{✅ No re-review needed - approve as-is}
+{⚠️ Re-review after critical fixes - request changes, then re-review}
+{❌ Major refactor required - block merge, pair programming recommended}
+
+---
+
+## Decision
+
+**Recommendation**: {Approve | Approve with Comments | Request Changes | Block}
+
+**Rationale**:
+{1-2 paragraph explanation of recommendation based on findings}
+
+**For Approve**:
+
+> Test quality is excellent/good with {score}/100 score. {Minor issues noted can be addressed in follow-up PRs.} Tests are production-ready and follow best practices.
+
+**For Approve with Comments**:
+
+> Test quality is acceptable with {score}/100 score. {High-priority recommendations should be addressed but don't block merge.} Critical issues resolved, but improvements would enhance maintainability.
+
+**For Request Changes**:
+
+> Test quality needs improvement with {score}/100 score. {Critical issues must be fixed before merge.} {X} critical violations detected that pose flakiness/maintainability risks.
+
+**For Block**:
+
+> Test quality is insufficient with {score}/100 score. {Multiple critical issues make tests unsuitable for production.} Recommend pairing session with QA engineer to apply patterns from knowledge base.
+
+---
+
+## Appendix
+
+### Violation Summary by Location
+
+{Table of all violations sorted by line number:}
+
+| Line   | Severity      | Criterion   | Issue         | Fix         |
+| ------ | ------------- | ----------- | ------------- | ----------- |
+| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
+| {line} | {P0/P1/P2/P3} | {criterion} | {brief_issue} | {brief_fix} |
+
+### Quality Trends
+
+{If reviewing same file multiple times, show trend:}
+
+| Review Date  | Score         | Grade     | Critical Issues | Trend       |
+| ------------ | ------------- | --------- | --------------- | ----------- |
+| {YYYY-MM-DD} | {score_1}/100 | {grade_1} | {count_1}       | ⬆️ Improved |
+| {YYYY-MM-DD} | {score_2}/100 | {grade_2} | {count_2}       | ⬇️ Declined |
+| {YYYY-MM-DD} | {score_3}/100 | {grade_3} | {count_3}       | ➡️ Stable   |
+
+### Related Reviews
+
+{If reviewing multiple files in directory/suite:}
+
+| File     | Score       | Grade   | Critical | Status             |
+| -------- | ----------- | ------- | -------- | ------------------ |
+| {file_1} | {score}/100 | {grade} | {count}  | {Approved/Blocked} |
+| {file_2} | {score}/100 | {grade} | {count}  | {Approved/Blocked} |
+| {file_3} | {score}/100 | {grade} | {count}  | {Approved/Blocked} |
+
+**Suite Average**: {avg_score}/100 ({avg_grade})
+
+---
+
+## Review Metadata
+
+**Generated By**: BMad TEA Agent (Test Architect)
+**Workflow**: testarch-test-review v4.0
+**Review ID**: test-review-{filename}-{YYYYMMDD}
+**Timestamp**: {YYYY-MM-DD HH:MM:SS}
+**Version**: 1.0
+
+---
+
+## Feedback on This Review
+
+If you have questions or feedback on this review:
+
+1. Review patterns in knowledge base: `testarch/knowledge/`
+2. Consult tea-index.csv for detailed guidance
+3. Request clarification on specific violations
+4. Pair with QA engineer to apply patterns
+
+This review is guidance, not rigid rules. Context matters - if a pattern is justified, document it with a comment.
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/validation-report-20260127-095021.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/validation-report-20260127-095021.md
@@ -0,0 +1,72 @@
+---
+validationDate: 2026-01-27
+workflowName: testarch-test-review
+workflowPath: {project-root}/src/workflows/testarch/bmad-testarch-test-review
+validationStatus: COMPLETE
+completionDate: 2026-01-27 10:03:10
+---
+
+# Validation Report: testarch-test-review
+
+**Validation Started:** 2026-01-27 09:50:21
+**Validator:** BMAD Workflow Validation System (Codex)
+**Standards Version:** BMAD Workflow Standards
+
+## File Structure & Size
+
+- workflow.md present: YES
+- instructions.md present: YES
+- workflow.yaml present: YES
+- step files found: 7
+
+**Step File Sizes:**
+
+- steps-c/step-01-load-context.md: 91 lines [GOOD]
+- steps-c/step-02-discover-tests.md: 63 lines [GOOD]
+- steps-c/step-03-quality-evaluation.md: 69 lines [GOOD]
+- steps-c/step-04-generate-report.md: 65 lines [GOOD]
+- steps-e/step-01-assess.md: 51 lines [GOOD]
+- steps-e/step-02-apply-edit.md: 46 lines [GOOD]
+- steps-v/step-01-validate.md: 53 lines [GOOD]
+- workflow-plan.md present: YES
+
+## Frontmatter Validation
+
+- No frontmatter violations found
+
+## Critical Path Violations
+
+- No {project-root} hardcoded paths detected in body
+- No dead relative links detected
+
+## Menu Handling Validation
+
+- No menu structures detected (linear step flow) [N/A]
+
+## Step Type Validation
+
+- Last step steps-v/step-01-validate.md has no nextStepFile (final step OK)
+- Step type validation assumes linear sequence (no branching/menu). Workflow-plan.md present for reference. [INFO]
+
+## Output Format Validation
+
+- Templates present: test-review-template.md
+- Steps with outputFile in frontmatter:
+  - steps-c/step-04-generate-report.md
+  - steps-v/step-01-validate.md
+
+## Validation Design Check
+
+- checklist.md present: YES
+- Validation steps folder (steps-v) present: YES
+
+## Instruction Style Check
+
+- All steps include STEP GOAL, MANDATORY EXECUTION RULES, EXECUTION PROTOCOLS, CONTEXT BOUNDARIES, and SUCCESS/FAILURE metrics
+
+## Summary
+
+- Validation completed: 2026-01-27 10:03:10
+- Critical issues: 0
+- Warnings: 0 (informational notes only)
+- Readiness: READY (manual review optional)
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/validation-report-20260127-102401.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/validation-report-20260127-102401.md
@@ -0,0 +1,114 @@
+---
+validationDate: 2026-01-27
+workflowName: testarch-test-review
+workflowPath: {project-root}/src/workflows/testarch/bmad-testarch-test-review
+validationStatus: COMPLETE
+completionDate: 2026-01-27 10:24:01
+---
+
+# Validation Report: testarch-test-review
+
+**Validation Started:** 2026-01-27 10:24:01
+**Validator:** BMAD Workflow Validation System (Codex)
+**Standards Version:** BMAD Workflow Standards
+
+## File Structure & Size
+
+- workflow.md present: YES
+- instructions.md present: YES
+- workflow.yaml present: YES
+- step files found: 7
+
+**Step File Sizes:**
+
+- steps-c/step-01-load-context.md: 90 lines [GOOD]
+- steps-c/step-02-discover-tests.md: 62 lines [GOOD]
+- steps-c/step-03-quality-evaluation.md: 68 lines [GOOD]
+- steps-c/step-04-generate-report.md: 64 lines [GOOD]
+- steps-e/step-01-assess.md: 50 lines [GOOD]
+- steps-e/step-02-apply-edit.md: 45 lines [GOOD]
+- steps-v/step-01-validate.md: 52 lines [GOOD]
+- workflow-plan.md present: YES
+
+## Frontmatter Validation
+
+- No frontmatter violations found
+
+## Critical Path Violations
+
+### Config Variables (Exceptions)
+
+Standard BMAD config variables treated as valid exceptions: bmb_creations_output_folder, communication_language, document_output_language, output_folder, planning_artifacts, project-root, project_name, test_artifacts, user_name
+
+- No {project-root} hardcoded paths detected in body
+
+- No dead relative links detected
+
+- No module path assumptions detected
+
+**Status:** ✅ PASS - No critical violations
+
+## Menu Handling Validation
+
+- No menu structures detected (linear step flow) [N/A]
+
+## Step Type Validation
+
+- steps-c/step-01-load-context.md: Init [PASS]
+- steps-c/step-02-discover-tests.md: Middle [PASS]
+- steps-c/step-03-quality-evaluation.md: Middle [PASS]
+- steps-c/step-04-generate-report.md: Final [PASS]
+- Step type validation assumes linear sequence (no branching/menu). Workflow-plan.md present for reference. [INFO]
+
+## Output Format Validation
+
+- Templates present: test-review-template.md
+- Steps with outputFile in frontmatter:
+  - steps-c/step-04-generate-report.md
+  - steps-v/step-01-validate.md
+- checklist.md present: YES
+
+## Validation Design Check
+
+- Validation steps folder (steps-v) present: YES
+- Validation step(s) present: step-01-validate.md
+- Validation steps reference checklist data and auto-proceed
+
+## Instruction Style Check
+
+- Instruction style: Prescriptive (appropriate for TEA quality/compliance workflows)
+- Steps emphasize mandatory sequence, explicit success/failure metrics, and risk-based guidance
+
+## Collaborative Experience Check
+
+- Overall facilitation quality: GOOD
+- Steps use progressive prompts and clear role reinforcement; no laundry-list interrogation detected
+- Flow progression is clear and aligned to workflow goals
+
+## Subagent Optimization Opportunities
+
+- No high-priority subagent optimizations identified; workflow already uses step-file architecture
+- Pattern 1 (grep/regex): N/A for most steps
+- Pattern 2 (per-file analysis): already aligned to validation structure
+- Pattern 3 (data ops): minimal data file loads
+- Pattern 4 (parallel): optional for validation only
+
+## Cohesive Review
+
+- Overall assessment: GOOD
+- Flow is linear, goals are clear, and outputs map to TEA artifacts
+- Voice and tone consistent with Test Architect persona
+- Recommendation: READY (minor refinements optional)
+
+## Plan Quality Validation
+
+- Plan file present: workflow-plan.md
+- Planned steps found: 7 (all implemented)
+- Plan implementation status: Fully Implemented
+
+## Summary
+
+- Validation completed: 2026-01-27 10:24:01
+- Critical issues: 0
+- Warnings: 0 (informational notes only)
+- Readiness: READY (manual review optional)
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/workflow-plan.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/workflow-plan.md
@@ -0,0 +1,18 @@
+    # Workflow Plan: testarch-test-review
+
+    ## Create Mode (steps-c)
+    - step-01-load-context.md
+
+- step-02-discover-tests.md
+- step-03-quality-evaluation.md
+- step-04-generate-report.md
+
+  ## Validate Mode (steps-v)
+  - step-01-validate.md
+
+  ## Edit Mode (steps-e)
+  - step-01-assess.md
+  - step-02-apply-edit.md
+
+  ## Outputs
+  - {test_artifacts}/test-review.md
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/workflow.md
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/workflow.md
@@ -0,0 +1,41 @@
+---
+name: bmad-testarch-test-review
+description: Review test quality using best practices validation. Use when user says 'lets review tests' or 'I want to evaluate test quality'
+web_bundle: true
+---
+
+# Test Quality Review
+
+**Goal:** Review test quality using comprehensive knowledge base and best practices validation
+
+**Role:** You are the Master Test Architect.
+
+---
+
+## WORKFLOW ARCHITECTURE
+
+This workflow uses **tri-modal step-file architecture**:
+
+- **Create mode (steps-c/)**: primary execution flow
+- **Validate mode (steps-v/)**: validation against checklist
+- **Edit mode (steps-e/)**: revise existing outputs
+
+---
+
+## INITIALIZATION SEQUENCE
+
+### 1. Mode Determination
+
+"Welcome to the workflow. What would you like to do?"
+
+- **[C] Create** — Run the workflow
+- **[R] Resume** — Resume an interrupted workflow
+- **[V] Validate** — Validate existing outputs
+- **[E] Edit** — Edit existing outputs
+
+### 2. Route to First Step
+
+- **If C:** Load `steps-c/step-01-load-context.md`
+- **If R:** Load `steps-c/step-01b-resume.md`
+- **If V:** Load `steps-v/step-01-validate.md`
+- **If E:** Load `steps-e/step-01-assess.md`
--- a/_bmad/tea/workflows/testarch/bmad-testarch-test-review/workflow.yaml
+++ b/_bmad/tea/workflows/testarch/bmad-testarch-test-review/workflow.yaml
@@ -0,0 +1,48 @@
+# Test Architect workflow: bmad-testarch-test-review
+name: bmad-testarch-test-review
+# prettier-ignore
+description: 'Review test quality using best practices validation. Use when the user says "lets review tests" or "I want to evaluate test quality"'
+
+# Critical variables from config
+config_source: "{project-root}/_bmad/tea/config.yaml"
+output_folder: "{config_source}:output_folder"
+test_artifacts: "{config_source}:test_artifacts"
+user_name: "{config_source}:user_name"
+communication_language: "{config_source}:communication_language"
+document_output_language: "{config_source}:document_output_language"
+date: system-generated
+
+# Workflow components
+installed_path: "."
+instructions: "./instructions.md"
+validation: "./checklist.md"
+template: "./test-review-template.md"
+
+# Variables and inputs
+variables:
+  test_dir: "{project-root}/tests" # Root test directory
+  review_scope: "single" # single (one file), directory (folder), suite (all tests)
+  test_stack_type: "auto" # auto, frontend, backend, fullstack - from config or auto-detected
+
+# Output configuration
+default_output_file: "{test_artifacts}/test-review.md"
+
+# Required tools
+required_tools:
+  - read_file # Read test files, story, test-design
+  - write_file # Create review report
+  - list_files # Discover test files in directory
+  - search_repo # Find tests by patterns
+  - glob # Find test files matching patterns
+
+tags:
+  - qa
+  - test-architect
+  - code-review
+  - quality
+  - best-practices
+
+execution_hints:
+  interactive: false # Minimize prompts
+  autonomous: true # Proceed without user input unless blocked
+  iterative: true # Can review multiple files