# APOPHIS Test Quality Audit Report **Date**: 2026-04-29 **Scope**: 55 test files, ~20,450 lines **Auditors**: 3 parallel subworkers (CLI tests, Domain/Core tests, Feature tests) --- ## Executive Summary | Category | Count | Lines | Verdict | |----------|-------|-------|---------| | **CLI Tests** | 18 files | ~9,209 lines | 10 KEEP, 3 MERGE, 4 REFACTOR, 1 DELETE | | **Domain/Core Tests** | 11 files | ~4,500 lines | 8 KEEP, 1 MERGE, 2 REFACTOR | | **Feature Tests** | 26 files | ~6,741 lines | 20 KEEP, 2 MERGE, 4 REFACTOR, 3 DELETE | | **Total** | 55 files | ~20,450 lines | 38 KEEP, 6 MERGE, 10 REFACTOR, 4 DELETE | **Key Findings**: - 4 test files test non-production helpers (cascade-validator, hypermedia-validator, etc.) - 6 files have significant overlap with other tests - 10 files need refactoring (temp app approach broken, implementation testing, weak assertions) - 38 files provide unique, valuable coverage --- ## Critical Issues (Fix First) ### 1. Broken Test Approach: `verify-ux.test.ts` - **Status**: 16 of 20 tests FAIL (80% failure rate) - **Root cause**: Creates temp app.js files that aren't valid Fastify apps - **Impact**: Unreliable regression protection - **Fix**: Switch to fixture apps (`src/cli/__fixtures__/`) or create new fixtures ### 2. Duplicate Tests: `integration.test.ts` - **Status**: 3 pairs of duplicate/near-duplicate tests (6 tests) - **Impact**: Wasted CI time, no added coverage - **Fix**: Remove duplicates ### 3. Non-Production Helpers: `cascade-validator.test.ts`, `hypermedia-validator.test.ts` - **Status**: Test helpers that were merged into test files, never imported by production code - **Impact**: Test maintenance burden for dead code - **Fix**: Delete (production coverage exists in `relationships.test.ts`) ### 4. Inline Copies: `deduplication.test.ts` - **Status**: Contains stale copies of `deduplicatePetit`/`deduplicateStateful` - **Impact**: Tests don't exercise actual production code - **Fix**: Import from `runner-utils.ts` instead --- ## CLI Test Audit (18 files) ### KEEP (10 files) | File | Tests | Value | Why | |------|-------|-------|-----| | `docs-smoke.test.ts` | 4 | **Unique** | Only test verifying documentation accuracy | | `goldens.test.ts` | 9 | **High** | Guards CLI output against accidental changes | | `init.test.ts` | 17 | **Unique** | Only deep init coverage | | `latency.test.ts` | 5 | **Unique** | Performance regression guards | | `migrate-reliability.test.ts` | 20 | **Unique** | Canonical migrate test, 80% coverage | | `observe-safety.test.ts` | 20 | **Unique** | Only policy engine + observe integration | | `packaging.test.ts` | 15 | **Unique** | Only test of built binary | | `qualify-signal.test.ts` | 16 | **Unique** | Only artifact structure validation | | `renderers.test.ts` | 18 | **Unique** | Only renderer function tests | | `replay-integrity.test.ts` | 10 | **Unique** | Only replay loader/schema tests | ### MERGE (3 files) | File | Target | Reason | |------|--------|--------| | `core.test.ts` | `dispatch.test.ts` | Tests same CLI entrypoint, weaker assertions | | `migrate.test.ts` | `migrate-reliability.test.ts` | Subset coverage, 15 tests vs 20 | | `observe.test.ts` | `observe-safety.test.ts` | Keep fixture-based tests only | ### REFACTOR (4 files) | File | Issue | Fix | |------|-------|-----| | `acceptance.test.ts` | 8 tests fail due to fixture instability | Use `main()` entrypoint, drop failing tests | | `config-validation.test.ts` | 271 tests, many permutations | Collapse to ~50 parameterized tests | | `doctor-consistency.test.ts` | 5 tests fail (temp apps not valid) | Use fixture apps instead | | `verify-ux.test.ts` | 16 of 20 tests fail | Switch to fixture apps | ### DELETE (after merge) - `core.test.ts` → merged into dispatch - `migrate.test.ts` → merged into migrate-reliability - `observe.test.ts` → merged into observe-safety --- ## Domain/Core Test Audit (11 files) ### KEEP (8 files) | File | Tests | Value | |------|-------|-------| | `domain.test.ts` | 45 | Foundational classification rules | | `formula.test.ts` | ~85 | Core parser/evaluator, property tests | | `extension.test.ts` | 36 | Registry/framework, no overlap | | `infrastructure.test.ts` | 15 | ScopeRegistry, CleanupManager, HookValidator | | `error-context.test.ts` | 24 | Core contract validation | | `error-suggestions.test.ts` | 31 | Exhaustive suggestion branches | | `cross-operation-support.test.ts` | 8 | Only integration tests for `previous()` | | `protocol-extensions.test.ts` | 22 | Built-in extensions | ### MERGE (1 file) | File | Target | Reason | |------|--------|--------| | `examples.test.ts` | `integration.test.ts` | Redundant smoke tests | ### REFACTOR (2 files) | File | Issue | Fix | |------|-------|-----| | `integration.test.ts` | 6 duplicate/near-duplicate tests | Remove duplicates | | `success-metrics.test.ts` | Arbitrary thresholds, covered elsewhere | Delete (assertions in error-context + integration) | --- ## Feature Test Audit (26 files) ### KEEP (20 files) | File | Tests | Value | |------|-------|-------| | `cache-hints.test.ts` | 7 | Cache invalidation patterns | | `counterexample.test.ts` | 17 | Failure analysis + formatting | | `debug-mode.test.ts` | 2 | Debug logging toggle | | `incremental.test.ts` | 12 | Hash determinism | | `incremental/cache.test.ts` | 7 | Cache API round-trip | | `invariant-registry.test.ts` | 5 | Invariant resolution | | `outbound-interceptor.test.ts` | 16 | Chaos application | | `outbound-runtime.test.ts` | 10 | Outbound registry + mocks | | `outbound-stateful.test.ts` | 7 | Stateful mock CRUD | | `production-safety.test.ts` | 4 | Production guards | | `regex-guard.test.ts` | 13 | ReDoS protection | | `relationships.test.ts` | 9 | Production relationship predicates | | `resource-inference.test.ts` | 13 | Schema-driven identity | | `route-matcher.test.ts` | 17 | URL pattern matching | | `scenario-runner.test.ts` | 6 | Scenario capture/rebind/cookies | | `schema-to-arbitrary.test.ts` | 33 | Schema-to-fast-check (property tests) | | `scope-isolation.test.ts` | 4 | Scope filtering | | `serverless.test.ts` | 3 | Serverless compatibility | | `stateful-runner.test.ts` | 6 | Stateful test execution | | `tap-formatter.test.ts` | 15 | TAP output formatting | ### MERGE (2 files) | File | Target | Reason | |------|--------|--------| | `format-diff.test.ts` | `counterexample.test.ts` | Only 4 tests, same module | | `seeded-rng.test.ts` | `schema-to-arbitrary.test.ts` | 5 tests, RNG core to generation | ### REFACTOR (4 files) | File | Issue | Fix | |------|-------|-----| | `deduplication.test.ts` | Stale copies of production code | Import from `runner-utils.ts` | | `incremental/cache.test.ts` | Weak "persists to disk" test | Fix or remove | | `counterexample.test.ts` | Growing file (224L) | Split if exceeds 250L | | `tap-formatter.test.ts` | Same module as counterexample | Consider unified `formatters.test.ts` | ### DELETE (4 files) | File | Reason | Coverage Moves To | |------|--------|-------------------| | `cascade-validator.test.ts` | Tests non-production helpers | `relationships.test.ts` | | `hypermedia-validator.test.ts` | Tests non-production helpers | `relationships.test.ts` | | `gap-fixes.test.ts` | Runtime hooks → infrastructure, chaos → outbound-interceptor | `infrastructure.test.ts`, `outbound-interceptor.test.ts` | | `success-metrics.test.ts` | Arbitrary metrics, covered elsewhere | `error-context.test.ts`, `integration.test.ts` | --- ## Action Plan ### Phase A: Fix Broken Tests (Week 1) 1. **Refactor `verify-ux.test.ts`** - Switch to fixture apps 2. **Refactor `doctor-consistency.test.ts`** - Use fixture apps for failing tests 3. **Refactor `acceptance.test.ts`** - Remove failing tests, use `main()` entrypoint 4. **Remove duplicates from `integration.test.ts`** - 6 tests ### Phase B: Delete Dead Tests (Week 1) 1. **Delete `cascade-validator.test.ts`** 2. **Delete `hypermedia-validator.test.ts`** 3. **Delete `gap-fixes.test.ts`** (after moving valuable tests) 4. **Delete `success-metrics.test.ts`** ### Phase C: Merge Overlapping Tests (Week 2) 1. **Merge `core.test.ts` → `dispatch.test.ts`** 2. **Merge `migrate.test.ts` → `migrate-reliability.test.ts`** 3. **Merge `observe.test.ts` → `observe-safety.test.ts`** 4. **Merge `examples.test.ts` → `integration.test.ts`** 5. **Merge `format-diff.test.ts` → `counterexample.test.ts`** 6. **Merge `seeded-rng.test.ts` → `schema-to-arbitrary.test.ts`** ### Phase D: Refactor Implementation Tests (Week 2) 1. **Refactor `deduplication.test.ts`** - Use real imports 2. **Refactor `config-validation.test.ts`** - Parameterize permutations 3. **Fix `incremental/cache.test.ts`** - Strengthen or remove weak test --- ## Impact Projection | Metric | Current | After | Change | |--------|---------|-------|--------| | Test files | 55 | ~45 | -10 (-18%) | | Test lines | ~20,450 | ~18,000 | -2,450 (-12%) | | Failing tests | ~20 | 0 | -20 (100%) | | Duplicate tests | ~15 | 0 | -15 (100%) | | Non-production tests | 4 files | 0 | -4 (100%) | **Coverage target**: Retain or move the useful assertions before deleting overlapping tests. --- ## Test Quality Principles Applied 1. **Behavior over implementation** - Tests should verify observable behavior, not internal structure 2. **Fixtures over temp files** - Use stable fixture apps instead of generating temp app.js files 3. **Parameterized over permutations** - One test with multiple inputs beats 10 identical tests 4. **Production over helpers** - Test production code, not test-only helpers 5. **Independence** - Each test should create its own context, not depend on global state --- *Report generated from static analysis of all 55 test files. No code changes made.*