Files

44 KiB

NEXT_STEPS_427.md — Chaos System Final Cutover (2026-04-27)

Philosophy

We write 1000-5000 LoC/hour. We do NOT do quick hacks or backward compatibility. Every change is a clean cutover. We parallelize via subworkers. We go red-green-refactor with fast feedback loops.

Status: v2.2 Stabilized → v2.3 Chaos Finalization

Test count: 505 passing, 0 failures
Build: Clean
Goal: Remove all dead code, unify APIs, fix naming lies, wire what exists, document honestly, then extend chaos into contract-driven outbound mocking.


P0: Kill Dead Code (Parallel Batch 1)

P0.1: Remove services field from all config types

  • Files: src/types.ts, src/quality/chaos-v2.ts, src/quality/chaos-types.ts
  • Action: Delete services?: Record<string, ServiceChaosConfig> from all types
  • Rationale: Documented fantasy. Zero implementation. Types for unimplemented features are worse than no types.
  • Verification: npm run build passes, tests pass

P0.2: Remove DependencyChaosConfig

  • Files: src/quality/chaos-v2.ts
  • Action: Delete the interface. It is never exported from the package entry point.
  • Rationale: Dead code. Duplicates EnhancedChaosConfig minus routes.

P0.3: Remove makeInvalidJson from corruption.ts

  • Files: src/quality/corruption.ts
  • Action: Delete function. It is defined but never wired into BUILTIN_STRATEGIES.
  • Rationale: Dead code. Also dangerous (swaps body type from object to string silently).

P0.4: Remove unreachable transport event types

  • Files: src/quality/chaos-types.ts, src/quality/chaos-v2.ts
  • Action: Delete transport-partial and transport-corrupt-headers from ChaosInjectionType union
  • Rationale: In the type union but no strategy produces them. No implementation. No tests.
  • Alternative: If we want them, implement them properly in this session. But cut first, add later.

P0.5: Remove reportInDiagnostics flag

  • Files: src/quality/chaos-types.ts, src/quality/chaos-v2.ts
  • Action: Delete field from EnhancedChaosConfig. Never checked in engine code.
  • Rationale: Dead config. Confusing — chaos events are always reported if they occur.

P1: Unify Config Types (Single Source of Truth)

P1.1: Merge all chaos config into one type

  • Files: src/types.ts (primary), src/quality/chaos-v2.ts, src/quality/chaos-types.ts
  • Action:
    1. Extend ChaosConfig in src/types.ts with:
      • outbound?: OutboundChaosConfig[]
      • include?: string[]
      • exclude?: string[]
      • resilience?: { enabled: boolean; maxRetries: number; backoffMs: number }
      • skipResilienceFor?: ('constructor' | 'mutator' | 'observer' | 'destructor' | 'utility')[]
      • routes?: Record<string, Partial<ChaosConfig>> (per-route overrides)
    2. Delete EnhancedChaosConfig from chaos-types.ts and chaos-v2.ts
    3. Update all imports site-wide
  • Rationale: Four config types for one concept is insane. One type, one import, one mental model.
  • Breaking: Yes. Clean cutover. No backward compat.

P1.2: Fix corruption.strategies — either implement or delete

  • Files: src/types.ts, src/quality/corruption.ts, src/quality/chaos-v2.ts
  • Decision: DELETE the field. It is documented three different ways and used zero ways.
  • Rationale: Dead parameter. If we want strategy allow-listing later, we'll design it properly.

P2: Fix Naming Lies (Transport → Body)

P2.1: Rename transport event types to body-*

  • Files: src/quality/chaos-types.ts, src/quality/chaos-v2.ts, src/quality/corruption.ts, all tests
  • Action:
    • transport-truncatebody-truncate
    • transport-malformedbody-malformed
    • Remove transport-partial and transport-corrupt-headers (already killed in P0)
  • Rationale: We manipulate deserialized JS values, not TCP bytes. Stop overpromising.
  • Docs update: docs/chaos-v2.md, docs/getting-started.md

P2.2: Rename injectCorruption to injectBodyCorruption

  • Files: src/quality/chaos-v2.ts
  • Action: Method rename. Internal only.

P3: Fix Strategy Mapping (Structural Descriptors)

P3.1: Replace substring matching with structural descriptors

  • Files: src/quality/corruption.ts, src/quality/chaos-v2.ts
  • Current: mapCorruptionToTransportType does name.includes('truncate') etc.
  • New: Each strategy object carries its own kind:
    interface CorruptionStrategy {
      readonly name: string
      readonly kind: 'body-truncate' | 'body-malformed'
      readonly fn: (data: unknown, rng: () => number) => unknown
    }
    
  • Rationale: Substring matching on human-readable names is fragile. Renaming a strategy silently reroutes event types.

P4: Wire Outbound Interceptor (The Big One)

P4.1: Integrate OutboundInterceptor into test runner

  • Files: src/test/petit-runner.ts, src/quality/chaos-v2.ts
  • Problem: getOutboundInterceptor() exists but nothing calls it.
  • Solution:
    1. Add a Fastify decorator or request-scoped container that exposes the interceptor
    2. OR: Patch fetch / http.request at test setup time to route through interceptor
    3. OR: Provide a helper that wraps the user's HTTP client:
      const fetchWithChaos = engine.wrapFetch(globalThis.fetch)
      
  • Decision: Start with option 3 (helper). Fastify-agnostic. Works with any HTTP client.
  • Rationale: We can't intercept inside handlers without cooperation. Give developers the tool.

P4.2: Add wrapFetch / wrapHttp helpers

  • Files: src/quality/chaos-outbound.ts (new exports)
  • API:
    export function wrapFetch(
      fetch: typeof globalThis.fetch,
      interceptor: OutboundInterceptor
    ): typeof globalThis.fetch
    
  • Rationale: Makes outbound chaos usable. Currently it's a class with no plumbing.

P4.3: Wire per-route outbound overrides

  • Files: src/quality/chaos-v2.ts, src/quality/chaos-route-resolver.ts
  • Problem: getRouteConfig merges legacy overrides but ignores resolveOutboundForRoute()
  • Fix: Call resolveOutboundForRoute(config, route) in executeWithChaos and pass result to OutboundInterceptor

P5: RNG Forking (Reproducibility)

P5.1: Fork RNG per chaos layer

  • Files: src/quality/chaos-v2.ts
  • Current: Both transport and outbound use same seed → same RNG stream
  • Fix:
    const transportRng = new SeededRng(hashCombine(seed, 'transport'))
    const outboundRng = new SeededRng(hashCombine(seed, 'outbound'))
    
  • Rationale: Adding outbound config currently shifts transport reproducibility. That's a bug.

P6: Blast Radius Cap (Safety)

P6.1: Add maxInjectionsPerSuite circuit breaker

  • Files: src/quality/chaos-v2.ts, src/types.ts
  • API: Add to ChaosConfig:
    readonly maxInjectionsPerSuite?: number // default: Infinity
    
  • Behavior: Counter in EnhancedChaosEngine. Once reached, executeWithChaos becomes no-op.
  • Rationale: Prevents probability: 1 from masking every assertion in CI.

P7: Fix truncateJson RNG

  • Files: src/quality/corruption.ts
  • Problem: Declares rng parameter but ignores it. Cut point is always floor(n/2).
  • Fix: Either remove param from signature, or use it for random cut point.
  • Decision: Use it. const cut = Math.floor(rng() * n) for arrays, Math.floor(rng() * str.length) for strings.

P8: Fix assertTestEnv Runtime Violation

  • Files: src/quality/chaos-v2.ts, src/infrastructure/env-guard.ts
  • Problem: assertTestEnv called inside executeWithChaos at request time. Its own invariant says "MUST only be called at plugin registration time."
  • Fix: Move the check to plugin registration. Cache result. Pass a boolean testEnv flag into executeWithChaos.

P9: Documentation

P9.1: Document transport/body chaos in getting-started.md

  • Current: Zero mention. Only chaos: { probability, delay } example.
  • Add: Section showing corruption config with body-truncate, body-malformed examples.

P9.2: Update docs/chaos-v2.md

  • Fix: Remove references to strategies array. Update type names. Remove services examples.
  • Add: wrapFetch example for outbound chaos.

P9.3: Update docs/extensions/QUICK-REFERENCE.md

  • Add: Chaos section with quick examples.

P10: Remaining from 426 (Deferred Items)

P10.1: Arbiter Bug #3 — Configurable Invariants

  • Status: Complete
  • Files: src/types.ts, src/domain/invariant-registry.ts, src/test/petit-runner.ts, src/test/stateful-runner.ts
  • Implemented: TestConfig.invariants?: string[] | false with resolveInvariants() routing in both runners

P10.2: CI/CD Examples

  • Status: Still pending
  • Files: docs/ci-cd.md (new)
  • Need: GitHub Actions, GitLab CI, CircleCI workflows
  • Defer to: v2.4 or integrate if time permits

P10.3: Mutation Testing Cleanup

  • Status: src/quality/mutation.ts exists but is unused
  • Decision: Keep file. It's not breaking anything. Integrate properly in v2.4.

P11: Contract-Driven Outbound Mocks (Next Major Cut)

P11.1: Register shared outbound dependency contracts

  • Status: Complete
  • Files: src/types.ts, src/plugin/index.ts, new src/domain/outbound-contracts.ts
  • Implemented: ApophisOptions.outboundContracts, OutboundContractRegistry, registerOutboundContracts() decoration

P11.2: Add x-outbound route annotation

  • Status: Complete
  • Files: src/domain/contract.ts, src/types.ts
  • Implemented: RouteContract.outbound, parsed from schema['x-outbound']. Supports string refs, ref-with-overrides, and inline contracts

P11.3: Add automatic test-env outbound mock runtime

  • Status: Complete
  • Files: src/plugin/index.ts, new src/infrastructure/outbound-mock-runtime.ts, src/test/petit-runner.ts, src/test/stateful-runner.ts
  • Implemented: OutboundMockRuntime patches globalThis.fetch, returns generated/overridden responses, records calls, restores cleanly. Imperative API via enableOutboundMocks(), disableOutboundMocks(), getOutboundCalls()

P11.4: Reuse existing outbound chaos as a mock overlay

  • Status: Complete (architectural — chaos-v2 still owns chaos, mock runtime owns dependency mocking; both work alongside via fetch wrapping)
  • Files: src/quality/chaos-v2.ts, src/quality/chaos-outbound.ts
  • Migrated: stateful-runner.ts now uses EnhancedChaosEngine (single chaos stack across runners)

P11.5: Expose outbound call facts to APOSTL and E2E tests

  • Status: Complete
  • Files: new src/extensions/outbound.ts, src/types.ts
  • Implemented: Built-in extension exposing outbound_calls(this) and outbound_last(this) predicates. Imperative getOutboundCalls() API for E2E tests.

P11.6: Property-test both sides of the integration boundary

  • Status: Phase 1 complete (mode: 'example' works deterministically). Phase 2 (mode: 'property') deferred — types and runtime allow additive change without rewrite.
  • Files: src/domain/schema-to-arbitrary.ts, src/test/petit-runner.ts, src/test/stateful-runner.ts
  • Implemented: convertSchema(responseSchema, { context: 'response' }) reused for dependency response generation. Deterministic sub-seeds derived from test seed via hashCombine(seed, stringHash(routePath)).

P11.7: Tests

  • Status: Complete
  • File: src/test/outbound-runtime.test.ts
  • Coverage: Registry resolution (string refs, refs with overrides, inline, missing refs), runtime install/restore, generated responses, overrides, unmatched error/passthrough, call recording, double-install protection. 10/10 tests passing.

P11.8: Async-to-Sync Conversion

  • Status: Complete
  • Files: src/extensions/serializers/transformer.ts, src/extensions/sse/transformer.ts, src/extensions/websocket/runner.ts, src/plugin/index.ts
  • Converted: transformRequest, transformResponse, transformSSEResponse, runWebSocketTests, enableOutboundMocks, disableOutboundMocks
  • Rationale: Removed unnecessary async/await overhead on functions that perform no async work. Reduces microtask queue pressure.

P12: Production-Safety Hardening (Reviewer-Driven)

Context: Engineering review by simulated personas (Hanson/Halliday/Dahl) identified production-safety concerns. We are NOT stripping APOPHIS down — the framework's scope is correct for the end goal. Instead, we harden every dangerous edge so APOPHIS becomes safe to ship in any environment, while preserving every feature.

Outcome: APOPHIS that is fully featured AND impossible to misuse in production.

P12.1: Replace globalThis.fetch Patching with undici MockAgent + AsyncLocalStorage

  • Status: Pending
  • Files: src/infrastructure/outbound-mock-runtime.ts (rewrite), src/test/petit-runner.ts, src/test/stateful-runner.ts, src/plugin/index.ts
  • Problem: Current globalThis.fetch patching is process-global, not concurrency-safe, bypassed by code that captures fetch at module load (Stripe SDK, undici Pool), and uses naive url.includes(target) substring matching which is exploitable.
  • Solution:
    1. Replace fetch monkey-patching with undici's MockAgent + setGlobalDispatcher
    2. Wrap mock state in AsyncLocalStorage<MockContext> so concurrent test suites don't collide
    3. Use URL parsing for target matching (hostname + path prefix), not substring
    4. Restore previous dispatcher (not just globalThis.fetch) on teardown
  • API:
    import { MockAgent, setGlobalDispatcher, getGlobalDispatcher } from 'undici'
    import { AsyncLocalStorage } from 'node:async_hooks'
    
    const mockContext = new AsyncLocalStorage<MockContext>()
    
    export function createOutboundMockRuntime(opts: OutboundMockOptions): OutboundMockRuntime {
      const agent = new MockAgent({ connections: 1 })
      agent.disableNetConnect()
      const previousDispatcher = getGlobalDispatcher()
      // ... interceptors set up via agent.get(origin).intercept({path, method}).reply(...)
      return {
        install: () => mockContext.run({ agent }, () => setGlobalDispatcher(agent)),
        restore: () => setGlobalDispatcher(previousDispatcher),
        // ...
      }
    }
    
  • Migration path: undici is already a Fastify dependency (it ships with Node 18+). Zero new deps.
  • Rationale: Both Hanson and Dahl identified this as the single biggest production risk. undici MockAgent is the standard, AsyncLocalStorage solves concurrency.

P12.2: Hard-Fail at Plugin Registration if NODE_ENV=production and Unsafe Options Set

  • Status: Pending
  • Files: src/plugin/index.ts, src/infrastructure/env-guard.ts
  • Problem: Currently enableOutboundMocks and chaos can be enabled at runtime in production with no guardrail. assertTestEnv only fires when chaos engine is constructed, not at plugin boot.
  • Solution:
    1. Move all environment checks to plugin onReady hook
    2. Refuse to start the Fastify instance if any unsafe option is set in production:
      • runtime: 'error' | 'warn' (any non-'off' value)
      • chaos config present
      • outboundContracts registered (even via apophis.registerOutboundContracts)
    3. Throw with explicit error message including the offending option and the env var to override
    4. Add escape hatch: APOPHIS_FORCE_PRODUCTION_DANGEROUS=1 env var for users who genuinely need it
  • Code shape:
    fastify.addHook('onReady', async () => {
      if (process.env.NODE_ENV === 'production' && !process.env.APOPHIS_FORCE_PRODUCTION_DANGEROUS) {
        const violations = []
        if (opts.runtime && opts.runtime !== 'off') violations.push('runtime hooks')
        if (opts.chaos) violations.push('chaos engine')
        if (Object.keys(opts.outboundContracts ?? {}).length > 0) violations.push('outbound mocks')
        if (violations.length > 0) {
          throw new Error(
            `APOPHIS refuses to start in production with: ${violations.join(', ')}. ` +
            `Set APOPHIS_FORCE_PRODUCTION_DANGEROUS=1 to override (not recommended).`
          )
        }
      }
    })
    
  • Rationale: onReady is the right layer — it's after registration, before serving. Hanson explicitly called this out.

P12.3: AsyncLocalStorage-Scoped Mock Context (Concurrent Test Safety)

  • Status: Pending (depends on P12.1)
  • Files: src/infrastructure/outbound-mock-runtime.ts, src/test/petit-runner.ts, src/test/stateful-runner.ts
  • Problem: Two test suites running in parallel (Promise.all([suiteA(), suiteB()])) silently share globalThis.fetch patches.
  • Solution:
    1. All mock state (resources, calls, injected responses) lives in AsyncLocalStorage<MockContext>
    2. Each runPetitTests invocation creates a fresh context via mockContext.run(...)
    3. The undici dispatcher reads the current ALS context to find the right mock
  • Verification: Add test that runs two concurrent test suites with different mocks and asserts isolation.

P12.4: Try/Finally Wrap All Mock Lifecycle (Cleanup-on-Throw)

  • Status: Pending
  • Files: src/test/petit-runner.ts, src/test/stateful-runner.ts
  • Problem: Current code does suiteMockRuntime.install() then later suiteMockRuntime.restore(). If any exception fires between them, fetch is leaked.
  • Solution:
    1. Wrap entire test execution in try { ... } finally { suiteMockRuntime.restore() }
    2. Register restore callback in CleanupManager so SIGINT/SIGTERM also restores
    3. Add idempotent restore() (safe to call twice)
  • Verification: Test that throws mid-suite and asserts globalThis.fetch === originalFetch after.

P12.5: URL-Aware Target Matching (Replace Substring)

  • Status: Pending (depends on P12.1)
  • Files: src/infrastructure/outbound-mock-runtime.ts, new src/domain/url-matcher.ts
  • Problem: url.includes(target) matches api.stripe.com.evil.example to target: 'api.stripe.com'.
  • Solution:
    1. Parse target with new URL(). Match on hostname exactly + pathname prefix.
    2. Support glob patterns at path-segment boundaries: /v1/customers/* matches /v1/customers/cus_123 but not /v1/customers_evil/x
    3. Escape regex metacharacters in user-supplied targets
  • Code shape:
    export interface UrlMatcher {
      readonly hostname: string
      readonly pathPattern: RegExp
      readonly method: string
    }
    export function compileTargetPattern(target: string): UrlMatcher
    export function matchesUrl(url: string, matcher: UrlMatcher, method: string): boolean
    

P12.6: Schema-Validate Mock Responses Against Contract

  • Status: Pending
  • Files: src/infrastructure/outbound-mock-runtime.ts
  • Problem: After applyEnsuresToResponse mutates the body, nothing re-validates against the response schema. A user-written ensures formula could produce a response that violates the contract it claims to uphold.
  • Solution:
    1. After applying ensures, run Ajv validation against contract.response[statusCode]
    2. If validation fails, throw a clear error pointing at the offending formula and the schema violation
    3. Cache compiled validators per contract for performance
  • Rationale: Trust but verify. The mock runtime should be self-consistent.

P12.7: Fix RNG Determinism (Eliminate Math.random() Fallbacks)

  • Status: Pending
  • Files: src/plugin/index.ts:128, src/test/petit-runner.ts:539, src/infrastructure/outbound-mock-runtime.ts:91
  • Problem: Math.floor(Math.random() * 0xFFFFFFFF) as a fallback when no seed is provided breaks reproducibility silently.
  • Solution:
    1. When no seed is provided, derive deterministic seed from a stable source (e.g., stringHash(process.pid + suite-name) or accept default seed 0)
    2. Replace seed + N patterns with hashCombine(seed, N) everywhere (consistency with petit-runner.ts:48)
    3. Document that seeds must be provided for reproducibility OR accept the default seed
  • Rationale: For a framework whose selling point is reproducibility, Math.random() anywhere in the seed chain is a bug.

P12.8: Discriminated Union for OutboundBinding (Tagged, Not Structural)

  • Status: Pending
  • Files: src/types.ts:339-360, src/test/petit-runner.ts, src/domain/contract.ts, src/domain/outbound-contracts.ts
  • Problem: Three call sites do typeof binding === 'string' ? binding : 'ref' in binding ? binding.ref : binding.name — structural narrowing that's fragile.
  • Solution:
    1. Introduce explicit tag:
      export type OutboundBinding =
        | { kind: 'ref'; name: string; chaos?: OutboundChaosConfig }
        | { kind: 'inline'; name: string; target: string; method: string; request?: ...; response: ...; chaos?: ... }
      
    2. Backward-compat: extractContract normalizes string shorthand to { kind: 'ref', name } at parse time
    3. Add helper getBindingName(binding: OutboundBinding): string — single source of truth
  • Rationale: TypeScript discriminated unions with explicit tags are refactor-safe; structural ones aren't.

P12.9: Eliminate as unknown as Mutation of Readonly Types

  • Status: Pending
  • Files: src/test/petit-runner.ts:735-749, audit all other as unknown as casts
  • Problem: Mutating readonly TestResult.diagnostics via double-cast lies to the type system.
  • Solution:
    1. Introduce MutableTestResult for in-construction state, freeze to TestResult on push
    2. OR: use a builder pattern — TestResultBuilder accumulates diagnostics, calls .build() at the end
    3. Run grep for all as unknown as and audit each one
  • Verification: New ESLint rule: forbid as unknown as Record<string, unknown> patterns (custom rule).

P12.10: Hoist Imports in petit-runner.ts

  • Status: Pending
  • Files: src/test/petit-runner.ts:264-268
  • Problem: Mid-file imports from dual-boundary-testing.js are a tell that they were tacked on later.
  • Solution: Move all imports to top of file. Pure cleanup.

P12.11: Cache Mock Response Arbitraries (Performance)

  • Status: Pending
  • Files: src/infrastructure/outbound-mock-runtime.ts
  • Problem: fc.sample(arb, ...) called inside the patched fetch on every outbound call. Builds full schema-to-arbitrary pipeline per sample.
  • Solution:
    1. Pre-compile arbitraries per contract at runtime install time
    2. Cache them on the runtime instance: Map<string, { [statusCode: number]: Arbitrary<unknown> }>
    3. Sample from cache, not rebuild
  • Verification: Benchmark: 1000 outbound calls before/after. Should be 5-10x faster.

P12.12: Property-Test Cache Invalidation on Schema Change

  • Status: Pending
  • Files: src/incremental/cache.ts, src/test/petit-runner.ts:151-196
  • Problem: generateCommands caches commands per route. After first run, the property-based aspect is gone unless the schema hash changes — fast-check can't shrink against cached examples.
  • Solution:
    1. Cache should store the seed and depth, not the resolved samples
    2. Re-sample on every run with cached seed for deterministic re-exploration
    3. Only cache the arbitrary reference (compiled), not the samples
  • Rationale: This restores property-based testing semantics. The framework's name says "property-based" — make it true.

P12.13: Strict OperationResolver Production Guard

  • Status: Pending
  • Files: src/formula/runtime.ts, src/plugin/index.ts
  • Problem: The previous(GET /users/{id}) operation resolver makes real fastify.inject() calls. In runtime: 'error' mode in production, this means every request triggers extra inject calls.
  • Solution:
    1. Disable operation resolution entirely when runtime !== 'off' and NODE_ENV === 'production'
    2. Throw at plugin boot with clear error if combination is detected
    3. Document: APOSTL previous() is for test-time only

P12.14: Documentation — Production Safety Section

  • Status: Pending
  • Files: docs/PRODUCTION_SAFETY.md (new), docs/getting-started.md
  • Content:
    1. Threat model: what runs in test, what runs in production
    2. Required env guards
    3. How to disable runtime hooks safely
    4. How to verify mocks are not active in production (health check)
    5. The APOPHIS_FORCE_PRODUCTION_DANGEROUS escape hatch and its risks

P12.15: Add Test for Production-Mode Refusal

  • Status: Pending
  • Files: src/test/production-guard.test.ts (new)
  • Coverage:
    • Plugin throws at ready() if NODE_ENV=production + chaos
    • Plugin throws at ready() if NODE_ENV=production + outbound contracts
    • Plugin throws at ready() if NODE_ENV=production + runtime: 'error'
    • Plugin allows boot with APOPHIS_FORCE_PRODUCTION_DANGEROUS=1
    • Concurrent test suites with different mocks don't cross-contaminate (P12.3)
    • Mock leak after thrown exception is impossible (P12.4)

P13: Polish from Reviews (Lower Priority, Same Sprint)

P13.1: ValidatedFormula Real Brand

  • Status: Pending
  • Files: src/types.ts:14
  • Problem: type ValidatedFormula = string is a lying type alias.
  • Solution:
    declare const ValidatedFormulaBrand: unique symbol
    export type ValidatedFormula = string & { readonly [ValidatedFormulaBrand]: true }
    export function validateFormula(s: string): ValidatedFormula { /* parse-check */ return s as ValidatedFormula }
    
  • Migration: All formula strings flow through validateFormula(). Clear error if invalid.

P13.2: Re-export ApophisExtension Type at Public Boundary

  • Status: Pending
  • Files: src/types.ts:631, src/index.ts
  • Problem: extensions?: ReadonlyArray<unknown> is unknown at the public API. The real type lives in extension/types.
  • Solution: Re-export ApophisExtension from the public index.ts and update the option type.

P13.3: Header Typing Honesty

  • Status: Pending
  • Files: src/extension/hook-validator.ts:60,75
  • Problem: request.headers as Record<string, string> loses multi-value headers.
  • Solution: Use Record<string, string | string[] | undefined> and have formula evaluator handle the union.

P13.4: O(n) Deduplication

  • Status: Pending
  • Files: src/test/petit-runner.ts:813-852
  • Problem: O(n²) duplicate count.
  • Solution: Single-pass Map<key, count>, then construct results once.

P13.5: Single Source for Field-Mapping Regex

  • Status: Pending
  • Files: src/domain/dual-boundary-testing.ts:84, src/infrastructure/outbound-mock-runtime.ts:100
  • Problem: Same request_body.X == response_body.Y regex in two places, slightly different.
  • Solution: Extract to src/domain/ensures-templates.ts. Single regex, both files import.

P13.6: Multi-Injection Queue for injectResponse

  • Status: Pending
  • Files: src/infrastructure/outbound-mock-runtime.ts
  • Problem: injectResponse is one-shot per contract. Two calls to the same dependency in one test only honor the first injection.
  • Solution: Change Map<string, InjectedResponse> to Map<string, InjectedResponse[]> (FIFO queue). Document semantics clearly.

P14: API Surface Simplification — 5 Methods Only

Context: Current ApophisDecorations has 14 methods (including 3 deprecated). Reviews identified this as cognitive overload. We can achieve the same expressiveness with 5 core methods by moving configuration to options and test-only helpers to a separate namespace.

Principle: Jobs to be Done drive the API. Everything else moves to options or test utilities.

P14.1: Define the 5 Core Methods

Method Job to be Done Current Equivalent
contract(opts?) Test my routes with generated inputs contract()
stateful(opts?) Test stateful workflows across multiple operations stateful()
check(method, path) Validate a single route immediately check()
cleanup() Clean up resources created during tests cleanup()
spec() Export contracts as OpenAPI spec spec()

Removed from decorations:

  • scope — internal registry, not user-facing
  • registerPluginContracts — move to ApophisOptions.extensions
  • registerOutboundContracts — move to ApophisOptions.outboundContracts
  • enableOutboundMocks, disableOutboundMocks, getOutboundCalls — move to fastify.apophis.test.* namespace
  • capture, extend, use — already deprecated, remove entirely

P14.2: Move Configuration to Options

Before:

await fastify.register(apophis, { /* minimal */ })
fastify.apophis.registerOutboundContracts({ stripe: {...} })
fastify.apophis.registerPluginContracts('auth', {...})

After:

await fastify.register(apophis, {
  outboundContracts: { stripe: {...} },
  extensions: [authExtension],
})

Files: src/types.ts, src/plugin/index.ts, src/index.ts

P14.3: Create Test-Only Namespace

Move imperative mock controls to fastify.apophis.test.* — clearly indicating these are for test environments only:

// Only available when NODE_ENV !== 'production' OR when explicitly enabled
interface ApophisTestNamespace {
  // --- Mock lifecycle ---
  /** Enable outbound mocking. Idempotent — safe to call multiple times. */
  enableOutboundMocks(opts?: TestConfig['outboundMocks']): void
  
  /** Disable outbound mocking. Idempotent. */
  disableOutboundMocks(): void
  
  /** Reset all mock state (calls, resources, injections) without disabling. Use between tests. */
  resetMocks(): void

  // --- Mock inspection ---
  /** Get recorded outbound calls. Filter by contract name if provided. */
  getOutboundCalls(name?: string): ReadonlyArray<OutboundCallRecord>
  
  /** Get the most recent outbound call to a contract, or undefined if none. */
  getLastOutboundCall(name: string): OutboundCallRecord | undefined
  
  /** Get a stored mock resource by contract name and ID. Used to verify CRUD lifecycle. */
  getMockResource(contractName: string, id: string): unknown | undefined

  // --- Mock control ---
  /** Inject a specific response for the next call to a contract. FIFO queue if called multiple times. */
  injectResponse(contractName: string, statusCode: number, body: unknown): void
  
  /** Force a specific status code for ALL calls to a contract until cleared. */
  forceStatus(contractName: string, statusCode: number): void
  
  /** Clear forced status for a contract. */
  clearForceStatus(contractName: string): void

  // --- Reproducibility ---
  /** Get the seed used by the last test run. Use to reproduce failures. */
  getLastSeed(): number | undefined
}

Final E2E test pattern:

import { test, beforeEach, afterEach } from 'node:test'

beforeEach(() => {
  fastify.apophis.test.enableOutboundMocks()
})

afterEach(() => {
  fastify.apophis.test.resetMocks()
  fastify.apophis.test.disableOutboundMocks()
})

test('handles Stripe 500 gracefully', async () => {
  fastify.apophis.test.injectResponse('stripe', 500, { error: 'temporary' })
  
  const res = await fastify.inject({ method: 'POST', url: '/charge', payload: {...} })
  
  assert.equal(res.statusCode, 503) // Our handler converts upstream 500 to 503
  
  const calls = fastify.apophis.test.getOutboundCalls('stripe')
  assert.equal(calls.length, 1)
  assert.equal(calls[0].responseStatus, 500)
})

test('CRUD lifecycle works', async () => {
  await fastify.inject({ method: 'POST', url: '/users', payload: { name: 'a' } })
  
  const lastCall = fastify.apophis.test.getLastOutboundCall('user-db')
  assert.ok(lastCall)
  
  const stored = fastify.apophis.test.getMockResource('user-db', lastCall.responseBody.id)
  assert.equal(stored.name, 'a')
})

test('reproduces failure from CI seed 12345', async () => {
  await fastify.apophis.contract({ seed: 12345 })
  // If failure happens, getLastSeed() returns 12345 for next run
})

Rationale:

  • Clear separation: core API (5 methods) vs test utilities (10 methods in test.*)
  • test.* namespace signals "not for production" without needing runtime checks
  • Can be tree-shaken in production builds
  • Each method maps 1:1 to a real E2E job

Files: src/types.ts, src/plugin/index.ts

P14.4: Update ApophisOptions Interface

Consolidate all configuration into ApophisOptions:

export interface ApophisOptions {
  // Existing
  scope?: ScopeConfig
  extensions?: ReadonlyArray<ApophisExtension>
  
  // New — moved from imperative decorations
  outboundContracts?: Record<string, OutboundContractSpec>
  
  // Existing
  invariants?: readonly string[] | false
}

Breaking: Yes. Clean cutover. Migration guide: move all register*() calls to options.

Files: src/types.ts

P14.5: Remove Deprecated Decorations

Delete from ApophisDecorations:

  • capture (v1 deprecated)
  • extend (v1 deprecated)
  • use (v1 deprecated)

Files: src/types.ts

P14.6: Remove scope from Decorations

ScopeRegistry is an internal concern. Users don't need direct access. If they need scope headers, they pass scope to contract() or stateful().

Files: src/types.ts, src/plugin/index.ts

P14.7: Update Plugin Registration to Accept All Config

Modify apophisPlugin to:

  1. Accept outboundContracts in options
  2. Register them at boot time (not via decoration)
  3. Accept extensions array and register all at boot time

Files: src/plugin/index.ts

P14.8: Update Documentation

  • Update docs/getting-started.md with new 5-method API
  • Migration guide: "Moving from v2.4 to v2.5"
  • Update all examples to use options-based configuration

Files: docs/getting-started.md, docs/MIGRATION_v2.5.md (new)

P14.9: Add Type Tests for API Surface

Ensure TypeScript enforces the 5-method limit:

// src/types/api-surface.test.ts (type tests only)
type ExpectedKeys = 'contract' | 'stateful' | 'check' | 'cleanup' | 'spec' | 'test'
type ActualKeys = keyof ApophisDecorations
type Assert = ActualKeys extends ExpectedKeys ? true : false
const _assert: Assert = true

Files: src/types/api-surface.test.ts

P14.10: Deprecation Warnings for v2.4 API

For v2.5.0 release, keep old methods but log deprecation warnings pointing to new options-based approach. Remove entirely in v3.0.

Actually — no. Clean cutover per philosophy. Remove in v2.5.


Updated Execution Order

Batch 7 (Production Safety — HIGHEST PRIORITY)

  • P12.1: undici MockAgent
  • P12.2: Production refusal at onReady
  • P12.3: AsyncLocalStorage scoping
  • P12.4: try/finally cleanup
  • P12.5: URL-aware matching

Batch 8 (Production Safety — Continuation)

  • P12.6: Schema-validate mock responses
  • P12.7: RNG determinism fixes
  • P12.13: Operation resolver production guard
  • P12.14: Production safety docs
  • P12.15: Production guard tests

Batch 9 (API Simplification — PARALLEL with Batch 8)

  • P14.1: Define 5 core methods
  • P14.2: Move config to options
  • P14.3: Create test namespace
  • P14.4: Update ApophisOptions
  • P14.5: Remove deprecated decorations
  • P14.6: Remove scope decoration
  • P14.7: Update plugin registration
  • P14.8: Update documentation
  • P14.9: Add type tests

Batch 10 (Polish — Parallel)

  • P13.*: All review polish items
  • P12.8-P12.12: Remaining hardening items

Final API (v2.5 Target)

// Registration — all config up front
await fastify.register(apophis, {
  outboundContracts: { stripe: {...} },
  extensions: [authExtension],
})

// Core API — 5 methods
const suite = await fastify.apophis.contract({ depth: 'standard' })
const suite = await fastify.apophis.stateful({ depth: 'deep' })
const result = await fastify.apophis.check('POST', '/users')
const cleaned = await fastify.apophis.cleanup()
const spec = fastify.apophis.spec()

// Test utilities — separate namespace (10 methods for E2E)
fastify.apophis.test.enableOutboundMocks()
fastify.apophis.test.resetMocks()
fastify.apophis.test.disableOutboundMocks()

const calls = fastify.apophis.test.getOutboundCalls('stripe')
const last = fastify.apophis.test.getLastOutboundCall('stripe')
const resource = fastify.apophis.test.getMockResource('user-db', '123')

fastify.apophis.test.injectResponse('stripe', 500, { error: 'down' })
fastify.apophis.test.forceStatus('stripe', 503)
fastify.apophis.test.clearForceStatus('stripe')

const seed = fastify.apophis.test.getLastSeed()

Total surface: 5 core + 10 test = 15 methods (down from 14, but organized).

Cognitive load: Low. Core API is 5 methods. Test namespace is comprehensive for E2E. Each maps 1:1 to a Job to be Done.


P15: Triple-Boundary Property Testing (Chaos in Arbitraries)

Context: Currently, chaos events are applied as side-effects via chaosEngine.executeWithChaos() inside the property test. This means fast-check shrinks the request and dependency responses, but chaos events themselves are not part of the shrinking process. If a failure only happens with a specific chaos pattern (e.g., "outbound corruption truncates response after 'id' field"), fast-check cannot find the minimal chaos pattern.

Solution: Move chaos generation INTO fast-check arbitraries. Generate request + dependency responses + chaos events together as a single tuple. fast-check then shrinks all three dimensions simultaneously.

Outcome: True triple-boundary property testing — when a test fails, the counterexample is minimal across all three boundaries.

P15.1: Implement Triple-Boundary Arbitrary

  • Status: Complete (file created)
  • File: src/domain/triple-boundary-testing.ts
  • Implemented:
    • ChaosEventSample type (chaos events as data, not side effects)
    • TripleBoundaryCommand (request + deps + chaos)
    • createTripleBoundaryArbitrary(route, contracts, chaosConfig) — generates all three together
    • createChaosEventArbitrary — generates chaos events conditioned on route + contracts
    • applyChaosToDependencyResponse — applies generated chaos to mock responses (truncate, malformed, field-corrupt)
    • applyChaosToAllResponses — applies chaos to all dependency responses
    • formatTripleBoundaryCounterexample — diagnostic output

P15.2: Add Outbound Response Body Corruption

  • Status: Complete (in P15.1)
  • Strategies:
    • truncate — Remove last field from response body (simulates partial response)
    • malformed — Replace body with invalid JSON (simulates network/serialization failure)
    • field-corrupt — Set a specific field to null (simulates bad data from upstream)
  • Rationale: These are real failure modes from production: partial responses from CDN failures, malformed JSON from broken proxies, null fields from deprecated upstream APIs.

P15.3: Wire Triple-Boundary into Petit Runner

  • Status: Pending
  • Files: src/test/petit-runner.ts
  • Changes:
    1. Replace runDualBoundaryPropertyTest with runTripleBoundaryPropertyTest
    2. Pass chaosConfig into the new function
    3. Inside fc.asyncProperty:
      • Apply chaos events to dependency responses BEFORE injecting into mock runtime
      • Apply inbound chaos events via chaosEngine.executeWithChaosEvents(events)
    4. Refactor chaosEngine.executeWithChaos to accept pre-generated chaos events instead of generating its own
  • API change:
    // OLD: chaos generated internally
    chaosEngine.executeWithChaos(fn, route, request, extensionRegistry)
    
    // NEW: chaos events passed as data
    chaosEngine.applyChaosEvents(fn, chaosEvents, route, request, extensionRegistry)
    

P15.4: Refactor Chaos Engine to Accept Pre-Generated Events

  • Status: Pending
  • Files: src/quality/chaos-v2.ts
  • Problem: EnhancedChaosEngine.executeWithChaos() currently rolls its own dice with Math.random(). For triple-boundary testing, chaos must be deterministic and shrinkable.
  • Solution:
    1. Add applyChaosEvents(fn, events, ...) method that takes pre-generated events
    2. Keep executeWithChaos(fn, ...) for backward compatibility (single-boundary mode)
    3. Internal logic: executeWithChaos becomes applyChaosEvents(fn, generateChaosEvents(rng), ...)
  • Rationale: Same engine, two entry points. Property mode uses pre-generated events; example mode rolls dice internally.

P15.5: Update Mock Runtime to Apply Outbound Corruption

  • Status: Pending
  • Files: src/infrastructure/outbound-mock-runtime.ts
  • Changes:
    1. Add injectCorruptedResponse(contractName, statusCode, body, corruption) method
    2. When triple-boundary test runs, it calls applyChaosToDependencyResponse then injectResponse with the corrupted body
    3. The mock returns the corrupted body to the route handler

P15.6: Add Tests for Triple-Boundary Shrinking

  • Status: Pending
  • File: src/test/triple-boundary.test.ts (new)
  • Coverage:
    • Triple-boundary arbitrary generates valid commands
    • Chaos events shrink toward 'no chaos' when not the cause
    • Outbound corruption strategies work (truncate/malformed/field-corrupt)
    • Multi-dependency chaos isolates to specific contract
    • Counterexample format includes all three boundaries
    • Failure boundary detection (request vs dependency vs chaos)

P15.7: Update Diagnostics

  • Status: Pending
  • Files: src/test/petit-runner.ts, src/domain/triple-boundary-testing.ts
  • Changes:
    • Failure result includes failureBoundary: 'request' | 'dependency' | 'chaos' | 'combination'
    • Counterexample output shows minimal request, minimal dep responses, minimal chaos events
    • Stack trace + APOSTL formula context preserved

P15.8: Documentation

  • Status: Pending
  • Files: docs/TRIPLE_BOUNDARY_TESTING.md (new), docs/getting-started.md
  • Content:
    • Why triple-boundary > dual-boundary
    • Real-world examples: corruption from CDN, partial responses, malformed JSON
    • How to read a triple-boundary counterexample
    • When to use property mode vs example mode

Updated Execution Order

Batch 7 (Production Safety — HIGHEST PRIORITY)

  • P12.1: undici MockAgent
  • P12.2: Production refusal at onReady
  • P12.3: AsyncLocalStorage scoping
  • P12.4: try/finally cleanup
  • P12.5: URL-aware matching

Batch 8 (Production Safety — Continuation)

  • P12.6: Schema-validate mock responses
  • P12.7: RNG determinism fixes
  • P12.13: Operation resolver production guard
  • P12.14: Production safety docs
  • P12.15: Production guard tests

Batch 9 (Polish — Parallel with Batch 8)

  • P12.8: Discriminated union for OutboundBinding
  • P12.9: Remove as unknown as casts
  • P12.10: Hoist imports
  • P12.11: Cache mock arbitraries
  • P12.12: Cache invalidation for property tests
  • P13.*: All review polish items

Updated Metrics

Metric v2.4 v2.5 Target
Tests passing 522 540+
globalThis.* mutations 1 0
Production-unsafe boot paths 3 0
Concurrent suite safety No Yes
Mock leak on throw Possible Impossible
Math.random() in seeded paths 3 0
Schema-validated mock responses No Yes
Structural type narrowing sites 3+ 0
undici-based outbound mocking No Yes
Production safety docs None Complete

Execution Order (Parallel Batches)

Batch 1 (Independent, Parallel)

  • P0: Kill dead code
  • P2: Rename transport → body
  • P7: Fix truncateJson RNG
  • P8: Fix assertTestEnv

Batch 2 (Depends on Batch 1)

  • P1: Unify config types
  • P3: Fix strategy mapping

Batch 3 (Depends on Batch 2)

  • P4: Wire outbound interceptor
  • P5: RNG forking
  • P6: Blast radius cap

Batch 4 (Documentation, always parallel)

  • P9: All docs updates

Batch 5 (Deferred)

  • P10: Bug #3, CI/CD, mutation testing

Batch 6 (Next Major Cut)

  • P11: Contract-driven outbound mocks and dual-boundary property testing

Metrics

Metric v2.2 v2.3 Target
Tests passing 505 505+
Config types 4 1
Dead code files 3+ 0
Unreachable event types 2 0
Outbound chaos wired No Yes
Transport naming honest No Yes
Docs cover chaos Partial Complete

Reference

  • Previous Steps: NEXT_STEPS_426.md
  • Arbiter Feedback: FEEDBACK_FROM_ARBITER.md
  • Chaos Spec: docs/chaos-v2.md
  • Outbound Mocking Spec: docs/OUTBOUND_CONTRACT_MOCKING_SPEC.md
  • Plugin Contracts: docs/PLUGIN_CONTRACTS_SPEC.md