10 KiB
APOPHIS Assessment: Arbiter Integration Readiness
Executive Summary
APOPHIS is a contract-driven API testing plugin for Fastify. This document assesses its readiness for integration with the Arbiter repository (~11,389 routes, multi-tenant authorization server).
What Is In Place
Core Infrastructure (100% Complete)
- Route Discovery: Extracts contracts from Fastify route schemas via
discoverRoutes() - Category Inference: Auto-categorizes routes as constructor/mutator/observer/utility
- Contract Extraction: Parses
x-requires,x-ensures,x-invariants,x-regex,x-category - Formula Parser: Full APOSTL grammar with charCodeAt optimization (94% faster)
- Formula Evaluator: Pure function with type coercion, regex matching, quantifiers
- Hook Validator: Runtime precondition/postcondition validation via preHandler/onResponse
- Scope Registry: Auto-discovers from
APOPHIS_SCOPE_*env vars - Cleanup Manager: LIFO deletion with callback-based batching
- TAP Formatter: CI/CD compatible test output
Test Framework (80% Complete)
- PETIT Runner: Property-based test execution with fast-check arbitraries
- Schema-to-Arbitrary: JSON Schema -> fast-check conversion (strings, integers, objects, arrays, enums, formats)
- Incremental Cache: SHA-256 schema hashing with file-based persistence (13-20x speedup)
- Model State Tracking: Basic resource tracking for constructor routes
Performance (Complete)
- Route discovery: ~0.5µs/route
- Formula parsing: ~5µs/formula
- Category inference: ~15ns/route
- Contract extraction: 58% faster with WeakMap cache
- Incremental cache: 13-20x speedup for unchanged routes
- Estimated 11K route overhead: ~1.4s total
What Is NOT In Place
1. Stateful Testing (0% - Architecture Only)
Current State: runPetitTests runs commands sequentially but without true stateful/model-based testing. The state machine only tracks created resources for cleanup.
What's Missing:
- Command sequence generation: Fast-check's
commands()arbitrary for generating valid command sequences - Model-based state machine: Formal model that tracks expected vs actual state
- Precondition-aware sequencing: Smart generation that respects
x-requiresdependencies - Cross-route state transitions: Understanding that POST /users creates a resource that GET /users/:id can observe
- Invariant checking across sequences: Ensuring state remains consistent after mutations
Arbiter-Specific Value: Arbiter has complex multi-tenant state:
- Tenant creation -> Application creation -> User creation -> Permission assignment
- OAuth flows: authorization -> token -> refresh -> revocation
- Graph mutations: node creation -> relation creation -> authorization evaluation
Stateful testing would catch:
- Race conditions in tenant isolation
- Invalid state transitions (e.g., deleting a tenant with active applications)
- Authorization leaks across state changes
- Resource lifecycle violations
Implementation Effort: Medium (2-3 days)
- Create
Modelclass tracking expected state - Implement
Commandarbitrary using fast-check'scommands() - Add
checkInvariants()for cross-route consistency - Implement
shrink()for minimal failing sequences
2. Object Inference from Schemas (40%)
Current State: updateState() infers resources from response body looking for id/uuid/_id fields. This is naive.
What's Missing:
- Schema-driven object extraction: Using JSON Schema
propertiesto know what fields constitute an object identity - Relationship inference: Understanding that
POST /tenants/:id/applicationscreates an application scoped to a tenant - Nested resource tracking: Tracking sub-resources (e.g., application configs within tenants)
- Path parameter correlation: Linking
POST /usersresponseidtoGET /users/:idpath parameter
Arbiter Example:
// POST /tenant/applications
// Response: { id: 'app-123', tenantId: 'tenant-456', name: 'My App' }
// Should infer: resourceType='application', parentType='tenant', parentId='tenant-456'
// Current code only captures: resourceType='applications', id='app-123'
// Missing the tenant scoping which is critical for Arbiter's authorization model
Implementation Effort: Low-Medium (1-2 days)
- Enhance
updateState()to parse response schema for identity fields - Add parent-child relationship tracking to
ModelState - Implement path parameter extraction for route correlation
3. Request Structure Inference (30%)
Current State: executeCommand() blindly sends all generated params as either body or query params based on HTTP method. No understanding of route-specific parameter structure.
What's Missing:
- Path parameter extraction: Identifying
:id,:tenantIdfrom route paths and correlating with generated data - Body vs query discrimination: Using Fastify schema to know which params go where
- Header injection: Automatic
x-tenant-id,authorizationheader injection based on route requirements - Nested body structures: Handling
body.properties.nested.fieldschemas - Content-Type negotiation: Form-encoded vs JSON based on route configuration
Arbiter Example:
// Route: POST /tenant/applications/:appId/rules
// Body schema: { type: 'object', properties: { dsl: { type: 'string' }, priority: { type: 'integer' } } }
// Path params: { appId: '...' }
// Headers: { 'x-tenant-id': '...', 'authorization': 'Bearer ...' }
// Current code would send: { appId: 'generated', dsl: 'generated', priority: 1 } all as body
// Should send: appId in path, { dsl, priority } in body, auth headers automatically
Implementation Effort: Medium (2-3 days)
- Parse route path for parameter placeholders
- Match generated data to path vs body vs query
- Implement header injection based on scope/auth requirements
- Handle nested schema structures
4. Logic/Invariant Analysis (20%)
Current State: checkPostconditions() only validates status:### patterns. No evaluation of complex invariants.
What's Missing:
- Cross-route invariant checking: "After POST /users, GET /users/:id should return the same user"
- State consistency checks: "Total user count should increase by 1 after creation"
- Authorization boundary checks: "Tenant A's admin cannot access Tenant B's resources"
- Temporal logic: "After DELETE /users/:id, subsequent GET should return 404"
- Mathematical invariants: Budget constraints, quota limits, rate limiting
Arbiter-Specific Value: Arbiter's authorization graph has rich invariants:
- If user U has permission P on resource R, then checking P for U on R must return true
- If node N is child of node M, then M's permissions apply to N (transitivity)
- If relation R is revoked, all derived permissions via R must be invalidated
- Tenant isolation: resources in tenant T1 must never be accessible from T2
Implementation Effort: High (1 week)
- Implement invariant registry for cross-route assertions
- Add temporal operators (eventually, always, until) to APOSTL
- Create graph-aware consistency checker for Arbiter's authorization model
- Implement property-based invariant generation from schema constraints
5. Documentation (70%)
In Place:
- README.md with quick start, features, API reference
- Architecture document (ARCHITECTURE, 2656 lines)
- Performance analysis (PERF_ANALYSIS.md)
- Inline code comments
Missing:
- skills.md: LLM-friendly documentation for AI-assisted development
- Advanced guides: Stateful testing setup, custom invariant authoring
- Arbiter-specific examples: Multi-tenant testing patterns, OAuth flow validation
- Troubleshooting guide: Common failures, debugging techniques
- Migration guide: From manual testing to contract-driven testing
Do We Gain from Logic?
Short Answer: YES, Significantly
Without logic/stateful testing, APOPHIS is essentially a smart fuzzer with runtime assertions. With logic:
-
State Space Coverage:
- Stateless: Tests each route in isolation (~200 tests for 200 routes)
- Stateful: Tests route sequences (200 routes ^ 5 depth = 3.2 billion sequences)
- Gain: 10-100x more bugs found in stateful interactions
-
Arbiter-Specific Bugs Caught:
- Authorization escalation after role changes
- Resource leaks across tenant boundaries
- Invalid state transitions (e.g., modifying revoked tokens)
- Cache invalidation failures after mutations
- Graph inconsistency after node deletion
-
Regression Prevention:
- Stateless: Catches route-level regressions
- Stateful: Catches system-level regressions (e.g., "deleting user breaks their sessions")
-
Cost-Benefit:
- Implementation: ~1 week
- Value: Prevents production incidents that could take days to debug
- ROI: 10x+ for a system like Arbiter
Recommendations
Phase 1: Immediate (This Week)
- Implement object inference from schemas (1-2 days)
- Fix request structure handling (path/body/query discrimination) (2-3 days)
- Create skills.md for LLM assistance (1 day)
Phase 2: Short-term (Next 2 Weeks)
- Implement stateful test runner with model-based testing (1 week)
- Add cross-route invariant checking (1 week)
- Create Arbiter-specific example suite
Phase 3: Medium-term (Next Month)
- Graph-aware consistency checker for Arbiter
- Automatic contract generation from existing tests
- Performance optimization for 11K routes
- Integration with Arbiter's CI/CD pipeline
Conclusion
APOPHIS has a solid foundation for contract-driven testing. The current implementation provides immediate value for:
- Runtime contract validation (preconditions/postconditions)
- Property-based testing of individual routes
- Incremental test execution for CI/CD
However, to fully realize value for Arbiter, we need:
- Stateful testing: Critical for catching multi-route interaction bugs
- Better object inference: Essential for Arbiter's complex resource hierarchies
- Request structure handling: Required for realistic test execution
- Logic/invariant analysis: Needed for authorization-specific testing
The highest ROI item is stateful testing with proper object inference, which would catch the class of bugs most likely to cause production incidents in Arbiter.