Forge Review: Architecture Module-by-Module Analysis
Forge Architecture — Module-by-Module Analysis
Deep-dive into each module's design, strengths, weaknesses, and alignment with SYSTEM-DESIGN.md.
1. Type System (src/types/index.ts — 775 lines)
Strengths
- Comprehensive union types for all domain concepts (phases, severities, agent types)
- Zod schemas paired with TypeScript types for runtime validation
- Clear separation between input/output types per phase
Weaknesses
- Duplicate error classes:
ForgeError+ForgeBaseErrorin types collide with richer hierarchy insrc/core/errors.ts - Two
LLMClient.chatoverloads with incompatible signatures create ambiguity for implementors - 775 lines in a single file — should be split by domain (agent types, event types, tool types, config types)
Alignment Score: 7/10
Covers all concepts from SYSTEM-DESIGN.md but the duplication indicates organic growth without pruning.
2. Core Module (src/core/)
Components
bus.ts— InMemoryEventBus (EventEmitter-based)config.ts— Config loader with file detectionerrors.ts— Rich error hierarchytypes.ts— Re-export barrel
Strengths
- Error hierarchy is well-designed with
details: ErrorDetailsand type-safe getters - Config loader supports multiple file formats (.ts, .js, .mjs)
- Wildcard event subscription (
bus.on('*', ...)) is useful for debugging
Weaknesses
- Unbounded event array in bus.ts —
this.events.push(event)never prunes, will leak memory in long-running processes - Config merge bug —
breakersfield always uses defaults (line 90) - Two EventBus implementations — core/bus.ts (in-memory) and events/bus.ts (SQLite) with no adapter or factory to choose between them
Alignment Score: 6/10
SYSTEM-DESIGN.md specifies SQLite-backed event bus with checkpointing. The in-memory version exists as a simpler fallback but they're not unified.
3. Database Layer (src/db/schema.ts)
Strengths
- Drizzle ORM schema covers all entities: events, agents, checkpoints, memories, patterns, runs, findings, executions
- ULID generation for IDs (sortable, distributed-safe)
- Proper relations defined between tables
- JSON columns for flexible payload storage
Weaknesses
- No migration files checked in — only
drizzle.config.tsexists - Schema defines relations but some are aspirational (e.g.,
executionstable referenced but no code writes to it) - Uses
better-sqlite3instead ofbun:sqliteper CLAUDE.md preference
Alignment Score: 8/10
Good coverage of the data model from SYSTEM-DESIGN.md Section 4.
4. Agent Framework (src/agents/)
Components
base.ts— BaseAgent with perceive/reason/act/learn loopplanner.ts,implementer.ts,reviewer.ts,tester.ts,deployer.ts— 5 specialized agentsindex.ts— Factory + metadatapi-adapter.ts,pi-model-bridge.ts,pi-event-bridge.ts,pi-tool-converter.ts— pi-agent-core integration
Strengths
- BaseAgent loop faithfully implements SYSTEM-DESIGN.md Section 6 (perceive → reason → act → learn)
- Reviewer implements the designed 3-layer review (static → security → AI)
- Tester has smart risk-based test selection (low/medium/high/critical → different scopes)
- pi-agent-core adapter is well-structured with proper event bridging, safety integration, and cost tracking
- Tool definitions use Zod schemas for validation
Weaknesses
- Command injection in planner (glob via
find), implementer (git branch/commit) — tools shell out unsafely - Inline tools — each agent defines its own tools instead of using the ToolRegistry. This means 30+ tools scattered across 5 files with no central inventory
- Deployer's emit() is a no-op — entire agent runs blind to observability
- BaseAgent reflection calls LLM on every act() success, even for trivial operations — expensive and wasteful
- Duplicate system prompts —
PI_AGENT_PROMPTSin index.ts duplicates prompts already defined in each agent class
Alignment Score: 7/10
Core loop matches the design. Tool integration and safety are the main gaps.
5. Orchestrator (src/orchestrator/)
Components
pipeline.ts— State machine with bounce-back loopscontext.ts— PipelineContext factory with defaultscheckpoint.ts— Checkpoint persistence (InMemory + SQLite)beads-pipeline.ts— Alternative beads-driven pipelineindex.ts— Module exports
Strengths
- Pipeline state machine correctly implements the bounce-back pattern from SYSTEM-DESIGN.md Section 10
- Configurable max bounces for review (3) and test (2) with clear failure on exceeded limits
- Phase input wiring properly passes outputs between phases (plan → impl → review → test → deploy)
- Checkpoint support enables pipeline resumption after failures
- Beads pipeline provides an alternative work-discovery mode that integrates with external issue tracking
Weaknesses
- SQL injection in
SQLiteCheckpointStorage.save()— string interpolation instead of parameterized queries - Default context is non-functional —
DefaultLLMClientreturns dummy strings,DefaultSafetyContextauto-approves everything - No retry logic — if a phase throws, the entire pipeline fails with no retry
- Beads pipeline has hardcoded label-matching heuristics for phase determination
Alignment Score: 8/10
Closest module to the design spec. The state machine, bounce-backs, and checkpointing all match.
6. Tools Module (src/tools/)
Components
index.ts— Registry, sandbox, and tool category exportsbeads.ts— 9 beads CLI wrapper toolsbeads-availability.ts— Availability check for bd CLI
Strengths
- Beads tools are well-structured: Zod schemas, proper error handling, JSON output parsing
- Tool collections (
beadsPlannerTools,beadsOrchestratorTools) provide role-appropriate subsets - Registry supports categories and metadata
Weaknesses
- Registry is populated but never consumed — agents define their own tools inline
- Async registration of beads tools creates race conditions
- Sandbox is declared but tools run unsandboxed —
Bun.spawn()with no resource limits
Alignment Score: 5/10
SYSTEM-DESIGN.md Section 5 specifies a tool registry with sandboxing, permissions, and timeout enforcement. Only the registry shell exists.
7. Safety Module (src/safety/)
Components
breakers.ts— 4 circuit breakers (iteration, cost, time, error-rate)gates.ts— Human approval gate managerbudget.ts— Budget tracking and enforcementindex.ts— Module exports + SafetyManager
Strengths
- All 4 circuit breaker types from SYSTEM-DESIGN.md are implemented
- Budget tracking with per-run and per-day limits
- Human gates with configurable automation levels
- 5 TypeScript errors are minor (
noUncheckedIndexedAccessissues)
Weaknesses
- 5 TS errors prevent clean compilation
- Circuit breakers are defined but only integrated through the pi-adapter path, not the BaseAgent path
Alignment Score: 8/10
Good coverage of SYSTEM-DESIGN.md Section 8.
8. Memory Module (src/memory/)
Components
store.ts— MemoryStore with SQLite backendindex.ts— MemoryManager coordinating episodic, semantic, and procedural memory
Strengths
- 3 memory types match SYSTEM-DESIGN.md (episodic, semantic/pattern, procedural)
- Confidence decay over time
- Memory consolidation pipeline
- Pattern extraction from episodic memories
Weaknesses
- Similarity search is keyword-based only (no embeddings)
- No integration tests for the consolidation pipeline
Alignment Score: 7/10
Structure matches design; quality of retrieval is the gap.
9. Events Module (src/events/bus.ts)
Strengths
- SQLite persistence with proper table creation
- Checkpoint snapshotting with phase tracking
- Event replay capability
Weaknesses
- Sort direction bug in
getLatestCheckpoint()— returns oldest instead of latest - Raw SQL instead of Drizzle ORM (the rest of the app uses Drizzle)
- Uses
better-sqlite3instead ofbun:sqlite
Alignment Score: 7/10
10. CLI (src/cli/index.ts)
Strengths
- Clean commander.js setup with 4 commands
- Human gate integration via readline prompts
- Beads mode correctly delegates to real pipeline
Weaknesses
forge runuses simulatePhase() (setTimeout) instead of the actual Pipeline class — the main command is non-functional- Review and test commands also use simulation
- No structured output option (everything is console.log)
Alignment Score: 4/10
CLI skeleton exists but doesn't wire to real pipeline execution (except beads mode).
Overall Architecture Assessment
| Dimension | Rating | Notes |
|---|---|---|
| Design Fidelity | 7/10 | Most SYSTEM-DESIGN.md concepts are represented in code |
| Code Quality | 5/10 | Command injection, SQL injection, TS errors, any types |
| Completeness | 6/10 | All modules exist but many have TODO/simulated paths |
| Test Coverage | 7/10 | 104 tests pass, but no integration tests for the full pipeline |
| Security | 3/10 | Multiple injection vectors in tools that shell out |
| Observability | 5/10 | Event bus exists but deployer is silent, no structured logging |
| Production Readiness | 3/10 | Not ready — simulated CLI, injection vulns, no real LLM integration |