architecture

February 8, 2026

Forge Review: Architecture Module-by-Module Analysis

Forge Architecture — Module-by-Module Analysis

Deep-dive into each module's design, strengths, weaknesses, and alignment with SYSTEM-DESIGN.md.

1. Type System (`src/types/index.ts` — 775 lines)

Strengths

Comprehensive union types for all domain concepts (phases, severities, agent types)
Zod schemas paired with TypeScript types for runtime validation
Clear separation between input/output types per phase

Weaknesses

Duplicate error classes: ForgeError + ForgeBaseError in types collide with richer hierarchy in src/core/errors.ts
Two LLMClient.chat overloads with incompatible signatures create ambiguity for implementors
775 lines in a single file — should be split by domain (agent types, event types, tool types, config types)

Alignment Score: 7/10

Covers all concepts from SYSTEM-DESIGN.md but the duplication indicates organic growth without pruning.

2. Core Module (`src/core/`)

Components

bus.ts — InMemoryEventBus (EventEmitter-based)
config.ts — Config loader with file detection
errors.ts — Rich error hierarchy
types.ts — Re-export barrel

Strengths

Error hierarchy is well-designed with details: ErrorDetails and type-safe getters
Config loader supports multiple file formats (.ts, .js, .mjs)
Wildcard event subscription (bus.on('*', ...)) is useful for debugging

Weaknesses

Unbounded event array in bus.ts — this.events.push(event) never prunes, will leak memory in long-running processes
Config merge bug — breakers field always uses defaults (line 90)
Two EventBus implementations — core/bus.ts (in-memory) and events/bus.ts (SQLite) with no adapter or factory to choose between them

Alignment Score: 6/10

SYSTEM-DESIGN.md specifies SQLite-backed event bus with checkpointing. The in-memory version exists as a simpler fallback but they're not unified.

3. Database Layer (`src/db/schema.ts`)

Strengths

Drizzle ORM schema covers all entities: events, agents, checkpoints, memories, patterns, runs, findings, executions
ULID generation for IDs (sortable, distributed-safe)
Proper relations defined between tables
JSON columns for flexible payload storage

Weaknesses

No migration files checked in — only drizzle.config.ts exists
Schema defines relations but some are aspirational (e.g., executions table referenced but no code writes to it)
Uses better-sqlite3 instead of bun:sqlite per CLAUDE.md preference

Alignment Score: 8/10

Good coverage of the data model from SYSTEM-DESIGN.md Section 4.

4. Agent Framework (`src/agents/`)

Components

base.ts — BaseAgent with perceive/reason/act/learn loop
planner.ts, implementer.ts, reviewer.ts, tester.ts, deployer.ts — 5 specialized agents
index.ts — Factory + metadata
pi-adapter.ts, pi-model-bridge.ts, pi-event-bridge.ts, pi-tool-converter.ts — pi-agent-core integration

Strengths

BaseAgent loop faithfully implements SYSTEM-DESIGN.md Section 6 (perceive → reason → act → learn)
Reviewer implements the designed 3-layer review (static → security → AI)
Tester has smart risk-based test selection (low/medium/high/critical → different scopes)
pi-agent-core adapter is well-structured with proper event bridging, safety integration, and cost tracking
Tool definitions use Zod schemas for validation

Weaknesses

Command injection in planner (glob via find), implementer (git branch/commit) — tools shell out unsafely
Inline tools — each agent defines its own tools instead of using the ToolRegistry. This means 30+ tools scattered across 5 files with no central inventory
Deployer's emit() is a no-op — entire agent runs blind to observability
BaseAgent reflection calls LLM on every act() success, even for trivial operations — expensive and wasteful
Duplicate system prompts — PI_AGENT_PROMPTS in index.ts duplicates prompts already defined in each agent class

Alignment Score: 7/10

Core loop matches the design. Tool integration and safety are the main gaps.

5. Orchestrator (`src/orchestrator/`)

Components

pipeline.ts — State machine with bounce-back loops
context.ts — PipelineContext factory with defaults
checkpoint.ts — Checkpoint persistence (InMemory + SQLite)
beads-pipeline.ts — Alternative beads-driven pipeline
index.ts — Module exports

Strengths

Pipeline state machine correctly implements the bounce-back pattern from SYSTEM-DESIGN.md Section 10
Configurable max bounces for review (3) and test (2) with clear failure on exceeded limits
Phase input wiring properly passes outputs between phases (plan → impl → review → test → deploy)
Checkpoint support enables pipeline resumption after failures
Beads pipeline provides an alternative work-discovery mode that integrates with external issue tracking

Weaknesses

SQL injection in SQLiteCheckpointStorage.save() — string interpolation instead of parameterized queries
Default context is non-functional — DefaultLLMClient returns dummy strings, DefaultSafetyContext auto-approves everything
No retry logic — if a phase throws, the entire pipeline fails with no retry
Beads pipeline has hardcoded label-matching heuristics for phase determination

Alignment Score: 8/10

Closest module to the design spec. The state machine, bounce-backs, and checkpointing all match.

6. Tools Module (`src/tools/`)

Components

index.ts — Registry, sandbox, and tool category exports
beads.ts — 9 beads CLI wrapper tools
beads-availability.ts — Availability check for bd CLI

Strengths

Beads tools are well-structured: Zod schemas, proper error handling, JSON output parsing
Tool collections (beadsPlannerTools, beadsOrchestratorTools) provide role-appropriate subsets
Registry supports categories and metadata

Weaknesses

Registry is populated but never consumed — agents define their own tools inline
Async registration of beads tools creates race conditions
Sandbox is declared but tools run unsandboxed — Bun.spawn() with no resource limits

Alignment Score: 5/10

SYSTEM-DESIGN.md Section 5 specifies a tool registry with sandboxing, permissions, and timeout enforcement. Only the registry shell exists.

7. Safety Module (`src/safety/`)

Components

breakers.ts — 4 circuit breakers (iteration, cost, time, error-rate)
gates.ts — Human approval gate manager
budget.ts — Budget tracking and enforcement
index.ts — Module exports + SafetyManager

Strengths

All 4 circuit breaker types from SYSTEM-DESIGN.md are implemented
Budget tracking with per-run and per-day limits
Human gates with configurable automation levels
5 TypeScript errors are minor (noUncheckedIndexedAccess issues)

Weaknesses

5 TS errors prevent clean compilation
Circuit breakers are defined but only integrated through the pi-adapter path, not the BaseAgent path

Alignment Score: 8/10

Good coverage of SYSTEM-DESIGN.md Section 8.

8. Memory Module (`src/memory/`)

Components

store.ts — MemoryStore with SQLite backend
index.ts — MemoryManager coordinating episodic, semantic, and procedural memory

Strengths

3 memory types match SYSTEM-DESIGN.md (episodic, semantic/pattern, procedural)
Confidence decay over time
Memory consolidation pipeline
Pattern extraction from episodic memories

Weaknesses

Similarity search is keyword-based only (no embeddings)
No integration tests for the consolidation pipeline

Alignment Score: 7/10

Structure matches design; quality of retrieval is the gap.

9. Events Module (`src/events/bus.ts`)

Strengths

SQLite persistence with proper table creation
Checkpoint snapshotting with phase tracking
Event replay capability

Weaknesses

Sort direction bug in getLatestCheckpoint() — returns oldest instead of latest
Raw SQL instead of Drizzle ORM (the rest of the app uses Drizzle)
Uses better-sqlite3 instead of bun:sqlite

Alignment Score: 7/10

10. CLI (`src/cli/index.ts`)

Strengths

Clean commander.js setup with 4 commands
Human gate integration via readline prompts
Beads mode correctly delegates to real pipeline

Weaknesses

forge run uses simulatePhase() (setTimeout) instead of the actual Pipeline class — the main command is non-functional
Review and test commands also use simulation
No structured output option (everything is console.log)

Alignment Score: 4/10

CLI skeleton exists but doesn't wire to real pipeline execution (except beads mode).

Overall Architecture Assessment

Dimension	Rating	Notes
Design Fidelity	7/10	Most SYSTEM-DESIGN.md concepts are represented in code
Code Quality	5/10	Command injection, SQL injection, TS errors, `any` types
Completeness	6/10	All modules exist but many have TODO/simulated paths
Test Coverage	7/10	104 tests pass, but no integration tests for the full pipeline
Security	3/10	Multiple injection vectors in tools that shell out
Observability	5/10	Event bus exists but deployer is silent, no structured logging
Production Readiness	3/10	Not ready — simulated CLI, injection vulns, no real LLM integration

Forge Architecture — Module-by-Module Analysis

1. Type System (src/types/index.ts — 775 lines)

Strengths

Weaknesses

Alignment Score: 7/10

2. Core Module (src/core/)

Components

Strengths

Weaknesses

Alignment Score: 6/10

3. Database Layer (src/db/schema.ts)

Strengths

Weaknesses

Alignment Score: 8/10

4. Agent Framework (src/agents/)

Components

Strengths

Weaknesses

Alignment Score: 7/10

5. Orchestrator (src/orchestrator/)

Components

Strengths

Weaknesses

Alignment Score: 8/10

6. Tools Module (src/tools/)

Components

Strengths

Weaknesses

Alignment Score: 5/10

7. Safety Module (src/safety/)

Components

Strengths

Weaknesses

Alignment Score: 8/10

8. Memory Module (src/memory/)

Components

Strengths

Weaknesses

Alignment Score: 7/10

9. Events Module (src/events/bus.ts)

Strengths

Weaknesses

Alignment Score: 7/10

10. CLI (src/cli/index.ts)

Strengths

Weaknesses

Alignment Score: 4/10

Overall Architecture Assessment

1. Type System (`src/types/index.ts` — 775 lines)

2. Core Module (`src/core/`)

3. Database Layer (`src/db/schema.ts`)

4. Agent Framework (`src/agents/`)

5. Orchestrator (`src/orchestrator/`)

6. Tools Module (`src/tools/`)

7. Safety Module (`src/safety/`)

8. Memory Module (`src/memory/`)

9. Events Module (`src/events/bus.ts`)

10. CLI (`src/cli/index.ts`)