25 min
architecture
February 8, 2026

Architecture Implementation Plan

Architecture Implementation Plan

Document: 02-architecture-implementation.md Date: 2026-02-07 Status: Implementation Blueprint Scope: Detailed implementation plan for Forge's layered architecture


Executive Summary

This document provides a concrete implementation roadmap for Forge's architecture as defined in SYSTEM-DESIGN.md Section 2. It specifies the exact order of implementation, the TypeScript interfaces that define each layer boundary, the dependency wiring strategy, and the bootstrap sequence that brings the system to life.

Key Architectural Decisions:

  • Dependency Injection via Constructor Parameters — Explicit, testable, no magic
  • Event Bus as the Nervous System — All cross-layer communication flows through events
  • Context Object Threading — Shared context propagates down through layers, never sideways
  • Layer-by-Layer Testing — Each layer can be tested in isolation with well-defined mocks
  • SQLite as Single Source of Truth — Events table is append-only, everything else is derived

1. Layer Implementation Order

Build from the foundation up. Each layer depends only on layers below it.

Week 1-2:  Foundation (Core + Memory + Tools)
  ├─ Core Layer (types, event bus, errors, config)
  ├─ Memory Layer (SQLite schema, store, recall)
  └─ Tool Layer (registry, LLM client, git, runner)

Week 3-4:  Safety + Single Agent
  ├─ Safety Layer (breakers, gates, budget tracking)
  └─ Base Agent (agent loop with one working agent)

Week 5-6:  Agent Pool
  ├─ Reviewer Agent (first vertical slice)
  ├─ Tester Agent
  ├─ Planner Agent
  └─ Implementer Agent

Week 7-8:  Orchestrator + CLI
  ├─ Orchestrator (state machine, phase transitions)
  ├─ Checkpoint system
  └─ CLI (entry point, commands, UI)

Week 9+:   Polish + Deployer
  ├─ Deployer Agent
  ├─ Consolidation (memory pruning, pattern extraction)
  └─ Production hardening

Why this order?

  1. Core First — Event bus, types, and errors are used everywhere. Build once, use forever.
  2. Memory + Tools Early — Agents need these immediately. No agent is useful without memory and tools.
  3. Safety Alongside Agents — Circuit breakers must exist before agents can run in a loop.
  4. Single Agent Proves the Loop — Build one agent end-to-end before adding more.
  5. Orchestrator Last — It only makes sense once multiple agents exist to coordinate.
  6. CLI Last — It's just a thin wrapper over the orchestrator.

2. CLI/API Layer

The CLI is the entry point for humans. It parses commands, validates input, and dispatches to the orchestrator.

2.1 Command Structure

typescript
// src/cli/commands.ts export type Command = | { type: 'run'; task: string; options: RunOptions } | { type: 'review'; target: string; options: ReviewOptions } | { type: 'test'; filter?: string; options: TestOptions } | { type: 'status'; runId?: string } | { type: 'resume'; checkpointId: string } | { type: 'history'; limit?: number }; export interface RunOptions { autoApprove?: boolean; // Skip human gates for low-risk dryRun?: boolean; // Plan only, don't execute costBudget?: number; // USD max phases?: PhaseName[]; // Run only these phases }

2.2 Argument Parsing

Library Choice: commander.js Simple, TypeScript-friendly, widely used.

typescript
// src/cli/index.ts import { Command } from 'commander'; import { runCommand, reviewCommand, testCommand, statusCommand } from './commands'; const program = new Command(); program .name('forge') .description('Agentic SDLC Orchestrator') .version('0.1.0'); program .command('run <task>') .description('Execute a task through the full pipeline') .option('--auto-approve', 'Skip human approval for low-risk changes') .option('--dry-run', 'Plan only, do not execute') .option('--budget <usd>', 'Maximum cost in USD', parseFloat) .option('--phases <list>', 'Comma-separated list of phases to run', parseList) .action(async (task: string, options) => { await runCommand({ type: 'run', task, options }); }); program .command('review <target>') .description('Run code review on a PR or commit') .option('--depth <level>', 'Review depth: shallow, normal, deep', 'normal') .action(async (target: string, options) => { await reviewCommand({ type: 'review', target, options }); }); program .command('test [filter]') .description('Run tests with optional filter') .option('--watch', 'Run in watch mode') .action(async (filter?: string, options = {}) => { await testCommand({ type: 'test', filter, options }); }); program .command('status [runId]') .description('Show status of current or specific run') .action(async (runId?: string) => { await statusCommand({ type: 'status', runId }); }); program.parse(process.argv);

2.3 Input Validation

typescript
// src/cli/validation.ts import { z } from 'zod'; export const RunCommandSchema = z.object({ type: z.literal('run'), task: z.string().min(10, 'Task description must be at least 10 characters'), options: z.object({ autoApprove: z.boolean().optional(), dryRun: z.boolean().optional(), costBudget: z.number().positive().optional(), phases: z.array(z.enum(['planning', 'implementation', 'review', 'testing', 'deployment'])).optional(), }), }); export function validateCommand(command: unknown): Command { // Determine command type and validate with appropriate schema const parsed = CommandSchema.parse(command); return parsed; }

2.4 Dispatch to Orchestrator

typescript
// src/cli/commands/run.ts import { ForgeOrchestrator } from '../../orchestrator'; import { loadConfig } from '../../core/config'; import { createContext } from '../../core/context'; export async function runCommand(cmd: RunCommand): Promise<void> { // Load configuration const config = await loadConfig(); // Create runtime context const ctx = await createContext(config, { traceId: ulid(), startedAt: new Date(), costBudget: cmd.options.costBudget ?? config.safety.costPerRun, }); // Initialize orchestrator const orchestrator = new ForgeOrchestrator(ctx); // Execute pipeline try { const result = await orchestrator.run({ task: cmd.task, phases: cmd.options.phases, dryRun: cmd.options.dryRun, autoApprove: cmd.options.autoApprove, }); // Display results displayResults(result, ctx); } catch (error) { handleError(error, ctx); process.exit(1); } }

3. Orchestrator Layer

The orchestrator is the state machine that drives the pipeline. It manages phase transitions, enforces safety controls, and coordinates agents.

3.1 State Machine Implementation

typescript
// src/orchestrator/state-machine.ts export type PipelineState = | 'idle' | 'planning' | 'implementing' | 'reviewing' | 'testing' | 'deploying' | 'completed' | 'failed' | 'paused'; export class PipelineStateMachine { private state: PipelineState = 'idle'; private transitions: Map<PipelineState, Set<PipelineState>>; constructor(private bus: EventBus) { // Define valid transitions this.transitions = new Map([ ['idle', new Set(['planning'])], ['planning', new Set(['implementing', 'failed', 'paused'])], ['implementing', new Set(['reviewing', 'failed', 'paused'])], ['reviewing', new Set(['implementing', 'testing', 'failed', 'paused'])], // Can bounce back ['testing', new Set(['implementing', 'deploying', 'failed', 'paused'])], // Can bounce back ['deploying', new Set(['completed', 'failed', 'paused'])], ['paused', new Set(['planning', 'implementing', 'reviewing', 'testing', 'deploying'])], ['completed', new Set(['idle'])], ['failed', new Set(['idle'])], ]); } async transition(to: PipelineState, reason: string): Promise<void> { const allowed = this.transitions.get(this.state); if (!allowed?.has(to)) { throw new InvalidStateTransitionError( `Cannot transition from ${this.state} to ${to}` ); } const from = this.state; this.state = to; await this.bus.emit({ type: 'pipeline.state_changed', source: 'orchestrator', traceId: this.bus.currentTraceId, payload: { from, to, reason }, }); } getState(): PipelineState { return this.state; } }

3.2 Phase Execution

typescript
// src/orchestrator/pipeline.ts export class ForgeOrchestrator { private stateMachine: PipelineStateMachine; private phases: Map<PhaseName, Phase>; constructor(private ctx: OrchestratorContext) { this.stateMachine = new PipelineStateMachine(ctx.bus); this.phases = this.initializePhases(); } async run(input: PipelineInput): Promise<PipelineResult> { const { task, phases: requestedPhases, dryRun, autoApprove } = input; // Start run await this.ctx.bus.emit({ type: 'run.started', source: 'orchestrator', traceId: this.ctx.traceId, payload: { task, requestedPhases, dryRun }, }); // Planning phase await this.stateMachine.transition('planning', 'Starting pipeline'); const plan = await this.executePhase('planning', { task }); await this.checkpoint('planning', plan); if (dryRun) { return { status: 'dry_run_complete', plan }; } // Implementation phase await this.stateMachine.transition('implementing', 'Planning complete'); let code = await this.executePhase('implementation', plan); await this.checkpoint('implementation', code); // Review phase with bounce-back loop let reviewBounces = 0; const maxReviewBounces = 3; while (reviewBounces < maxReviewBounces) { await this.stateMachine.transition('reviewing', reviewBounces > 0 ? `Review bounce ${reviewBounces}` : 'Implementation complete' ); const review = await this.executePhase('review', code); if (review.decision === 'approve') { break; } if (review.decision === 'require_human') { const humanDecision = await this.requestHumanReview(review); if (humanDecision.approved) break; // Human requested changes, bounce back } // Bounce back to implementation reviewBounces++; await this.ctx.bus.emit({ type: 'loop.phase_bounce', source: 'orchestrator', traceId: this.ctx.traceId, payload: { from: 'review', to: 'implementation', bounce: reviewBounces, findings: review.findings }, }); await this.stateMachine.transition('implementing', 'Fixing review findings'); code = await this.executePhase('implementation', { ...plan, existingCode: code, fixFindings: review.findings, }); } // Testing phase with bounce-back loop let testBounces = 0; const maxTestBounces = 2; while (testBounces < maxTestBounces) { await this.stateMachine.transition('testing', 'Review complete'); const testResults = await this.executePhase('testing', code); if (testResults.summary.failed === 0) { break; } // Auto-fix if possible const fixable = testResults.failures.filter(f => f.suggestedFix && f.confidence > 0.7); if (fixable.length === 0) { await this.requestHumanHelp(testResults.failures); break; } testBounces++; await this.ctx.bus.emit({ type: 'loop.phase_bounce', source: 'orchestrator', traceId: this.ctx.traceId, payload: { from: 'testing', to: 'implementation', bounce: testBounces, failures: fixable }, }); await this.stateMachine.transition('implementing', 'Fixing test failures'); code = await this.executePhase('implementation', { ...plan, existingCode: code, fixFailures: fixable, }); } // Deployment phase await this.stateMachine.transition('deploying', 'Tests passed'); const deployment = await this.executePhase('deployment', { code, review, testResults }); // Complete await this.stateMachine.transition('completed', 'Deployment successful'); return { status: 'completed', plan, code, review, testResults, deployment, }; } private async executePhase<T>( phase: PhaseName, input: PhaseInput ): Promise<T> { const phaseConfig = this.phases.get(phase)!; // Check guards for (const guard of phaseConfig.guards) { if (!await guard.check(this.ctx)) { throw new GuardFailedError(`Guard ${guard.name} failed for phase ${phase}`); } } // Check circuit breakers for (const breaker of phaseConfig.breakers) { const result = breaker.check(this.ctx); if (result.shouldBreak) { throw new CircuitBreakerError(result); } } // Check human gates for (const gate of phaseConfig.gates) { if (gate.condition(this.ctx)) { await this.requestGateApproval(gate); } } // Execute agent await this.ctx.bus.emit({ type: `phase.${phase}.started`, source: 'orchestrator', traceId: this.ctx.traceId, payload: input, }); const startTime = Date.now(); const output = await phaseConfig.agent.execute(input, this.ctx); const duration = Date.now() - startTime; await this.ctx.bus.emit({ type: `phase.${phase}.completed`, source: 'orchestrator', traceId: this.ctx.traceId, payload: { output, duration }, }); return output as T; } private async checkpoint(phase: PhaseName, state: unknown): Promise<void> { await this.ctx.checkpointSystem.save({ traceId: this.ctx.traceId, phase, state, timestamp: new Date(), }); } }

3.3 Phase Configuration

typescript
// src/orchestrator/phases.ts interface Phase { name: PhaseName; agent: Agent; guards: Guard[]; gates: HumanGate[]; breakers: CircuitBreaker[]; next: PhaseName | null; } function initializePhases(ctx: OrchestratorContext): Map<PhaseName, Phase> { return new Map([ ['planning', { name: 'planning', agent: new PlannerAgent(ctx), guards: [], gates: [ { id: 'architecture_approval', condition: (plan) => plan.risk.level === 'high' || plan.risk.level === 'critical', prompt: 'Review proposed architecture before implementation begins.', timeout: 24 * 60 * 60_000, }, ], breakers: [ new IterationBreaker({ max: 20, phase: 'planning' }), new CostBreaker({ max: 5, phase: 'planning' }), new TimeBreaker({ max: 30 * 60_000, phase: 'planning' }), ], next: 'implementation', }], ['implementation', { name: 'implementation', agent: new ImplementerAgent(ctx), guards: [ { name: 'has_plan', check: (ctx) => ctx.state.plan !== undefined }, ], gates: [], breakers: [ new IterationBreaker({ max: 50, phase: 'implementation' }), new CostBreaker({ max: 10, phase: 'implementation' }), new TimeBreaker({ max: 60 * 60_000, phase: 'implementation' }), new StagnationBreaker({ threshold: 3 }), ], next: 'review', }], // ... other phases ]); }

4. Event Bus Implementation

The event bus is the nervous system of Forge. All significant actions emit events, and components subscribe to events they care about.

4.1 Core Event Bus

typescript
// src/core/bus.ts import { db, events } from './schema'; import { eq } from 'drizzle-orm'; type EventHandler = (event: ForgeEvent) => void | Promise<void>; export class EventBus { private handlers = new Map<string, Set<EventHandler>>(); public currentTraceId: string = ''; constructor(private db: Database) {} /** * Emit an event. Persists to SQLite and notifies subscribers. */ async emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void> { const full: ForgeEvent = { ...event, id: ulid(), timestamp: new Date(), traceId: event.traceId || this.currentTraceId, }; // Persist to SQLite (append-only) await this.db.insert(events).values({ id: full.id, traceId: full.traceId, timestamp: full.timestamp, source: full.source, type: full.type, phase: full.phase, payload: full.payload, tokensUsed: full.cost?.tokens, costUsd: full.cost?.usd, durationMs: full.durationMs, }); // Notify subscribers const typeHandlers = this.handlers.get(event.type) ?? new Set(); const wildcardHandlers = this.handlers.get('*') ?? new Set(); for (const handler of [...typeHandlers, ...wildcardHandlers]) { try { await handler(full); } catch (error) { console.error(`Error in event handler for ${event.type}:`, error); } } } /** * Subscribe to events of a specific type (or '*' for all events). */ on(type: string, handler: EventHandler): () => void { if (!this.handlers.has(type)) { this.handlers.set(type, new Set()); } this.handlers.get(type)!.add(handler); // Return unsubscribe function return () => { this.handlers.get(type)?.delete(handler); }; } /** * Replay all events for a specific trace ID. */ async replay(traceId: string): Promise<ForgeEvent[]> { const rows = await this.db .select() .from(events) .where(eq(events.traceId, traceId)) .orderBy(events.timestamp); return rows.map(row => ({ id: row.id, traceId: row.traceId, timestamp: row.timestamp, source: row.source, type: row.type, phase: row.phase, payload: row.payload, cost: row.costUsd ? { tokens: row.tokensUsed!, usd: row.costUsd } : undefined, durationMs: row.durationMs, })); } /** * Create a snapshot of current state for checkpointing. */ async snapshot(traceId: string): Promise<EventSnapshot> { const allEvents = await this.replay(traceId); return { traceId, events: allEvents, timestamp: new Date(), eventCount: allEvents.length, totalCost: allEvents.reduce((sum, e) => sum + (e.cost?.usd ?? 0), 0), }; } }

4.2 Event Types

typescript
// src/core/types.ts export interface ForgeEvent { id: string; traceId: string; timestamp: Date; source: string; // Which agent/component emitted this type: string; // Dot-namespaced: "review.finding", "test.failed" phase?: PhaseName; // Which phase this event belongs to payload: unknown; cost?: { tokens: number; usd: number }; durationMs?: number; } // Standard event types export type EventType = | 'run.started' | 'run.completed' | 'phase.started' | 'phase.completed' | 'phase.failed' | 'agent.iteration' | 'tool.executed' | 'finding.detected' | 'test.failed' | 'gate.requested' | 'gate.approved' | 'breaker.tripped' | 'memory.stored' | 'loop.phase_bounce';

5. Agent Pool Layer

Agents are registered in a pool and invoked by the orchestrator. Each agent implements the same Agent interface.

5.1 Agent Registry

typescript
// src/agents/registry.ts export class AgentRegistry { private agents = new Map<AgentType, Agent>(); register(agent: Agent): void { this.agents.set(agent.type, agent); } get(type: AgentType): Agent { const agent = this.agents.get(type); if (!agent) { throw new Error(`Agent ${type} not registered`); } return agent; } list(): Agent[] { return Array.from(this.agents.values()); } } // Initialize all agents export function createAgentPool(ctx: AgentContext): AgentRegistry { const registry = new AgentRegistry(); registry.register(new PlannerAgent(ctx)); registry.register(new ImplementerAgent(ctx)); registry.register(new ReviewerAgent(ctx)); registry.register(new TesterAgent(ctx)); registry.register(new DeployerAgent(ctx)); return registry; }

5.2 Agent Lifecycle

typescript
// src/agents/base.ts export abstract class BaseAgent implements Agent { abstract type: AgentType; abstract tools: Tool[]; abstract systemPrompt: string; constructor(protected ctx: AgentContext) {} async execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput> { let iteration = 0; let workingMemory = await this.perceive(input, ctx); while (true) { iteration++; // Safety check const breakerResult = ctx.safety.check({ iteration, cost: ctx.cost, elapsed: ctx.elapsed }); if (breakerResult.shouldBreak) { ctx.bus.emit({ type: `${this.type}.breaker_tripped`, source: this.type, traceId: ctx.traceId, payload: breakerResult }); throw new CircuitBreakerError(breakerResult); } // Emit iteration event await ctx.bus.emit({ type: 'agent.iteration', source: this.type, traceId: ctx.traceId, payload: { iteration, memorySize: workingMemory.messages.length }, }); // Reason: ask LLM what to do const decision = await ctx.llm.chat({ system: this.systemPrompt, messages: workingMemory.messages, tools: this.tools.map(t => t.schema), }); // Done? if (decision.done) { const output = decision.result as PhaseOutput; await ctx.bus.emit({ type: `${this.type}.completed`, source: this.type, traceId: ctx.traceId, payload: output }); await this.reflect(ctx, 'success'); return output; } // Act: execute the chosen tool const tool = this.tools.find(t => t.name === decision.toolCall.name)!; const result = await this.executeTool(tool, decision.toolCall.input, ctx); // Learn: update context workingMemory = this.updateWorkingMemory(workingMemory, decision, result); if (result.error) { await this.reflect(ctx, 'error', result.error); } } } private async executeTool( tool: Tool, input: unknown, ctx: AgentContext ): Promise<ToolResult> { const startTime = Date.now(); try { // Validate input const validated = tool.schema.input.parse(input); // Execute in sandbox const result = await ctx.toolSandbox.execute(tool, validated); // Emit event await ctx.bus.emit({ type: 'tool.executed', source: this.type, traceId: ctx.traceId, payload: { tool: tool.name, input: validated, result }, durationMs: Date.now() - startTime, }); return result; } catch (error) { await ctx.bus.emit({ type: 'tool.failed', source: this.type, traceId: ctx.traceId, payload: { tool: tool.name, error: error.message }, durationMs: Date.now() - startTime, }); return { success: false, error }; } } protected abstract perceive(input: PhaseInput, ctx: AgentContext): Promise<WorkingMemory>; protected abstract updateWorkingMemory(memory: WorkingMemory, decision: Decision, result: ToolResult): WorkingMemory; protected abstract reflect(ctx: AgentContext, outcome: string, error?: Error): Promise<void>; }

6. Tool Layer

Tools are the hands of agents. Each tool implements a standard interface and is registered in the tool registry.

6.1 Tool Registry

typescript
// src/tools/registry.ts export class ToolRegistry { private tools = new Map<string, Tool>(); register(tool: Tool): void { this.tools.set(tool.name, tool); } get(name: string): Tool | undefined { return this.tools.get(name); } list(): Tool[] { return Array.from(this.tools.values()); } find(query: string): Tool[] { // Simple search by name or description return this.list().filter(t => t.name.includes(query) || t.description.toLowerCase().includes(query.toLowerCase()) ); } } // Initialize standard tools export function createToolRegistry(): ToolRegistry { const registry = new ToolRegistry(); // Core tools registry.register(readFileTool); registry.register(writeFileTool); registry.register(runCommandTool); registry.register(searchFilesTool); // Git tools registry.register(gitDiffTool); registry.register(gitCommitTool); // LLM tool registry.register(llmAnalyzeTool); return registry; }

6.2 Tool Sandbox

typescript
// src/tools/sandbox.ts export class ToolSandbox { constructor(private config: SandboxConfig) {} async execute<TInput, TOutput>( tool: Tool<TInput, TOutput>, input: TInput ): Promise<ToolResult<TOutput>> { // Validate input const validated = tool.schema.input.parse(input); // Check permissions await this.checkPermissions(tool, validated); // Apply resource limits const limiter = this.createLimiter(); try { const result = await limiter.run(() => tool.execute(validated, this.createToolContext())); // Validate output const validatedOutput = tool.schema.output.parse(result); return { success: true, data: validatedOutput }; } catch (error) { return { success: false, error }; } } private async checkPermissions(tool: Tool, input: unknown): Promise<void> { // Check if tool requires elevated permissions if (tool.requiresApproval && !this.hasApproval(tool)) { throw new PermissionDeniedError(`Tool ${tool.name} requires approval`); } // Check resource access (file paths, network hosts, etc.) if (tool.name.includes('write') || tool.name.includes('delete')) { await this.checkWritePermissions(input); } } private createLimiter(): ResourceLimiter { return new ResourceLimiter({ maxMemory: this.config.maxMemory, maxCpu: this.config.maxCpu, maxTime: this.config.maxTime, }); } }

6.3 Tool Audit Logger

typescript
// src/tools/audit.ts export class ToolAuditLogger { constructor(private bus: EventBus) {} async logExecution( tool: string, input: unknown, result: ToolResult, context: ToolContext ): Promise<void> { await this.bus.emit({ type: 'tool.executed', source: 'tool_layer', traceId: context.traceId, payload: { tool, input: this.sanitize(input), success: result.success, error: result.error?.message, }, durationMs: result.durationMs, }); } private sanitize(data: unknown): unknown { // Remove sensitive data (passwords, tokens, etc.) // This is a placeholder - real implementation would be more sophisticated return data; } }

7. Memory Layer

The memory layer interfaces with SQLite and provides memory storage, recall, and consolidation.

7.1 Database Connection

typescript
// src/memory/db.ts import { drizzle } from 'drizzle-orm/better-sqlite3'; import Database from 'better-sqlite3'; import * as schema from './schema'; let dbInstance: ReturnType<typeof drizzle> | null = null; export function getDb(path: string = '.forge/memory.db') { if (!dbInstance) { const sqlite = new Database(path); sqlite.pragma('journal_mode = WAL'); // Better concurrency dbInstance = drizzle(sqlite, { schema }); } return dbInstance; } export function closeDb() { if (dbInstance) { // Better-sqlite3 doesn't have a close method on drizzle // Just set to null and let GC handle it dbInstance = null; } }

7.2 Memory Store

typescript
// src/memory/store.ts import { db, memories } from './schema'; import { eq, and, gte } from 'drizzle-orm'; export class MemoryStore { constructor(private db: Database) {} async store(memory: Omit<Memory, 'id' | 'createdAt' | 'lastAccessed' | 'accessCount'>): Promise<string> { const id = ulid(); await this.db.insert(memories).values({ id, type: memory.type, content: memory.content, context: memory.context, embedding: memory.embedding, confidence: memory.confidence, source: memory.source, tags: memory.tags, createdAt: new Date(), lastAccessed: new Date(), accessCount: 0, }); return id; } async recall(query: RecallQuery): Promise<Memory[]> { // Simple keyword-based recall for MVP // Will upgrade to vector similarity search later const results = await this.db .select() .from(memories) .where( and( eq(memories.type, query.type), gte(memories.confidence, query.minConfidence ?? 0.5) ) ) .limit(query.limit ?? 10); // Update access count and timestamp for (const result of results) { await this.db .update(memories) .set({ lastAccessed: new Date(), accessCount: result.accessCount + 1, }) .where(eq(memories.id, result.id)); } return results; } async update(id: string, updates: Partial<Memory>): Promise<void> { await this.db .update(memories) .set(updates) .where(eq(memories.id, id)); } async delete(id: string): Promise<void> { await this.db .delete(memories) .where(eq(memories.id, id)); } }

7.3 Migration Strategy

typescript
// drizzle.config.ts import type { Config } from 'drizzle-kit'; export default { schema: './src/memory/schema.ts', out: './drizzle', driver: 'better-sqlite', dbCredentials: { url: '.forge/memory.db', }, } satisfies Config; // Run migrations: // bun run drizzle-kit generate:sqlite // bun run drizzle-kit push:sqlite

8. Cross-Layer Communication

8.1 Dependency Injection

No service locator. No global singletons. Explicit constructor injection only.

typescript
// src/core/context.ts export interface ForgeContext { // Identity traceId: string; startedAt: Date; // Core infrastructure bus: EventBus; db: Database; config: ForgeConfig; // Memory memory: MemoryStore; // Tools toolRegistry: ToolRegistry; toolSandbox: ToolSandbox; // Safety safety: SafetySystem; costTracker: CostTracker; // LLM llm: LLMProvider; // Human interface humanGateway: HumanGateway; } export async function createContext( config: ForgeConfig, options: ContextOptions ): Promise<ForgeContext> { // Initialize database const db = getDb(config.memory.dbPath); // Initialize event bus const bus = new EventBus(db); bus.currentTraceId = options.traceId; // Initialize memory const memory = new MemoryStore(db); // Initialize tools const toolRegistry = createToolRegistry(); const toolSandbox = new ToolSandbox(config.tools.sandbox); // Initialize safety const safety = new SafetySystem(config.safety); const costTracker = new CostTracker(config.safety.costPerRun); // Initialize LLM const llm = createLLMProvider(config.llm); // Initialize human gateway const humanGateway = new HumanGateway(bus); return { traceId: options.traceId, startedAt: options.startedAt, bus, db, config, memory, toolRegistry, toolSandbox, safety, costTracker, llm, humanGateway, }; }

8.2 Context Propagation

The ForgeContext threads through every layer. Agents receive AgentContext (a view of ForgeContext), tools receive ToolContext, etc.

typescript
// src/core/types.ts export interface AgentContext { traceId: string; phase: PhaseName; // What agents need bus: EventBus; memory: MemoryStore; llm: LLMProvider; tools: ToolRegistry; toolSandbox: ToolSandbox; safety: SafetySystem; cost: CostTracker; elapsed: () => number; } export interface ToolContext { traceId: string; agent: AgentType; // What tools need workingDirectory: string; environment: Record<string, string>; secrets: SecretProvider; } // Derive specialized contexts from main context export function deriveAgentContext( ctx: ForgeContext, phase: PhaseName ): AgentContext { return { traceId: ctx.traceId, phase, bus: ctx.bus, memory: ctx.memory, llm: ctx.llm, tools: ctx.toolRegistry, toolSandbox: ctx.toolSandbox, safety: ctx.safety, cost: ctx.costTracker, elapsed: () => Date.now() - ctx.startedAt.getTime(), }; }

9. Error Propagation

Errors flow upward through layers. Each layer is responsible for handling errors it can recover from, and re-throwing those it cannot.

9.1 Error Taxonomy

typescript
// src/core/errors.ts export abstract class ForgeError extends Error { abstract readonly code: string; abstract readonly severity: 'info' | 'warning' | 'error' | 'critical'; abstract readonly recoverable: boolean; constructor(message: string, public readonly context?: unknown) { super(message); this.name = this.constructor.name; } } // ── Layer-specific errors ── export class ToolExecutionError extends ForgeError { code = 'TOOL_EXECUTION_ERROR'; severity = 'error' as const; recoverable = true; } export class CircuitBreakerError extends ForgeError { code = 'CIRCUIT_BREAKER_TRIPPED'; severity = 'critical' as const; recoverable = false; } export class InvalidStateTransitionError extends ForgeError { code = 'INVALID_STATE_TRANSITION'; severity = 'critical' as const; recoverable = false; } export class GuardFailedError extends ForgeError { code = 'GUARD_FAILED'; severity = 'error' as const; recoverable = true; } export class HumanGateTimeoutError extends ForgeError { code = 'HUMAN_GATE_TIMEOUT'; severity = 'error' as const; recoverable = true; }

9.2 Error Handling Strategy

typescript
// src/orchestrator/error-handler.ts export class ErrorHandler { constructor(private ctx: ForgeContext) {} async handle(error: Error, phase: PhaseName): Promise<RecoveryAction> { // Emit error event await this.ctx.bus.emit({ type: 'error.occurred', source: 'orchestrator', traceId: this.ctx.traceId, payload: { error: error.message, phase, stack: error.stack, }, }); // Classify error if (error instanceof CircuitBreakerError) { // Non-recoverable, halt pipeline return { action: 'fail', message: error.message }; } if (error instanceof ToolExecutionError) { // Retry with backoff return { action: 'retry', delay: this.calculateBackoff(phase), maxRetries: 3 }; } if (error instanceof GuardFailedError) { // Skip phase or escalate return { action: 'skip_phase', message: error.message }; } if (error instanceof HumanGateTimeoutError) { // Escalate to human return { action: 'escalate', message: 'Human approval timed out' }; } // Unknown error, fail safe return { action: 'fail', message: `Unknown error: ${error.message}` }; } private calculateBackoff(phase: PhaseName): number { // Exponential backoff based on attempt count const attempt = this.ctx.costTracker.getAttemptCount(phase); return Math.min(1000 * Math.pow(2, attempt), 30000); } }

10. Testability

Each layer has well-defined boundaries and can be tested in isolation.

10.1 Mock Strategies

typescript
// tests/mocks/bus.mock.ts export class MockEventBus implements EventBus { public emittedEvents: ForgeEvent[] = []; private handlers = new Map<string, Set<EventHandler>>(); public currentTraceId = 'test-trace'; async emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void> { const full: ForgeEvent = { ...event, id: ulid(), timestamp: new Date(), traceId: this.currentTraceId, }; this.emittedEvents.push(full); // Still notify handlers const typeHandlers = this.handlers.get(event.type) ?? new Set(); const wildcardHandlers = this.handlers.get('*') ?? new Set(); for (const handler of [...typeHandlers, ...wildcardHandlers]) { await handler(full); } } on(type: string, handler: EventHandler): () => void { if (!this.handlers.has(type)) { this.handlers.set(type, new Set()); } this.handlers.get(type)!.add(handler); return () => this.handlers.get(type)?.delete(handler); } async replay(traceId: string): Promise<ForgeEvent[]> { return this.emittedEvents.filter(e => e.traceId === traceId); } async snapshot(traceId: string): Promise<EventSnapshot> { const events = await this.replay(traceId); return { traceId, events, timestamp: new Date(), eventCount: events.length, totalCost: 0, }; } // Test helper getEventsOfType(type: string): ForgeEvent[] { return this.emittedEvents.filter(e => e.type === type); } clear(): void { this.emittedEvents = []; } }
typescript
// tests/mocks/llm.mock.ts export class MockLLMProvider implements LLMProvider { public responses: ChatResponse[] = []; private responseIndex = 0; constructor(responses: ChatResponse[] = []) { this.responses = responses; } async chat(request: ChatRequest): Promise<ChatResponse> { if (this.responseIndex >= this.responses.length) { throw new Error('MockLLMProvider: No more mocked responses'); } return this.responses[this.responseIndex++]; } async embed(text: string): Promise<Float32Array> { // Return dummy embedding return new Float32Array(1536).fill(0.5); } // Test helper addResponse(response: ChatResponse): void { this.responses.push(response); } reset(): void { this.responseIndex = 0; this.responses = []; } }

10.2 Layer Testing Examples

Test the Event Bus in Isolation:

typescript
// tests/core/bus.test.ts import { describe, it, expect, beforeEach } from 'bun:test'; import { EventBus } from '../../src/core/bus'; import { getDb } from '../../src/memory/db'; describe('EventBus', () => { let bus: EventBus; let db: Database; beforeEach(() => { db = getDb(':memory:'); // In-memory SQLite for tests bus = new EventBus(db); bus.currentTraceId = 'test-trace'; }); it('should emit and persist events', async () => { await bus.emit({ type: 'test.event', source: 'test', traceId: 'test-trace', payload: { foo: 'bar' }, }); const events = await bus.replay('test-trace'); expect(events).toHaveLength(1); expect(events[0].type).toBe('test.event'); expect(events[0].payload).toEqual({ foo: 'bar' }); }); it('should notify subscribers', async () => { const received: ForgeEvent[] = []; bus.on('test.event', (event) => { received.push(event); }); await bus.emit({ type: 'test.event', source: 'test', traceId: 'test-trace', payload: { foo: 'bar' }, }); expect(received).toHaveLength(1); expect(received[0].type).toBe('test.event'); }); it('should support wildcard subscriptions', async () => { const received: ForgeEvent[] = []; bus.on('*', (event) => { received.push(event); }); await bus.emit({ type: 'test.event1', source: 'test', traceId: 'test-trace', payload: {}, }); await bus.emit({ type: 'test.event2', source: 'test', traceId: 'test-trace', payload: {}, }); expect(received).toHaveLength(2); }); });

Test an Agent in Isolation:

typescript
// tests/agents/reviewer.test.ts import { describe, it, expect, beforeEach } from 'bun:test'; import { ReviewerAgent } from '../../src/agents/reviewer'; import { MockEventBus } from '../mocks/bus.mock'; import { MockLLMProvider } from '../mocks/llm.mock'; import { createTestContext } from '../helpers'; describe('ReviewerAgent', () => { let agent: ReviewerAgent; let bus: MockEventBus; let llm: MockLLMProvider; let ctx: AgentContext; beforeEach(() => { bus = new MockEventBus(); llm = new MockLLMProvider(); ctx = createTestContext({ bus, llm }); agent = new ReviewerAgent(ctx); }); it('should auto-approve low-risk changes', async () => { // Mock LLM response llm.addResponse({ done: true, result: { decision: 'approve', riskScore: { total: 0.2, level: 'low' }, findings: [], }, usage: { promptTokens: 100, completionTokens: 50 }, cost: 0.01, }); const result = await agent.execute( { codebase: mockCodebase }, ctx ); expect(result.decision).toBe('approve'); expect(result.riskScore.level).toBe('low'); // Check events emitted const completedEvents = bus.getEventsOfType('reviewer.completed'); expect(completedEvents).toHaveLength(1); }); it('should escalate high-risk changes', async () => { llm.addResponse({ done: true, result: { decision: 'require_human', riskScore: { total: 0.8, level: 'high' }, findings: [ { severity: 'critical', message: 'Potential SQL injection' }, ], }, usage: { promptTokens: 100, completionTokens: 50 }, cost: 0.01, }); const result = await agent.execute( { codebase: mockCodebase }, ctx ); expect(result.decision).toBe('require_human'); expect(result.riskScore.level).toBe('high'); }); });

Test the Orchestrator with Mocked Agents:

typescript
// tests/orchestrator/pipeline.test.ts import { describe, it, expect, beforeEach } from 'bun:test'; import { ForgeOrchestrator } from '../../src/orchestrator'; import { MockEventBus } from '../mocks/bus.mock'; import { MockAgent } from '../mocks/agent.mock'; import { createTestContext } from '../helpers'; describe('ForgeOrchestrator', () => { let orchestrator: ForgeOrchestrator; let ctx: OrchestratorContext; beforeEach(() => { ctx = createTestContext(); orchestrator = new ForgeOrchestrator(ctx); }); it('should execute full pipeline', async () => { const result = await orchestrator.run({ task: 'Add user authentication', phases: undefined, dryRun: false, autoApprove: true, }); expect(result.status).toBe('completed'); expect(result.plan).toBeDefined(); expect(result.code).toBeDefined(); expect(result.review).toBeDefined(); expect(result.testResults).toBeDefined(); }); it('should handle review bounce-back', async () => { // Mock reviewer to reject first time, approve second time // Implementation details... const result = await orchestrator.run({ task: 'Add user authentication', phases: undefined, dryRun: false, autoApprove: false, }); // Check that bounce-back happened const bounceEvents = ctx.bus.getEventsOfType('loop.phase_bounce'); expect(bounceEvents).toHaveLength(1); expect(bounceEvents[0].payload.from).toBe('review'); expect(bounceEvents[0].payload.to).toBe('implementation'); }); });

11. Bootstrap Sequence

When forge run is invoked, this is the exact sequence of initialization:

typescript
// src/index.ts import { loadConfig } from './core/config'; import { createContext } from './core/context'; import { ForgeOrchestrator } from './orchestrator'; import { ulid } from 'ulid'; export async function bootstrap(command: Command): Promise<void> { // 1. Load configuration const config = await loadConfig(); // 2. Run migrations if needed await runMigrations(config.memory.dbPath); // 3. Create context (this initializes all layers) const ctx = await createContext(config, { traceId: ulid(), startedAt: new Date(), }); // 4. Register event listeners for observability ctx.bus.on('*', (event) => { // Log to console in development if (config.environment === 'development') { console.log(`[${event.source}] ${event.type}`, event.payload); } }); // 5. Initialize orchestrator const orchestrator = new ForgeOrchestrator(ctx); // 6. Execute command switch (command.type) { case 'run': await orchestrator.run({ task: command.task, phases: command.options.phases, dryRun: command.options.dryRun ?? false, autoApprove: command.options.autoApprove ?? false, }); break; case 'status': await showStatus(ctx, command.runId); break; case 'resume': await orchestrator.resume(command.checkpointId); break; // ... other commands } // 7. Clean up await ctx.db.close(); } // Entry point if (import.meta.main) { const command = await parseCommandLineArgs(); await bootstrap(command); }

Initialization Order

1. Load config from forge.config.ts
   ├─ Merge with defaults
   └─ Validate

2. Run database migrations
   ├─ Check schema version
   └─ Apply pending migrations

3. Initialize Database
   ├─ Open SQLite connection
   └─ Set pragmas (WAL mode, etc.)

4. Initialize Event Bus
   └─ Pass database connection

5. Initialize Memory Store
   └─ Pass database connection

6. Initialize Tool Layer
   ├─ Create tool registry
   ├─ Register standard tools
   └─ Initialize sandbox

7. Initialize Safety Layer
   ├─ Create circuit breakers
   ├─ Initialize cost tracker
   └─ Set up human gateway

8. Initialize LLM Provider
   ├─ Load API keys from secrets
   └─ Validate credentials

9. Initialize Agent Pool
   ├─ Create agent registry
   └─ Instantiate all agents (pass context)

10. Initialize Orchestrator
    ├─ Create state machine
    ├─ Load phase configurations
    └─ Initialize checkpoint system

11. Execute Command
    └─ Orchestrator runs the pipeline

12. Cleanup
    ├─ Close database connections
    └─ Flush pending events

12. File Structure

This is the exact file structure to implement:

forge/
├── src/
│   ├── core/
│   │   ├── types.ts              # Core interfaces (Agent, Tool, Event, Memory, etc.)
│   │   ├── bus.ts                # EventBus implementation
│   │   ├── config.ts             # Configuration loading and validation
│   │   ├── errors.ts             # Error taxonomy
│   │   └── context.ts            # Context creation and propagation
│   │
│   ├── memory/
│   │   ├── schema.ts             # Drizzle SQLite schema
│   │   ├── db.ts                 # Database connection management
│   │   ├── store.ts              # MemoryStore implementation
│   │   ├── episodes.ts           # Episodic memory helpers
│   │   ├── patterns.ts           # Semantic memory / pattern extraction
│   │   └── consolidate.ts        # Memory consolidation and pruning
│   │
│   ├── tools/
│   │   ├── registry.ts           # ToolRegistry implementation
│   │   ├── sandbox.ts            # ToolSandbox implementation
│   │   ├── audit.ts              # ToolAuditLogger
│   │   ├── llm.ts                # LLM provider abstraction
│   │   ├── git.ts                # Git operations tool
│   │   ├── github.ts             # GitHub API tool
│   │   ├── runner.ts             # Shell command execution tool
│   │   ├── linter.ts             # Linter integration tool
│   │   └── test-runner.ts        # Test execution tool
│   │
│   ├── safety/
│   │   ├── breakers.ts           # Circuit breakers (iteration, cost, time, error-rate)
│   │   ├── gates.ts              # Human approval gates
│   │   ├── budget.ts             # Cost tracking and limits
│   │   └── system.ts             # SafetySystem coordinator
│   │
│   ├── agents/
│   │   ├── registry.ts           # AgentRegistry
│   │   ├── base.ts               # BaseAgent with agent loop
│   │   ├── planner.ts            # PlannerAgent
│   │   ├── implementer.ts        # ImplementerAgent
│   │   ├── reviewer.ts           # ReviewerAgent
│   │   ├── tester.ts             # TesterAgent
│   │   └── deployer.ts           # DeployerAgent
│   │
│   ├── orchestrator/
│   │   ├── pipeline.ts           # ForgeOrchestrator
│   │   ├── state-machine.ts      # PipelineStateMachine
│   │   ├── phases.ts             # Phase configurations
│   │   ├── checkpoint.ts         # CheckpointSystem
│   │   └── error-handler.ts      # ErrorHandler
│   │
│   ├── cli/
│   │   ├── index.ts              # CLI entry point (commander setup)
│   │   ├── commands/
│   │   │   ├── run.ts
│   │   │   ├── review.ts
│   │   │   ├── test.ts
│   │   │   ├── status.ts
│   │   │   └── history.ts
│   │   ├── validation.ts         # Input validation (Zod schemas)
│   │   └── ui.ts                 # Terminal output formatting
│   │
│   └── index.ts                  # Main bootstrap sequence
│
├── tests/
│   ├── mocks/
│   │   ├── bus.mock.ts
│   │   ├── llm.mock.ts
│   │   ├── agent.mock.ts
│   │   └── tool.mock.ts
│   ├── helpers/
│   │   └── test-context.ts       # createTestContext helper
│   ├── core/
│   │   ├── bus.test.ts
│   │   └── config.test.ts
│   ├── agents/
│   │   ├── base.test.ts
│   │   ├── reviewer.test.ts
│   │   └── tester.test.ts
│   ├── orchestrator/
│   │   └── pipeline.test.ts
│   └── integration/
│       └── full-pipeline.test.ts
│
├── drizzle/                      # Generated migrations
│
├── forge.config.ts               # Default configuration
├── drizzle.config.ts             # Drizzle migration config
├── package.json
├── tsconfig.json
└── README.md

13. Key Interfaces Summary

Here are the critical interfaces that define layer boundaries:

typescript
// ═══════════════════════════════════════════════════════════════════════ // AGENT INTERFACE // ═══════════════════════════════════════════════════════════════════════ interface Agent { id: string; type: AgentType; execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput>; } // ═══════════════════════════════════════════════════════════════════════ // TOOL INTERFACE // ═══════════════════════════════════════════════════════════════════════ interface Tool<TInput = unknown, TOutput = unknown> { name: string; description: string; schema: { input: ZodSchema<TInput>; output: ZodSchema<TOutput>; }; requiresApproval?: boolean; execute(input: TInput, ctx: ToolContext): Promise<TOutput>; } // ═══════════════════════════════════════════════════════════════════════ // EVENT BUS INTERFACE // ═══════════════════════════════════════════════════════════════════════ interface EventBus { emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void>; on(type: string, handler: EventHandler): () => void; replay(traceId: string): Promise<ForgeEvent[]>; snapshot(traceId: string): Promise<EventSnapshot>; } // ═══════════════════════════════════════════════════════════════════════ // MEMORY INTERFACE // ═══════════════════════════════════════════════════════════════════════ interface MemoryStore { store(memory: Omit<Memory, 'id' | 'createdAt' | 'lastAccessed' | 'accessCount'>): Promise<string>; recall(query: RecallQuery): Promise<Memory[]>; update(id: string, updates: Partial<Memory>): Promise<void>; delete(id: string): Promise<void>; } // ═══════════════════════════════════════════════════════════════════════ // ORCHESTRATOR INTERFACE // ═══════════════════════════════════════════════════════════════════════ interface Orchestrator { run(input: PipelineInput): Promise<PipelineResult>; resume(checkpointId: string): Promise<PipelineResult>; pause(): Promise<void>; status(): Promise<PipelineStatus>; } // ═══════════════════════════════════════════════════════════════════════ // CHECKPOINT INTERFACE // ═══════════════════════════════════════════════════════════════════════ interface CheckpointSystem { save(checkpoint: Omit<Checkpoint, 'id'>): Promise<string>; load(id: string): Promise<Checkpoint>; loadLatest(traceId: string): Promise<Checkpoint>; list(traceId: string): Promise<Checkpoint[]>; }

14. Next Steps for Developers

To implement this architecture:

  1. Start with src/core/types.ts — Define all the core interfaces first. This is the contract.

  2. Implement src/core/bus.ts — Get events working. This is the foundation for everything else.

  3. Set up src/memory/schema.ts — Define the SQLite schema using Drizzle.

  4. Implement src/memory/store.ts — Basic memory storage and recall (no vector search in MVP).

  5. Build one tool — Start with src/tools/runner.ts (shell command execution). Prove the tool interface works.

  6. Build src/agents/base.ts — Implement the agent loop. This is the most critical piece.

  7. Build one complete agent — Start with ReviewerAgent. Get one agent working end-to-end before adding more.

  8. Add safety controls — Implement circuit breakers and gates. Test them with the working agent.

  9. Build the orchestrator — Once one agent works, build the state machine to coordinate multiple agents.

  10. Add the CLI — Wrap the orchestrator in a command-line interface.

  11. Test everything — Write integration tests that exercise the full pipeline.

  12. Iterate — Add more agents, refine the feedback loops, improve error handling.


15. Critical Architectural Decisions Recap

  1. Event Bus as Nervous System — All significant actions emit events. Events are persisted to SQLite. Components subscribe to events they care about. This enables observability, audit trails, and replay.

  2. Dependency Injection — No service locator, no global singletons. All dependencies are passed explicitly via constructor parameters. This makes testing trivial and dependencies visible.

  3. Context Threading — A single ForgeContext is created at bootstrap and threaded through every layer. Specialized views (AgentContext, ToolContext) are derived from it.

  4. SQLite as Source of Truth — The events table is append-only. Everything else (memories, checkpoints, runs) is either derived or auxiliary. You can always reconstruct state by replaying events.

  5. Layers Depend Only Downward — CLI depends on Orchestrator. Orchestrator depends on Agents. Agents depend on Tools and Memory. Tools depend only on Core. This makes the system testable and maintainable.

  6. Safety is Baked In — Circuit breakers, cost tracking, and human gates are not afterthoughts. They are first-class concerns wired into the agent loop and orchestrator.

  7. Feedback Loops via Events — All feedback (review findings, test failures, human decisions) flows through the event bus. Agents listen for relevant events and react accordingly.

  8. Build Foundation First — Core, Memory, and Tools are Week 1-2. Don't start on agents until the foundation is solid.


This plan is now ready for implementation. The architecture is concrete, the interfaces are defined, and the build order is clear. Time to write code.