Module Map: Forge Agentic SDLC Implementation Plan
Module Map: Forge Agentic SDLC Implementation Plan
Document: 04-module-map.md Date: 2026-02-07 Status: Implementation Specification Purpose: Complete module-level breakdown for Forge system implementation
Table of Contents
- Complete File Tree
- Module Dependency Graph
- src/core/ Module
- src/safety/ Module
- src/memory/ Module
- src/tools/ Module
- src/agents/ Module
- src/orchestrator/ Module
- src/cli/ Module
- Module Interfaces
- Initialization Order
- Testing Structure
1. Complete File Tree
forge/
│
├── package.json # Project manifest, Bun scripts, dependencies
├── tsconfig.json # TypeScript config with path aliases
├── forge.config.ts # Default configuration template
├── .gitignore # Standard ignores + .forge/ local data
├── README.md # Project documentation
│
├── drizzle/ # Database migrations
│ ├── meta/ # Drizzle metadata
│ └── 0000_initial_schema.sql # Initial schema migration
│
├── src/
│ │
│ ├── core/ # Foundation (Week 1)
│ │ ├── types.ts # Core abstractions: Agent, Tool, Phase, Memory, Event, Checkpoint
│ │ ├── bus.ts # In-memory event bus with SQLite persistence
│ │ ├── config.ts # Runtime configuration + validation
│ │ ├── errors.ts # Error taxonomy + custom error classes
│ │ └── index.ts # Re-exports core types (explicit, no barrels)
│ │
│ ├── safety/ # Guardrails (Week 1)
│ │ ├── breakers.ts # Circuit breaker implementations (iteration, cost, time, error-rate)
│ │ ├── gates.ts # Human approval gates + gate manager
│ │ ├── budget.ts # Cost tracking + budget enforcement
│ │ └── index.ts # Re-exports safety controls
│ │
│ ├── memory/ # Learning system (Week 2)
│ │ ├── schema.ts # Drizzle ORM schema (events, memories, patterns, checkpoints, runs, findings)
│ │ ├── store.ts # Base memory CRUD operations + similarity search
│ │ ├── episodes.ts # Episodic memory manager (what happened)
│ │ ├── patterns.ts # Semantic memory manager (pattern extraction)
│ │ ├── procedures.ts # Procedural memory manager (strategies)
│ │ ├── consolidate.ts # Knowledge consolidation + pruning job
│ │ └── index.ts # Re-exports memory interfaces
│ │
│ ├── tools/ # Tool layer (Week 2)
│ │ ├── registry.ts # Tool registry + discovery
│ │ ├── sandbox.ts # Execution sandboxing + validation
│ │ ├── llm.ts # LLM provider abstraction (Anthropic, OpenAI, Ollama)
│ │ ├── git.ts # Git operations (diff, commit, branch, log)
│ │ ├── github.ts # GitHub API (PR, issues, comments, checks)
│ │ ├── runner.ts # Shell command execution + parsing
│ │ ├── linter.ts # ESLint/Biome integration
│ │ ├── test-runner.ts # Jest/Vitest execution + result parsing
│ │ └── index.ts # Re-exports tool interfaces
│ │
│ ├── agents/ # Agent implementations (Week 3-6)
│ │ ├── base.ts # BaseAgent with perceive-reason-act-learn loop
│ │ ├── planner.ts # PlannerAgent: requirements → plan → tasks
│ │ ├── implementer.ts # ImplementerAgent: tasks → code
│ │ ├── reviewer.ts # ReviewerAgent: 3-layer review (static, security, AI)
│ │ ├── tester.ts # TesterAgent: test selection, execution, analysis
│ │ ├── deployer.ts # DeployerAgent: build → deploy → verify
│ │ └── index.ts # Re-exports agent classes
│ │
│ ├── orchestrator/ # Pipeline coordination (Week 7)
│ │ ├── pipeline.ts # Pipeline state machine + phase execution
│ │ ├── checkpoint.ts # State persistence + restore
│ │ ├── context.ts # Shared context across agents
│ │ └── index.ts # Re-exports orchestrator interfaces
│ │
│ └── cli/ # User interface (Week 8)
│ ├── index.ts # CLI entry point + command registration
│ ├── ui.ts # Terminal output formatting (spinners, tables)
│ ├── commands/
│ │ ├── run.ts # forge run <task>
│ │ ├── review.ts # forge review <pr>
│ │ ├── test.ts # forge test <files>
│ │ ├── status.ts # forge status
│ │ ├── history.ts # forge history
│ │ └── index.ts # Command exports
│ └── prompts.ts # Interactive prompts (approvals, inputs)
│
└── tests/ # Test suite
├── unit/ # Unit tests (mirrors src structure)
│ ├── core/
│ ├── safety/
│ ├── memory/
│ ├── tools/
│ └── agents/
├── integration/ # Integration tests
│ ├── pipeline.test.ts # Full pipeline flow
│ ├── review-flow.test.ts # Review → fix → re-review
│ └── memory.test.ts # Memory persistence + recall
├── fixtures/ # Test data
│ ├── sample-repos/ # Sample code for testing
│ ├── mock-responses/ # LLM response mocks
│ └── test-configs/ # Test configurations
└── helpers/ # Test utilities
├── mock-llm.ts # LLM mock implementation
├── temp-db.ts # Temporary DB setup
└── fixtures.ts # Fixture loader
2. Module Dependency Graph
┌──────────────────────────────────────────────────────────────────────────────┐
│ MODULE DEPENDENCY GRAPH │
│ │
│ Legend: │
│ ──▶ Direct dependency (imports from) │
│ ╌╌▶ Optional/runtime dependency │
│ [X] External package │
└──────────────────────────────────────────────────────────────────────────────┘
LAYER 0: External Dependencies
═══════════════════════════════
[Bun] [Drizzle] [Zod] [Anthropic SDK] [Octokit]
│
▼
LAYER 1: Core Foundation (No internal dependencies)
═══════════════════════════════════════════════════
┌────────────┐ ┌────────────┐ ┌────────────┐
│ core/types │ │core/errors │ │core/config │
└────────────┘ └────────────┘ └────────────┘
│ │ │
└─────────────────┴──────────────────┘
│
▼
┌────────────┐
│ core/bus │ (depends on: types, errors)
└────────────┘
│
▼
LAYER 2: Support Systems (Depend on core only)
═══════════════════════════════════════════════
┌────────────────┐ ┌────────────────┐
│ safety/* │◀───────│ memory/schema │
│ (all files) │ │ │
└────────────────┘ └────────────────┘
│ │
│ ┌─────────────────────────┘
│ │
▼ ▼
┌────────────────┐
│ memory/store │ (depends on: schema, core/types, core/bus)
└────────────────┘
│
├───▶ memory/episodes.ts
├───▶ memory/patterns.ts
├───▶ memory/procedures.ts
└───▶ memory/consolidate.ts
│
▼
LAYER 3: Tool Layer (Depends on core + safety)
═══════════════════════════════════════════════
┌────────────────┐
│ tools/registry │ (depends on: core/types, safety/breakers)
└────────────────┘
│
│ (All tools implement Tool interface from core/types)
│
├───▶ tools/llm.ts (depends on: registry, [Anthropic SDK])
├───▶ tools/git.ts (depends on: registry, tools/runner)
├───▶ tools/github.ts (depends on: registry, [Octokit])
├───▶ tools/runner.ts (depends on: registry, safety/sandbox)
├───▶ tools/linter.ts (depends on: registry, tools/runner)
└───▶ tools/test-runner.ts (depends on: registry, tools/runner)
│
▼
LAYER 4: Agent Layer (Depends on core + tools + memory)
═════════════════════════════════════════════════════════
┌────────────────┐
│ agents/base │ (depends on: core/types, tools/registry, memory/*, safety/*)
└────────────────┘
│
│ (All agents extend BaseAgent)
│
├───▶ agents/planner.ts (depends on: base, tools/llm, tools/git)
├───▶ agents/implementer.ts (depends on: base, tools/llm, tools/git, tools/runner)
├───▶ agents/reviewer.ts (depends on: base, tools/llm, tools/linter, tools/git)
├───▶ agents/tester.ts (depends on: base, tools/test-runner, tools/llm)
└───▶ agents/deployer.ts (depends on: base, tools/runner, tools/github)
│
▼
LAYER 5: Orchestration (Depends on agents + core + memory)
═══════════════════════════════════════════════════════════
┌────────────────────┐
│orchestrator/context│ (depends on: core/bus, core/types)
└────────────────────┘
│
▼
┌─────────────────────────┐
│orchestrator/checkpoint │ (depends on: memory/schema, core/types)
└─────────────────────────┘
│
▼
┌─────────────────────────┐
│orchestrator/pipeline │ (depends on: agents/*, context, checkpoint, safety/*)
└─────────────────────────┘
│
▼
LAYER 6: CLI (Depends on everything)
═════════════════════════════════════
┌────────────────┐
│ cli/ui.ts │ (terminal formatting only)
│ cli/prompts.ts│
└────────────────┘
│
▼
┌────────────────┐
│ cli/commands/* │ (depends on: orchestrator/*, agents/*, core/*, ui, prompts)
└────────────────┘
│
▼
┌────────────────┐
│ cli/index.ts │ (CLI entry point, command registration)
└────────────────┘
Circular Dependency Risks
| Potential Cycle | Risk | Prevention |
|---|---|---|
| agents ↔ tools | Low | Tools are stateless, agents consume via registry |
| memory ↔ bus | Medium | Bus writes to memory but doesn't read from it |
| orchestrator ↔ agents | Medium | Orchestrator calls agents, agents emit events to bus (not to orchestrator) |
| tools ↔ safety | Low | Tools use safety for sandboxing, safety doesn't import tools |
Prevention Strategy:
- All circular risks flow through the event bus (one-way communication)
- Agents never import orchestrator
- Tools never import agents
- All cross-module communication via interfaces from core/types
3. src/core/ Module
File: core/types.ts
Purpose: Central type definitions for the entire system.
Exports:
typescript// ─── Agent Types ─────────────────────────────────────────── export type AgentType = 'planner' | 'implementer' | 'reviewer' | 'tester' | 'deployer'; export interface Agent { id: string; type: AgentType; execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput>; } export interface AgentContext { traceId: string; bus: EventBus; memory: MemoryStore; llm: LLMProvider; tools: ToolRegistry; safety: SafetyControls; config: ForgeConfig; cost: CostTracker; elapsed: number; // milliseconds since phase started } export interface PhaseInput { task: string; previousPhaseOutput?: unknown; context: Record<string, unknown>; } export interface PhaseOutput { success: boolean; result: unknown; cost: number; duration: number; iterations: number; learnings?: Learning[]; } // ─── Tool Types ──────────────────────────────────────────── export interface Tool<TInput = unknown, TOutput = unknown> { name: string; description: string; schema: { input: ZodSchema<TInput>; output: ZodSchema<TOutput>; }; execute(input: TInput, ctx: ToolContext): Promise<TOutput>; } export interface ToolContext { traceId: string; agentId: string; workingDir: string; timeout?: number; } // ─── Event Types ─────────────────────────────────────────── export interface ForgeEvent { id: string; traceId: string; timestamp: Date; source: string; // agent or component id type: string; // dot-namespaced: "review.finding" payload: unknown; cost?: { tokens: number; usd: number }; } // ─── Phase Types ─────────────────────────────────────────── export type PhaseName = 'planning' | 'implementation' | 'review' | 'testing' | 'deployment'; export interface Phase { name: PhaseName; agent: Agent; guards: Guard[]; // Pre-conditions gates: HumanGate[]; // Human approval points breakers: CircuitBreaker[]; next: PhaseName | null; } export interface Guard { name: string; check(context: unknown): Promise<boolean>; failureMessage: string; } // ─── Memory Types ────────────────────────────────────────── export type MemoryType = 'episodic' | 'semantic' | 'procedural'; export interface Memory { id: string; type: MemoryType; content: string; embedding?: Float32Array; confidence: number; context: string; createdAt: Date; lastAccessed: Date; accessCount: number; } export interface Learning { type: MemoryType; content: string; context: string; confidence: number; source?: string; } // ─── Checkpoint Types ────────────────────────────────────── export interface Checkpoint { id: string; traceId: string; phase: PhaseName; state: Record<string, unknown>; timestamp: Date; } // ─── Config Types ────────────────────────────────────────── export interface ForgeConfig { name: string; language: string; llm: LLMConfig; tools: ToolsConfig; safety: SafetyConfig; github?: GitHubConfig; memory: MemoryConfig; } // ... (detailed config types)
Dependencies:
zod(for schema validation)- No internal dependencies
File: core/errors.ts
Purpose: Error taxonomy and custom error classes.
Exports:
typescript// ─── Error Categories ────────────────────────────────────── export enum ErrorCategory { CONFIG = 'config', TOOL = 'tool', AGENT = 'agent', SAFETY = 'safety', MEMORY = 'memory', NETWORK = 'network', LLM = 'llm', } export enum ErrorSeverity { INFO = 'info', WARNING = 'warning', ERROR = 'error', CRITICAL = 'critical', } // ─── Base Error Class ────────────────────────────────────── export class ForgeError extends Error { category: ErrorCategory; severity: ErrorSeverity; recoverable: boolean; context?: Record<string, unknown>; constructor( message: string, category: ErrorCategory, severity: ErrorSeverity, recoverable: boolean = false, context?: Record<string, unknown> ) { super(message); this.name = 'ForgeError'; this.category = category; this.severity = severity; this.recoverable = recoverable; this.context = context; } } // ─── Specific Error Types ────────────────────────────────── export class CircuitBreakerError extends ForgeError { constructor(message: string, context?: Record<string, unknown>) { super(message, ErrorCategory.SAFETY, ErrorSeverity.CRITICAL, false, context); this.name = 'CircuitBreakerError'; } } export class ToolExecutionError extends ForgeError { constructor(message: string, recoverable = true, context?: Record<string, unknown>) { super(message, ErrorCategory.TOOL, ErrorSeverity.ERROR, recoverable, context); this.name = 'ToolExecutionError'; } } export class LLMError extends ForgeError { constructor(message: string, recoverable = true, context?: Record<string, unknown>) { super(message, ErrorCategory.LLM, ErrorSeverity.ERROR, recoverable, context); this.name = 'LLMError'; } } // ... (more error types) // ─── Error Utilities ─────────────────────────────────────── export function isRecoverable(error: Error): boolean { return error instanceof ForgeError && error.recoverable; } export function formatError(error: Error): string { // Human-readable error formatting }
Dependencies:
- None
File: core/config.ts
Purpose: Runtime configuration loading, validation, and defaults.
Exports:
typescriptimport { z } from 'zod'; import type { ForgeConfig } from './types'; // ─── Schema Validation ───────────────────────────────────── const ForgeConfigSchema = z.object({ name: z.string(), language: z.string().default('typescript'), llm: z.object({ provider: z.enum(['anthropic', 'openai', 'ollama']), model: z.string(), fastModel: z.string().optional(), }), tools: z.object({ testCommand: z.string().default('bun test'), lintCommand: z.string().default('bun run lint'), buildCommand: z.string().default('bun run build'), typecheckCommand: z.string().default('bun run typecheck'), }), safety: z.object({ costPerRun: z.number().default(50), costPerDay: z.number().default(200), automationLevel: z.number().min(0).max(4).default(1), }), // ... more config }); // ─── Default Configuration ───────────────────────────────── export const DEFAULT_CONFIG: ForgeConfig = { name: 'forge-project', language: 'typescript', llm: { provider: 'anthropic', model: 'claude-sonnet-4-5-20250929', fastModel: 'claude-haiku-4-5-20251001', }, // ... defaults for all fields }; // ─── Config Loader ───────────────────────────────────────── export async function loadConfig(path?: string): Promise<ForgeConfig> { // 1. Load from forge.config.ts if exists // 2. Merge with defaults // 3. Validate with Zod // 4. Return validated config } export function validateConfig(config: unknown): ForgeConfig { return ForgeConfigSchema.parse(config); }
Dependencies:
zod./types
File: core/bus.ts
Purpose: In-memory event bus with SQLite persistence.
Exports:
typescriptimport type { ForgeEvent } from './types'; import type { Database } from '../memory/schema'; export type EventHandler = (event: ForgeEvent) => void | Promise<void>; export interface EventBus { emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void>; on(type: string, handler: EventHandler): () => void; // returns unsubscribe once(type: string, handler: EventHandler): () => void; replay(traceId: string): Promise<ForgeEvent[]>; clear(): void; // for testing } export class InMemoryEventBus implements EventBus { private handlers = new Map<string, Set<EventHandler>>(); private db: Database; constructor(db: Database) { this.db = db; } async emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void> { const full: ForgeEvent = { ...event, id: ulid(), timestamp: new Date(), }; // Persist to SQLite await this.db.insert(events).values(full); // Notify in-memory subscribers const typeHandlers = this.handlers.get(event.type) ?? new Set(); const wildcardHandlers = this.handlers.get('*') ?? new Set(); for (const handler of [...typeHandlers, ...wildcardHandlers]) { await handler(full); } } on(type: string, handler: EventHandler): () => void { if (!this.handlers.has(type)) { this.handlers.set(type, new Set()); } this.handlers.get(type)!.add(handler); // Return unsubscribe function return () => this.handlers.get(type)?.delete(handler); } once(type: string, handler: EventHandler): () => void { const wrapper = async (event: ForgeEvent) => { await handler(event); this.handlers.get(type)?.delete(wrapper); }; return this.on(type, wrapper); } async replay(traceId: string): Promise<ForgeEvent[]> { return this.db .select() .from(events) .where(eq(events.traceId, traceId)) .orderBy(events.timestamp); } clear(): void { this.handlers.clear(); } }
Dependencies:
./types../memory/schema(for events table)ulid(for ID generation)drizzle-orm
File: core/index.ts
Purpose: Public API of core module.
Strategy: Explicit re-exports, no barrel files.
typescript// Explicit re-exports for tree-shaking export type { Agent, AgentType, AgentContext, PhaseInput, PhaseOutput, Tool, ToolContext, ForgeEvent, Phase, PhaseName, Guard, Memory, MemoryType, Learning, Checkpoint, ForgeConfig, } from './types'; export { ErrorCategory, ErrorSeverity, ForgeError, CircuitBreakerError, ToolExecutionError, LLMError, isRecoverable, formatError, } from './errors'; export { DEFAULT_CONFIG, loadConfig, validateConfig, } from './config'; export type { EventBus, EventHandler } from './bus'; export { InMemoryEventBus } from './bus';
4. src/safety/ Module
File: safety/breakers.ts
Purpose: Circuit breaker implementations.
Exports:
typescriptimport type { AgentContext } from '../core/types'; import { CircuitBreakerError } from '../core/errors'; export interface CircuitBreaker { name: string; check(ctx: BreakerContext): BreakerResult; } export interface BreakerContext { iteration: number; cost: number; elapsed: number; errorCount: number; phase: string; } export interface BreakerResult { shouldBreak: boolean; reason?: string; metric?: string; value?: number; limit?: number; } // ─── Specific Breakers ───────────────────────────────────── export class IterationBreaker implements CircuitBreaker { name = 'iteration'; constructor(private limits: Record<string, number>) {} check(ctx: BreakerContext): BreakerResult { const limit = this.limits[ctx.phase] ?? 10; if (ctx.iteration > limit) { return { shouldBreak: true, reason: `Iteration limit exceeded for ${ctx.phase}`, metric: 'iterations', value: ctx.iteration, limit, }; } return { shouldBreak: false }; } } export class CostBreaker implements CircuitBreaker { name = 'cost'; constructor(private limits: Record<string, number>) {} check(ctx: BreakerContext): BreakerResult { const limit = this.limits[ctx.phase] ?? 5.0; if (ctx.cost > limit) { return { shouldBreak: true, reason: `Cost limit exceeded for ${ctx.phase}`, metric: 'cost_usd', value: ctx.cost, limit, }; } return { shouldBreak: false }; } } export class TimeBreaker implements CircuitBreaker { name = 'time'; constructor(private limits: Record<string, number>) {} check(ctx: BreakerContext): BreakerResult { const limit = this.limits[ctx.phase] ?? 30 * 60_000; // 30 minutes if (ctx.elapsed > limit) { return { shouldBreak: true, reason: `Time limit exceeded for ${ctx.phase}`, metric: 'duration_ms', value: ctx.elapsed, limit, }; } return { shouldBreak: false }; } } export class ErrorRateBreaker implements CircuitBreaker { name = 'error_rate'; constructor(private threshold: number = 0.25) {} check(ctx: BreakerContext): BreakerResult { if (ctx.iteration === 0) return { shouldBreak: false }; const rate = ctx.errorCount / ctx.iteration; if (rate > this.threshold) { return { shouldBreak: true, reason: `Error rate ${(rate * 100).toFixed(1)}% exceeds threshold`, metric: 'error_rate', value: rate, limit: this.threshold, }; } return { shouldBreak: false }; } } // ─── Breaker Manager ─────────────────────────────────────── export class BreakerManager { private breakers: CircuitBreaker[]; constructor(breakers: CircuitBreaker[]) { this.breakers = breakers; } check(ctx: BreakerContext): BreakerResult { for (const breaker of this.breakers) { const result = breaker.check(ctx); if (result.shouldBreak) { return result; } } return { shouldBreak: false }; } }
Dependencies:
../core/types../core/errors
File: safety/gates.ts
Purpose: Human approval gate system.
Exports:
typescriptimport type { ForgeEvent } from '../core/types'; import type { EventBus } from '../core/bus'; export interface HumanGate { id: string; phase: string; condition: (context: unknown) => boolean; prompt: string; timeout: number; // milliseconds } export interface GateRequest { gateId: string; phase: string; context: unknown; prompt: string; requestedAt: Date; deadline: Date; } export interface GateDecision { approved: boolean; feedback?: string; decidedAt: Date; } export class GateManager { private pendingGates = new Map<string, GateRequest>(); constructor( private gates: HumanGate[], private bus: EventBus, private requestHandler: (request: GateRequest) => Promise<GateDecision> ) {} async checkGates(phase: string, context: unknown): Promise<void> { const applicableGates = this.gates.filter( g => g.phase === phase && g.condition(context) ); for (const gate of applicableGates) { await this.requestApproval(gate, context); } } private async requestApproval(gate: HumanGate, context: unknown): Promise<void> { const request: GateRequest = { gateId: gate.id, phase: gate.phase, context, prompt: gate.prompt, requestedAt: new Date(), deadline: new Date(Date.now() + gate.timeout), }; this.pendingGates.set(gate.id, request); await this.bus.emit({ type: 'gate.requested', source: 'gate_manager', traceId: (context as any).traceId, payload: request, }); const decision = await this.requestHandler(request); this.pendingGates.delete(gate.id); await this.bus.emit({ type: decision.approved ? 'gate.approved' : 'gate.rejected', source: 'gate_manager', traceId: (context as any).traceId, payload: { request, decision }, }); if (!decision.approved) { throw new Error(`Human gate rejected: ${gate.id}. Feedback: ${decision.feedback}`); } } } // ─── Default Gates ───────────────────────────────────────── export const DEFAULT_GATES: HumanGate[] = [ { id: 'production_deploy', phase: 'deployment', condition: (ctx: any) => ctx.environment === 'production', prompt: 'Approve production deployment?', timeout: 60 * 60_000, // 1 hour }, { id: 'high_risk_change', phase: 'review', condition: (ctx: any) => ctx.riskScore?.level === 'high' || ctx.riskScore?.level === 'critical', prompt: 'High-risk change detected. Review required.', timeout: 24 * 60 * 60_000, // 24 hours }, { id: 'security_finding', phase: 'review', condition: (ctx: any) => ctx.findings?.some((f: any) => f.severity === 'critical' && f.category === 'security' ), prompt: 'Critical security finding. Human review required.', timeout: 12 * 60 * 60_000, // 12 hours }, ];
Dependencies:
../core/types../core/bus
File: safety/budget.ts
Purpose: Cost tracking and budget enforcement.
Exports:
typescriptexport interface CostTracker { addCost(amount: number, category: string): void; getTotal(): number; getByCategory(category: string): number; getRemainingBudget(): number; isOverBudget(): boolean; } export class SimpleCostTracker implements CostTracker { private costs: { amount: number; category: string; timestamp: Date }[] = []; constructor(private budget: number) {} addCost(amount: number, category: string): void { this.costs.push({ amount, category, timestamp: new Date() }); } getTotal(): number { return this.costs.reduce((sum, c) => sum + c.amount, 0); } getByCategory(category: string): number { return this.costs .filter(c => c.category === category) .reduce((sum, c) => sum + c.amount, 0); } getRemainingBudget(): number { return this.budget - this.getTotal(); } isOverBudget(): boolean { return this.getTotal() > this.budget; } getBreakdown(): Record<string, number> { const breakdown: Record<string, number> = {}; for (const cost of this.costs) { breakdown[cost.category] = (breakdown[cost.category] ?? 0) + cost.amount; } return breakdown; } }
Dependencies:
- None
File: safety/index.ts
typescriptexport type { CircuitBreaker, BreakerContext, BreakerResult } from './breakers'; export { IterationBreaker, CostBreaker, TimeBreaker, ErrorRateBreaker, BreakerManager, } from './breakers'; export type { HumanGate, GateRequest, GateDecision } from './gates'; export { GateManager, DEFAULT_GATES } from './gates'; export type { CostTracker } from './budget'; export { SimpleCostTracker } from './budget';
5. src/memory/ Module
File: memory/schema.ts
Purpose: Drizzle ORM schema definition.
Exports:
typescriptimport { sqliteTable, text, integer, real, blob } from 'drizzle-orm/sqlite-core'; // ─── Events Table ────────────────────────────────────────── export const events = sqliteTable('events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), source: text('source').notNull(), type: text('type').notNull(), phase: text('phase'), payload: text('payload', { mode: 'json' }), tokensUsed: integer('tokens_used'), costUsd: real('cost_usd'), durationMs: integer('duration_ms'), }); // ─── Memories Table ──────────────────────────────────────── export const memories = sqliteTable('memories', { id: text('id').primaryKey(), type: text('type').notNull(), // episodic | semantic | procedural content: text('content').notNull(), context: text('context').notNull(), embedding: blob('embedding'), confidence: real('confidence').notNull(), source: text('source'), tags: text('tags', { mode: 'json' }), createdAt: integer('created_at', { mode: 'timestamp_ms' }).notNull(), lastAccessed: integer('last_accessed', { mode: 'timestamp_ms' }).notNull(), accessCount: integer('access_count').notNull().default(0), }); // ─── Patterns Table ──────────────────────────────────────── export const patterns = sqliteTable('patterns', { id: text('id').primaryKey(), type: text('type').notNull(), // success | failure | approach trigger: text('trigger').notNull(), pattern: text('pattern').notNull(), resolution: text('resolution'), frequency: integer('frequency').notNull().default(1), successRate: real('success_rate'), confidence: real('confidence').notNull(), lastSeen: integer('last_seen', { mode: 'timestamp_ms' }).notNull(), }); // ─── Checkpoints Table ───────────────────────────────────── export const checkpoints = sqliteTable('checkpoints', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), phase: text('phase').notNull(), state: text('state', { mode: 'json' }).notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), }); // ─── Runs Table ──────────────────────────────────────────── export const runs = sqliteTable('runs', { id: text('id').primaryKey(), // = traceId task: text('task').notNull(), status: text('status').notNull(), // pending | running | completed | failed | cancelled currentPhase: text('current_phase'), config: text('config', { mode: 'json' }), startedAt: integer('started_at', { mode: 'timestamp_ms' }).notNull(), completedAt: integer('completed_at', { mode: 'timestamp_ms' }), totalCostUsd: real('total_cost_usd').default(0), totalTokens: integer('total_tokens').default(0), error: text('error'), }); // ─── Findings Table ──────────────────────────────────────── export const findings = sqliteTable('findings', { id: text('id').primaryKey(), runId: text('run_id').notNull(), phase: text('phase').notNull(), severity: text('severity').notNull(), category: text('category').notNull(), message: text('message').notNull(), file: text('file'), line: integer('line'), confidence: real('confidence'), fixable: integer('fixable', { mode: 'boolean' }), fix: text('fix'), dismissed: integer('dismissed', { mode: 'boolean' }).default(false), dismissedBy: text('dismissed_by'), }); // ─── Type Exports ────────────────────────────────────────── export type Event = typeof events.$inferSelect; export type NewEvent = typeof events.$inferInsert; export type Memory = typeof memories.$inferSelect; export type NewMemory = typeof memories.$inferInsert; export type Pattern = typeof patterns.$inferSelect; export type NewPattern = typeof patterns.$inferInsert; export type Checkpoint = typeof checkpoints.$inferSelect; export type NewCheckpoint = typeof checkpoints.$inferInsert; export type Run = typeof runs.$inferSelect; export type NewRun = typeof runs.$inferInsert; export type Finding = typeof findings.$inferSelect; export type NewFinding = typeof findings.$inferInsert;
Dependencies:
drizzle-orm/sqlite-core
File: memory/store.ts
Purpose: Base memory CRUD + similarity search.
Exports:
typescriptimport { drizzle, type BunSQLiteDatabase } from 'drizzle-orm/bun-sqlite'; import { Database } from 'bun:sqlite'; import { eq, desc, and } from 'drizzle-orm'; import { memories, type Memory, type NewMemory } from './schema'; import type { MemoryType } from '../core/types'; export interface MemoryQuery { context?: string; type?: MemoryType; tags?: string[]; minConfidence?: number; limit?: number; } export interface MemoryStore { store(memory: NewMemory): Promise<Memory>; recall(query: MemoryQuery): Promise<Memory[]>; update(id: string, updates: Partial<Memory>): Promise<void>; delete(id: string): Promise<void>; consolidate(): Promise<void>; } export class SQLiteMemoryStore implements MemoryStore { private db: BunSQLiteDatabase; constructor(dbPath: string) { const sqlite = new Database(dbPath); this.db = drizzle(sqlite); } async store(memory: NewMemory): Promise<Memory> { const [inserted] = await this.db .insert(memories) .values({ ...memory, id: ulid(), createdAt: new Date(), lastAccessed: new Date(), accessCount: 0, }) .returning(); return inserted; } async recall(query: MemoryQuery): Promise<Memory[]> { const conditions = []; if (query.type) { conditions.push(eq(memories.type, query.type)); } if (query.minConfidence !== undefined) { conditions.push(gte(memories.confidence, query.minConfidence)); } const results = await this.db .select() .from(memories) .where(conditions.length > 0 ? and(...conditions) : undefined) .orderBy(desc(memories.confidence)) .limit(query.limit ?? 10); // Update access tracking for (const result of results) { await this.db .update(memories) .set({ lastAccessed: new Date(), accessCount: result.accessCount + 1, }) .where(eq(memories.id, result.id)); } // TODO: If query.context provided, do similarity search on embeddings // For MVP, just return by confidence return results; } async update(id: string, updates: Partial<Memory>): Promise<void> { await this.db .update(memories) .set(updates) .where(eq(memories.id, id)); } async delete(id: string): Promise<void> { await this.db .delete(memories) .where(eq(memories.id, id)); } async consolidate(): Promise<void> { // Consolidation logic: merge similar memories, prune low-confidence // MVP: just delete memories with confidence < 0.2 await this.db .delete(memories) .where(lt(memories.confidence, 0.2)); } }
Dependencies:
./schema../core/typesdrizzle-ormbun:sqliteulid
File: memory/episodes.ts
Purpose: Episodic memory manager.
Exports:
typescriptimport type { MemoryStore } from './store'; import type { ForgeEvent } from '../core/types'; export class EpisodicMemoryManager { constructor(private store: MemoryStore) {} async storeEpisode(event: ForgeEvent): Promise<void> { // Store significant events as episodic memories if (this.isSignificant(event)) { await this.store.store({ type: 'episodic', content: `Event: ${event.type}. ${JSON.stringify(event.payload)}`, context: event.source, confidence: 0.7, source: event.id, tags: [event.type.split('.')[0]], // e.g., "review" from "review.finding" }); } } private isSignificant(event: ForgeEvent): boolean { // Determine if event is worth storing const significantTypes = [ 'phase.completed', 'phase.failed', 'gate.approved', 'gate.rejected', 'breaker.tripped', 'finding.detected', 'test.failed', ]; return significantTypes.some(type => event.type.startsWith(type)); } async recallSimilarEpisodes(context: string, limit = 5): Promise<any[]> { return this.store.recall({ type: 'episodic', context, limit, }); } }
Dependencies:
./store../core/types
File: memory/patterns.ts
Purpose: Semantic memory (pattern extraction).
Exports:
typescriptimport type { MemoryStore } from './store'; import { patterns, type Pattern, type NewPattern } from './schema'; import { drizzle } from 'drizzle-orm/bun-sqlite'; export class PatternManager { constructor( private memoryStore: MemoryStore, private db: any // Drizzle DB instance ) {} async extractPattern( type: 'success' | 'failure' | 'approach', trigger: string, patternDescription: string, resolution?: string ): Promise<Pattern> { const [inserted] = await this.db .insert(patterns) .values({ id: ulid(), type, trigger, pattern: patternDescription, resolution, frequency: 1, confidence: 0.5, // Low initial confidence lastSeen: new Date(), }) .returning(); return inserted; } async incrementPattern(patternId: string, success: boolean): Promise<void> { const pattern = await this.db .select() .from(patterns) .where(eq(patterns.id, patternId)) .get(); if (!pattern) return; const newFrequency = pattern.frequency + 1; const successRate = success ? ((pattern.successRate ?? 0.5) * pattern.frequency + 1) / newFrequency : ((pattern.successRate ?? 0.5) * pattern.frequency) / newFrequency; await this.db .update(patterns) .set({ frequency: newFrequency, successRate, confidence: Math.min(0.95, pattern.confidence + 0.05), // Gradual confidence increase lastSeen: new Date(), }) .where(eq(patterns.id, patternId)); } async findMatchingPattern(trigger: string): Promise<Pattern | null> { // MVP: simple substring match // Future: embedding similarity const allPatterns = await this.db .select() .from(patterns) .where(gte(patterns.confidence, 0.6)); return allPatterns.find(p => trigger.toLowerCase().includes(p.trigger.toLowerCase()) ) ?? null; } }
Dependencies:
./store./schemadrizzle-ormulid
File: memory/procedures.ts
Purpose: Procedural memory (strategies that work).
Exports:
typescriptimport type { MemoryStore } from './store'; export interface Procedure { id: string; name: string; context: string; steps: string[]; successRate: number; timesUsed: number; } export class ProceduralMemoryManager { constructor(private store: MemoryStore) {} async storeProcedure( name: string, context: string, steps: string[] ): Promise<void> { await this.store.store({ type: 'procedural', content: `Procedure: ${name}. Steps: ${steps.join(' → ')}`, context, confidence: 0.6, tags: ['procedure', name], }); } async recallProcedure(context: string): Promise<any[]> { return this.store.recall({ type: 'procedural', context, limit: 3, minConfidence: 0.6, }); } async reinforceProcedure(procedureId: string): Promise<void> { // Increase confidence when procedure succeeds await this.store.update(procedureId, { confidence: Math.min(0.95, (await this.getProcedure(procedureId))!.confidence + 0.1), }); } private async getProcedure(id: string): Promise<any | null> { const results = await this.store.recall({ limit: 1000 }); return results.find(m => m.id === id) ?? null; } }
Dependencies:
./store
File: memory/consolidate.ts
Purpose: Knowledge consolidation and pruning.
Exports:
typescriptimport type { MemoryStore } from './store'; import type { PatternManager } from './patterns'; export class ConsolidationJob { constructor( private memoryStore: MemoryStore, private patternManager: PatternManager ) {} async run(): Promise<ConsolidationReport> { const report: ConsolidationReport = { memoriesPruned: 0, patternsPromoted: 0, duplicatesMerged: 0, timestamp: new Date(), }; // 1. Prune low-confidence memories await this.memoryStore.consolidate(); // (consolidate() already deletes confidence < 0.2) // 2. Decay unused memories await this.decayUnusedMemories(); // 3. TODO: Merge duplicate patterns // 4. TODO: Promote frequent episodes to patterns return report; } private async decayUnusedMemories(): Promise<void> { // Decrease confidence for memories not accessed in 30 days const thirtyDaysAgo = Date.now() - 30 * 24 * 60 * 60 * 1000; const allMemories = await this.memoryStore.recall({ limit: 10000 }); for (const memory of allMemories) { if (memory.lastAccessed.getTime() < thirtyDaysAgo) { await this.memoryStore.update(memory.id, { confidence: Math.max(0, memory.confidence - 0.05), }); } } } } export interface ConsolidationReport { memoriesPruned: number; patternsPromoted: number; duplicatesMerged: number; timestamp: Date; }
Dependencies:
./store./patterns
File: memory/index.ts
typescriptexport type { MemoryStore, MemoryQuery } from './store'; export { SQLiteMemoryStore } from './store'; export { EpisodicMemoryManager } from './episodes'; export { PatternManager } from './patterns'; export { ProceduralMemoryManager } from './procedures'; export { ConsolidationJob } from './consolidate'; export * from './schema';
6. src/tools/ Module
File: tools/registry.ts
Purpose: Tool registry and discovery.
Exports:
typescriptimport type { Tool, ToolContext } from '../core/types'; import { ToolExecutionError } from '../core/errors'; export interface ToolRegistry { register(tool: Tool): void; unregister(name: string): void; get(name: string): Tool | undefined; list(): Tool[]; execute<TInput, TOutput>( name: string, input: TInput, ctx: ToolContext ): Promise<TOutput>; } export class SimpleToolRegistry implements ToolRegistry { private tools = new Map<string, Tool>(); register(tool: Tool): void { if (this.tools.has(tool.name)) { throw new Error(`Tool ${tool.name} already registered`); } this.tools.set(tool.name, tool); } unregister(name: string): void { this.tools.delete(name); } get(name: string): Tool | undefined { return this.tools.get(name); } list(): Tool[] { return Array.from(this.tools.values()); } async execute<TInput, TOutput>( name: string, input: TInput, ctx: ToolContext ): Promise<TOutput> { const tool = this.tools.get(name); if (!tool) { throw new ToolExecutionError(`Tool ${name} not found`, false); } try { // Validate input tool.schema.input.parse(input); // Execute const result = await tool.execute(input, ctx); // Validate output tool.schema.output.parse(result); return result as TOutput; } catch (error) { throw new ToolExecutionError( `Tool ${name} execution failed: ${(error as Error).message}`, true, { tool: name, input } ); } } }
Dependencies:
../core/types../core/errors
File: tools/sandbox.ts
Purpose: Execution sandboxing (placeholder for MVP).
Exports:
typescriptimport type { ToolContext } from '../core/types'; export interface Sandbox { execute<T>(fn: () => Promise<T>, ctx: ToolContext): Promise<T>; } export class NoopSandbox implements Sandbox { async execute<T>(fn: () => Promise<T>, ctx: ToolContext): Promise<T> { // MVP: no actual sandboxing, just timeout return Promise.race([ fn(), new Promise<T>((_, reject) => setTimeout(() => reject(new Error('Execution timeout')), ctx.timeout ?? 60000) ), ]); } } // Future: Docker-based sandbox, resource limits, network isolation
Dependencies:
../core/types
File: tools/llm.ts
Purpose: LLM provider abstraction.
Exports:
typescriptimport Anthropic from '@anthropic-ai/sdk'; import type { Tool, ToolContext } from '../core/types'; import { z } from 'zod'; export interface LLMProvider { chat(request: ChatRequest): Promise<ChatResponse>; embed(text: string): Promise<Float32Array>; } export interface ChatRequest { system: string; messages: Message[]; tools?: ToolSchema[]; temperature?: number; maxTokens?: number; } export interface Message { role: 'user' | 'assistant'; content: string; } export interface ChatResponse { content: string; toolCalls?: ToolCall[]; done: boolean; result?: unknown; usage: { promptTokens: number; completionTokens: number }; cost: number; } export interface ToolCall { name: string; input: unknown; } // ─── Anthropic Provider ──────────────────────────────────── export class AnthropicProvider implements LLMProvider { private client: Anthropic; constructor(apiKey: string) { this.client = new Anthropic({ apiKey }); } async chat(request: ChatRequest): Promise<ChatResponse> { const response = await this.client.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: request.maxTokens ?? 4096, system: request.system, messages: request.messages.map(m => ({ role: m.role, content: m.content, })), tools: request.tools?.map(t => ({ name: t.name, description: t.description, input_schema: t.schema, })), }); // Parse response const content = response.content[0]; const toolCalls: ToolCall[] = []; if (content.type === 'tool_use') { toolCalls.push({ name: content.name, input: content.input, }); } const cost = this.calculateCost(response.usage); return { content: content.type === 'text' ? content.text : '', toolCalls: toolCalls.length > 0 ? toolCalls : undefined, done: response.stop_reason === 'end_turn', usage: { promptTokens: response.usage.input_tokens, completionTokens: response.usage.output_tokens, }, cost, }; } async embed(text: string): Promise<Float32Array> { // MVP: not implemented // Future: use Voyage AI or similar throw new Error('Embedding not implemented'); } private calculateCost(usage: { input_tokens: number; output_tokens: number }): number { // Claude Sonnet 4.5 pricing (as of 2025) const INPUT_COST_PER_1M = 3.0; const OUTPUT_COST_PER_1M = 15.0; return ( (usage.input_tokens / 1_000_000) * INPUT_COST_PER_1M + (usage.output_tokens / 1_000_000) * OUTPUT_COST_PER_1M ); } } // ─── LLM Tool Wrapper ────────────────────────────────────── export function createLLMTool(provider: LLMProvider): Tool { return { name: 'llm_chat', description: 'Send a message to the LLM and get a response', schema: { input: z.object({ system: z.string(), messages: z.array(z.object({ role: z.enum(['user', 'assistant']), content: z.string(), })), }), output: z.object({ content: z.string(), toolCalls: z.array(z.object({ name: z.string(), input: z.unknown(), })).optional(), }), }, async execute(input, ctx) { return provider.chat(input); }, }; }
Dependencies:
../core/types@anthropic-ai/sdkzod
File: tools/git.ts
Purpose: Git operations.
Exports:
typescriptimport type { Tool, ToolContext } from '../core/types'; import { z } from 'zod'; import { $ } from 'bun'; export interface GitDiff { files: GitFileDiff[]; insertions: number; deletions: number; } export interface GitFileDiff { path: string; status: 'added' | 'modified' | 'deleted'; diff: string; } export const gitDiffTool: Tool<{ ref?: string }, GitDiff> = { name: 'git_diff', description: 'Get git diff for current working directory', schema: { input: z.object({ ref: z.string().optional(), // e.g., "HEAD", "main" }), output: z.object({ files: z.array(z.object({ path: z.string(), status: z.enum(['added', 'modified', 'deleted']), diff: z.string(), })), insertions: z.number(), deletions: z.number(), }), }, async execute(input, ctx) { const ref = input.ref ?? 'HEAD'; // Get diff const diffOutput = await $`git diff ${ref}`.cwd(ctx.workingDir).text(); // Parse diff const files = parseDiff(diffOutput); const insertions = files.reduce((sum, f) => sum + (f.diff.match(/^\+/gm)?.length ?? 0), 0 ); const deletions = files.reduce((sum, f) => sum + (f.diff.match(/^-/gm)?.length ?? 0), 0 ); return { files, insertions, deletions }; }, }; export const gitLogTool: Tool<{ limit?: number }, any> = { name: 'git_log', description: 'Get recent git commit history', schema: { input: z.object({ limit: z.number().optional().default(10), }), output: z.array(z.object({ hash: z.string(), message: z.string(), author: z.string(), date: z.string(), })), }, async execute(input, ctx) { const output = await $`git log --pretty=format:"%H|%an|%ad|%s" -n ${input.limit}` .cwd(ctx.workingDir) .text(); return output.split('\n').map(line => { const [hash, author, date, message] = line.split('|'); return { hash, author, date, message }; }); }, }; function parseDiff(diffOutput: string): GitFileDiff[] { // Simple diff parser // MVP: just extract file paths and full diff // Future: parse hunks properly const files: GitFileDiff[] = []; const fileSections = diffOutput.split('diff --git'); for (const section of fileSections.slice(1)) { const pathMatch = section.match(/a\/(.+?) b\/(.+)/); if (!pathMatch) continue; const path = pathMatch[2]; const status = section.includes('new file') ? 'added' : section.includes('deleted file') ? 'deleted' : 'modified'; files.push({ path, status, diff: section }); } return files; }
Dependencies:
../core/typeszodbun(for shell execution)
File: tools/github.ts
Purpose: GitHub API integration.
Exports:
typescriptimport { Octokit } from '@octokit/rest'; import type { Tool, ToolContext } from '../core/types'; import { z } from 'zod'; export class GitHubClient { private octokit: Octokit; constructor(token: string) { this.octokit = new Octokit({ auth: token }); } async createPR(params: { owner: string; repo: string; title: string; body: string; head: string; base: string; }) { return this.octokit.pulls.create(params); } async postReviewComment(params: { owner: string; repo: string; pull_number: number; body: string; path?: string; line?: number; }) { if (params.path && params.line) { return this.octokit.pulls.createReviewComment({ ...params, commit_id: '', // Need to get from PR side: 'RIGHT', start_line: params.line, }); } else { return this.octokit.pulls.createReview({ owner: params.owner, repo: params.repo, pull_number: params.pull_number, body: params.body, event: 'COMMENT', }); } } async updateCheckStatus(params: { owner: string; repo: string; sha: string; name: string; status: 'queued' | 'in_progress' | 'completed'; conclusion?: 'success' | 'failure' | 'neutral' | 'cancelled'; summary: string; }) { return this.octokit.checks.create({ owner: params.owner, repo: params.repo, head_sha: params.sha, name: params.name, status: params.status, conclusion: params.conclusion, output: { title: params.name, summary: params.summary, }, }); } } export function createGitHubTools(client: GitHubClient): Tool[] { return [ { name: 'github_post_comment', description: 'Post a comment on a GitHub PR', schema: { input: z.object({ owner: z.string(), repo: z.string(), pull_number: z.number(), body: z.string(), }), output: z.object({ success: z.boolean() }), }, async execute(input) { await client.postReviewComment(input); return { success: true }; }, }, // ... more tools ]; }
Dependencies:
../core/types@octokit/restzod
File: tools/runner.ts
Purpose: Shell command execution.
Exports:
typescriptimport type { Tool, ToolContext } from '../core/types'; import { z } from 'zod'; import { $ } from 'bun'; export const shellTool: Tool<{ command: string; args?: string[] }, { stdout: string; stderr: string; exitCode: number }> = { name: 'shell', description: 'Execute a shell command', schema: { input: z.object({ command: z.string(), args: z.array(z.string()).optional(), }), output: z.object({ stdout: z.string(), stderr: z.string(), exitCode: z.number(), }), }, async execute(input, ctx) { const args = input.args ?? []; const result = await $`${input.command} ${args}`.cwd(ctx.workingDir).nothrow(); return { stdout: await result.text(), stderr: result.stderr ? await result.stderr.text() : '', exitCode: result.exitCode ?? 0, }; }, };
Dependencies:
../core/typeszodbun
File: tools/linter.ts
Purpose: ESLint/Biome integration.
Exports:
typescriptimport type { Tool, ToolContext } from '../core/types'; import { z } from 'zod'; import { $ } from 'bun'; export interface LintIssue { file: string; line: number; column: number; severity: 'warning' | 'error'; message: string; rule: string; fixable: boolean; } export const lintTool: Tool<{ files?: string[] }, { issues: LintIssue[] }> = { name: 'lint', description: 'Run linter on files', schema: { input: z.object({ files: z.array(z.string()).optional(), }), output: z.object({ issues: z.array(z.object({ file: z.string(), line: z.number(), column: z.number(), severity: z.enum(['warning', 'error']), message: z.string(), rule: z.string(), fixable: z.boolean(), })), }), }, async execute(input, ctx) { const files = input.files ?? ['.']; // Try eslint first try { const result = await $`eslint --format json ${files}`.cwd(ctx.workingDir).nothrow(); const eslintOutput = JSON.parse(await result.text()); return { issues: parseEslintOutput(eslintOutput) }; } catch { // Fallback to biome or return empty return { issues: [] }; } }, }; function parseEslintOutput(output: any[]): LintIssue[] { const issues: LintIssue[] = []; for (const file of output) { for (const message of file.messages) { issues.push({ file: file.filePath, line: message.line, column: message.column, severity: message.severity === 2 ? 'error' : 'warning', message: message.message, rule: message.ruleId, fixable: message.fix !== undefined, }); } } return issues; }
Dependencies:
../core/typeszodbun
File: tools/test-runner.ts
Purpose: Jest/Vitest execution and parsing.
Exports:
typescriptimport type { Tool, ToolContext } from '../core/types'; import { z } from 'zod'; import { $ } from 'bun'; export interface TestResult { name: string; status: 'passed' | 'failed' | 'skipped'; duration: number; error?: string; } export interface TestSummary { total: number; passed: number; failed: number; skipped: number; duration: number; results: TestResult[]; } export const testTool: Tool<{ files?: string[] }, TestSummary> = { name: 'run_tests', description: 'Run test suite', schema: { input: z.object({ files: z.array(z.string()).optional(), }), output: z.object({ total: z.number(), passed: z.number(), failed: z.number(), skipped: z.number(), duration: z.number(), results: z.array(z.object({ name: z.string(), status: z.enum(['passed', 'failed', 'skipped']), duration: z.number(), error: z.string().optional(), })), }), }, async execute(input, ctx) { const files = input.files ?? []; // Try bun test first (since we're using Bun) const result = await $`bun test --reporter json ${files}`.cwd(ctx.workingDir).nothrow(); const output = await result.text(); // Parse JSON output try { const parsed = JSON.parse(output); return parseTestOutput(parsed); } catch { // Fallback: parse text output return { total: 0, passed: 0, failed: 0, skipped: 0, duration: 0, results: [], }; } }, }; function parseTestOutput(output: any): TestSummary { // MVP: simple parsing // Future: handle Jest, Vitest, Bun test formats const results: TestResult[] = []; let passed = 0; let failed = 0; let skipped = 0; // Parse based on structure // (implementation depends on test runner output format) return { total: results.length, passed, failed, skipped, duration: 0, results, }; }
Dependencies:
../core/typeszodbun
File: tools/index.ts
typescriptexport type { ToolRegistry } from './registry'; export { SimpleToolRegistry } from './registry'; export type { LLMProvider, ChatRequest, ChatResponse } from './llm'; export { AnthropicProvider, createLLMTool } from './llm'; export { gitDiffTool, gitLogTool } from './git'; export { GitHubClient, createGitHubTools } from './github'; export { shellTool } from './runner'; export { lintTool } from './linter'; export { testTool } from './test-runner';
7. src/agents/ Module
File: agents/base.ts
Purpose: BaseAgent with perceive-reason-act-learn loop.
Exports:
typescriptimport type { Agent, AgentType, AgentContext, PhaseInput, PhaseOutput, Tool, Memory, } from '../core/types'; import { CircuitBreakerError } from '../core/errors'; export interface WorkingMemory { messages: { role: 'user' | 'assistant'; content: string }[]; relevantMemories: Memory[]; } export abstract class BaseAgent implements Agent { abstract id: string; abstract type: AgentType; abstract tools: Tool[]; abstract systemPrompt: string; async execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput> { const startTime = Date.now(); let iteration = 0; let totalCost = 0; let workingMemory = await this.perceive(input, ctx); while (true) { iteration++; // ── Safety check ── const breakerResult = ctx.safety.check({ iteration, cost: totalCost, elapsed: Date.now() - startTime, errorCount: 0, // TODO: track errors phase: ctx.config.name, }); if (breakerResult.shouldBreak) { ctx.bus.emit({ type: `${this.type}.breaker_tripped`, source: this.id, traceId: ctx.traceId, payload: breakerResult, }); throw new CircuitBreakerError(breakerResult.reason ?? 'Circuit breaker tripped'); } // ── Reason: ask LLM what to do ── const decision = await ctx.llm.chat({ system: this.systemPrompt, messages: workingMemory.messages, tools: this.tools.map(t => ({ name: t.name, description: t.description, schema: t.schema.input, })), }); totalCost += decision.cost; ctx.cost.addCost(decision.cost, 'llm'); // ── Done? ── if (decision.done && decision.result) { ctx.bus.emit({ type: `${this.type}.completed`, source: this.id, traceId: ctx.traceId, payload: decision.result, }); await this.reflect(ctx, 'success'); return { success: true, result: decision.result, cost: totalCost, duration: Date.now() - startTime, iterations: iteration, }; } // ── Act: execute tool ── if (decision.toolCalls && decision.toolCalls.length > 0) { const toolCall = decision.toolCalls[0]; const result = await this.executeTool(toolCall.name, toolCall.input, ctx); // ── Learn: update working memory ── workingMemory.messages.push({ role: 'assistant', content: `Tool: ${toolCall.name}, Result: ${JSON.stringify(result)}`, }); } } } private async perceive(input: PhaseInput, ctx: AgentContext): Promise<WorkingMemory> { const relevantMemories = await ctx.memory.recall({ context: input.task, type: this.type as any, limit: 10, minConfidence: 0.6, }); return { messages: [ { role: 'user', content: this.buildPrompt(input, relevantMemories), }, ], relevantMemories, }; } protected abstract buildPrompt(input: PhaseInput, memories: Memory[]): string; private async executeTool(name: string, input: unknown, ctx: AgentContext): Promise<unknown> { try { return await ctx.tools.execute(name, input, { traceId: ctx.traceId, agentId: this.id, workingDir: process.cwd(), }); } catch (error) { await this.reflect(ctx, 'error', error as Error); throw error; } } private async reflect(ctx: AgentContext, outcome: string, error?: Error): Promise<void> { const reflectionPrompt = `Outcome: ${outcome}. ${error ? `Error: ${error.message}` : ''} What should we remember for next time?`; const reflection = await ctx.llm.chat({ system: 'You are a learning system. Extract key learnings from this execution.', messages: [{ role: 'user', content: reflectionPrompt }], }); // Store learnings if (reflection.content) { await ctx.memory.store({ type: 'procedural', content: reflection.content, context: `${this.type} agent reflection`, confidence: 0.5, }); } } }
Dependencies:
../core/types../core/errors
File: agents/planner.ts
Purpose: PlannerAgent implementation.
Exports:
typescriptimport { BaseAgent } from './base'; import type { AgentType, Tool, Memory, PhaseInput } from '../core/types'; import { gitDiffTool, gitLogTool } from '../tools/git'; export class PlannerAgent extends BaseAgent { id = 'planner'; type: AgentType = 'planner'; tools: Tool[] = [gitLogTool]; systemPrompt = `You are a software architecture planner. Your job: Given a natural language task, create an implementation plan. Output a JSON object with: - architecture: { components, interfaces, decisions } - tasks: ordered list of tasks with dependencies - risk: { level: 'low'|'medium'|'high'|'critical', reasons: [] } - estimates: { complexity: number, effort: string } Consider: - Existing codebase patterns (from git history) - Past learnings (from memory) - Dependencies and critical paths - Risk factors (breaking changes, new tech, complexity) `; protected buildPrompt(input: PhaseInput, memories: Memory[]): string { let prompt = `Task: ${input.task}\n\n`; if (memories.length > 0) { prompt += `Relevant learnings from past:\n`; for (const mem of memories) { prompt += `- ${mem.content}\n`; } prompt += `\n`; } prompt += `Create an implementation plan.`; return prompt; } }
Dependencies:
./base../core/types../tools/git
File: agents/implementer.ts
Purpose: ImplementerAgent implementation.
Exports:
typescriptimport { BaseAgent } from './base'; import type { AgentType, Tool, Memory, PhaseInput } from '../core/types'; import { gitDiffTool } from '../tools/git'; import { shellTool } from '../tools/runner'; export class ImplementerAgent extends BaseAgent { id = 'implementer'; type: AgentType = 'implementer'; tools: Tool[] = [gitDiffTool, shellTool]; systemPrompt = `You are a code implementation agent. Your job: Given an implementation plan, write code to accomplish the tasks. You have tools to: - Read files (use shell tool with cat/grep) - Write files (use shell tool) - Run typecheck/lint - Run tests For each task: 1. Read existing code to understand patterns 2. Implement the change 3. Self-validate (typecheck, tests) 4. Fix any issues found 5. When all tasks done, output JSON: { success: true, files: [...] } `; protected buildPrompt(input: PhaseInput, memories: Memory[]): string { return `Implementation Plan:\n${JSON.stringify(input.previousPhaseOutput, null, 2)}\n\nBegin implementation.`; } }
Dependencies:
./base../core/types../tools/*
File: agents/reviewer.ts
Purpose: ReviewerAgent with 3-layer review.
Exports:
typescriptimport { BaseAgent } from './base'; import type { AgentType, Tool, Memory, PhaseInput } from '../core/types'; import { gitDiffTool } from '../tools/git'; import { lintTool } from '../tools/linter'; export class ReviewerAgent extends BaseAgent { id = 'reviewer'; type: AgentType = 'reviewer'; tools: Tool[] = [gitDiffTool, lintTool]; systemPrompt = `You are a code review agent. Your job: Review code changes for quality, correctness, and security. Process: 1. Run static analysis (lint tool) 2. Check for security issues (patterns) 3. AI review for logic, edge cases, architecture fit Output JSON: { findings: [{ type, severity, message, file, line, fixable }], riskScore: { total: number, level: 'low'|'medium'|'high'|'critical' }, decision: 'approve' | 'request_changes' | 'require_human' } `; protected buildPrompt(input: PhaseInput, memories: Memory[]): string { return `Review this code:\n${JSON.stringify(input.previousPhaseOutput)}\n\nGenerate review.`; } }
Dependencies:
./base../core/types../tools/*
File: agents/tester.ts
Purpose: TesterAgent implementation.
Exports:
typescriptimport { BaseAgent } from './base'; import type { AgentType, Tool, Memory, PhaseInput } from '../core/types'; import { testTool } from '../tools/test-runner'; export class TesterAgent extends BaseAgent { id = 'tester'; type: AgentType = 'tester'; tools: Tool[] = [testTool]; systemPrompt = `You are a testing agent. Your job: Select and run tests, analyze failures. Process: 1. Identify tests covering changed files 2. Run selected tests 3. If failures, analyze root cause 4. Suggest fixes if possible Output JSON: { summary: { total, passed, failed, skipped }, failures: [{ test, error, rootCause, suggestedFix }], coverage: { line, branch } } `; protected buildPrompt(input: PhaseInput, memories: Memory[]): string { return `Code changes: ${JSON.stringify(input.previousPhaseOutput)}\n\nRun tests and analyze.`; } }
Dependencies:
./base../core/types../tools/test-runner
File: agents/deployer.ts
Purpose: DeployerAgent implementation.
Exports:
typescriptimport { BaseAgent } from './base'; import type { AgentType, Tool, Memory, PhaseInput } from '../core/types'; import { shellTool } from '../tools/runner'; export class DeployerAgent extends BaseAgent { id = 'deployer'; type: AgentType = 'deployer'; tools: Tool[] = [shellTool]; systemPrompt = `You are a deployment agent. Your job: Build and deploy validated code. Process: 1. Run build command 2. Request human approval (for production) 3. Deploy using configured strategy 4. Verify deployment health Output JSON: { status: 'deployed' | 'failed' | 'rolled_back', url: string, metrics: { errorRate, latency } } `; protected buildPrompt(input: PhaseInput, memories: Memory[]): string { return `Deploy this code: ${JSON.stringify(input.previousPhaseOutput)}\n\nBegin deployment.`; } }
Dependencies:
./base../core/types../tools/runner
File: agents/index.ts
typescriptexport { BaseAgent } from './base'; export { PlannerAgent } from './planner'; export { ImplementerAgent } from './implementer'; export { ReviewerAgent } from './reviewer'; export { TesterAgent } from './tester'; export { DeployerAgent } from './deployer';
8. src/orchestrator/ Module
File: orchestrator/context.ts
Purpose: Shared context across agents.
Exports:
typescriptimport type { EventBus } from '../core/bus'; export interface SharedContext { get<T>(key: string): T | undefined; set<T>(key: string, value: T): void; has(key: string): boolean; delete(key: string): void; clear(): void; snapshot(): Record<string, unknown>; } export class InMemoryContext implements SharedContext { private data = new Map<string, unknown>(); get<T>(key: string): T | undefined { return this.data.get(key) as T | undefined; } set<T>(key: string, value: T): void { this.data.set(key, value); } has(key: string): boolean { return this.data.has(key); } delete(key: string): void { this.data.delete(key); } clear(): void { this.data.clear(); } snapshot(): Record<string, unknown> { return Object.fromEntries(this.data.entries()); } }
Dependencies:
- None
File: orchestrator/checkpoint.ts
Purpose: State persistence and restore.
Exports:
typescriptimport type { Checkpoint as CheckpointType, PhaseName } from '../core/types'; import { checkpoints } from '../memory/schema'; import type { Database } from 'drizzle-orm/bun-sqlite'; import { ulid } from 'ulid'; export interface CheckpointManager { save(traceId: string, phase: PhaseName, state: Record<string, unknown>): Promise<CheckpointType>; restore(checkpointId: string): Promise<CheckpointType>; list(traceId: string): Promise<CheckpointType[]>; } export class SQLiteCheckpointManager implements CheckpointManager { constructor(private db: Database) {} async save(traceId: string, phase: PhaseName, state: Record<string, unknown>): Promise<CheckpointType> { const [checkpoint] = await this.db .insert(checkpoints) .values({ id: ulid(), traceId, phase, state, timestamp: new Date(), }) .returning(); return checkpoint; } async restore(checkpointId: string): Promise<CheckpointType> { const checkpoint = await this.db .select() .from(checkpoints) .where(eq(checkpoints.id, checkpointId)) .get(); if (!checkpoint) { throw new Error(`Checkpoint ${checkpointId} not found`); } return checkpoint; } async list(traceId: string): Promise<CheckpointType[]> { return this.db .select() .from(checkpoints) .where(eq(checkpoints.traceId, traceId)) .orderBy(checkpoints.timestamp); } }
Dependencies:
../core/types../memory/schemadrizzle-ormulid
File: orchestrator/pipeline.ts
Purpose: Pipeline state machine and phase execution.
Exports:
typescriptimport type { Phase, PhaseName, PhaseInput, PhaseOutput, AgentContext, } from '../core/types'; import type { EventBus } from '../core/bus'; import type { SharedContext } from './context'; import type { CheckpointManager } from './checkpoint'; import { ulid } from 'ulid'; export interface PipelineConfig { phases: Phase[]; maxBounces: number; } export class Pipeline { private traceId: string = ulid(); constructor( private config: PipelineConfig, private bus: EventBus, private context: SharedContext, private checkpointManager: CheckpointManager ) {} async execute(task: string, agentContext: AgentContext): Promise<void> { this.traceId = ulid(); await this.bus.emit({ type: 'run.started', source: 'pipeline', traceId: this.traceId, payload: { task }, }); let currentPhase: PhaseName | null = 'planning'; let phaseInput: PhaseInput = { task, context: {} }; while (currentPhase) { const phase = this.config.phases.find(p => p.name === currentPhase); if (!phase) throw new Error(`Phase ${currentPhase} not found`); await this.bus.emit({ type: 'phase.entered', source: 'pipeline', traceId: this.traceId, payload: { phase: currentPhase }, }); // Execute phase const output = await phase.agent.execute(phaseInput, { ...agentContext, traceId: this.traceId, }); // Save checkpoint await this.checkpointManager.save(this.traceId, currentPhase, { phaseOutput: output, context: this.context.snapshot(), }); // Prepare next phase phaseInput = { task, previousPhaseOutput: output.result, context: this.context.snapshot(), }; currentPhase = phase.next; } await this.bus.emit({ type: 'run.completed', source: 'pipeline', traceId: this.traceId, payload: { success: true }, }); } }
Dependencies:
../core/types../core/bus./context./checkpointulid
File: orchestrator/index.ts
typescriptexport type { SharedContext } from './context'; export { InMemoryContext } from './context'; export type { CheckpointManager } from './checkpoint'; export { SQLiteCheckpointManager } from './checkpoint'; export type { PipelineConfig } from './pipeline'; export { Pipeline } from './pipeline';
9. src/cli/ Module
File: cli/index.ts
Purpose: CLI entry point.
Exports:
typescript#!/usr/bin/env bun import { Command } from 'commander'; import { runCommand } from './commands/run'; import { reviewCommand } from './commands/review'; import { statusCommand } from './commands/status'; const program = new Command(); program .name('forge') .description('Agentic SDLC Orchestrator') .version('0.1.0'); program .command('run') .description('Run a task through the full SDLC pipeline') .argument('<task>', 'Task description') .action(runCommand); program .command('review') .description('Review a PR or code changes') .argument('<pr>', 'PR number or path') .action(reviewCommand); program .command('status') .description('Show current run status') .action(statusCommand); program.parse();
Dependencies:
commander./commands/*
File: cli/ui.ts
Purpose: Terminal output formatting.
Exports:
typescriptimport chalk from 'chalk'; export function log(message: string): void { console.log(message); } export function success(message: string): void { console.log(chalk.green('✓ ' + message)); } export function error(message: string): void { console.log(chalk.red('✗ ' + message)); } export function info(message: string): void { console.log(chalk.blue('ℹ ' + message)); } export function spinner(message: string): () => void { process.stdout.write(message + '... '); const interval = setInterval(() => { process.stdout.write('.'); }, 500); return () => { clearInterval(interval); process.stdout.write(' done\n'); }; } export function table(data: Record<string, string>[]): void { // Simple table rendering console.table(data); }
Dependencies:
chalk
File: cli/commands/run.ts
Purpose: forge run command.
Exports:
typescriptimport { Pipeline, InMemoryContext, SQLiteCheckpointManager } from '../../orchestrator'; import { InMemoryEventBus } from '../../core/bus'; import { SQLiteMemoryStore } from '../../memory'; import { PlannerAgent, ImplementerAgent, ReviewerAgent, TesterAgent, DeployerAgent } from '../../agents'; import { success, error, spinner } from '../ui'; import { loadConfig } from '../../core/config'; export async function runCommand(task: string): Promise<void> { const stop = spinner(`Starting pipeline for: ${task}`); try { // Initialize infrastructure const config = await loadConfig(); const db = /* initialize DB */; const bus = new InMemoryEventBus(db); const context = new InMemoryContext(); const checkpointManager = new SQLiteCheckpointManager(db); const memoryStore = new SQLiteMemoryStore(config.memory.dbPath); // Create agents const planner = new PlannerAgent(); const implementer = new ImplementerAgent(); const reviewer = new ReviewerAgent(); const tester = new TesterAgent(); const deployer = new DeployerAgent(); // Build pipeline const pipeline = new Pipeline( { phases: [ { name: 'planning', agent: planner, guards: [], gates: [], breakers: [], next: 'implementation' }, { name: 'implementation', agent: implementer, guards: [], gates: [], breakers: [], next: 'review' }, { name: 'review', agent: reviewer, guards: [], gates: [], breakers: [], next: 'testing' }, { name: 'testing', agent: tester, guards: [], gates: [], breakers: [], next: 'deployment' }, { name: 'deployment', agent: deployer, guards: [], gates: [], breakers: [], next: null }, ], maxBounces: 3, }, bus, context, checkpointManager ); // Execute await pipeline.execute(task, /* agent context */); stop(); success('Pipeline completed successfully'); } catch (err) { stop(); error(`Pipeline failed: ${(err as Error).message}`); process.exit(1); } }
Dependencies:
../../orchestrator../../core/*../../memory../../agents../ui
File: cli/commands/review.ts, status.ts, etc.
Similar structure to run.ts, implementing specific commands.
10. Module Interfaces
Import Strategy
No barrel files by default. Import explicitly from each module:
typescript// ✓ GOOD - Explicit imports import type { Agent, Tool } from './core/types'; import { InMemoryEventBus } from './core/bus'; import { SQLiteMemoryStore } from './memory/store'; // ✗ BAD - Barrel imports import { Agent, Tool } from './core'; // Would require index.ts barrel
Exception: Each module has an index.ts for convenience, but it's optional to use.
Public API per Module
core:
- Types:
Agent, Tool, ForgeEvent, Phase, Memory, Checkpoint, ForgeConfig - Classes:
InMemoryEventBus, ForgeError, CircuitBreakerError - Functions:
loadConfig, validateConfig
safety:
- Classes:
IterationBreaker, CostBreaker, TimeBreaker, ErrorRateBreaker, BreakerManager, GateManager, SimpleCostTracker - Constants:
DEFAULT_GATES
memory:
- Classes:
SQLiteMemoryStore, EpisodicMemoryManager, PatternManager, ProceduralMemoryManager, ConsolidationJob - Schema:
events, memories, patterns, checkpoints, runs, findingstables
tools:
- Classes:
SimpleToolRegistry, AnthropicProvider, GitHubClient - Tools:
gitDiffTool, gitLogTool, shellTool, lintTool, testTool - Functions:
createLLMTool, createGitHubTools
agents:
- Classes:
BaseAgent, PlannerAgent, ImplementerAgent, ReviewerAgent, TesterAgent, DeployerAgent
orchestrator:
- Classes:
Pipeline, InMemoryContext, SQLiteCheckpointManager
cli:
- Entry point:
cli/index.ts(executable) - Functions:
runCommand, reviewCommand, statusCommand
11. Initialization Order
Application Startup Sequence
1. Load configuration
└─ loadConfig('./forge.config.ts')
└─ Validate with Zod
└─ Merge with defaults
2. Initialize database
└─ Open SQLite connection
└─ Run migrations (Drizzle)
└─ Create tables if not exist
3. Initialize infrastructure
├─ Event bus (InMemoryEventBus)
├─ Memory store (SQLiteMemoryStore)
├─ Checkpoint manager (SQLiteCheckpointManager)
└─ Shared context (InMemoryContext)
4. Initialize safety controls
├─ Create circuit breakers (IterationBreaker, CostBreaker, TimeBreaker, ErrorRateBreaker)
├─ Create breaker manager
├─ Create gate manager
└─ Create cost tracker
5. Initialize tools
├─ Create tool registry
├─ Register LLM tool
├─ Register Git tools
├─ Register GitHub tools
├─ Register runner tools
├─ Register linter tool
└─ Register test runner tool
6. Initialize agents
├─ Create PlannerAgent
├─ Create ImplementerAgent
├─ Create ReviewerAgent
├─ Create TesterAgent
└─ Create DeployerAgent
7. Initialize orchestrator
└─ Create Pipeline with phases
8. Register event handlers
└─ Subscribe to significant events for logging
9. Parse CLI arguments
└─ Commander.js parse
10. Execute command
└─ Dispatch to appropriate command handler
Dependency Injection Pattern
typescript// Main initialization function async function initializeForge(): Promise<ForgeApp> { const config = await loadConfig(); // Layer 1: Database const db = await initializeDatabase(config.memory.dbPath); // Layer 2: Infrastructure const bus = new InMemoryEventBus(db); const memoryStore = new SQLiteMemoryStore(db); const checkpointManager = new SQLiteCheckpointManager(db); const context = new InMemoryContext(); // Layer 3: Safety const breakers = [ new IterationBreaker(config.safety.iterations), new CostBreaker(config.safety.cost), new TimeBreaker(config.safety.time), new ErrorRateBreaker(0.25), ]; const breakerManager = new BreakerManager(breakers); const gateManager = new GateManager(DEFAULT_GATES, bus, /* request handler */); const costTracker = new SimpleCostTracker(config.safety.costPerRun); // Layer 4: Tools const toolRegistry = new SimpleToolRegistry(); const llmProvider = new AnthropicProvider(process.env.ANTHROPIC_API_KEY!); toolRegistry.register(createLLMTool(llmProvider)); toolRegistry.register(gitDiffTool); toolRegistry.register(gitLogTool); toolRegistry.register(shellTool); toolRegistry.register(lintTool); toolRegistry.register(testTool); // Layer 5: Agents const planner = new PlannerAgent(); const implementer = new ImplementerAgent(); const reviewer = new ReviewerAgent(); const tester = new TesterAgent(); const deployer = new DeployerAgent(); // Layer 6: Orchestrator const pipeline = new Pipeline( { phases: [ { name: 'planning', agent: planner, /* ... */ }, { name: 'implementation', agent: implementer, /* ... */ }, { name: 'review', agent: reviewer, /* ... */ }, { name: 'testing', agent: tester, /* ... */ }, { name: 'deployment', agent: deployer, /* ... */ }, ], maxBounces: 3, }, bus, context, checkpointManager ); return { config, bus, memoryStore, toolRegistry, pipeline, // ... everything needed }; }
12. Testing Structure
Test Organization
Tests mirror the source structure for unit tests, and live in a separate integration/ folder for integration tests.
tests/
├── unit/ # Unit tests (mirror src/)
│ ├── core/
│ │ ├── bus.test.ts # Tests for core/bus.ts
│ │ ├── config.test.ts # Tests for core/config.ts
│ │ └── errors.test.ts
│ ├── safety/
│ │ ├── breakers.test.ts
│ │ ├── gates.test.ts
│ │ └── budget.test.ts
│ ├── memory/
│ │ ├── store.test.ts
│ │ ├── episodes.test.ts
│ │ └── patterns.test.ts
│ ├── tools/
│ │ ├── registry.test.ts
│ │ ├── llm.test.ts
│ │ └── git.test.ts
│ └── agents/
│ ├── base.test.ts
│ ├── planner.test.ts
│ └── reviewer.test.ts
│
├── integration/ # Integration tests
│ ├── pipeline.test.ts # Full pipeline execution
│ ├── review-flow.test.ts # Review → fix → re-review
│ ├── test-flow.test.ts # Test → analyze → fix
│ └── memory-recall.test.ts # Memory persistence and recall
│
├── fixtures/ # Test data
│ ├── sample-repos/
│ │ └── simple-ts-project/ # Sample TS project for testing
│ ├── mock-responses/
│ │ ├── llm-planning.json # Mock LLM responses
│ │ └── github-pr.json
│ └── test-configs/
│ └── test-forge.config.ts # Test configuration
│
└── helpers/ # Test utilities
├── mock-llm.ts # LLM mock for testing
├── temp-db.ts # Temporary SQLite DB setup
└── fixtures.ts # Fixture loading utilities
Test Utilities
tests/helpers/temp-db.ts
typescriptimport { Database } from 'bun:sqlite'; import { drizzle } from 'drizzle-orm/bun-sqlite'; import { migrate } from 'drizzle-orm/bun-sqlite/migrator'; export async function createTempDB() { const sqlite = new Database(':memory:'); const db = drizzle(sqlite); // Run migrations await migrate(db, { migrationsFolder: './drizzle' }); return { db, sqlite, cleanup: () => sqlite.close() }; }
tests/helpers/mock-llm.ts
typescriptimport type { LLMProvider, ChatRequest, ChatResponse } from '../../src/tools/llm'; export class MockLLMProvider implements LLMProvider { private responses: ChatResponse[] = []; private callCount = 0; mockResponse(response: ChatResponse) { this.responses.push(response); } async chat(request: ChatRequest): Promise<ChatResponse> { const response = this.responses[this.callCount] ?? { content: 'Mock response', done: true, result: {}, usage: { promptTokens: 100, completionTokens: 100 }, cost: 0.01, }; this.callCount++; return response; } async embed(text: string): Promise<Float32Array> { return new Float32Array(1536); // Mock embedding } reset() { this.responses = []; this.callCount = 0; } }
tests/helpers/fixtures.ts
typescriptimport { join } from 'path'; export function loadFixture(path: string): string { return Bun.file(join(__dirname, '../fixtures', path)).text(); } export function loadJSONFixture<T>(path: string): T { return JSON.parse(loadFixture(path)); }
Example Unit Test
tests/unit/core/bus.test.ts
typescriptimport { describe, it, expect, beforeEach, afterEach } from 'bun:test'; import { InMemoryEventBus } from '../../../src/core/bus'; import { createTempDB } from '../../helpers/temp-db'; describe('InMemoryEventBus', () => { let bus: InMemoryEventBus; let cleanup: () => void; beforeEach(async () => { const { db, cleanup: clean } = await createTempDB(); cleanup = clean; bus = new InMemoryEventBus(db); }); afterEach(() => { cleanup(); }); it('should emit and receive events', async () => { let received: any = null; bus.on('test.event', (event) => { received = event; }); await bus.emit({ type: 'test.event', source: 'test', traceId: 'trace-123', payload: { foo: 'bar' }, }); expect(received).toBeDefined(); expect(received.type).toBe('test.event'); expect(received.payload).toEqual({ foo: 'bar' }); }); it('should replay events by traceId', async () => { await bus.emit({ type: 'test.1', source: 'test', traceId: 'trace-1', payload: {} }); await bus.emit({ type: 'test.2', source: 'test', traceId: 'trace-1', payload: {} }); await bus.emit({ type: 'test.3', source: 'test', traceId: 'trace-2', payload: {} }); const events = await bus.replay('trace-1'); expect(events).toHaveLength(2); expect(events[0].type).toBe('test.1'); expect(events[1].type).toBe('test.2'); }); });
Example Integration Test
tests/integration/pipeline.test.ts
typescriptimport { describe, it, expect, beforeEach } from 'bun:test'; import { Pipeline, InMemoryContext, SQLiteCheckpointManager } from '../../src/orchestrator'; import { InMemoryEventBus } from '../../src/core/bus'; import { PlannerAgent, ImplementerAgent } from '../../src/agents'; import { MockLLMProvider } from '../helpers/mock-llm'; import { createTempDB } from '../helpers/temp-db'; describe('Pipeline Integration', () => { it('should execute a simple planning → implementation flow', async () => { const { db, cleanup } = await createTempDB(); const bus = new InMemoryEventBus(db); const context = new InMemoryContext(); const checkpointManager = new SQLiteCheckpointManager(db); const mockLLM = new MockLLMProvider(); // Mock LLM responses mockLLM.mockResponse({ content: '', done: true, result: { architecture: { components: ['UserService'] }, tasks: [{ name: 'Create UserService', priority: 1 }], risk: { level: 'low', reasons: [] }, }, usage: { promptTokens: 100, completionTokens: 200 }, cost: 0.05, }); const pipeline = new Pipeline( { phases: [ { name: 'planning', agent: new PlannerAgent(), guards: [], gates: [], breakers: [], next: null }, ], maxBounces: 3, }, bus, context, checkpointManager ); await pipeline.execute('Add user authentication', { traceId: 'test-trace', bus, memory: /* mock */, llm: mockLLM, tools: /* mock */, safety: /* mock */, config: /* mock */, cost: /* mock */, elapsed: 0, }); // Verify checkpoint was saved const checkpoints = await checkpointManager.list('test-trace'); expect(checkpoints).toHaveLength(1); expect(checkpoints[0].phase).toBe('planning'); cleanup(); }); });
Test Coverage Goals
| Module | Target Coverage | Critical Paths |
|---|---|---|
| core/* | 90% | Event bus, config loading |
| safety/* | 95% | Breaker logic, gate conditions |
| memory/* | 85% | CRUD operations, recall |
| tools/* | 70% | Tool execution (lots of external deps) |
| agents/* | 80% | Agent loop, reflection |
| orchestrator/* | 85% | Phase transitions, checkpointing |
| cli/* | 60% | Command dispatch (hard to test UI) |
Summary
This module map provides:
- Complete file tree - Every file that will exist
- Dependency graph - Visual representation of module relationships
- Detailed module breakdowns - Each module's files with purpose, exports, and dependencies
- Module interfaces - Public API of each module
- Initialization order - How the system boots up
- Testing structure - Where tests live and how they're organized
You can now generate skeleton files for any module with imports, exports, and function signatures based on this specification.
The design follows these key principles:
- No circular dependencies (all flow through event bus)
- Explicit imports (no barrel files unless for convenience)
- Layer-based architecture (each layer only depends on layers below)
- Testability (dependency injection, mocks, in-memory implementations)
- Incremental buildability (can build and test each module independently)