30 min
security
February 8, 2026

Safety System Implementation Plan

Safety System Implementation Plan

Overview

This plan specifies the complete implementation of Forge's safety system, including circuit breakers, human gates, automation ladder, cost tracking, and audit trails. The safety system is the nervous system that prevents runaway execution, enforces human oversight at critical points, and enables progressive autonomy.

Build Priority: P0 - Must be built alongside core loop infrastructure (Week 1-2)

Dependencies:

  • Core types and event bus (from core/)
  • Memory system (for learning from safety events)
  • Tool execution context (for cost tracking)

1. Circuit Breaker Framework

1.1 Base Circuit Breaker Interface

File: src/safety/breakers.ts

typescript
// ─── Core Abstractions ───────────────────────────────────── export enum BreakerState { CLOSED = 'closed', // Normal operation HALF_OPEN = 'half_open', // Testing after failure OPEN = 'open' // Halted due to threshold breach } export interface BreakerResult { shouldBreak: boolean; state: BreakerState; reason?: string; currentValue: number; threshold: number; suggestion?: string; } export interface BreakerConfig { enabled: boolean; threshold: number; warning: number; // Percentage of threshold (0-1) resetAfter?: number; // milliseconds } // ─── Abstract Base Class ─────────────────────────────────── export abstract class CircuitBreaker { protected state: BreakerState = BreakerState.CLOSED; protected lastFailureTime?: Date; protected failureCount: number = 0; constructor( public readonly name: string, protected config: BreakerConfig ) {} /** * Check if the breaker should trip. * Called before each iteration or significant operation. */ abstract check(context: BreakerContext): Promise<BreakerResult>; /** * Reset the breaker to closed state. * Called after successful operations or timeout. */ reset(): void { this.state = BreakerState.CLOSED; this.failureCount = 0; this.lastFailureTime = undefined; } /** * Check if enough time has passed to attempt reset. */ protected shouldAttemptReset(): boolean { if (!this.lastFailureTime || !this.config.resetAfter) return false; const elapsed = Date.now() - this.lastFailureTime.getTime(); return elapsed >= this.config.resetAfter; } /** * Record a failure and potentially open the circuit. */ protected recordFailure(reason: string): void { this.failureCount++; this.lastFailureTime = new Date(); this.state = BreakerState.OPEN; } /** * Check if current value is in warning range. */ protected isWarning(current: number, threshold: number): boolean { return current >= threshold * this.config.warning; } } // ─── Context passed to breakers ──────────────────────────── export interface BreakerContext { traceId: string; phase: PhaseName; iteration: number; cost: CostAccumulator; elapsed: number; errorWindow: ErrorEvent[]; } export interface ErrorEvent { timestamp: Date; severity: 'error' | 'warning'; source: string; message: string; } export interface CostAccumulator { phase: number; run: number; day: number; }

1.2 State Machine Implementation

State Transitions:

  CLOSED ─────────────────┐
    │                      │
    │ threshold exceeded   │ resetAfter timeout
    │                      │
    ▼                      │
  OPEN ──────────────▶ HALF_OPEN
    │                      │
    │ failure continues    │ success
    │                      │
    └──────────────────────┴────▶ CLOSED

Implementation:

typescript
// In CircuitBreaker base class protected transitionState( shouldTrip: boolean, context: BreakerContext ): BreakerState { switch (this.state) { case BreakerState.CLOSED: if (shouldTrip) { this.recordFailure('threshold exceeded'); return BreakerState.OPEN; } return BreakerState.CLOSED; case BreakerState.OPEN: if (this.shouldAttemptReset()) { return BreakerState.HALF_OPEN; } return BreakerState.OPEN; case BreakerState.HALF_OPEN: if (shouldTrip) { this.recordFailure('failed during recovery'); return BreakerState.OPEN; } else { this.reset(); return BreakerState.CLOSED; } } }

2. Iteration Breaker Implementation

2.1 Iteration Limits

File: src/safety/breakers.ts

typescript
// ─── Iteration Breaker Configuration ─────────────────────── export interface IterationBreakerConfig extends BreakerConfig { limits: { default: number; planning: number; implementation: number; review: number; testing: number; deployment: number; }; stagnation: { threshold: number; // Consecutive iterations without progress definition: StagnationDetector; }; } export const DEFAULT_ITERATION_CONFIG: IterationBreakerConfig = { enabled: true, threshold: 10, // Overridden per phase warning: 0.8, // Warn at 80% resetAfter: 5 * 60_000, // 5 minutes limits: { default: 10, planning: 20, // Architecture can be complex implementation: 50, // Code generation with retries review: 10, // Review iterations testing: 5, // Fix/retry cycles deployment: 3, // Deployment attempts }, stagnation: { threshold: 3, // 3 consecutive no-progress iterations definition: null as any, // Injected } }; // ─── Iteration Breaker Implementation ────────────────────── export class IterationBreaker extends CircuitBreaker { private iterationCounts = new Map<string, number>(); private lastProgress = new Map<string, Date>(); private stagnationCounts = new Map<string, number>(); constructor(config: IterationBreakerConfig) { super('iteration', config); } private get config(): IterationBreakerConfig { return this._config as IterationBreakerConfig; } async check(context: BreakerContext): Promise<BreakerResult> { const key = `${context.traceId}:${context.phase}`; const currentCount = this.iterationCounts.get(key) ?? 0; const newCount = currentCount + 1; // Update count this.iterationCounts.set(key, newCount); // Get phase-specific limit const limit = this.config.limits[context.phase] ?? this.config.limits.default; // Check stagnation const stagnated = await this.checkStagnation(key, context); if (stagnated) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'STAGNATION_DETECTED', currentValue: this.stagnationCounts.get(key) ?? 0, threshold: this.config.stagnation.threshold, suggestion: 'Agent is not making progress. Consider simplifying approach or requesting human input.' }; } // Check hard limit if (newCount > limit) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'MAX_ITERATIONS_EXCEEDED', currentValue: newCount, threshold: limit, suggestion: `Phase ${context.phase} exceeded ${limit} iterations. Task may be too complex or approach may be incorrect.` }; } // Warning check if (this.isWarning(newCount, limit)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'APPROACHING_ITERATION_LIMIT', currentValue: newCount, threshold: limit, suggestion: `${limit - newCount} iterations remaining in ${context.phase} phase.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: newCount, threshold: limit }; } private async checkStagnation( key: string, context: BreakerContext ): Promise<boolean> { const detector = this.config.stagnation.definition; const hasProgress = await detector.hasProgress(context); if (hasProgress) { // Reset stagnation counter this.lastProgress.set(key, new Date()); this.stagnationCounts.set(key, 0); return false; } // No progress detected const stagnationCount = (this.stagnationCounts.get(key) ?? 0) + 1; this.stagnationCounts.set(key, stagnationCount); return stagnationCount >= this.config.stagnation.threshold; } resetPhase(traceId: string, phase: PhaseName): void { const key = `${traceId}:${phase}`; this.iterationCounts.delete(key); this.lastProgress.delete(key); this.stagnationCounts.delete(key); } } // ─── Stagnation Detection ────────────────────────────────── export interface StagnationDetector { hasProgress(context: BreakerContext): Promise<boolean>; } export class DefaultStagnationDetector implements StagnationDetector { async hasProgress(context: BreakerContext): Promise<boolean> { // Phase-specific progress definitions switch (context.phase) { case 'planning': return this.planningProgress(context); case 'implementation': return this.implementationProgress(context); case 'review': return this.reviewProgress(context); case 'testing': return this.testingProgress(context); case 'deployment': return this.deploymentProgress(context); default: return false; } } private planningProgress(context: BreakerContext): boolean { // Progress = new decisions, architecture elements, or tasks defined // Check if output has grown/changed from previous iteration // This requires access to phase state, injected via context const state = (context as any).phaseState; return state?.tasksCount > (state?.previousTasksCount ?? 0); } private implementationProgress(context: BreakerContext): boolean { // Progress = new code written, files modified, or tests pass const state = (context as any).phaseState; return ( state?.filesModified > 0 || state?.linesAdded > (state?.previousLinesAdded ?? 0) || state?.testsPassedDelta > 0 ); } private reviewProgress(context: BreakerContext): boolean { // Progress = findings addressed or new findings detected const state = (context as any).phaseState; return ( state?.findingsResolved > 0 || state?.newFindings > 0 ); } private testingProgress(context: BreakerContext): boolean { // Progress = tests pass, failures reduce, or new tests added const state = (context as any).phaseState; return ( state?.testsPassed > (state?.previousTestsPassed ?? 0) || state?.failuresReduced > 0 ); } private deploymentProgress(context: BreakerContext): boolean { // Progress = deployment step completed or health improved const state = (context as any).phaseState; return ( state?.deploymentStage !== state?.previousDeploymentStage || state?.healthScore > (state?.previousHealthScore ?? 0) ); } }

3. Cost Breaker Implementation

3.1 Cost Tracking and Budgets

File: src/safety/budget.ts

typescript
// ─── Cost Tracking Infrastructure ────────────────────────── export interface CostEvent { timestamp: Date; phase: PhaseName; model: string; promptTokens: number; completionTokens: number; costUsd: number; operation: string; } export interface ModelPricing { input: number; // Per 1K tokens output: number; // Per 1K tokens } export const MODEL_PRICING: Record<string, ModelPricing> = { 'claude-sonnet-4-5-20250929': { input: 0.003, output: 0.015 }, 'claude-haiku-4-5-20251001': { input: 0.00025, output: 0.00125 }, 'claude-opus-4-6': { input: 0.015, output: 0.075 }, 'gpt-4o': { input: 0.0025, output: 0.01 }, 'gpt-4o-mini': { input: 0.00015, output: 0.0006 }, }; export class CostTracker { private costs: CostEvent[] = []; /** * Record a cost event. */ record(event: Omit<CostEvent, 'costUsd'>): number { const pricing = MODEL_PRICING[event.model]; if (!pricing) { throw new Error(`Unknown model pricing: ${event.model}`); } const costUsd = (event.promptTokens / 1000) * pricing.input + (event.completionTokens / 1000) * pricing.output; const fullEvent: CostEvent = { ...event, costUsd }; this.costs.push(fullEvent); return costUsd; } /** * Get accumulated cost for a scope. */ accumulate(scope: CostScope): number { const filtered = this.filter(scope); return filtered.reduce((sum, e) => sum + e.costUsd, 0); } private filter(scope: CostScope): CostEvent[] { const now = Date.now(); return this.costs.filter(event => { // Time window if (scope.since) { const age = now - event.timestamp.getTime(); if (age > scope.since) return false; } // Phase filter if (scope.phase && event.phase !== scope.phase) { return false; } return true; }); } } export interface CostScope { since?: number; // milliseconds ago phase?: PhaseName; }

3.2 Cost Breaker

File: src/safety/breakers.ts

typescript
// ─── Cost Breaker Configuration ──────────────────────────── export interface CostBreakerConfig extends BreakerConfig { budgets: { perPhase: Record<PhaseName, number>; // USD perRun: number; perDay: number; }; } export const DEFAULT_COST_CONFIG: CostBreakerConfig = { enabled: true, threshold: 1.0, // 100% of budget warning: 0.8, // Warn at 80% budgets: { perPhase: { planning: 5.0, implementation: 10.0, review: 2.0, testing: 3.0, deployment: 2.0, }, perRun: 50.0, perDay: 200.0, } }; // ─── Cost Breaker Implementation ─────────────────────────── export class CostBreaker extends CircuitBreaker { constructor( private tracker: CostTracker, config: CostBreakerConfig ) { super('cost', config); } private get config(): CostBreakerConfig { return this._config as CostBreakerConfig; } async check(context: BreakerContext): Promise<BreakerResult> { const checks = await Promise.all([ this.checkPhase(context), this.checkRun(context), this.checkDay(context), ]); // Return the most severe result const failed = checks.find(c => c.shouldBreak); if (failed) return failed; const warning = checks.find(c => c.reason?.includes('WARNING')); if (warning) return warning; return checks[0]; // Default to phase check } private async checkPhase(context: BreakerContext): Promise<BreakerResult> { const spent = this.tracker.accumulate({ phase: context.phase }); const budget = this.config.budgets.perPhase[context.phase] ?? 5.0; if (spent >= budget) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'PHASE_BUDGET_EXCEEDED', currentValue: spent, threshold: budget, suggestion: `Phase ${context.phase} exceeded $${budget.toFixed(2)} budget. Consider simplifying approach.` }; } if (this.isWarning(spent, budget)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'PHASE_BUDGET_WARNING', currentValue: spent, threshold: budget, suggestion: `Phase ${context.phase} at $${spent.toFixed(2)} of $${budget.toFixed(2)} budget.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: spent, threshold: budget }; } private async checkRun(context: BreakerContext): Promise<BreakerResult> { const spent = context.cost.run; const budget = this.config.budgets.perRun; if (spent >= budget) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'RUN_BUDGET_EXCEEDED', currentValue: spent, threshold: budget, suggestion: `Run exceeded $${budget.toFixed(2)} total budget.` }; } if (this.isWarning(spent, budget)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'RUN_BUDGET_WARNING', currentValue: spent, threshold: budget, suggestion: `Run at $${spent.toFixed(2)} of $${budget.toFixed(2)} budget.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: spent, threshold: budget }; } private async checkDay(context: BreakerContext): Promise<BreakerResult> { const spent = this.tracker.accumulate({ since: 24 * 60 * 60_000 }); const budget = this.config.budgets.perDay; if (spent >= budget) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'DAILY_BUDGET_EXCEEDED', currentValue: spent, threshold: budget, suggestion: `Daily budget of $${budget.toFixed(2)} exceeded. Operations halted until tomorrow.` }; } if (this.isWarning(spent, budget)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'DAILY_BUDGET_WARNING', currentValue: spent, threshold: budget, suggestion: `Daily spending at $${spent.toFixed(2)} of $${budget.toFixed(2)} budget.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: spent, threshold: budget }; } }

4. Time Breaker Implementation

4.1 Time Tracking

File: src/safety/breakers.ts

typescript
// ─── Time Breaker Configuration ──────────────────────────── export interface TimeBreakerConfig extends BreakerConfig { timeouts: Record<PhaseName | 'total', number>; // milliseconds } export const DEFAULT_TIME_CONFIG: TimeBreakerConfig = { enabled: true, threshold: 1.0, // 100% of timeout warning: 0.9, // Warn at 90% timeouts: { planning: 30 * 60_000, // 30 minutes implementation: 60 * 60_000, // 1 hour review: 30 * 60_000, // 30 minutes testing: 20 * 60_000, // 20 minutes deployment: 15 * 60_000, // 15 minutes total: 120 * 60_000, // 2 hours } }; // ─── Time Breaker Implementation ─────────────────────────── export class TimeBreaker extends CircuitBreaker { private startTimes = new Map<string, Date>(); constructor(config: TimeBreakerConfig) { super('time', config); } private get config(): TimeBreakerConfig { return this._config as TimeBreakerConfig; } startPhase(traceId: string, phase: PhaseName): void { const key = `${traceId}:${phase}`; this.startTimes.set(key, new Date()); } async check(context: BreakerContext): Promise<BreakerResult> { const phaseCheck = await this.checkPhase(context); if (phaseCheck.shouldBreak) return phaseCheck; const totalCheck = await this.checkTotal(context); if (totalCheck.shouldBreak) return totalCheck; // Return warning if any return phaseCheck.reason ? phaseCheck : totalCheck; } private async checkPhase(context: BreakerContext): Promise<BreakerResult> { const key = `${context.traceId}:${context.phase}`; const startTime = this.startTimes.get(key); if (!startTime) { // Not started yet, no violation return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: 0, threshold: this.config.timeouts[context.phase] }; } const elapsed = Date.now() - startTime.getTime(); const limit = this.config.timeouts[context.phase]; if (elapsed >= limit) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'PHASE_TIMEOUT_EXCEEDED', currentValue: elapsed, threshold: limit, suggestion: `Phase ${context.phase} exceeded ${limit / 60_000} minute timeout.` }; } if (this.isWarning(elapsed, limit)) { const remaining = limit - elapsed; return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'PHASE_TIMEOUT_WARNING', currentValue: elapsed, threshold: limit, suggestion: `Phase ${context.phase} has ${Math.ceil(remaining / 60_000)} minutes remaining.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: elapsed, threshold: limit }; } private async checkTotal(context: BreakerContext): Promise<BreakerResult> { const elapsed = context.elapsed; const limit = this.config.timeouts.total; if (elapsed >= limit) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'TOTAL_TIMEOUT_EXCEEDED', currentValue: elapsed, threshold: limit, suggestion: `Total pipeline exceeded ${limit / 60_000} minute timeout.` }; } if (this.isWarning(elapsed, limit)) { const remaining = limit - elapsed; return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'TOTAL_TIMEOUT_WARNING', currentValue: elapsed, threshold: limit, suggestion: `Pipeline has ${Math.ceil(remaining / 60_000)} minutes remaining.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: elapsed, threshold: limit }; } }

5. Error Rate Breaker Implementation

5.1 Sliding Window Error Tracking

File: src/safety/breakers.ts

typescript
// ─── Error Rate Breaker Configuration ────────────────────── export interface ErrorRateBreakerConfig extends BreakerConfig { windowSize: number; // milliseconds thresholds: { warning: number; // % of events that are errors critical: number; }; } export const DEFAULT_ERROR_RATE_CONFIG: ErrorRateBreakerConfig = { enabled: true, threshold: 0.25, // 25% error rate warning: 0.1, // 10% error rate windowSize: 5 * 60_000, // 5 minutes thresholds: { warning: 0.10, critical: 0.25, } }; // ─── Circular Buffer for Event Window ────────────────────── export class CircularBuffer<T> { private buffer: T[] = []; private index = 0; constructor(private maxSize: number) {} add(item: T): void { if (this.buffer.length < this.maxSize) { this.buffer.push(item); } else { this.buffer[this.index] = item; this.index = (this.index + 1) % this.maxSize; } } filter(predicate: (item: T) => boolean): T[] { return this.buffer.filter(predicate); } removeOlderThan(timestamp: number): void { this.buffer = this.buffer.filter(item => { const itemTime = (item as any).timestamp.getTime(); return itemTime >= timestamp; }); } get length(): number { return this.buffer.length; } get items(): T[] { return [...this.buffer]; } } // ─── Error Rate Breaker Implementation ───────────────────── export class ErrorRateBreaker extends CircuitBreaker { private errorWindow: CircularBuffer<ErrorEvent>; constructor(config: ErrorRateBreakerConfig) { super('error_rate', config); this.errorWindow = new CircularBuffer(1000); // Max 1000 events tracked } private get config(): ErrorRateBreakerConfig { return this._config as ErrorRateBreakerConfig; } /** * Record an error event. */ recordError(event: ErrorEvent): void { this.errorWindow.add(event); } /** * Record a success event (for rate calculation). */ recordSuccess(operation: string): void { this.errorWindow.add({ timestamp: new Date(), severity: 'warning', // Not an error source: operation, message: 'success' }); } async check(context: BreakerContext): Promise<BreakerResult> { const now = Date.now(); const windowStart = now - this.config.windowSize; // Clean old events this.errorWindow.removeOlderThan(windowStart); // Calculate error rate const allEvents = this.errorWindow.items; const errorEvents = allEvents.filter(e => e.severity === 'error'); if (allEvents.length === 0) { // No events yet return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: 0, threshold: this.config.thresholds.critical }; } const errorRate = errorEvents.length / allEvents.length; // Check critical threshold if (errorRate >= this.config.thresholds.critical) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'ERROR_RATE_CRITICAL', currentValue: errorRate, threshold: this.config.thresholds.critical, suggestion: `Error rate ${(errorRate * 100).toFixed(1)}% exceeds critical threshold. Halting operations.` }; } // Check warning threshold if (errorRate >= this.config.thresholds.warning) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'ERROR_RATE_WARNING', currentValue: errorRate, threshold: this.config.thresholds.warning, suggestion: `Error rate ${(errorRate * 100).toFixed(1)}% above warning threshold. Monitoring closely.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: errorRate, threshold: this.config.thresholds.critical }; } }

6. Human Gates System

6.1 Gate Definitions

File: src/safety/gates.ts

typescript
// ─── Human Gate Types ────────────────────────────────────── export interface HumanGate { id: string; phase: PhaseName | '*'; // '*' = any phase condition: GateCondition; prompt: string; timeout: number; // milliseconds escalation?: string; // Who to escalate to on timeout } export type GateCondition = (context: GateContext) => boolean | Promise<boolean>; export interface GateContext { traceId: string; phase: PhaseName; plan?: ImplementationPlan; review?: ReviewResult; cost: CostAccumulator; environment?: string; } // ─── Predefined Gates ────────────────────────────────────── export const STANDARD_GATES: HumanGate[] = [ { id: 'architecture_approval', phase: 'planning', condition: (ctx) => { return ctx.plan?.risk.level === 'high' || ctx.plan?.risk.level === 'critical'; }, prompt: 'Review proposed architecture before implementation begins.', timeout: 24 * 60 * 60_000, // 24 hours escalation: 'architect' }, { id: 'production_deploy', phase: 'deployment', condition: (ctx) => { return ctx.environment === 'production'; }, prompt: 'Approve production deployment.', timeout: 60 * 60_000, // 1 hour escalation: 'deployment_lead' }, { id: 'security_findings', phase: 'review', condition: (ctx) => { return ctx.review?.findings.some(f => f.severity === 'critical' && f.category === 'security' ) ?? false; }, prompt: 'Critical security finding requires human review.', timeout: 12 * 60 * 60_000, // 12 hours escalation: 'security_team' }, { id: 'cost_overrun', phase: '*', condition: (ctx) => { return ctx.cost.run > 40; // $40 = 80% of default $50 budget }, prompt: 'Approaching cost budget. Continue?', timeout: 2 * 60 * 60_000, // 2 hours escalation: 'budget_owner' } ];

6.2 Gate Manager

File: src/safety/gates.ts

typescript
// ─── Gate Request and Response ───────────────────────────── export interface GateRequest { id: string; gateId: string; timestamp: Date; context: GateContext; summary: string; details: unknown; timeout: number; } export interface GateResponse { requestId: string; approved: boolean; approver: string; timestamp: Date; reason?: string; conditions?: string[]; // Conditional approval requirements } // ─── Gate Manager Implementation ─────────────────────────── export class GateManager { private gates: Map<string, HumanGate> = new Map(); private pendingRequests = new Map<string, GateRequest>(); private responses = new Map<string, GateResponse>(); constructor( private bus: EventBus, gates: HumanGate[] = STANDARD_GATES ) { gates.forEach(gate => this.gates.set(gate.id, gate)); } /** * Register a custom gate. */ registerGate(gate: HumanGate): void { this.gates.set(gate.id, gate); } /** * Check if any gates should be triggered for the current context. */ async checkGates(context: GateContext): Promise<HumanGate[]> { const triggered: HumanGate[] = []; for (const gate of this.gates.values()) { // Check phase match if (gate.phase !== '*' && gate.phase !== context.phase) { continue; } // Check condition const shouldTrigger = await gate.condition(context); if (shouldTrigger) { triggered.push(gate); } } return triggered; } /** * Request human approval at a gate. * This pauses execution until a response is received or timeout occurs. */ async requestApproval( gate: HumanGate, context: GateContext ): Promise<GateResponse> { const requestId = ulid(); const request: GateRequest = { id: requestId, gateId: gate.id, timestamp: new Date(), context, summary: this.generateSummary(gate, context), details: this.generateDetails(context), timeout: gate.timeout }; // Store pending request this.pendingRequests.set(requestId, request); // Emit gate request event await this.bus.emit({ traceId: context.traceId, source: 'safety', type: 'gate.requested', payload: { requestId, gateId: gate.id, prompt: gate.prompt, summary: request.summary, timeout: gate.timeout } }); // Send notification to human(s) await this.notifyHuman(request, gate); // Wait for response with timeout const response = await this.waitForResponse(requestId, gate.timeout); if (!response) { // Timeout occurred await this.escalate(request, gate); return { requestId, approved: false, approver: 'system', timestamp: new Date(), reason: 'TIMEOUT' }; } // Emit gate response event await this.bus.emit({ traceId: context.traceId, source: 'safety', type: response.approved ? 'gate.approved' : 'gate.rejected', payload: response }); return response; } /** * Submit a response to a pending gate request. * Called by CLI or API when human makes decision. */ submitResponse(response: GateResponse): void { this.responses.set(response.requestId, response); this.pendingRequests.delete(response.requestId); } /** * Wait for a response to a gate request. */ private async waitForResponse( requestId: string, timeout: number ): Promise<GateResponse | null> { const startTime = Date.now(); while (Date.now() - startTime < timeout) { const response = this.responses.get(requestId); if (response) { return response; } // Poll every second await new Promise(resolve => setTimeout(resolve, 1000)); } return null; // Timeout } /** * Generate human-readable summary for gate request. */ private generateSummary(gate: HumanGate, context: GateContext): string { switch (gate.id) { case 'architecture_approval': return `High-risk architecture proposal for ${context.plan?.task}`; case 'production_deploy': return `Production deployment requested`; case 'security_findings': const count = context.review?.findings.filter(f => f.severity === 'critical' && f.category === 'security' ).length ?? 0; return `${count} critical security finding(s) detected`; case 'cost_overrun': return `Cost approaching budget: $${context.cost.run.toFixed(2)}`; default: return gate.prompt; } } /** * Generate detailed information for gate request. */ private generateDetails(context: GateContext): unknown { return { phase: context.phase, cost: context.cost, plan: context.plan, review: context.review, environment: context.environment }; } /** * Send notification to human(s). * Implementation depends on notification system (CLI, Slack, etc.) */ private async notifyHuman(request: GateRequest, gate: HumanGate): Promise<void> { // This will be implemented based on the notification strategy // For now, just log to console (CLI will poll for pending requests) console.log(`\n🚧 HUMAN APPROVAL REQUIRED: ${gate.prompt}`); console.log(`Request ID: ${request.id}`); console.log(`Summary: ${request.summary}`); console.log(`Timeout: ${gate.timeout / 60_000} minutes\n`); } /** * Escalate to designated person/team on timeout. */ private async escalate(request: GateRequest, gate: HumanGate): Promise<void> { await this.bus.emit({ traceId: request.context.traceId, source: 'safety', type: 'gate.escalated', payload: { requestId: request.id, gateId: gate.id, escalateTo: gate.escalation, reason: 'TIMEOUT' } }); console.log(`\n⚠️ GATE ESCALATED: ${gate.prompt}`); console.log(`Escalating to: ${gate.escalation}`); console.log(`Request ID: ${request.id}\n`); } }

6.3 CLI Integration for Gates

File: src/cli/gates.ts

typescript
import inquirer from 'inquirer'; // ─── CLI Gate Handler ────────────────────────────────────── export class CLIGateHandler { constructor(private gateManager: GateManager) {} /** * Poll for pending gate requests and prompt user. */ async pollAndPrompt(): Promise<void> { const pending = this.gateManager.getPendingRequests(); if (pending.length === 0) return; for (const request of pending) { await this.promptForDecision(request); } } /** * Prompt user for approval decision. */ private async promptForDecision(request: GateRequest): Promise<void> { console.log('\n' + '='.repeat(60)); console.log(`GATE: ${request.gateId}`); console.log('='.repeat(60)); console.log(`\nSummary: ${request.summary}`); console.log(`\nDetails:`); console.log(JSON.stringify(request.details, null, 2)); console.log('\n'); const { decision } = await inquirer.prompt([ { type: 'list', name: 'decision', message: 'Your decision:', choices: [ { name: 'Approve', value: 'approve' }, { name: 'Reject', value: 'reject' }, { name: 'Approve with conditions', value: 'conditional' } ] } ]); let conditions: string[] | undefined; let reason: string | undefined; if (decision === 'conditional') { const { cond } = await inquirer.prompt([ { type: 'input', name: 'cond', message: 'Enter conditions (comma-separated):' } ]); conditions = cond.split(',').map((s: string) => s.trim()); } if (decision === 'reject') { const { r } = await inquirer.prompt([ { type: 'input', name: 'r', message: 'Reason for rejection:' } ]); reason = r; } // Submit response this.gateManager.submitResponse({ requestId: request.id, approved: decision === 'approve' || decision === 'conditional', approver: process.env.USER ?? 'unknown', timestamp: new Date(), reason, conditions }); console.log(`\n✅ Decision recorded: ${decision}\n`); } }

7. Automation Ladder Implementation

7.1 Ladder Levels

File: src/safety/automation.ts

typescript
// ─── Automation Ladder Configuration ─────────────────────── export enum AutomationLevel { LEVEL_0 = 0, // Human does everything LEVEL_1 = 1, // AI suggests, human decides LEVEL_2 = 2, // AI acts, human reviews LEVEL_3 = 3, // AI acts, human notified LEVEL_4 = 4 // Full autonomy (low-risk only) } export interface LevelDefinition { level: AutomationLevel; name: string; description: string; capabilities: string[]; requirements: LevelRequirements; } export interface LevelRequirements { minRuns: number; maxFalsePositiveRate: number; maxMissedCriticalBugs: number; minApprovalRate?: number; } export const AUTOMATION_LEVELS: LevelDefinition[] = [ { level: AutomationLevel.LEVEL_0, name: 'Manual', description: 'Human does everything (current state)', capabilities: [ 'System provides suggestions only', 'All decisions require human approval', 'Full manual control' ], requirements: { minRuns: 0, maxFalsePositiveRate: 1.0, maxMissedCriticalBugs: Infinity } }, { level: AutomationLevel.LEVEL_1, name: 'AI Suggests', description: 'AI suggests, human decides', capabilities: [ 'Review comments are suggestions only', 'Test failures analyzed but human fixes', 'Deploy requires explicit approval' ], requirements: { minRuns: 10, maxFalsePositiveRate: 0.5, maxMissedCriticalBugs: 5 } }, { level: AutomationLevel.LEVEL_2, name: 'AI Acts', description: 'AI acts, human reviews', capabilities: [ 'Auto-fix formatting and simple lint issues', 'Auto-approve low-risk reviews', 'Still requires human for medium+ risk' ], requirements: { minRuns: 50, maxFalsePositiveRate: 0.2, maxMissedCriticalBugs: 1, minApprovalRate: 0.8 } }, { level: AutomationLevel.LEVEL_3, name: 'AI Autonomous', description: 'AI acts, human notified', capabilities: [ 'Auto-merge low-risk PRs', 'Auto-deploy to staging', 'Human notified, can override within window' ], requirements: { minRuns: 200, maxFalsePositiveRate: 0.05, maxMissedCriticalBugs: 0, minApprovalRate: 0.9 } }, { level: AutomationLevel.LEVEL_4, name: 'Full Autonomy', description: 'Full autonomy (low-risk only)', capabilities: [ 'Fully autonomous for low-risk changes', 'Human gates remain for medium+ risk', 'Human gates ALWAYS remain for production deploys' ], requirements: { minRuns: 500, maxFalsePositiveRate: 0.02, maxMissedCriticalBugs: 0, minApprovalRate: 0.95 } } ];

7.2 Ladder Manager

File: src/safety/automation.ts

typescript
// ─── Automation Ladder Manager ───────────────────────────── export interface LevelMetrics { totalRuns: number; successfulRuns: number; falsePositives: number; missedCriticalBugs: number; approvalRate: number; lastUpdated: Date; } export class AutomationLadder { private currentLevel: AutomationLevel; private metrics: LevelMetrics; constructor( private db: Database, initialLevel: AutomationLevel = AutomationLevel.LEVEL_1 ) { this.currentLevel = initialLevel; this.metrics = this.loadMetrics(); } /** * Get current automation level. */ getLevel(): AutomationLevel { return this.currentLevel; } /** * Check if an action is allowed at current level. */ isAllowed(action: string, riskLevel: RiskLevel): boolean { switch (this.currentLevel) { case AutomationLevel.LEVEL_0: // Nothing automated return false; case AutomationLevel.LEVEL_1: // Only suggestions, no actions return false; case AutomationLevel.LEVEL_2: // Auto-fix low-risk issues only return action === 'auto_fix' && riskLevel === 'low'; case AutomationLevel.LEVEL_3: // Auto-merge and auto-deploy staging for low-risk return ['auto_fix', 'auto_merge', 'auto_deploy_staging'].includes(action) && riskLevel === 'low'; case AutomationLevel.LEVEL_4: // Full autonomy for low-risk, but never production return riskLevel === 'low' && action !== 'auto_deploy_production'; } } /** * Record metrics from a run. */ async recordRun(outcome: RunOutcome): Promise<void> { this.metrics.totalRuns++; if (outcome.success) { this.metrics.successfulRuns++; } this.metrics.falsePositives += outcome.falsePositives ?? 0; this.metrics.missedCriticalBugs += outcome.missedCriticalBugs ?? 0; if (outcome.humanApproved !== undefined) { // Update approval rate (exponential moving average) const alpha = 0.1; this.metrics.approvalRate = alpha * (outcome.humanApproved ? 1 : 0) + (1 - alpha) * this.metrics.approvalRate; } this.metrics.lastUpdated = new Date(); // Persist metrics await this.saveMetrics(); // Check if we can level up await this.checkLevelTransition(); } /** * Check if we meet requirements for next level. */ private async checkLevelTransition(): Promise<void> { const currentDef = AUTOMATION_LEVELS[this.currentLevel]; const nextLevel = this.currentLevel + 1; if (nextLevel >= AUTOMATION_LEVELS.length) { // Already at max level return; } const nextDef = AUTOMATION_LEVELS[nextLevel]; // Check requirements const meetsRequirements = this.metrics.totalRuns >= nextDef.requirements.minRuns && this.getFalsePositiveRate() <= nextDef.requirements.maxFalsePositiveRate && this.metrics.missedCriticalBugs <= nextDef.requirements.maxMissedCriticalBugs && (nextDef.requirements.minApprovalRate === undefined || this.metrics.approvalRate >= nextDef.requirements.minApprovalRate); if (meetsRequirements) { console.log(`\n🎉 AUTOMATION LEVEL UP: ${currentDef.name}${nextDef.name}`); console.log(`${nextDef.description}`); console.log(`\nNew capabilities:`); nextDef.capabilities.forEach(cap => console.log(` - ${cap}`)); console.log(''); this.currentLevel = nextLevel; await this.saveLevel(); } } /** * Calculate false positive rate. */ private getFalsePositiveRate(): number { if (this.metrics.totalRuns === 0) return 0; return this.metrics.falsePositives / this.metrics.totalRuns; } /** * Load metrics from database. */ private loadMetrics(): LevelMetrics { const row = this.db.query(` SELECT * FROM automation_metrics ORDER BY last_updated DESC LIMIT 1 `).get(); if (!row) { return { totalRuns: 0, successfulRuns: 0, falsePositives: 0, missedCriticalBugs: 0, approvalRate: 0, lastUpdated: new Date() }; } return row as LevelMetrics; } /** * Save metrics to database. */ private async saveMetrics(): Promise<void> { await this.db.query(` INSERT INTO automation_metrics (total_runs, successful_runs, false_positives, missed_critical_bugs, approval_rate, last_updated) VALUES (?, ?, ?, ?, ?, ?) `).run( this.metrics.totalRuns, this.metrics.successfulRuns, this.metrics.falsePositives, this.metrics.missedCriticalBugs, this.metrics.approvalRate, this.metrics.lastUpdated ); } /** * Save current level to database. */ private async saveLevel(): Promise<void> { await this.db.query(` UPDATE config SET automation_level = ? WHERE id = 1 `).run(this.currentLevel); } } export interface RunOutcome { success: boolean; falsePositives?: number; missedCriticalBugs?: number; humanApproved?: boolean; }

8. Safety Configuration

8.1 Configuration Type

File: src/core/config.ts

typescript
// ─── Safety Configuration ────────────────────────────────── export interface SafetyConfig { enabled: boolean; breakers: { iteration: IterationBreakerConfig; cost: CostBreakerConfig; time: TimeBreakerConfig; errorRate: ErrorRateBreakerConfig; }; gates: { enabled: boolean; custom: HumanGate[]; }; automation: { level: AutomationLevel; allowLevelUp: boolean; }; audit: { enabled: boolean; retention: number; // days }; } export const DEFAULT_SAFETY_CONFIG: SafetyConfig = { enabled: true, breakers: { iteration: DEFAULT_ITERATION_CONFIG, cost: DEFAULT_COST_CONFIG, time: DEFAULT_TIME_CONFIG, errorRate: DEFAULT_ERROR_RATE_CONFIG, }, gates: { enabled: true, custom: [] }, automation: { level: AutomationLevel.LEVEL_1, allowLevelUp: true }, audit: { enabled: true, retention: 90 // 90 days } };

8.2 Configuration Loading

File: forge.config.ts (example)

typescript
import { defineConfig } from 'forge'; export default defineConfig({ name: 'my-app', language: 'typescript', safety: { // Override default iteration limits breakers: { iteration: { limits: { implementation: 100 // Allow more iterations for complex implementation } }, // Override cost budgets cost: { budgets: { perDay: 500.0 // Higher daily budget } } }, // Add custom gate gates: { custom: [ { id: 'api_change', phase: 'review', condition: (ctx) => ctx.review?.findings.some(f => f.category === 'api_change'), prompt: 'API changes detected. Review breaking changes.', timeout: 4 * 60 * 60_000 // 4 hours } ] }, // Start at Level 2 (already proven system) automation: { level: 2 } } });

9. Safety Integration with Agent Loop

9.1 SafetyManager Coordinator

File: src/safety/manager.ts

typescript
// ─── Safety Manager ──────────────────────────────────────── export class SafetyManager { private breakers: Map<string, CircuitBreaker>; private gateManager: GateManager; private automationLadder: AutomationLadder; private costTracker: CostTracker; private errorRateBreaker: ErrorRateBreaker; constructor( private config: SafetyConfig, private bus: EventBus, private db: Database ) { // Initialize cost tracker this.costTracker = new CostTracker(); // Initialize breakers this.breakers = new Map([ ['iteration', new IterationBreaker(config.breakers.iteration)], ['cost', new CostBreaker(this.costTracker, config.breakers.cost)], ['time', new TimeBreaker(config.breakers.time)], ['error_rate', new ErrorRateBreaker(config.breakers.errorRate)] ]); this.errorRateBreaker = this.breakers.get('error_rate') as ErrorRateBreaker; // Initialize gates this.gateManager = new GateManager(bus, [ ...STANDARD_GATES, ...config.gates.custom ]); // Initialize automation ladder this.automationLadder = new AutomationLadder(db, config.automation.level); } /** * Check all circuit breakers before proceeding with operation. * This is called by agents at the start of each iteration. */ async check(context: BreakerContext): Promise<SafetyCheckResult> { if (!this.config.enabled) { return { safe: true, breakers: [], gates: [] }; } // Run all breakers in parallel const breakerResults = await Promise.all( Array.from(this.breakers.values()).map(b => b.check(context)) ); // Find any that should break const tripped = breakerResults.filter(r => r.shouldBreak); if (tripped.length > 0) { // Emit breaker events for (const result of tripped) { await this.bus.emit({ traceId: context.traceId, source: 'safety', type: 'breaker.tripped', payload: result }); } return { safe: false, breakers: tripped, gates: [] }; } // Check for warnings const warnings = breakerResults.filter(r => r.reason?.includes('WARNING') || r.reason?.includes('APPROACHING') ); if (warnings.length > 0) { for (const warning of warnings) { await this.bus.emit({ traceId: context.traceId, source: 'safety', type: 'breaker.warning', payload: warning }); } } return { safe: true, breakers: warnings, gates: [] }; } /** * Check and trigger human gates if conditions are met. */ async checkGates(context: GateContext): Promise<GateResponse[]> { if (!this.config.gates.enabled) { return []; } const triggered = await this.gateManager.checkGates(context); if (triggered.length === 0) { return []; } // Request approval for each triggered gate const responses = await Promise.all( triggered.map(gate => this.gateManager.requestApproval(gate, context)) ); return responses; } /** * Record cost for an LLM operation. */ recordCost(event: Omit<CostEvent, 'costUsd'>): number { return this.costTracker.record(event); } /** * Record an error event for rate tracking. */ recordError(event: ErrorEvent): void { this.errorRateBreaker.recordError(event); } /** * Record a success event for rate tracking. */ recordSuccess(operation: string): void { this.errorRateBreaker.recordSuccess(operation); } /** * Get current automation level. */ getAutomationLevel(): AutomationLevel { return this.automationLadder.getLevel(); } /** * Check if an action is allowed at current automation level. */ isActionAllowed(action: string, riskLevel: RiskLevel): boolean { return this.automationLadder.isAllowed(action, riskLevel); } /** * Record run outcome for automation ladder progression. */ async recordRunOutcome(outcome: RunOutcome): Promise<void> { if (this.config.automation.allowLevelUp) { await this.automationLadder.recordRun(outcome); } } /** * Reset breakers for a phase. */ resetPhase(traceId: string, phase: PhaseName): void { const iterationBreaker = this.breakers.get('iteration') as IterationBreaker; iterationBreaker.resetPhase(traceId, phase); const timeBreaker = this.breakers.get('time') as TimeBreaker; timeBreaker.startPhase(traceId, phase); } } export interface SafetyCheckResult { safe: boolean; breakers: BreakerResult[]; gates: GateResponse[]; }

9.2 Integration into Agent Loop

File: src/agents/base.ts (updated)

typescript
abstract class BaseAgent implements Agent { async execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput> { let iteration = 0; let workingMemory = await this.perceive(input, ctx); while (true) { iteration++; // ── SAFETY CHECK ── const safetyCheck = await ctx.safety.check({ traceId: ctx.traceId, phase: this.type as PhaseName, iteration, cost: { phase: ctx.phaseCost, run: ctx.runCost, day: ctx.dayCost }, elapsed: Date.now() - ctx.startTime, errorWindow: ctx.errorWindow }); if (!safetyCheck.safe) { // Breaker tripped - halt execution const errors = safetyCheck.breakers.map(b => b.reason).join(', '); throw new CircuitBreakerError( `Circuit breakers tripped: ${errors}`, safetyCheck.breakers ); } // ── GATE CHECK (at start of phase only) ── if (iteration === 1) { const gateResponses = await ctx.safety.checkGates({ traceId: ctx.traceId, phase: this.type as PhaseName, plan: (input as any).plan, review: (input as any).review, cost: { phase: ctx.phaseCost, run: ctx.runCost, day: ctx.dayCost }, environment: ctx.environment }); // Check if any gate rejected const rejected = gateResponses.find(r => !r.approved); if (rejected) { throw new GateRejectedError( `Human gate rejected: ${rejected.reason}`, rejected ); } } // ── REASON: ask LLM what to do ── const decision = await ctx.llm.chat({ system: this.systemPrompt, messages: workingMemory.messages, tools: this.tools.map(t => t.schema), }); // Record cost ctx.safety.recordCost({ timestamp: new Date(), phase: this.type as PhaseName, model: ctx.llm.model, promptTokens: decision.usage.promptTokens, completionTokens: decision.usage.completionTokens, operation: `${this.type}.reason` }); // ── DONE? ── if (decision.done) { const output = decision.result as PhaseOutput; ctx.bus.emit({ type: `${this.type}.completed`, payload: output }); // Record success ctx.safety.recordSuccess(this.type); await this.reflect(ctx, 'success'); return output; } // ── ACT: execute the chosen tool ── try { const tool = this.tools.find(t => t.name === decision.toolCall.name); const result = await this.executeTool(tool, decision.toolCall.input, ctx); // Record success ctx.safety.recordSuccess(`${this.type}.${tool.name}`); // ── LEARN: update context ── workingMemory = this.updateWorkingMemory(workingMemory, decision, result); } catch (error) { // Record error for rate tracking ctx.safety.recordError({ timestamp: new Date(), severity: 'error', source: `${this.type}.${decision.toolCall.name}`, message: error.message }); await this.reflect(ctx, 'error', error); // Re-throw to fail iteration throw error; } } } }

10. Audit Trail Implementation

10.1 Database Schema

File: src/memory/schema.ts (additions)

typescript
// ─── Safety Audit Tables ─────────────────────────────────── export const safetyEvents = sqliteTable('safety_events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), type: text('type').notNull(), // 'breaker.tripped', 'gate.requested', etc. breakerName: text('breaker_name'), reason: text('reason'), currentValue: real('current_value'), threshold: real('threshold'), payload: text('payload', { mode: 'json' }), }); export const gateEvents = sqliteTable('gate_events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), gateId: text('gate_id').notNull(), requestId: text('request_id').notNull(), type: text('type').notNull(), // 'requested', 'approved', 'rejected', 'escalated' approver: text('approver'), reason: text('reason'), conditions: text('conditions', { mode: 'json' }), }); export const costEvents = sqliteTable('cost_events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), phase: text('phase').notNull(), model: text('model').notNull(), promptTokens: integer('prompt_tokens').notNull(), completionTokens: integer('completion_tokens').notNull(), costUsd: real('cost_usd').notNull(), operation: text('operation').notNull(), }); export const automationMetrics = sqliteTable('automation_metrics', { id: text('id').primaryKey(), totalRuns: integer('total_runs').notNull(), successfulRuns: integer('successful_runs').notNull(), falsePositives: integer('false_positives').notNull(), missedCriticalBugs: integer('missed_critical_bugs').notNull(), approvalRate: real('approval_rate').notNull(), lastUpdated: integer('last_updated', { mode: 'timestamp_ms' }).notNull(), });

10.2 Audit Query Interface

File: src/safety/audit.ts

typescript
// ─── Safety Audit Interface ──────────────────────────────── export class SafetyAudit { constructor(private db: Database) {} /** * Get all safety events for a trace. */ async getTraceEvents(traceId: string): Promise<SafetyEvent[]> { return this.db.select() .from(safetyEvents) .where(eq(safetyEvents.traceId, traceId)) .orderBy(safetyEvents.timestamp); } /** * Get all gate events for a trace. */ async getGateEvents(traceId: string): Promise<GateEvent[]> { return this.db.select() .from(gateEvents) .where(eq(gateEvents.traceId, traceId)) .orderBy(gateEvents.timestamp); } /** * Get cost breakdown for a trace. */ async getCostBreakdown(traceId: string): Promise<CostBreakdown> { const events = await this.db.select() .from(costEvents) .where(eq(costEvents.traceId, traceId)); const byPhase: Record<string, number> = {}; const byModel: Record<string, number> = {}; let total = 0; for (const event of events) { byPhase[event.phase] = (byPhase[event.phase] ?? 0) + event.costUsd; byModel[event.model] = (byModel[event.model] ?? 0) + event.costUsd; total += event.costUsd; } return { total, byPhase, byModel, events }; } /** * Get breaker trip history. */ async getBreakerHistory( breakerName?: string, since?: Date ): Promise<SafetyEvent[]> { let query = this.db.select() .from(safetyEvents) .where(eq(safetyEvents.type, 'breaker.tripped')); if (breakerName) { query = query.where(eq(safetyEvents.breakerName, breakerName)); } if (since) { query = query.where( gte(safetyEvents.timestamp, since.getTime()) ); } return query.orderBy(desc(safetyEvents.timestamp)); } /** * Get gate approval statistics. */ async getGateStats(gateId?: string): Promise<GateStats> { let events = await this.db.select() .from(gateEvents) .where(eq(gateEvents.type, 'approved')) .or(eq(gateEvents.type, 'rejected')); if (gateId) { events = events.filter(e => e.gateId === gateId); } const total = events.length; const approved = events.filter(e => e.type === 'approved').length; const rejected = events.filter(e => e.type === 'rejected').length; return { total, approved, rejected, approvalRate: total > 0 ? approved / total : 0 }; } /** * Clean up old audit records. */ async cleanup(retentionDays: number): Promise<void> { const cutoff = Date.now() - (retentionDays * 24 * 60 * 60_000); await this.db.delete(safetyEvents) .where(lt(safetyEvents.timestamp, cutoff)); await this.db.delete(gateEvents) .where(lt(gateEvents.timestamp, cutoff)); await this.db.delete(costEvents) .where(lt(costEvents.timestamp, cutoff)); } } export interface CostBreakdown { total: number; byPhase: Record<string, number>; byModel: Record<string, number>; events: CostEvent[]; } export interface GateStats { total: number; approved: number; rejected: number; approvalRate: number; }

11. Testing Safety System

11.1 Unit Tests for Breakers

File: src/safety/__tests__/breakers.test.ts

typescript
import { describe, it, expect, beforeEach } from 'bun:test'; import { IterationBreaker, CostBreaker, TimeBreaker, ErrorRateBreaker, CostTracker } from '../breakers'; describe('IterationBreaker', () => { let breaker: IterationBreaker; beforeEach(() => { breaker = new IterationBreaker({ enabled: true, threshold: 10, warning: 0.8, limits: { default: 10, planning: 20, implementation: 50, review: 10, testing: 5, deployment: 3 }, stagnation: { threshold: 3, definition: { hasProgress: async () => true // Mock always has progress } } }); }); it('should not trip under limit', async () => { const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 10, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); }); it('should warn at 80% of limit', async () => { const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 40, // 80% of 50 cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); expect(result.reason).toContain('APPROACHING'); }); it('should trip over limit', async () => { const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 51, // Over limit of 50 cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(true); expect(result.reason).toBe('MAX_ITERATIONS_EXCEEDED'); }); }); describe('CostBreaker', () => { let tracker: CostTracker; let breaker: CostBreaker; beforeEach(() => { tracker = new CostTracker(); breaker = new CostBreaker(tracker, { enabled: true, threshold: 1.0, warning: 0.8, budgets: { perPhase: { planning: 5.0, implementation: 10.0, review: 2.0, testing: 3.0, deployment: 2.0 }, perRun: 50.0, perDay: 200.0 } }); }); it('should not trip under budget', async () => { tracker.record({ timestamp: new Date(), phase: 'implementation', model: 'claude-sonnet-4-5-20250929', promptTokens: 1000, completionTokens: 500, operation: 'test' }); const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); }); it('should trip when phase budget exceeded', async () => { // Add enough events to exceed $10 implementation budget for (let i = 0; i < 100; i++) { tracker.record({ timestamp: new Date(), phase: 'implementation', model: 'claude-opus-4-6', // Expensive model promptTokens: 10000, completionTokens: 5000, operation: 'test' }); } const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 100, run: 100, day: 100 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(true); }); }); describe('ErrorRateBreaker', () => { let breaker: ErrorRateBreaker; beforeEach(() => { breaker = new ErrorRateBreaker({ enabled: true, threshold: 0.25, warning: 0.1, windowSize: 5 * 60_000, thresholds: { warning: 0.10, critical: 0.25 } }); }); it('should not trip with low error rate', async () => { // Record 90 successes, 10 errors = 10% error rate for (let i = 0; i < 90; i++) { breaker.recordSuccess('test'); } for (let i = 0; i < 10; i++) { breaker.recordError({ timestamp: new Date(), severity: 'error', source: 'test', message: 'test error' }); } const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); expect(result.reason).toContain('WARNING'); }); it('should trip with high error rate', async () => { // Record 50 successes, 50 errors = 50% error rate for (let i = 0; i < 50; i++) { breaker.recordSuccess('test'); breaker.recordError({ timestamp: new Date(), severity: 'error', source: 'test', message: 'test error' }); } const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(true); expect(result.reason).toBe('ERROR_RATE_CRITICAL'); }); });

11.2 Integration Test for Gate Flow

File: src/safety/__tests__/gates.integration.test.ts

typescript
import { describe, it, expect, beforeEach } from 'bun:test'; import { GateManager } from '../gates'; import { EventBus } from '../../core/bus'; describe('GateManager Integration', () => { let gateManager: GateManager; let bus: EventBus; beforeEach(() => { bus = new EventBus(); gateManager = new GateManager(bus); }); it('should trigger gate on condition', async () => { const context: GateContext = { traceId: 'test', phase: 'deployment', environment: 'production', cost: { phase: 0, run: 0, day: 0 } }; const triggered = await gateManager.checkGates(context); // Should trigger production_deploy gate expect(triggered).toHaveLength(1); expect(triggered[0].id).toBe('production_deploy'); }); it('should wait for approval response', async () => { const context: GateContext = { traceId: 'test', phase: 'deployment', environment: 'production', cost: { phase: 0, run: 0, day: 0 } }; const triggered = await gateManager.checkGates(context); const gate = triggered[0]; // Start approval request (async) const approvalPromise = gateManager.requestApproval(gate, context); // Simulate human response after 1 second setTimeout(() => { gateManager.submitResponse({ requestId: gate.id, approved: true, approver: 'test-user', timestamp: new Date() }); }, 1000); const response = await approvalPromise; expect(response.approved).toBe(true); expect(response.approver).toBe('test-user'); }); it('should timeout if no response', async () => { const context: GateContext = { traceId: 'test', phase: 'deployment', environment: 'production', cost: { phase: 0, run: 0, day: 0 } }; const triggered = await gateManager.checkGates(context); const gate = { ...triggered[0], timeout: 100 }; // 100ms timeout const response = await gateManager.requestApproval(gate, context); expect(response.approved).toBe(false); expect(response.reason).toBe('TIMEOUT'); }, 10000); });

12. Implementation Checklist

Phase 1: Core Infrastructure (Week 1)

  • Implement CircuitBreaker base class with state machine
  • Implement IterationBreaker with stagnation detection
  • Implement CostTracker and CostBreaker
  • Implement TimeBreaker
  • Implement ErrorRateBreaker with circular buffer
  • Create safety configuration types
  • Add safety event tables to database schema

Phase 2: Human Gates (Week 1-2)

  • Implement HumanGate type and standard gates
  • Implement GateManager with request/response flow
  • Add gate event tracking
  • Create CLI gate handler with inquirer prompts
  • Implement gate timeout and escalation

Phase 3: Integration (Week 2)

  • Implement SafetyManager coordinator
  • Integrate safety checks into BaseAgent loop
  • Add cost recording to LLM calls
  • Add error recording to tool executions
  • Test breaker trip propagation

Phase 4: Automation Ladder (Week 2)

  • Implement AutomationLadder level manager
  • Add automation metrics tracking
  • Implement level transition logic
  • Add level checks to decision points
  • Test level progression

Phase 5: Audit Trail (Week 2)

  • Implement SafetyAudit query interface
  • Add audit event emission
  • Create cost breakdown queries
  • Implement cleanup for old records

Phase 6: Testing (Week 2)

  • Unit tests for all breakers
  • Integration tests for gate flow
  • Cost calculation accuracy tests
  • Stagnation detection tests
  • End-to-end safety system test

13. Configuration Examples

Example 1: Conservative (High Safety)

typescript
export default defineConfig({ safety: { breakers: { iteration: { limits: { planning: 10, implementation: 25, testing: 3 } }, cost: { budgets: { perRun: 25.0, perDay: 100.0 } } }, automation: { level: 1, // AI suggests only allowLevelUp: false } } });

Example 2: Aggressive (Fast Iteration)

typescript
export default defineConfig({ safety: { breakers: { iteration: { limits: { implementation: 100, testing: 10 } }, cost: { budgets: { perRun: 100.0, perDay: 500.0 } } }, automation: { level: 3, // High autonomy allowLevelUp: true } } });

Example 3: Production (Maximum Safety)

typescript
export default defineConfig({ safety: { gates: { custom: [ // Always require approval for any deployment { id: 'any_deploy', phase: 'deployment', condition: () => true, prompt: 'Approve deployment', timeout: 30 * 60_000 }, // Require approval for large code changes { id: 'large_change', phase: 'review', condition: (ctx) => ctx.review?.linesChanged > 500, prompt: 'Large code change requires review', timeout: 4 * 60 * 60_000 } ] }, automation: { level: 2, allowLevelUp: false // Never auto-advance in production } } });

14. Summary

The safety system is the nervous system of Forge. It prevents runaway execution through circuit breakers, enforces human oversight at critical points through gates, and enables progressive autonomy through the automation ladder.

Key Components:

  1. Circuit Breakers - Iteration, Cost, Time, Error Rate
  2. Human Gates - Conditional approval checkpoints
  3. Automation Ladder - Progressive autonomy earning
  4. Audit Trail - Complete safety decision logging

Implementation Priority:

  • P0 (Week 1): Core breakers + basic gates
  • P0 (Week 2): Integration + automation ladder
  • P1 (Week 3+): Advanced features + optimization

Success Metrics:

  • Zero runaway executions
  • < 1 minute median gate response time
  • Automation level progression to Level 3+ after 200 successful runs
  • 100% audit trail coverage of safety decisions

This safety system balances autonomy with control, enabling agents to operate efficiently while maintaining human oversight where it matters most.