Safety System Implementation Plan
Safety System Implementation Plan
Overview
This plan specifies the complete implementation of Forge's safety system, including circuit breakers, human gates, automation ladder, cost tracking, and audit trails. The safety system is the nervous system that prevents runaway execution, enforces human oversight at critical points, and enables progressive autonomy.
Build Priority: P0 - Must be built alongside core loop infrastructure (Week 1-2)
Dependencies:
- Core types and event bus (from core/)
- Memory system (for learning from safety events)
- Tool execution context (for cost tracking)
1. Circuit Breaker Framework
1.1 Base Circuit Breaker Interface
File: src/safety/breakers.ts
typescript// ─── Core Abstractions ───────────────────────────────────── export enum BreakerState { CLOSED = 'closed', // Normal operation HALF_OPEN = 'half_open', // Testing after failure OPEN = 'open' // Halted due to threshold breach } export interface BreakerResult { shouldBreak: boolean; state: BreakerState; reason?: string; currentValue: number; threshold: number; suggestion?: string; } export interface BreakerConfig { enabled: boolean; threshold: number; warning: number; // Percentage of threshold (0-1) resetAfter?: number; // milliseconds } // ─── Abstract Base Class ─────────────────────────────────── export abstract class CircuitBreaker { protected state: BreakerState = BreakerState.CLOSED; protected lastFailureTime?: Date; protected failureCount: number = 0; constructor( public readonly name: string, protected config: BreakerConfig ) {} /** * Check if the breaker should trip. * Called before each iteration or significant operation. */ abstract check(context: BreakerContext): Promise<BreakerResult>; /** * Reset the breaker to closed state. * Called after successful operations or timeout. */ reset(): void { this.state = BreakerState.CLOSED; this.failureCount = 0; this.lastFailureTime = undefined; } /** * Check if enough time has passed to attempt reset. */ protected shouldAttemptReset(): boolean { if (!this.lastFailureTime || !this.config.resetAfter) return false; const elapsed = Date.now() - this.lastFailureTime.getTime(); return elapsed >= this.config.resetAfter; } /** * Record a failure and potentially open the circuit. */ protected recordFailure(reason: string): void { this.failureCount++; this.lastFailureTime = new Date(); this.state = BreakerState.OPEN; } /** * Check if current value is in warning range. */ protected isWarning(current: number, threshold: number): boolean { return current >= threshold * this.config.warning; } } // ─── Context passed to breakers ──────────────────────────── export interface BreakerContext { traceId: string; phase: PhaseName; iteration: number; cost: CostAccumulator; elapsed: number; errorWindow: ErrorEvent[]; } export interface ErrorEvent { timestamp: Date; severity: 'error' | 'warning'; source: string; message: string; } export interface CostAccumulator { phase: number; run: number; day: number; }
1.2 State Machine Implementation
State Transitions:
CLOSED ─────────────────┐
│ │
│ threshold exceeded │ resetAfter timeout
│ │
▼ │
OPEN ──────────────▶ HALF_OPEN
│ │
│ failure continues │ success
│ │
└──────────────────────┴────▶ CLOSED
Implementation:
typescript// In CircuitBreaker base class protected transitionState( shouldTrip: boolean, context: BreakerContext ): BreakerState { switch (this.state) { case BreakerState.CLOSED: if (shouldTrip) { this.recordFailure('threshold exceeded'); return BreakerState.OPEN; } return BreakerState.CLOSED; case BreakerState.OPEN: if (this.shouldAttemptReset()) { return BreakerState.HALF_OPEN; } return BreakerState.OPEN; case BreakerState.HALF_OPEN: if (shouldTrip) { this.recordFailure('failed during recovery'); return BreakerState.OPEN; } else { this.reset(); return BreakerState.CLOSED; } } }
2. Iteration Breaker Implementation
2.1 Iteration Limits
File: src/safety/breakers.ts
typescript// ─── Iteration Breaker Configuration ─────────────────────── export interface IterationBreakerConfig extends BreakerConfig { limits: { default: number; planning: number; implementation: number; review: number; testing: number; deployment: number; }; stagnation: { threshold: number; // Consecutive iterations without progress definition: StagnationDetector; }; } export const DEFAULT_ITERATION_CONFIG: IterationBreakerConfig = { enabled: true, threshold: 10, // Overridden per phase warning: 0.8, // Warn at 80% resetAfter: 5 * 60_000, // 5 minutes limits: { default: 10, planning: 20, // Architecture can be complex implementation: 50, // Code generation with retries review: 10, // Review iterations testing: 5, // Fix/retry cycles deployment: 3, // Deployment attempts }, stagnation: { threshold: 3, // 3 consecutive no-progress iterations definition: null as any, // Injected } }; // ─── Iteration Breaker Implementation ────────────────────── export class IterationBreaker extends CircuitBreaker { private iterationCounts = new Map<string, number>(); private lastProgress = new Map<string, Date>(); private stagnationCounts = new Map<string, number>(); constructor(config: IterationBreakerConfig) { super('iteration', config); } private get config(): IterationBreakerConfig { return this._config as IterationBreakerConfig; } async check(context: BreakerContext): Promise<BreakerResult> { const key = `${context.traceId}:${context.phase}`; const currentCount = this.iterationCounts.get(key) ?? 0; const newCount = currentCount + 1; // Update count this.iterationCounts.set(key, newCount); // Get phase-specific limit const limit = this.config.limits[context.phase] ?? this.config.limits.default; // Check stagnation const stagnated = await this.checkStagnation(key, context); if (stagnated) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'STAGNATION_DETECTED', currentValue: this.stagnationCounts.get(key) ?? 0, threshold: this.config.stagnation.threshold, suggestion: 'Agent is not making progress. Consider simplifying approach or requesting human input.' }; } // Check hard limit if (newCount > limit) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'MAX_ITERATIONS_EXCEEDED', currentValue: newCount, threshold: limit, suggestion: `Phase ${context.phase} exceeded ${limit} iterations. Task may be too complex or approach may be incorrect.` }; } // Warning check if (this.isWarning(newCount, limit)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'APPROACHING_ITERATION_LIMIT', currentValue: newCount, threshold: limit, suggestion: `${limit - newCount} iterations remaining in ${context.phase} phase.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: newCount, threshold: limit }; } private async checkStagnation( key: string, context: BreakerContext ): Promise<boolean> { const detector = this.config.stagnation.definition; const hasProgress = await detector.hasProgress(context); if (hasProgress) { // Reset stagnation counter this.lastProgress.set(key, new Date()); this.stagnationCounts.set(key, 0); return false; } // No progress detected const stagnationCount = (this.stagnationCounts.get(key) ?? 0) + 1; this.stagnationCounts.set(key, stagnationCount); return stagnationCount >= this.config.stagnation.threshold; } resetPhase(traceId: string, phase: PhaseName): void { const key = `${traceId}:${phase}`; this.iterationCounts.delete(key); this.lastProgress.delete(key); this.stagnationCounts.delete(key); } } // ─── Stagnation Detection ────────────────────────────────── export interface StagnationDetector { hasProgress(context: BreakerContext): Promise<boolean>; } export class DefaultStagnationDetector implements StagnationDetector { async hasProgress(context: BreakerContext): Promise<boolean> { // Phase-specific progress definitions switch (context.phase) { case 'planning': return this.planningProgress(context); case 'implementation': return this.implementationProgress(context); case 'review': return this.reviewProgress(context); case 'testing': return this.testingProgress(context); case 'deployment': return this.deploymentProgress(context); default: return false; } } private planningProgress(context: BreakerContext): boolean { // Progress = new decisions, architecture elements, or tasks defined // Check if output has grown/changed from previous iteration // This requires access to phase state, injected via context const state = (context as any).phaseState; return state?.tasksCount > (state?.previousTasksCount ?? 0); } private implementationProgress(context: BreakerContext): boolean { // Progress = new code written, files modified, or tests pass const state = (context as any).phaseState; return ( state?.filesModified > 0 || state?.linesAdded > (state?.previousLinesAdded ?? 0) || state?.testsPassedDelta > 0 ); } private reviewProgress(context: BreakerContext): boolean { // Progress = findings addressed or new findings detected const state = (context as any).phaseState; return ( state?.findingsResolved > 0 || state?.newFindings > 0 ); } private testingProgress(context: BreakerContext): boolean { // Progress = tests pass, failures reduce, or new tests added const state = (context as any).phaseState; return ( state?.testsPassed > (state?.previousTestsPassed ?? 0) || state?.failuresReduced > 0 ); } private deploymentProgress(context: BreakerContext): boolean { // Progress = deployment step completed or health improved const state = (context as any).phaseState; return ( state?.deploymentStage !== state?.previousDeploymentStage || state?.healthScore > (state?.previousHealthScore ?? 0) ); } }
3. Cost Breaker Implementation
3.1 Cost Tracking and Budgets
File: src/safety/budget.ts
typescript// ─── Cost Tracking Infrastructure ────────────────────────── export interface CostEvent { timestamp: Date; phase: PhaseName; model: string; promptTokens: number; completionTokens: number; costUsd: number; operation: string; } export interface ModelPricing { input: number; // Per 1K tokens output: number; // Per 1K tokens } export const MODEL_PRICING: Record<string, ModelPricing> = { 'claude-sonnet-4-5-20250929': { input: 0.003, output: 0.015 }, 'claude-haiku-4-5-20251001': { input: 0.00025, output: 0.00125 }, 'claude-opus-4-6': { input: 0.015, output: 0.075 }, 'gpt-4o': { input: 0.0025, output: 0.01 }, 'gpt-4o-mini': { input: 0.00015, output: 0.0006 }, }; export class CostTracker { private costs: CostEvent[] = []; /** * Record a cost event. */ record(event: Omit<CostEvent, 'costUsd'>): number { const pricing = MODEL_PRICING[event.model]; if (!pricing) { throw new Error(`Unknown model pricing: ${event.model}`); } const costUsd = (event.promptTokens / 1000) * pricing.input + (event.completionTokens / 1000) * pricing.output; const fullEvent: CostEvent = { ...event, costUsd }; this.costs.push(fullEvent); return costUsd; } /** * Get accumulated cost for a scope. */ accumulate(scope: CostScope): number { const filtered = this.filter(scope); return filtered.reduce((sum, e) => sum + e.costUsd, 0); } private filter(scope: CostScope): CostEvent[] { const now = Date.now(); return this.costs.filter(event => { // Time window if (scope.since) { const age = now - event.timestamp.getTime(); if (age > scope.since) return false; } // Phase filter if (scope.phase && event.phase !== scope.phase) { return false; } return true; }); } } export interface CostScope { since?: number; // milliseconds ago phase?: PhaseName; }
3.2 Cost Breaker
File: src/safety/breakers.ts
typescript// ─── Cost Breaker Configuration ──────────────────────────── export interface CostBreakerConfig extends BreakerConfig { budgets: { perPhase: Record<PhaseName, number>; // USD perRun: number; perDay: number; }; } export const DEFAULT_COST_CONFIG: CostBreakerConfig = { enabled: true, threshold: 1.0, // 100% of budget warning: 0.8, // Warn at 80% budgets: { perPhase: { planning: 5.0, implementation: 10.0, review: 2.0, testing: 3.0, deployment: 2.0, }, perRun: 50.0, perDay: 200.0, } }; // ─── Cost Breaker Implementation ─────────────────────────── export class CostBreaker extends CircuitBreaker { constructor( private tracker: CostTracker, config: CostBreakerConfig ) { super('cost', config); } private get config(): CostBreakerConfig { return this._config as CostBreakerConfig; } async check(context: BreakerContext): Promise<BreakerResult> { const checks = await Promise.all([ this.checkPhase(context), this.checkRun(context), this.checkDay(context), ]); // Return the most severe result const failed = checks.find(c => c.shouldBreak); if (failed) return failed; const warning = checks.find(c => c.reason?.includes('WARNING')); if (warning) return warning; return checks[0]; // Default to phase check } private async checkPhase(context: BreakerContext): Promise<BreakerResult> { const spent = this.tracker.accumulate({ phase: context.phase }); const budget = this.config.budgets.perPhase[context.phase] ?? 5.0; if (spent >= budget) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'PHASE_BUDGET_EXCEEDED', currentValue: spent, threshold: budget, suggestion: `Phase ${context.phase} exceeded $${budget.toFixed(2)} budget. Consider simplifying approach.` }; } if (this.isWarning(spent, budget)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'PHASE_BUDGET_WARNING', currentValue: spent, threshold: budget, suggestion: `Phase ${context.phase} at $${spent.toFixed(2)} of $${budget.toFixed(2)} budget.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: spent, threshold: budget }; } private async checkRun(context: BreakerContext): Promise<BreakerResult> { const spent = context.cost.run; const budget = this.config.budgets.perRun; if (spent >= budget) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'RUN_BUDGET_EXCEEDED', currentValue: spent, threshold: budget, suggestion: `Run exceeded $${budget.toFixed(2)} total budget.` }; } if (this.isWarning(spent, budget)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'RUN_BUDGET_WARNING', currentValue: spent, threshold: budget, suggestion: `Run at $${spent.toFixed(2)} of $${budget.toFixed(2)} budget.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: spent, threshold: budget }; } private async checkDay(context: BreakerContext): Promise<BreakerResult> { const spent = this.tracker.accumulate({ since: 24 * 60 * 60_000 }); const budget = this.config.budgets.perDay; if (spent >= budget) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'DAILY_BUDGET_EXCEEDED', currentValue: spent, threshold: budget, suggestion: `Daily budget of $${budget.toFixed(2)} exceeded. Operations halted until tomorrow.` }; } if (this.isWarning(spent, budget)) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'DAILY_BUDGET_WARNING', currentValue: spent, threshold: budget, suggestion: `Daily spending at $${spent.toFixed(2)} of $${budget.toFixed(2)} budget.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: spent, threshold: budget }; } }
4. Time Breaker Implementation
4.1 Time Tracking
File: src/safety/breakers.ts
typescript// ─── Time Breaker Configuration ──────────────────────────── export interface TimeBreakerConfig extends BreakerConfig { timeouts: Record<PhaseName | 'total', number>; // milliseconds } export const DEFAULT_TIME_CONFIG: TimeBreakerConfig = { enabled: true, threshold: 1.0, // 100% of timeout warning: 0.9, // Warn at 90% timeouts: { planning: 30 * 60_000, // 30 minutes implementation: 60 * 60_000, // 1 hour review: 30 * 60_000, // 30 minutes testing: 20 * 60_000, // 20 minutes deployment: 15 * 60_000, // 15 minutes total: 120 * 60_000, // 2 hours } }; // ─── Time Breaker Implementation ─────────────────────────── export class TimeBreaker extends CircuitBreaker { private startTimes = new Map<string, Date>(); constructor(config: TimeBreakerConfig) { super('time', config); } private get config(): TimeBreakerConfig { return this._config as TimeBreakerConfig; } startPhase(traceId: string, phase: PhaseName): void { const key = `${traceId}:${phase}`; this.startTimes.set(key, new Date()); } async check(context: BreakerContext): Promise<BreakerResult> { const phaseCheck = await this.checkPhase(context); if (phaseCheck.shouldBreak) return phaseCheck; const totalCheck = await this.checkTotal(context); if (totalCheck.shouldBreak) return totalCheck; // Return warning if any return phaseCheck.reason ? phaseCheck : totalCheck; } private async checkPhase(context: BreakerContext): Promise<BreakerResult> { const key = `${context.traceId}:${context.phase}`; const startTime = this.startTimes.get(key); if (!startTime) { // Not started yet, no violation return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: 0, threshold: this.config.timeouts[context.phase] }; } const elapsed = Date.now() - startTime.getTime(); const limit = this.config.timeouts[context.phase]; if (elapsed >= limit) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'PHASE_TIMEOUT_EXCEEDED', currentValue: elapsed, threshold: limit, suggestion: `Phase ${context.phase} exceeded ${limit / 60_000} minute timeout.` }; } if (this.isWarning(elapsed, limit)) { const remaining = limit - elapsed; return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'PHASE_TIMEOUT_WARNING', currentValue: elapsed, threshold: limit, suggestion: `Phase ${context.phase} has ${Math.ceil(remaining / 60_000)} minutes remaining.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: elapsed, threshold: limit }; } private async checkTotal(context: BreakerContext): Promise<BreakerResult> { const elapsed = context.elapsed; const limit = this.config.timeouts.total; if (elapsed >= limit) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'TOTAL_TIMEOUT_EXCEEDED', currentValue: elapsed, threshold: limit, suggestion: `Total pipeline exceeded ${limit / 60_000} minute timeout.` }; } if (this.isWarning(elapsed, limit)) { const remaining = limit - elapsed; return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'TOTAL_TIMEOUT_WARNING', currentValue: elapsed, threshold: limit, suggestion: `Pipeline has ${Math.ceil(remaining / 60_000)} minutes remaining.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: elapsed, threshold: limit }; } }
5. Error Rate Breaker Implementation
5.1 Sliding Window Error Tracking
File: src/safety/breakers.ts
typescript// ─── Error Rate Breaker Configuration ────────────────────── export interface ErrorRateBreakerConfig extends BreakerConfig { windowSize: number; // milliseconds thresholds: { warning: number; // % of events that are errors critical: number; }; } export const DEFAULT_ERROR_RATE_CONFIG: ErrorRateBreakerConfig = { enabled: true, threshold: 0.25, // 25% error rate warning: 0.1, // 10% error rate windowSize: 5 * 60_000, // 5 minutes thresholds: { warning: 0.10, critical: 0.25, } }; // ─── Circular Buffer for Event Window ────────────────────── export class CircularBuffer<T> { private buffer: T[] = []; private index = 0; constructor(private maxSize: number) {} add(item: T): void { if (this.buffer.length < this.maxSize) { this.buffer.push(item); } else { this.buffer[this.index] = item; this.index = (this.index + 1) % this.maxSize; } } filter(predicate: (item: T) => boolean): T[] { return this.buffer.filter(predicate); } removeOlderThan(timestamp: number): void { this.buffer = this.buffer.filter(item => { const itemTime = (item as any).timestamp.getTime(); return itemTime >= timestamp; }); } get length(): number { return this.buffer.length; } get items(): T[] { return [...this.buffer]; } } // ─── Error Rate Breaker Implementation ───────────────────── export class ErrorRateBreaker extends CircuitBreaker { private errorWindow: CircularBuffer<ErrorEvent>; constructor(config: ErrorRateBreakerConfig) { super('error_rate', config); this.errorWindow = new CircularBuffer(1000); // Max 1000 events tracked } private get config(): ErrorRateBreakerConfig { return this._config as ErrorRateBreakerConfig; } /** * Record an error event. */ recordError(event: ErrorEvent): void { this.errorWindow.add(event); } /** * Record a success event (for rate calculation). */ recordSuccess(operation: string): void { this.errorWindow.add({ timestamp: new Date(), severity: 'warning', // Not an error source: operation, message: 'success' }); } async check(context: BreakerContext): Promise<BreakerResult> { const now = Date.now(); const windowStart = now - this.config.windowSize; // Clean old events this.errorWindow.removeOlderThan(windowStart); // Calculate error rate const allEvents = this.errorWindow.items; const errorEvents = allEvents.filter(e => e.severity === 'error'); if (allEvents.length === 0) { // No events yet return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: 0, threshold: this.config.thresholds.critical }; } const errorRate = errorEvents.length / allEvents.length; // Check critical threshold if (errorRate >= this.config.thresholds.critical) { return { shouldBreak: true, state: BreakerState.OPEN, reason: 'ERROR_RATE_CRITICAL', currentValue: errorRate, threshold: this.config.thresholds.critical, suggestion: `Error rate ${(errorRate * 100).toFixed(1)}% exceeds critical threshold. Halting operations.` }; } // Check warning threshold if (errorRate >= this.config.thresholds.warning) { return { shouldBreak: false, state: BreakerState.CLOSED, reason: 'ERROR_RATE_WARNING', currentValue: errorRate, threshold: this.config.thresholds.warning, suggestion: `Error rate ${(errorRate * 100).toFixed(1)}% above warning threshold. Monitoring closely.` }; } return { shouldBreak: false, state: BreakerState.CLOSED, currentValue: errorRate, threshold: this.config.thresholds.critical }; } }
6. Human Gates System
6.1 Gate Definitions
File: src/safety/gates.ts
typescript// ─── Human Gate Types ────────────────────────────────────── export interface HumanGate { id: string; phase: PhaseName | '*'; // '*' = any phase condition: GateCondition; prompt: string; timeout: number; // milliseconds escalation?: string; // Who to escalate to on timeout } export type GateCondition = (context: GateContext) => boolean | Promise<boolean>; export interface GateContext { traceId: string; phase: PhaseName; plan?: ImplementationPlan; review?: ReviewResult; cost: CostAccumulator; environment?: string; } // ─── Predefined Gates ────────────────────────────────────── export const STANDARD_GATES: HumanGate[] = [ { id: 'architecture_approval', phase: 'planning', condition: (ctx) => { return ctx.plan?.risk.level === 'high' || ctx.plan?.risk.level === 'critical'; }, prompt: 'Review proposed architecture before implementation begins.', timeout: 24 * 60 * 60_000, // 24 hours escalation: 'architect' }, { id: 'production_deploy', phase: 'deployment', condition: (ctx) => { return ctx.environment === 'production'; }, prompt: 'Approve production deployment.', timeout: 60 * 60_000, // 1 hour escalation: 'deployment_lead' }, { id: 'security_findings', phase: 'review', condition: (ctx) => { return ctx.review?.findings.some(f => f.severity === 'critical' && f.category === 'security' ) ?? false; }, prompt: 'Critical security finding requires human review.', timeout: 12 * 60 * 60_000, // 12 hours escalation: 'security_team' }, { id: 'cost_overrun', phase: '*', condition: (ctx) => { return ctx.cost.run > 40; // $40 = 80% of default $50 budget }, prompt: 'Approaching cost budget. Continue?', timeout: 2 * 60 * 60_000, // 2 hours escalation: 'budget_owner' } ];
6.2 Gate Manager
File: src/safety/gates.ts
typescript// ─── Gate Request and Response ───────────────────────────── export interface GateRequest { id: string; gateId: string; timestamp: Date; context: GateContext; summary: string; details: unknown; timeout: number; } export interface GateResponse { requestId: string; approved: boolean; approver: string; timestamp: Date; reason?: string; conditions?: string[]; // Conditional approval requirements } // ─── Gate Manager Implementation ─────────────────────────── export class GateManager { private gates: Map<string, HumanGate> = new Map(); private pendingRequests = new Map<string, GateRequest>(); private responses = new Map<string, GateResponse>(); constructor( private bus: EventBus, gates: HumanGate[] = STANDARD_GATES ) { gates.forEach(gate => this.gates.set(gate.id, gate)); } /** * Register a custom gate. */ registerGate(gate: HumanGate): void { this.gates.set(gate.id, gate); } /** * Check if any gates should be triggered for the current context. */ async checkGates(context: GateContext): Promise<HumanGate[]> { const triggered: HumanGate[] = []; for (const gate of this.gates.values()) { // Check phase match if (gate.phase !== '*' && gate.phase !== context.phase) { continue; } // Check condition const shouldTrigger = await gate.condition(context); if (shouldTrigger) { triggered.push(gate); } } return triggered; } /** * Request human approval at a gate. * This pauses execution until a response is received or timeout occurs. */ async requestApproval( gate: HumanGate, context: GateContext ): Promise<GateResponse> { const requestId = ulid(); const request: GateRequest = { id: requestId, gateId: gate.id, timestamp: new Date(), context, summary: this.generateSummary(gate, context), details: this.generateDetails(context), timeout: gate.timeout }; // Store pending request this.pendingRequests.set(requestId, request); // Emit gate request event await this.bus.emit({ traceId: context.traceId, source: 'safety', type: 'gate.requested', payload: { requestId, gateId: gate.id, prompt: gate.prompt, summary: request.summary, timeout: gate.timeout } }); // Send notification to human(s) await this.notifyHuman(request, gate); // Wait for response with timeout const response = await this.waitForResponse(requestId, gate.timeout); if (!response) { // Timeout occurred await this.escalate(request, gate); return { requestId, approved: false, approver: 'system', timestamp: new Date(), reason: 'TIMEOUT' }; } // Emit gate response event await this.bus.emit({ traceId: context.traceId, source: 'safety', type: response.approved ? 'gate.approved' : 'gate.rejected', payload: response }); return response; } /** * Submit a response to a pending gate request. * Called by CLI or API when human makes decision. */ submitResponse(response: GateResponse): void { this.responses.set(response.requestId, response); this.pendingRequests.delete(response.requestId); } /** * Wait for a response to a gate request. */ private async waitForResponse( requestId: string, timeout: number ): Promise<GateResponse | null> { const startTime = Date.now(); while (Date.now() - startTime < timeout) { const response = this.responses.get(requestId); if (response) { return response; } // Poll every second await new Promise(resolve => setTimeout(resolve, 1000)); } return null; // Timeout } /** * Generate human-readable summary for gate request. */ private generateSummary(gate: HumanGate, context: GateContext): string { switch (gate.id) { case 'architecture_approval': return `High-risk architecture proposal for ${context.plan?.task}`; case 'production_deploy': return `Production deployment requested`; case 'security_findings': const count = context.review?.findings.filter(f => f.severity === 'critical' && f.category === 'security' ).length ?? 0; return `${count} critical security finding(s) detected`; case 'cost_overrun': return `Cost approaching budget: $${context.cost.run.toFixed(2)}`; default: return gate.prompt; } } /** * Generate detailed information for gate request. */ private generateDetails(context: GateContext): unknown { return { phase: context.phase, cost: context.cost, plan: context.plan, review: context.review, environment: context.environment }; } /** * Send notification to human(s). * Implementation depends on notification system (CLI, Slack, etc.) */ private async notifyHuman(request: GateRequest, gate: HumanGate): Promise<void> { // This will be implemented based on the notification strategy // For now, just log to console (CLI will poll for pending requests) console.log(`\n🚧 HUMAN APPROVAL REQUIRED: ${gate.prompt}`); console.log(`Request ID: ${request.id}`); console.log(`Summary: ${request.summary}`); console.log(`Timeout: ${gate.timeout / 60_000} minutes\n`); } /** * Escalate to designated person/team on timeout. */ private async escalate(request: GateRequest, gate: HumanGate): Promise<void> { await this.bus.emit({ traceId: request.context.traceId, source: 'safety', type: 'gate.escalated', payload: { requestId: request.id, gateId: gate.id, escalateTo: gate.escalation, reason: 'TIMEOUT' } }); console.log(`\n⚠️ GATE ESCALATED: ${gate.prompt}`); console.log(`Escalating to: ${gate.escalation}`); console.log(`Request ID: ${request.id}\n`); } }
6.3 CLI Integration for Gates
File: src/cli/gates.ts
typescriptimport inquirer from 'inquirer'; // ─── CLI Gate Handler ────────────────────────────────────── export class CLIGateHandler { constructor(private gateManager: GateManager) {} /** * Poll for pending gate requests and prompt user. */ async pollAndPrompt(): Promise<void> { const pending = this.gateManager.getPendingRequests(); if (pending.length === 0) return; for (const request of pending) { await this.promptForDecision(request); } } /** * Prompt user for approval decision. */ private async promptForDecision(request: GateRequest): Promise<void> { console.log('\n' + '='.repeat(60)); console.log(`GATE: ${request.gateId}`); console.log('='.repeat(60)); console.log(`\nSummary: ${request.summary}`); console.log(`\nDetails:`); console.log(JSON.stringify(request.details, null, 2)); console.log('\n'); const { decision } = await inquirer.prompt([ { type: 'list', name: 'decision', message: 'Your decision:', choices: [ { name: 'Approve', value: 'approve' }, { name: 'Reject', value: 'reject' }, { name: 'Approve with conditions', value: 'conditional' } ] } ]); let conditions: string[] | undefined; let reason: string | undefined; if (decision === 'conditional') { const { cond } = await inquirer.prompt([ { type: 'input', name: 'cond', message: 'Enter conditions (comma-separated):' } ]); conditions = cond.split(',').map((s: string) => s.trim()); } if (decision === 'reject') { const { r } = await inquirer.prompt([ { type: 'input', name: 'r', message: 'Reason for rejection:' } ]); reason = r; } // Submit response this.gateManager.submitResponse({ requestId: request.id, approved: decision === 'approve' || decision === 'conditional', approver: process.env.USER ?? 'unknown', timestamp: new Date(), reason, conditions }); console.log(`\n✅ Decision recorded: ${decision}\n`); } }
7. Automation Ladder Implementation
7.1 Ladder Levels
File: src/safety/automation.ts
typescript// ─── Automation Ladder Configuration ─────────────────────── export enum AutomationLevel { LEVEL_0 = 0, // Human does everything LEVEL_1 = 1, // AI suggests, human decides LEVEL_2 = 2, // AI acts, human reviews LEVEL_3 = 3, // AI acts, human notified LEVEL_4 = 4 // Full autonomy (low-risk only) } export interface LevelDefinition { level: AutomationLevel; name: string; description: string; capabilities: string[]; requirements: LevelRequirements; } export interface LevelRequirements { minRuns: number; maxFalsePositiveRate: number; maxMissedCriticalBugs: number; minApprovalRate?: number; } export const AUTOMATION_LEVELS: LevelDefinition[] = [ { level: AutomationLevel.LEVEL_0, name: 'Manual', description: 'Human does everything (current state)', capabilities: [ 'System provides suggestions only', 'All decisions require human approval', 'Full manual control' ], requirements: { minRuns: 0, maxFalsePositiveRate: 1.0, maxMissedCriticalBugs: Infinity } }, { level: AutomationLevel.LEVEL_1, name: 'AI Suggests', description: 'AI suggests, human decides', capabilities: [ 'Review comments are suggestions only', 'Test failures analyzed but human fixes', 'Deploy requires explicit approval' ], requirements: { minRuns: 10, maxFalsePositiveRate: 0.5, maxMissedCriticalBugs: 5 } }, { level: AutomationLevel.LEVEL_2, name: 'AI Acts', description: 'AI acts, human reviews', capabilities: [ 'Auto-fix formatting and simple lint issues', 'Auto-approve low-risk reviews', 'Still requires human for medium+ risk' ], requirements: { minRuns: 50, maxFalsePositiveRate: 0.2, maxMissedCriticalBugs: 1, minApprovalRate: 0.8 } }, { level: AutomationLevel.LEVEL_3, name: 'AI Autonomous', description: 'AI acts, human notified', capabilities: [ 'Auto-merge low-risk PRs', 'Auto-deploy to staging', 'Human notified, can override within window' ], requirements: { minRuns: 200, maxFalsePositiveRate: 0.05, maxMissedCriticalBugs: 0, minApprovalRate: 0.9 } }, { level: AutomationLevel.LEVEL_4, name: 'Full Autonomy', description: 'Full autonomy (low-risk only)', capabilities: [ 'Fully autonomous for low-risk changes', 'Human gates remain for medium+ risk', 'Human gates ALWAYS remain for production deploys' ], requirements: { minRuns: 500, maxFalsePositiveRate: 0.02, maxMissedCriticalBugs: 0, minApprovalRate: 0.95 } } ];
7.2 Ladder Manager
File: src/safety/automation.ts
typescript// ─── Automation Ladder Manager ───────────────────────────── export interface LevelMetrics { totalRuns: number; successfulRuns: number; falsePositives: number; missedCriticalBugs: number; approvalRate: number; lastUpdated: Date; } export class AutomationLadder { private currentLevel: AutomationLevel; private metrics: LevelMetrics; constructor( private db: Database, initialLevel: AutomationLevel = AutomationLevel.LEVEL_1 ) { this.currentLevel = initialLevel; this.metrics = this.loadMetrics(); } /** * Get current automation level. */ getLevel(): AutomationLevel { return this.currentLevel; } /** * Check if an action is allowed at current level. */ isAllowed(action: string, riskLevel: RiskLevel): boolean { switch (this.currentLevel) { case AutomationLevel.LEVEL_0: // Nothing automated return false; case AutomationLevel.LEVEL_1: // Only suggestions, no actions return false; case AutomationLevel.LEVEL_2: // Auto-fix low-risk issues only return action === 'auto_fix' && riskLevel === 'low'; case AutomationLevel.LEVEL_3: // Auto-merge and auto-deploy staging for low-risk return ['auto_fix', 'auto_merge', 'auto_deploy_staging'].includes(action) && riskLevel === 'low'; case AutomationLevel.LEVEL_4: // Full autonomy for low-risk, but never production return riskLevel === 'low' && action !== 'auto_deploy_production'; } } /** * Record metrics from a run. */ async recordRun(outcome: RunOutcome): Promise<void> { this.metrics.totalRuns++; if (outcome.success) { this.metrics.successfulRuns++; } this.metrics.falsePositives += outcome.falsePositives ?? 0; this.metrics.missedCriticalBugs += outcome.missedCriticalBugs ?? 0; if (outcome.humanApproved !== undefined) { // Update approval rate (exponential moving average) const alpha = 0.1; this.metrics.approvalRate = alpha * (outcome.humanApproved ? 1 : 0) + (1 - alpha) * this.metrics.approvalRate; } this.metrics.lastUpdated = new Date(); // Persist metrics await this.saveMetrics(); // Check if we can level up await this.checkLevelTransition(); } /** * Check if we meet requirements for next level. */ private async checkLevelTransition(): Promise<void> { const currentDef = AUTOMATION_LEVELS[this.currentLevel]; const nextLevel = this.currentLevel + 1; if (nextLevel >= AUTOMATION_LEVELS.length) { // Already at max level return; } const nextDef = AUTOMATION_LEVELS[nextLevel]; // Check requirements const meetsRequirements = this.metrics.totalRuns >= nextDef.requirements.minRuns && this.getFalsePositiveRate() <= nextDef.requirements.maxFalsePositiveRate && this.metrics.missedCriticalBugs <= nextDef.requirements.maxMissedCriticalBugs && (nextDef.requirements.minApprovalRate === undefined || this.metrics.approvalRate >= nextDef.requirements.minApprovalRate); if (meetsRequirements) { console.log(`\n🎉 AUTOMATION LEVEL UP: ${currentDef.name} → ${nextDef.name}`); console.log(`${nextDef.description}`); console.log(`\nNew capabilities:`); nextDef.capabilities.forEach(cap => console.log(` - ${cap}`)); console.log(''); this.currentLevel = nextLevel; await this.saveLevel(); } } /** * Calculate false positive rate. */ private getFalsePositiveRate(): number { if (this.metrics.totalRuns === 0) return 0; return this.metrics.falsePositives / this.metrics.totalRuns; } /** * Load metrics from database. */ private loadMetrics(): LevelMetrics { const row = this.db.query(` SELECT * FROM automation_metrics ORDER BY last_updated DESC LIMIT 1 `).get(); if (!row) { return { totalRuns: 0, successfulRuns: 0, falsePositives: 0, missedCriticalBugs: 0, approvalRate: 0, lastUpdated: new Date() }; } return row as LevelMetrics; } /** * Save metrics to database. */ private async saveMetrics(): Promise<void> { await this.db.query(` INSERT INTO automation_metrics (total_runs, successful_runs, false_positives, missed_critical_bugs, approval_rate, last_updated) VALUES (?, ?, ?, ?, ?, ?) `).run( this.metrics.totalRuns, this.metrics.successfulRuns, this.metrics.falsePositives, this.metrics.missedCriticalBugs, this.metrics.approvalRate, this.metrics.lastUpdated ); } /** * Save current level to database. */ private async saveLevel(): Promise<void> { await this.db.query(` UPDATE config SET automation_level = ? WHERE id = 1 `).run(this.currentLevel); } } export interface RunOutcome { success: boolean; falsePositives?: number; missedCriticalBugs?: number; humanApproved?: boolean; }
8. Safety Configuration
8.1 Configuration Type
File: src/core/config.ts
typescript// ─── Safety Configuration ────────────────────────────────── export interface SafetyConfig { enabled: boolean; breakers: { iteration: IterationBreakerConfig; cost: CostBreakerConfig; time: TimeBreakerConfig; errorRate: ErrorRateBreakerConfig; }; gates: { enabled: boolean; custom: HumanGate[]; }; automation: { level: AutomationLevel; allowLevelUp: boolean; }; audit: { enabled: boolean; retention: number; // days }; } export const DEFAULT_SAFETY_CONFIG: SafetyConfig = { enabled: true, breakers: { iteration: DEFAULT_ITERATION_CONFIG, cost: DEFAULT_COST_CONFIG, time: DEFAULT_TIME_CONFIG, errorRate: DEFAULT_ERROR_RATE_CONFIG, }, gates: { enabled: true, custom: [] }, automation: { level: AutomationLevel.LEVEL_1, allowLevelUp: true }, audit: { enabled: true, retention: 90 // 90 days } };
8.2 Configuration Loading
File: forge.config.ts (example)
typescriptimport { defineConfig } from 'forge'; export default defineConfig({ name: 'my-app', language: 'typescript', safety: { // Override default iteration limits breakers: { iteration: { limits: { implementation: 100 // Allow more iterations for complex implementation } }, // Override cost budgets cost: { budgets: { perDay: 500.0 // Higher daily budget } } }, // Add custom gate gates: { custom: [ { id: 'api_change', phase: 'review', condition: (ctx) => ctx.review?.findings.some(f => f.category === 'api_change'), prompt: 'API changes detected. Review breaking changes.', timeout: 4 * 60 * 60_000 // 4 hours } ] }, // Start at Level 2 (already proven system) automation: { level: 2 } } });
9. Safety Integration with Agent Loop
9.1 SafetyManager Coordinator
File: src/safety/manager.ts
typescript// ─── Safety Manager ──────────────────────────────────────── export class SafetyManager { private breakers: Map<string, CircuitBreaker>; private gateManager: GateManager; private automationLadder: AutomationLadder; private costTracker: CostTracker; private errorRateBreaker: ErrorRateBreaker; constructor( private config: SafetyConfig, private bus: EventBus, private db: Database ) { // Initialize cost tracker this.costTracker = new CostTracker(); // Initialize breakers this.breakers = new Map([ ['iteration', new IterationBreaker(config.breakers.iteration)], ['cost', new CostBreaker(this.costTracker, config.breakers.cost)], ['time', new TimeBreaker(config.breakers.time)], ['error_rate', new ErrorRateBreaker(config.breakers.errorRate)] ]); this.errorRateBreaker = this.breakers.get('error_rate') as ErrorRateBreaker; // Initialize gates this.gateManager = new GateManager(bus, [ ...STANDARD_GATES, ...config.gates.custom ]); // Initialize automation ladder this.automationLadder = new AutomationLadder(db, config.automation.level); } /** * Check all circuit breakers before proceeding with operation. * This is called by agents at the start of each iteration. */ async check(context: BreakerContext): Promise<SafetyCheckResult> { if (!this.config.enabled) { return { safe: true, breakers: [], gates: [] }; } // Run all breakers in parallel const breakerResults = await Promise.all( Array.from(this.breakers.values()).map(b => b.check(context)) ); // Find any that should break const tripped = breakerResults.filter(r => r.shouldBreak); if (tripped.length > 0) { // Emit breaker events for (const result of tripped) { await this.bus.emit({ traceId: context.traceId, source: 'safety', type: 'breaker.tripped', payload: result }); } return { safe: false, breakers: tripped, gates: [] }; } // Check for warnings const warnings = breakerResults.filter(r => r.reason?.includes('WARNING') || r.reason?.includes('APPROACHING') ); if (warnings.length > 0) { for (const warning of warnings) { await this.bus.emit({ traceId: context.traceId, source: 'safety', type: 'breaker.warning', payload: warning }); } } return { safe: true, breakers: warnings, gates: [] }; } /** * Check and trigger human gates if conditions are met. */ async checkGates(context: GateContext): Promise<GateResponse[]> { if (!this.config.gates.enabled) { return []; } const triggered = await this.gateManager.checkGates(context); if (triggered.length === 0) { return []; } // Request approval for each triggered gate const responses = await Promise.all( triggered.map(gate => this.gateManager.requestApproval(gate, context)) ); return responses; } /** * Record cost for an LLM operation. */ recordCost(event: Omit<CostEvent, 'costUsd'>): number { return this.costTracker.record(event); } /** * Record an error event for rate tracking. */ recordError(event: ErrorEvent): void { this.errorRateBreaker.recordError(event); } /** * Record a success event for rate tracking. */ recordSuccess(operation: string): void { this.errorRateBreaker.recordSuccess(operation); } /** * Get current automation level. */ getAutomationLevel(): AutomationLevel { return this.automationLadder.getLevel(); } /** * Check if an action is allowed at current automation level. */ isActionAllowed(action: string, riskLevel: RiskLevel): boolean { return this.automationLadder.isAllowed(action, riskLevel); } /** * Record run outcome for automation ladder progression. */ async recordRunOutcome(outcome: RunOutcome): Promise<void> { if (this.config.automation.allowLevelUp) { await this.automationLadder.recordRun(outcome); } } /** * Reset breakers for a phase. */ resetPhase(traceId: string, phase: PhaseName): void { const iterationBreaker = this.breakers.get('iteration') as IterationBreaker; iterationBreaker.resetPhase(traceId, phase); const timeBreaker = this.breakers.get('time') as TimeBreaker; timeBreaker.startPhase(traceId, phase); } } export interface SafetyCheckResult { safe: boolean; breakers: BreakerResult[]; gates: GateResponse[]; }
9.2 Integration into Agent Loop
File: src/agents/base.ts (updated)
typescriptabstract class BaseAgent implements Agent { async execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput> { let iteration = 0; let workingMemory = await this.perceive(input, ctx); while (true) { iteration++; // ── SAFETY CHECK ── const safetyCheck = await ctx.safety.check({ traceId: ctx.traceId, phase: this.type as PhaseName, iteration, cost: { phase: ctx.phaseCost, run: ctx.runCost, day: ctx.dayCost }, elapsed: Date.now() - ctx.startTime, errorWindow: ctx.errorWindow }); if (!safetyCheck.safe) { // Breaker tripped - halt execution const errors = safetyCheck.breakers.map(b => b.reason).join(', '); throw new CircuitBreakerError( `Circuit breakers tripped: ${errors}`, safetyCheck.breakers ); } // ── GATE CHECK (at start of phase only) ── if (iteration === 1) { const gateResponses = await ctx.safety.checkGates({ traceId: ctx.traceId, phase: this.type as PhaseName, plan: (input as any).plan, review: (input as any).review, cost: { phase: ctx.phaseCost, run: ctx.runCost, day: ctx.dayCost }, environment: ctx.environment }); // Check if any gate rejected const rejected = gateResponses.find(r => !r.approved); if (rejected) { throw new GateRejectedError( `Human gate rejected: ${rejected.reason}`, rejected ); } } // ── REASON: ask LLM what to do ── const decision = await ctx.llm.chat({ system: this.systemPrompt, messages: workingMemory.messages, tools: this.tools.map(t => t.schema), }); // Record cost ctx.safety.recordCost({ timestamp: new Date(), phase: this.type as PhaseName, model: ctx.llm.model, promptTokens: decision.usage.promptTokens, completionTokens: decision.usage.completionTokens, operation: `${this.type}.reason` }); // ── DONE? ── if (decision.done) { const output = decision.result as PhaseOutput; ctx.bus.emit({ type: `${this.type}.completed`, payload: output }); // Record success ctx.safety.recordSuccess(this.type); await this.reflect(ctx, 'success'); return output; } // ── ACT: execute the chosen tool ── try { const tool = this.tools.find(t => t.name === decision.toolCall.name); const result = await this.executeTool(tool, decision.toolCall.input, ctx); // Record success ctx.safety.recordSuccess(`${this.type}.${tool.name}`); // ── LEARN: update context ── workingMemory = this.updateWorkingMemory(workingMemory, decision, result); } catch (error) { // Record error for rate tracking ctx.safety.recordError({ timestamp: new Date(), severity: 'error', source: `${this.type}.${decision.toolCall.name}`, message: error.message }); await this.reflect(ctx, 'error', error); // Re-throw to fail iteration throw error; } } } }
10. Audit Trail Implementation
10.1 Database Schema
File: src/memory/schema.ts (additions)
typescript// ─── Safety Audit Tables ─────────────────────────────────── export const safetyEvents = sqliteTable('safety_events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), type: text('type').notNull(), // 'breaker.tripped', 'gate.requested', etc. breakerName: text('breaker_name'), reason: text('reason'), currentValue: real('current_value'), threshold: real('threshold'), payload: text('payload', { mode: 'json' }), }); export const gateEvents = sqliteTable('gate_events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), gateId: text('gate_id').notNull(), requestId: text('request_id').notNull(), type: text('type').notNull(), // 'requested', 'approved', 'rejected', 'escalated' approver: text('approver'), reason: text('reason'), conditions: text('conditions', { mode: 'json' }), }); export const costEvents = sqliteTable('cost_events', { id: text('id').primaryKey(), traceId: text('trace_id').notNull(), timestamp: integer('timestamp', { mode: 'timestamp_ms' }).notNull(), phase: text('phase').notNull(), model: text('model').notNull(), promptTokens: integer('prompt_tokens').notNull(), completionTokens: integer('completion_tokens').notNull(), costUsd: real('cost_usd').notNull(), operation: text('operation').notNull(), }); export const automationMetrics = sqliteTable('automation_metrics', { id: text('id').primaryKey(), totalRuns: integer('total_runs').notNull(), successfulRuns: integer('successful_runs').notNull(), falsePositives: integer('false_positives').notNull(), missedCriticalBugs: integer('missed_critical_bugs').notNull(), approvalRate: real('approval_rate').notNull(), lastUpdated: integer('last_updated', { mode: 'timestamp_ms' }).notNull(), });
10.2 Audit Query Interface
File: src/safety/audit.ts
typescript// ─── Safety Audit Interface ──────────────────────────────── export class SafetyAudit { constructor(private db: Database) {} /** * Get all safety events for a trace. */ async getTraceEvents(traceId: string): Promise<SafetyEvent[]> { return this.db.select() .from(safetyEvents) .where(eq(safetyEvents.traceId, traceId)) .orderBy(safetyEvents.timestamp); } /** * Get all gate events for a trace. */ async getGateEvents(traceId: string): Promise<GateEvent[]> { return this.db.select() .from(gateEvents) .where(eq(gateEvents.traceId, traceId)) .orderBy(gateEvents.timestamp); } /** * Get cost breakdown for a trace. */ async getCostBreakdown(traceId: string): Promise<CostBreakdown> { const events = await this.db.select() .from(costEvents) .where(eq(costEvents.traceId, traceId)); const byPhase: Record<string, number> = {}; const byModel: Record<string, number> = {}; let total = 0; for (const event of events) { byPhase[event.phase] = (byPhase[event.phase] ?? 0) + event.costUsd; byModel[event.model] = (byModel[event.model] ?? 0) + event.costUsd; total += event.costUsd; } return { total, byPhase, byModel, events }; } /** * Get breaker trip history. */ async getBreakerHistory( breakerName?: string, since?: Date ): Promise<SafetyEvent[]> { let query = this.db.select() .from(safetyEvents) .where(eq(safetyEvents.type, 'breaker.tripped')); if (breakerName) { query = query.where(eq(safetyEvents.breakerName, breakerName)); } if (since) { query = query.where( gte(safetyEvents.timestamp, since.getTime()) ); } return query.orderBy(desc(safetyEvents.timestamp)); } /** * Get gate approval statistics. */ async getGateStats(gateId?: string): Promise<GateStats> { let events = await this.db.select() .from(gateEvents) .where(eq(gateEvents.type, 'approved')) .or(eq(gateEvents.type, 'rejected')); if (gateId) { events = events.filter(e => e.gateId === gateId); } const total = events.length; const approved = events.filter(e => e.type === 'approved').length; const rejected = events.filter(e => e.type === 'rejected').length; return { total, approved, rejected, approvalRate: total > 0 ? approved / total : 0 }; } /** * Clean up old audit records. */ async cleanup(retentionDays: number): Promise<void> { const cutoff = Date.now() - (retentionDays * 24 * 60 * 60_000); await this.db.delete(safetyEvents) .where(lt(safetyEvents.timestamp, cutoff)); await this.db.delete(gateEvents) .where(lt(gateEvents.timestamp, cutoff)); await this.db.delete(costEvents) .where(lt(costEvents.timestamp, cutoff)); } } export interface CostBreakdown { total: number; byPhase: Record<string, number>; byModel: Record<string, number>; events: CostEvent[]; } export interface GateStats { total: number; approved: number; rejected: number; approvalRate: number; }
11. Testing Safety System
11.1 Unit Tests for Breakers
File: src/safety/__tests__/breakers.test.ts
typescriptimport { describe, it, expect, beforeEach } from 'bun:test'; import { IterationBreaker, CostBreaker, TimeBreaker, ErrorRateBreaker, CostTracker } from '../breakers'; describe('IterationBreaker', () => { let breaker: IterationBreaker; beforeEach(() => { breaker = new IterationBreaker({ enabled: true, threshold: 10, warning: 0.8, limits: { default: 10, planning: 20, implementation: 50, review: 10, testing: 5, deployment: 3 }, stagnation: { threshold: 3, definition: { hasProgress: async () => true // Mock always has progress } } }); }); it('should not trip under limit', async () => { const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 10, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); }); it('should warn at 80% of limit', async () => { const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 40, // 80% of 50 cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); expect(result.reason).toContain('APPROACHING'); }); it('should trip over limit', async () => { const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 51, // Over limit of 50 cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(true); expect(result.reason).toBe('MAX_ITERATIONS_EXCEEDED'); }); }); describe('CostBreaker', () => { let tracker: CostTracker; let breaker: CostBreaker; beforeEach(() => { tracker = new CostTracker(); breaker = new CostBreaker(tracker, { enabled: true, threshold: 1.0, warning: 0.8, budgets: { perPhase: { planning: 5.0, implementation: 10.0, review: 2.0, testing: 3.0, deployment: 2.0 }, perRun: 50.0, perDay: 200.0 } }); }); it('should not trip under budget', async () => { tracker.record({ timestamp: new Date(), phase: 'implementation', model: 'claude-sonnet-4-5-20250929', promptTokens: 1000, completionTokens: 500, operation: 'test' }); const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); }); it('should trip when phase budget exceeded', async () => { // Add enough events to exceed $10 implementation budget for (let i = 0; i < 100; i++) { tracker.record({ timestamp: new Date(), phase: 'implementation', model: 'claude-opus-4-6', // Expensive model promptTokens: 10000, completionTokens: 5000, operation: 'test' }); } const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 100, run: 100, day: 100 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(true); }); }); describe('ErrorRateBreaker', () => { let breaker: ErrorRateBreaker; beforeEach(() => { breaker = new ErrorRateBreaker({ enabled: true, threshold: 0.25, warning: 0.1, windowSize: 5 * 60_000, thresholds: { warning: 0.10, critical: 0.25 } }); }); it('should not trip with low error rate', async () => { // Record 90 successes, 10 errors = 10% error rate for (let i = 0; i < 90; i++) { breaker.recordSuccess('test'); } for (let i = 0; i < 10; i++) { breaker.recordError({ timestamp: new Date(), severity: 'error', source: 'test', message: 'test error' }); } const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(false); expect(result.reason).toContain('WARNING'); }); it('should trip with high error rate', async () => { // Record 50 successes, 50 errors = 50% error rate for (let i = 0; i < 50; i++) { breaker.recordSuccess('test'); breaker.recordError({ timestamp: new Date(), severity: 'error', source: 'test', message: 'test error' }); } const result = await breaker.check({ traceId: 'test', phase: 'implementation', iteration: 1, cost: { phase: 0, run: 0, day: 0 }, elapsed: 0, errorWindow: [] }); expect(result.shouldBreak).toBe(true); expect(result.reason).toBe('ERROR_RATE_CRITICAL'); }); });
11.2 Integration Test for Gate Flow
File: src/safety/__tests__/gates.integration.test.ts
typescriptimport { describe, it, expect, beforeEach } from 'bun:test'; import { GateManager } from '../gates'; import { EventBus } from '../../core/bus'; describe('GateManager Integration', () => { let gateManager: GateManager; let bus: EventBus; beforeEach(() => { bus = new EventBus(); gateManager = new GateManager(bus); }); it('should trigger gate on condition', async () => { const context: GateContext = { traceId: 'test', phase: 'deployment', environment: 'production', cost: { phase: 0, run: 0, day: 0 } }; const triggered = await gateManager.checkGates(context); // Should trigger production_deploy gate expect(triggered).toHaveLength(1); expect(triggered[0].id).toBe('production_deploy'); }); it('should wait for approval response', async () => { const context: GateContext = { traceId: 'test', phase: 'deployment', environment: 'production', cost: { phase: 0, run: 0, day: 0 } }; const triggered = await gateManager.checkGates(context); const gate = triggered[0]; // Start approval request (async) const approvalPromise = gateManager.requestApproval(gate, context); // Simulate human response after 1 second setTimeout(() => { gateManager.submitResponse({ requestId: gate.id, approved: true, approver: 'test-user', timestamp: new Date() }); }, 1000); const response = await approvalPromise; expect(response.approved).toBe(true); expect(response.approver).toBe('test-user'); }); it('should timeout if no response', async () => { const context: GateContext = { traceId: 'test', phase: 'deployment', environment: 'production', cost: { phase: 0, run: 0, day: 0 } }; const triggered = await gateManager.checkGates(context); const gate = { ...triggered[0], timeout: 100 }; // 100ms timeout const response = await gateManager.requestApproval(gate, context); expect(response.approved).toBe(false); expect(response.reason).toBe('TIMEOUT'); }, 10000); });
12. Implementation Checklist
Phase 1: Core Infrastructure (Week 1)
- Implement
CircuitBreakerbase class with state machine - Implement
IterationBreakerwith stagnation detection - Implement
CostTrackerandCostBreaker - Implement
TimeBreaker - Implement
ErrorRateBreakerwith circular buffer - Create safety configuration types
- Add safety event tables to database schema
Phase 2: Human Gates (Week 1-2)
- Implement
HumanGatetype and standard gates - Implement
GateManagerwith request/response flow - Add gate event tracking
- Create CLI gate handler with inquirer prompts
- Implement gate timeout and escalation
Phase 3: Integration (Week 2)
- Implement
SafetyManagercoordinator - Integrate safety checks into
BaseAgentloop - Add cost recording to LLM calls
- Add error recording to tool executions
- Test breaker trip propagation
Phase 4: Automation Ladder (Week 2)
- Implement
AutomationLadderlevel manager - Add automation metrics tracking
- Implement level transition logic
- Add level checks to decision points
- Test level progression
Phase 5: Audit Trail (Week 2)
- Implement
SafetyAuditquery interface - Add audit event emission
- Create cost breakdown queries
- Implement cleanup for old records
Phase 6: Testing (Week 2)
- Unit tests for all breakers
- Integration tests for gate flow
- Cost calculation accuracy tests
- Stagnation detection tests
- End-to-end safety system test
13. Configuration Examples
Example 1: Conservative (High Safety)
typescriptexport default defineConfig({ safety: { breakers: { iteration: { limits: { planning: 10, implementation: 25, testing: 3 } }, cost: { budgets: { perRun: 25.0, perDay: 100.0 } } }, automation: { level: 1, // AI suggests only allowLevelUp: false } } });
Example 2: Aggressive (Fast Iteration)
typescriptexport default defineConfig({ safety: { breakers: { iteration: { limits: { implementation: 100, testing: 10 } }, cost: { budgets: { perRun: 100.0, perDay: 500.0 } } }, automation: { level: 3, // High autonomy allowLevelUp: true } } });
Example 3: Production (Maximum Safety)
typescriptexport default defineConfig({ safety: { gates: { custom: [ // Always require approval for any deployment { id: 'any_deploy', phase: 'deployment', condition: () => true, prompt: 'Approve deployment', timeout: 30 * 60_000 }, // Require approval for large code changes { id: 'large_change', phase: 'review', condition: (ctx) => ctx.review?.linesChanged > 500, prompt: 'Large code change requires review', timeout: 4 * 60 * 60_000 } ] }, automation: { level: 2, allowLevelUp: false // Never auto-advance in production } } });
14. Summary
The safety system is the nervous system of Forge. It prevents runaway execution through circuit breakers, enforces human oversight at critical points through gates, and enables progressive autonomy through the automation ladder.
Key Components:
- Circuit Breakers - Iteration, Cost, Time, Error Rate
- Human Gates - Conditional approval checkpoints
- Automation Ladder - Progressive autonomy earning
- Audit Trail - Complete safety decision logging
Implementation Priority:
- P0 (Week 1): Core breakers + basic gates
- P0 (Week 2): Integration + automation ladder
- P1 (Week 3+): Advanced features + optimization
Success Metrics:
- Zero runaway executions
- < 1 minute median gate response time
- Automation level progression to Level 3+ after 200 successful runs
- 100% audit trail coverage of safety decisions
This safety system balances autonomy with control, enabling agents to operate efficiently while maintaining human oversight where it matters most.