security

February 8, 2026

Forge Review: Critical & High Severity Findings

Critical & High Severity Findings

CRITICAL-1: Command Injection in Agent Tools

Files: src/agents/implementer.ts:110, src/agents/implementer.ts:129, src/agents/planner.ts:45

Severity: Critical | Category: Security

Multiple agent tools interpolate user-controlled input directly into shell commands:

typescript
// implementer.ts:110 — git_create_branch tool
await execAsync(`git checkout -b ${input.branchName}`);

// implementer.ts:129 — git_commit tool  
await execAsync(`git commit -m "${input.message}"`);

// planner.ts:45 — glob tool
await execAsync(`find . -name "${input.pattern}" -type f`);

An LLM producing a branch name like foo; rm -rf / would execute arbitrary commands. This is the most serious issue in the codebase.

Fix: Use Bun.spawn() with argument arrays (no shell interpolation), or at minimum escape/validate all shell arguments against [a-zA-Z0-9._/-] patterns.

CRITICAL-2: SQL Injection in SQLiteCheckpointStorage

File: src/orchestrator/checkpoint.ts:95-104

Severity: Critical | Category: Security

The save() method uses string interpolation to build SQL:

typescript
await this.db.exec(`
  INSERT OR REPLACE INTO checkpoints (id, trace_id, phase, state, timestamp)
  VALUES (
    '${checkpoint.id}',
    '${checkpoint.traceId}',
    '${checkpoint.phase}',
    '${JSON.stringify(checkpoint.state)}',
    ${checkpoint.timestamp.getTime()}
  )
`);

If any checkpoint field contains a single quote, this breaks. More critically, crafted input could inject SQL. The load() method at line 110 correctly uses parameterized queries — inconsistency within the same class.

Fix: Use parameterized queries via this.db.query() with ? placeholders for all values.

CRITICAL-3: CLI Pipeline Runs Simulated, Not Real

File: src/cli/index.ts:169-194

The forge run command uses simulatePhase() (setTimeout delays) instead of actual agent execution. The Pipeline class exists and is functional, but the CLI never calls it.

typescript
console.log('\n📋 Phase 1: Planning...');
// TODO: planner.execute()
await simulatePhase('planning', 1000);  // Just a sleep!

Meanwhile, the beads mode path at line 114 correctly imports and runs the actual BeadsPipeline. The standard pipeline path should import and run Pipeline from ./orchestrator/pipeline.ts.

Fix: Replace simulatePhase() calls with actual Pipeline.run() invocation using the loaded config.

HIGH-1: Duplicate Type Definitions

Files: src/types/index.ts (lines 662-757) vs src/core/errors.ts

ForgeBaseError, AgentError, ToolError, ErrorSource, and ErrorSeverity are defined in both files with different implementations:

Location	`ForgeBaseError` constructor	`ErrorSource` values
`src/types/index.ts:711`	Takes `ForgeBaseErrorOptions`	4 values: AGENT, TOOL, ORCHESTRATOR, SYSTEM
`src/core/errors.ts:32`	Takes `Omit<ErrorDetails, 'stack'>`	6 values: +SAFETY, +CONFIG, +EXTERNAL

The core/errors.ts version is richer (includes details property, getters, and more error subclasses including OrchestratorError, SafetyError, ConfigError, ExternalError). The types/index.ts version also has a simpler ForgeError class (line 662) that is separate from ForgeBaseError.

Impact: Depending on which import path is used, errors may have different shapes. The safety module imports from core/errors.ts while agents import from types/index.ts.

Fix: Remove the error definitions from types/index.ts and re-export from core/errors.ts. Keep one canonical error hierarchy.

HIGH-2: Two Competing EventBus Implementations

Files: src/core/bus.ts vs src/events/bus.ts

Feature	`core/bus.ts` (InMemoryEventBus)	`events/bus.ts` (EventBus)
Storage	In-memory array only	SQLite-persisted via Drizzle ORM
ID Gen	`Date.now().toString(36)-random`	`ulid()`
Wildcards	Via Node.js EventEmitter listeners	Dedicated `wildcardHandlers` Set
Checkpoints	No	Built-in `snapshot()` / `getCheckpoint()`
Cleanup	`clear()` method	`close()` (SQLite connection)

The CLI and Pipeline use core/bus.ts (InMemoryEventBus), while the events/bus.ts is the full-featured version with SQLite persistence matching the system design. The core/bus.ts import path is '../core/bus.js' and the CLI references '../core/bus.js'.

Impact: Events are lost on process restart. The design specifies SQLite-persisted events for audit trails and replay.

Fix: Consolidate to events/bus.ts as the canonical implementation. The InMemoryEventBus can be kept as a test fixture.

HIGH-3: `better-sqlite3` Instead of `bun:sqlite`

Files: package.json, src/events/bus.ts, src/memory/store.ts

The project uses better-sqlite3 (external C++ addon) despite the CLAUDE.md rule: "bun:sqlite for SQLite. Don't use better-sqlite3."

better-sqlite3 adds native compilation overhead, is slower on Bun, and requires @types/better-sqlite3 as a dev dependency.

Fix: Migrate to import { Database } from 'bun:sqlite' with the Drizzle bun-sqlite driver.

HIGH-4: TypeScript Errors in Safety Module

File: src/safety/breakers.ts — 5 errors

All related to noUncheckedIndexedAccess strictness:

Line 207: Object possibly undefined when accessing array element
Lines 336-388: string | undefined not assignable to string

These are real type safety gaps — array access without bounds checking in the error rate sliding window.

HIGH-5: `any` Type in Pi-Agent Bridges

Files: src/agents/pi-model-bridge.ts:31,45, src/agents/pi-adapter.ts:44

typescript
// pi-model-bridge.ts:31
export function resolvePiModel(config: LLMConfig): Model<any> {

// pi-adapter.ts:44  
private model: Model<any>;

Per CLAUDE.md rules: "Don't use any — use unknown and narrow." The Model<any> suppresses all type checking on the model's response shape.

HIGH-6: Unbounded Memory Growth in InMemoryEventBus

File: src/core/bus.ts:35

typescript
this.events.push(fullEvent);  // Never pruned!

The in-memory event array grows forever. For a long-running pipeline with many iterations, this will eventually consume all available memory.

Fix: Add a max events config with LRU eviction, or switch to the SQLite-backed EventBus.