4 min
security
February 8, 2026

Forge Review: Critical & High Severity Findings

Critical & High Severity Findings

CRITICAL-1: Command Injection in Agent Tools

Files: src/agents/implementer.ts:110, src/agents/implementer.ts:129, src/agents/planner.ts:45

Severity: Critical | Category: Security

Multiple agent tools interpolate user-controlled input directly into shell commands:

typescript
// implementer.ts:110 — git_create_branch tool await execAsync(`git checkout -b ${input.branchName}`); // implementer.ts:129 — git_commit tool await execAsync(`git commit -m "${input.message}"`); // planner.ts:45 — glob tool await execAsync(`find . -name "${input.pattern}" -type f`);

An LLM producing a branch name like foo; rm -rf / would execute arbitrary commands. This is the most serious issue in the codebase.

Fix: Use Bun.spawn() with argument arrays (no shell interpolation), or at minimum escape/validate all shell arguments against [a-zA-Z0-9._/-] patterns.


CRITICAL-2: SQL Injection in SQLiteCheckpointStorage

File: src/orchestrator/checkpoint.ts:95-104

Severity: Critical | Category: Security

The save() method uses string interpolation to build SQL:

typescript
await this.db.exec(` INSERT OR REPLACE INTO checkpoints (id, trace_id, phase, state, timestamp) VALUES ( '${checkpoint.id}', '${checkpoint.traceId}', '${checkpoint.phase}', '${JSON.stringify(checkpoint.state)}', ${checkpoint.timestamp.getTime()} ) `);

If any checkpoint field contains a single quote, this breaks. More critically, crafted input could inject SQL. The load() method at line 110 correctly uses parameterized queries — inconsistency within the same class.

Fix: Use parameterized queries via this.db.query() with ? placeholders for all values.


CRITICAL-3: CLI Pipeline Runs Simulated, Not Real

File: src/cli/index.ts:169-194

The forge run command uses simulatePhase() (setTimeout delays) instead of actual agent execution. The Pipeline class exists and is functional, but the CLI never calls it.

typescript
console.log('\nšŸ“‹ Phase 1: Planning...'); // TODO: planner.execute() await simulatePhase('planning', 1000); // Just a sleep!

Meanwhile, the beads mode path at line 114 correctly imports and runs the actual BeadsPipeline. The standard pipeline path should import and run Pipeline from ./orchestrator/pipeline.ts.

Fix: Replace simulatePhase() calls with actual Pipeline.run() invocation using the loaded config.


HIGH-1: Duplicate Type Definitions

Files: src/types/index.ts (lines 662-757) vs src/core/errors.ts

ForgeBaseError, AgentError, ToolError, ErrorSource, and ErrorSeverity are defined in both files with different implementations:

LocationForgeBaseError constructorErrorSource values
src/types/index.ts:711Takes ForgeBaseErrorOptions4 values: AGENT, TOOL, ORCHESTRATOR, SYSTEM
src/core/errors.ts:32Takes Omit<ErrorDetails, 'stack'>6 values: +SAFETY, +CONFIG, +EXTERNAL

The core/errors.ts version is richer (includes details property, getters, and more error subclasses including OrchestratorError, SafetyError, ConfigError, ExternalError). The types/index.ts version also has a simpler ForgeError class (line 662) that is separate from ForgeBaseError.

Impact: Depending on which import path is used, errors may have different shapes. The safety module imports from core/errors.ts while agents import from types/index.ts.

Fix: Remove the error definitions from types/index.ts and re-export from core/errors.ts. Keep one canonical error hierarchy.


HIGH-2: Two Competing EventBus Implementations

Files: src/core/bus.ts vs src/events/bus.ts

Featurecore/bus.ts (InMemoryEventBus)events/bus.ts (EventBus)
StorageIn-memory array onlySQLite-persisted via Drizzle ORM
ID GenDate.now().toString(36)-randomulid()
WildcardsVia Node.js EventEmitter listenersDedicated wildcardHandlers Set
CheckpointsNoBuilt-in snapshot() / getCheckpoint()
Cleanupclear() methodclose() (SQLite connection)

The CLI and Pipeline use core/bus.ts (InMemoryEventBus), while the events/bus.ts is the full-featured version with SQLite persistence matching the system design. The core/bus.ts import path is '../core/bus.js' and the CLI references '../core/bus.js'.

Impact: Events are lost on process restart. The design specifies SQLite-persisted events for audit trails and replay.

Fix: Consolidate to events/bus.ts as the canonical implementation. The InMemoryEventBus can be kept as a test fixture.


HIGH-3: better-sqlite3 Instead of bun:sqlite

Files: package.json, src/events/bus.ts, src/memory/store.ts

The project uses better-sqlite3 (external C++ addon) despite the CLAUDE.md rule: "bun:sqlite for SQLite. Don't use better-sqlite3."

better-sqlite3 adds native compilation overhead, is slower on Bun, and requires @types/better-sqlite3 as a dev dependency.

Fix: Migrate to import { Database } from 'bun:sqlite' with the Drizzle bun-sqlite driver.


HIGH-4: TypeScript Errors in Safety Module

File: src/safety/breakers.ts — 5 errors

All related to noUncheckedIndexedAccess strictness:

  • Line 207: Object possibly undefined when accessing array element
  • Lines 336-388: string | undefined not assignable to string

These are real type safety gaps — array access without bounds checking in the error rate sliding window.


HIGH-5: any Type in Pi-Agent Bridges

Files: src/agents/pi-model-bridge.ts:31,45, src/agents/pi-adapter.ts:44

typescript
// pi-model-bridge.ts:31 export function resolvePiModel(config: LLMConfig): Model<any> { // pi-adapter.ts:44 private model: Model<any>;

Per CLAUDE.md rules: "Don't use any — use unknown and narrow." The Model<any> suppresses all type checking on the model's response shape.


HIGH-6: Unbounded Memory Growth in InMemoryEventBus

File: src/core/bus.ts:35

typescript
this.events.push(fullEvent); // Never pruned!

The in-memory event array grows forever. For a long-running pipeline with many iterations, this will eventually consume all available memory.

Fix: Add a max events config with LRU eviction, or switch to the SQLite-backed EventBus.