29 min
architecture
February 8, 2026

Implementation Sub-Plan: Deferred Features & Future Extensibility

Implementation Sub-Plan: Deferred Features & Future Extensibility

Section: 15 - What This Design Explicitly Defers Generated: 2026-02-07 Status: Draft Dependencies: System Design (SYSTEM-DESIGN.md), Roadmap, Orchestration, Self-Improvement, Evaluation, Human-AI Collaboration


Executive Summary

This plan catalogs everything NOT in the MVP but explicitly designed for. The goal is to ensure MVP architecture decisions don't paint us into a corner. Each deferred item specifies:

  1. What the MVP MUST do to not block this later
  2. What the MVP MUST NOT do to avoid breaking changes
  3. Estimated effort to add post-MVP
  4. The trigger/milestone that indicates it's time to build this

Philosophy: The MVP is sequential and single-process. Every abstraction boundary is designed so that swapping the implementation doesn't break calling code.


1. Parallel Agent Execution

Current: Sequential pipeline (Plan → Implement → Review → Test → Deploy) Future: Parallel implementation agents working on independent modules

1.1 MVP Requirements to Enable This

MUST implement:

typescript
// Agent interface already assumes statelessness interface Agent { id: string; type: AgentType; // CRITICAL: No mutable shared state — all inputs via parameters execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput>; } // Context is READ-ONLY for agents interface AgentContext { readonly memory: MemoryStore; readonly llm: LLMProvider; readonly bus: EventBus; readonly safety: SafetyControls; // Agents CANNOT modify these — only read and emit events }

MUST design:

  • Event bus must support concurrent emitters (already true for in-memory pub/sub)
  • Each agent execution must have isolated working memory (no shared scratch space)
  • Phase outputs must be immutable once returned (freeze objects)
  • Tool execution must be thread-safe (each tool call gets its own sandbox)

MUST enforce:

typescript
// In base.ts abstract class BaseAgent { private workingMemory: WorkingMemory; // Local to this execution async execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput> { // NEVER mutate input or ctx — treat as immutable const localContext = this.createLocalContext(ctx); const result = await this.runLoop(input, localContext); // Return immutable output return Object.freeze(result); } }

1.2 MVP Must NOT Do

Forbidden patterns:

typescript
// ❌ BAD: Shared mutable state class PlannerAgent { private cachedPlan: Plan; // Parallel agents would clobber this } // ❌ BAD: Context mutation async execute(input, ctx) { ctx.sharedState.lastResult = result; // Race conditions } // ❌ BAD: Sequential assumptions const code = await implementer.execute(plan); // Assumes single implementer; breaks if parallel

Safe alternatives:

typescript
// ✅ GOOD: Stateless with explicit state passing async execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput> { const state = this.perceive(input, ctx); // Local state const result = await this.act(state); return result; // No side effects on ctx }

1.3 Orchestrator Changes for Parallelism

MVP orchestrator (sequential):

typescript
// pipeline.ts (MVP) async function runPipelineSequential(task: string) { const plan = await runPhase('planning', { task }); const code = await runPhase('implementation', plan); // One at a time // ... }

Post-MVP orchestrator (parallel):

typescript
// pipeline.ts (Post-MVP) async function runPipelineParallel(task: string) { const plan = await runPhase('planning', { task }); // Split plan into independent modules const modules = plan.tasks.filter(t => !t.dependencies.length); // Spawn parallel implementers const implementations = await Promise.all( modules.map(module => this.spawnImplementer(module, plan.architecture) ) ); // Work-stealing for dependent tasks const remaining = plan.tasks.filter(t => t.dependencies.length); const completed = await this.workStealingExecution(remaining, implementations); // Merge results const code = this.mergeImplementations([...implementations, ...completed]); // ... }

Interface changes needed: NONE — execute() signature stays the same. Only orchestrator internals change.

1.4 Context Bus Changes

MVP (in-memory):

typescript
interface ContextBus { get<T>(key: string): T; events: EventBus; }

Post-MVP (concurrent-safe):

typescript
interface ContextBus { get<T>(key: string): Readonly<T>; // Enforce immutability getSnapshot(): Snapshot; // Atomic read events: EventBus; }

1.5 Checkpoint System Changes

MVP (phase-level checkpoints):

typescript
interface Checkpoint { phase: PhaseName; state: Record<string, unknown>; // One state per phase }

Post-MVP (agent-level checkpoints):

typescript
interface Checkpoint { phase: PhaseName; agents: Map<AgentId, AgentState>; // Multiple concurrent agents dependencies: DependencyGraph; // Track which tasks depend on which } async function resumeFromCheckpoint(cp: Checkpoint) { // Resume all agents that weren't waiting on dependencies const resumable = cp.agents.filter(a => a.dependencies.every(d => cp.completed.has(d))); await Promise.all(resumable.map(a => a.resume())); }

1.6 Estimated Effort

Engineering time: 2-3 weeks Breaking changes: None (if MVP interfaces are designed correctly) Components affected:

  • Orchestrator (major changes)
  • Checkpoint system (extend schema)
  • Context bus (add immutability enforcement)
  • Agents (no changes if stateless)

1.7 Trigger to Build This

Indicators:

  • Implementation phase takes >10 minutes for tasks that have independent modules
  • Profile shows agents spending >50% time idle waiting for sequential dependencies
  • User requests feature that spans >5 independent modules

Milestone: After 500+ successful sequential pipeline runs with no shared-state bugs.


2. Kubernetes Deployment

Current: Single Bun process on developer machine Future: Distributed system across K8s cluster

2.1 MVP Requirements to Enable This

MUST separate concerns:

forge/
├── src/
│   ├── orchestrator/    # Coordinator service (K8s Deployment)
│   ├── agents/          # Worker pods (K8s Jobs or Deployments)
│   ├── tools/           # Tool executors (might stay in agent pods)
│   ├── memory/          # Becomes external service (K8s StatefulSet)
│   └── core/            # Shared library

MUST design interfaces for network boundaries:

typescript
// In MVP, this is in-process function call interface MemoryStore { store(memory: Memory): Promise<void>; recall(query: Query): Promise<Memory[]>; } // Post-MVP, this becomes gRPC/HTTP API // But interface signature DOES NOT CHANGE class RemoteMemoryStore implements MemoryStore { async store(memory: Memory): Promise<void> { await this.httpClient.post('/memory', memory); } }

MUST externalize configuration:

typescript
// MVP: config.ts exports const export const config = { llm: { provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY }, memory: { dbPath: './.forge/memory.db' }, }; // Post-MVP: config comes from environment/ConfigMap export const config = { llm: { provider: process.env.LLM_PROVIDER, apiKey: process.env.LLM_API_KEY, }, memory: { dbUrl: process.env.DATABASE_URL, // PostgreSQL connection string }, };

2.2 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Hardcoded file paths const dbPath = '/home/user/.forge/memory.db'; // ❌ BAD: Process-global singletons class Orchestrator { private static instance: Orchestrator; // Doesn't work across pods } // ❌ BAD: In-memory state that can't be serialized const cache = new Map(); // Lost when pod restarts

Safe alternatives:

typescript
// ✅ GOOD: Environment-driven paths const dbPath = process.env.FORGE_DB_PATH || './.forge/memory.db'; // ✅ GOOD: Dependency injection class Orchestrator { constructor( private bus: EventBus, private memory: MemoryStore, private llm: LLMProvider ) {} } // ✅ GOOD: External cache with TTL const cache = new Redis({ url: process.env.REDIS_URL });

2.3 Component Containerization Plan

Orchestrator (Deployment, 1 replica):

yaml
# k8s/orchestrator.yaml apiVersion: apps/v1 kind: Deployment metadata: name: forge-orchestrator spec: replicas: 1 # Single coordinator template: spec: containers: - name: orchestrator image: forge/orchestrator:latest env: - name: DATABASE_URL valueFrom: secretKeyRef: name: forge-secrets key: database-url

Agents (Jobs, ephemeral):

yaml
# Spawned dynamically by orchestrator apiVersion: batch/v1 kind: Job metadata: name: implementer-{{ .TaskId }} spec: template: spec: containers: - name: implementer image: forge/agents:latest env: - name: AGENT_TYPE value: "implementer" - name: TASK_INPUT value: "{{ .TaskInputJSON }}"

Memory Store (StatefulSet with persistent volume):

yaml
apiVersion: apps/v1 kind: StatefulSet metadata: name: forge-memory spec: serviceName: "memory" replicas: 1 template: spec: containers: - name: postgres image: postgres:16 volumeMounts: - name: memory-data mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: memory-data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 10Gi

2.4 State Migration

SQLite → PostgreSQL:

sql
-- MVP schema (SQLite) CREATE TABLE events ( id TEXT PRIMARY KEY, trace_id TEXT NOT NULL, -- ... ); -- Post-MVP schema (PostgreSQL) CREATE TABLE events ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), trace_id UUID NOT NULL, -- Same columns, different types );

Migration script:

typescript
async function migrateSQLiteToPostgres() { const sqlite = new Database('.forge/memory.db'); const postgres = new Client(process.env.DATABASE_URL); const events = sqlite.prepare('SELECT * FROM events').all(); for (const event of events) { await postgres.query( 'INSERT INTO events (...) VALUES (...)', mapSQLiteToPostgres(event) ); } }

2.5 Event Bus Migration

In-memory → Redis Pub/Sub:

typescript
// MVP: bus.ts class InMemoryEventBus implements EventBus { private handlers = new Map<string, Set<EventHandler>>(); async emit(event: ForgeEvent) { const handlers = this.handlers.get(event.type) || new Set(); handlers.forEach(h => h(event)); } } // Post-MVP: redis-bus.ts (implements same interface!) class RedisEventBus implements EventBus { private redis: Redis; async emit(event: ForgeEvent) { await this.redis.publish('forge:events', JSON.stringify(event)); } on(type: string, handler: EventHandler) { this.redis.subscribe(`forge:events:${type}`); this.redis.on('message', (channel, message) => { const event = JSON.parse(message); handler(event); }); } }

Interface doesn't change — swap implementation via config.

2.6 Estimated Effort

Engineering time: 4-6 weeks Breaking changes: None if interfaces are clean New infrastructure:

  • Kubernetes cluster setup
  • PostgreSQL instance
  • Redis instance
  • Container registry
  • Monitoring (Prometheus/Grafana)

2.7 Trigger to Build This

Indicators:

  • Users running Forge on multiple machines need shared memory
  • Single-process execution can't scale to team usage
  • Need deployment isolation per user/team

Milestone: After 1000+ successful runs in single-process mode with no data loss.


3. Vector Database for Memory

Current: Brute-force cosine similarity in SQLite Future: pgvector / Qdrant / ChromaDB for fast similarity search

3.1 MVP Requirements to Enable This

MUST abstract similarity search behind interface:

typescript
// memory/store.ts (MVP) interface MemoryStore { store(memory: Memory): Promise<void>; // This interface stays the same forever recall(query: RecallQuery): Promise<Memory[]>; } interface RecallQuery { context: string; // Will be embedded type?: MemoryType; limit?: number; minConfidence?: number; }

MVP implementation (brute force):

typescript
class SQLiteMemoryStore implements MemoryStore { async recall(query: RecallQuery): Promise<Memory[]> { // Embed query const queryEmbedding = await this.llm.embed(query.context); // Fetch all memories (brute force!) const allMemories = await this.db.select().from(memories); // Calculate similarity in-memory const withScores = allMemories.map(m => ({ memory: m, score: cosineSimilarity(queryEmbedding, m.embedding) })); // Sort and filter return withScores .filter(m => m.score > (query.minConfidence || 0)) .sort((a, b) => b.score - a.score) .slice(0, query.limit || 10) .map(m => m.memory); } }

This is O(N) — acceptable for <100K memories.

3.2 Post-MVP Implementation (vector DB)

typescript
class PgVectorMemoryStore implements MemoryStore { async recall(query: RecallQuery): Promise<Memory[]> { const queryEmbedding = await this.llm.embed(query.context); // pgvector does the heavy lifting const results = await this.db.execute(sql` SELECT *, 1 - (embedding <=> ${queryEmbedding}) as similarity FROM memories WHERE similarity > ${query.minConfidence || 0} ORDER BY similarity DESC LIMIT ${query.limit || 10} `); return results.map(r => this.mapToMemory(r)); } }

This is O(log N) with HNSW index — scales to millions.

3.3 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Leaking implementation details interface RecallQuery { embedding: Float32Array; // Forces caller to know about embeddings } // ❌ BAD: Vector-DB-specific query syntax interface RecallQuery { hnswQuery: HNSWQuery; // Locks us into HNSW }

Safe:

typescript
// ✅ GOOD: Abstract, high-level query interface RecallQuery { context: string; // Store handles embedding internally filters?: RecallFilters; // Generic filters, not DB-specific }

3.4 Migration Path

Step 1: Add vector DB as secondary store (dual-write)

typescript
class DualWriteMemoryStore implements MemoryStore { constructor( private sqlite: SQLiteMemoryStore, private vector: PgVectorMemoryStore ) {} async store(memory: Memory) { await Promise.all([ this.sqlite.store(memory), this.vector.store(memory) // Write to both ]); } async recall(query: RecallQuery) { return this.vector.recall(query); // Read from vector DB } }

Step 2: Backfill historical data

typescript
async function backfillToVectorDB() { const allMemories = await sqlite.getAll(); for (const memory of allMemories) { await vectorDB.store(memory); } }

Step 3: Remove SQLite (single-write)

3.5 Performance Threshold

Brute-force is acceptable until:

  • Memory count > 100,000
  • Recall latency > 500ms (p95)
  • Memory usage > 2GB for embeddings

When to migrate:

  • If recall queries take >1 second
  • If memory table exceeds 100K rows
  • If adding an embedding index to SQLite doesn't help

3.6 Estimated Effort

Engineering time: 1-2 weeks Breaking changes: None (swap implementation) New infrastructure:

  • pgvector extension on PostgreSQL, OR
  • Qdrant instance, OR
  • ChromaDB instance

3.7 Trigger to Build This

Indicators:

  • Memory table exceeds 50K rows
  • Recall queries taking >500ms
  • Users complaining about slow memory retrieval

Milestone: When brute-force becomes measurably slow (>1s p95).


4. Multi-Repo Intelligence

Current: Memory scoped to single repository Future: Learn patterns across all codebases a user works on

4.1 MVP Requirements to Enable This

MUST namespace memories by repo:

typescript
// memory/schema.ts export const memories = sqliteTable('memories', { id: text('id').primaryKey(), // MVP: Add this field even though we only use one repo repoId: text('repo_id').notNull().default('current'), type: text('type').notNull(), content: text('content').notNull(), // ... }); // Composite index for efficient filtering // CREATE INDEX idx_repo_type ON memories(repo_id, type);

MUST support repo-scoped and global queries:

typescript
interface RecallQuery { context: string; // MVP: Always set to 'current', but interface supports multi-repo scope?: 'current' | 'global' | 'repo:specific-id'; limit?: number; } class MemoryStore { async recall(query: RecallQuery): Promise<Memory[]> { const scope = query.scope || 'current'; if (scope === 'global') { // Search across all repos return this.searchAllRepos(query); } else if (scope.startsWith('repo:')) { // Search specific repo const repoId = scope.split(':')[1]; return this.searchRepo(repoId, query); } else { // Search current repo (MVP default) return this.searchRepo(this.currentRepoId, query); } } }

4.2 Privacy Considerations

Pattern extraction vs code storage:

typescript
interface Memory { content: string; // High-level pattern, NOT actual code context: string; // When this applies // ❌ NEVER store actual code from other repos codeSnippet?: never; // ✅ GOOD: Store abstract patterns // Example: "When implementing auth, use JWT middleware pattern" // Example: "Timestamp columns should be TIMESTAMPTZ in Postgres" }

Anonymization:

typescript
async function extractPattern(episode: Episode): Promise<Memory> { const pattern = await llm.chat({ system: `Extract a GENERAL pattern from this execution. DO NOT include: - Specific variable names from this codebase - Business logic details - API keys or secrets DO include: - Architectural patterns - Language/framework best practices - Common pitfalls and solutions`, messages: [{ role: 'user', content: episode.events }] }); return { type: 'semantic', content: pattern.content, context: pattern.applicableWhen, repoId: 'global', // This pattern is universal }; }

4.3 Portability Classification

Which patterns are portable?

typescript
interface PatternPortability { universal: [ 'error-handling-strategies', 'testing-patterns', 'async-await-best-practices', ], framework_specific: [ 'react-component-patterns', 'nextjs-routing-patterns', 'drizzle-migration-patterns', ], project_specific: [ 'this-api-authentication-flow', 'this-database-schema-decisions', 'this-deployment-process', ], } async function classifyPattern(memory: Memory): Promise<PortabilityLevel> { const classification = await llm.chat({ system: 'Classify this pattern as universal, framework-specific, or project-specific', messages: [{ role: 'user', content: memory.content }] }); return classification.level; }

4.4 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Hardcode single-repo assumption const memories = db.select().from(memories); // No repo filter // ❌ BAD: Store repo-specific code as universal pattern await memory.store({ type: 'semantic', content: 'Use the UserService class from src/services/user.ts', // Too specific repoId: 'global' // Wrong! });

Safe:

typescript
// ✅ GOOD: Always filter by repo const memories = db.select() .from(memories) .where(eq(memories.repoId, currentRepoId)); // ✅ GOOD: Abstract patterns only await memory.store({ type: 'semantic', content: 'For user management, use a service layer to encapsulate business logic', repoId: 'global' // This is actually universal });

4.5 Multi-Repo Memory Recall Strategy

typescript
async function recallWithFallback(query: RecallQuery): Promise<Memory[]> { // 1. Try current repo first (most relevant) const repoMemories = await recall({ ...query, scope: 'current', limit: 5 }); // 2. If not enough, fall back to global if (repoMemories.length < 5) { const globalMemories = await recall({ ...query, scope: 'global', limit: 5 }); return [...repoMemories, ...globalMemories].slice(0, query.limit); } return repoMemories; }

4.6 Estimated Effort

Engineering time: 2-3 weeks Breaking changes: None if repoId field added in MVP Components affected:

  • Memory schema (add repoId column)
  • Recall queries (add repo filtering)
  • Pattern extraction (classify portability)
  • CLI (detect repo context)

4.7 Trigger to Build This

Indicators:

  • User works on >3 repos with similar tech stacks
  • Patterns from one repo would benefit another
  • User requests "use the same pattern as in repo X"

Milestone: After 100+ learnings in a single repo, indicating pattern extraction works well.


5. Autonomous Deployment

Current: Always require human approval for production deploys Future: Auto-deploy low-risk changes, human approval for high-risk

5.1 Automation Ladder State

MVP:

typescript
interface AutomationConfig { level: 0 | 1 | 2 | 3 | 4; // Level 0: Human does everything (not our target) // Level 1: AI suggests, human decides (MVP) // Level 2: AI acts, human reviews (post-MVP) // Level 3: AI acts, human notified (post-MVP) // Level 4: Full autonomy for low-risk (post-MVP) } // MVP: Always Level 1 const config = { automation: { level: 1, allowedActions: ['suggest', 'analyze'], requiredApprovals: ['deploy', 'merge', 'security-changes'], } };

5.2 Earning Higher Automation Levels

Metrics required to advance:

typescript
interface AutomationMetrics { // To reach Level 2 (from 1): level2Requirements: { totalRuns: 50, // Minimum experience successRate: 0.95, // 95% of runs succeed falsePositiveRate: 0.20, // <20% review findings dismissed humanOverrideRate: 0.05, // <5% of suggestions rejected }, // To reach Level 3 (from 2): level3Requirements: { totalRuns: 200, successRate: 0.98, falsePositiveRate: 0.05, // <5% false positives missedCriticalBugs: 0, // Zero critical bugs missed averageHumanEditSize: 0.10, // Edits are <10% of code }, // To reach Level 4 (from 3): level4Requirements: { totalRuns: 500, successRate: 0.99, productionIncidents: 0, // Zero incidents from auto-deploys rollbackRate: 0.01, // <1% of deploys rolled back humanInterventionRate: 0.02, // <2% require human help }, } async function evaluateAutomationLevel(): Promise<AutomationLevel> { const metrics = await getSystemMetrics(); if (meetsRequirements(metrics, level4Requirements)) { return 4; } else if (meetsRequirements(metrics, level3Requirements)) { return 3; } else if (meetsRequirements(metrics, level2Requirements)) { return 2; } else { return 1; // Stay conservative } }

5.3 Risk-Based Deployment Strategy

typescript
interface DeploymentDecision { risk: RiskLevel; automation: AutomationLevel; // Decision matrix strategy: DeploymentStrategy; } function selectDeploymentStrategy( risk: RiskLevel, automation: AutomationLevel ): DeploymentStrategy { // At Level 1 (MVP): Everything requires approval if (automation === 1) { return { type: 'manual', requireApproval: true }; } // At Level 2: Low-risk auto-deploys to staging if (automation === 2) { if (risk === 'low') { return { type: 'auto-staging', requireApproval: false }; } else { return { type: 'manual', requireApproval: true }; } } // At Level 3: Auto-deploy low/medium with notification if (automation === 3) { if (risk === 'low' || risk === 'medium') { return { type: 'auto-notify', requireApproval: false, notifyHuman: true }; } else { return { type: 'manual', requireApproval: true }; } } // At Level 4: Full autonomy for low-risk, gated for high-risk if (automation === 4) { if (risk === 'low') { return { type: 'fully-auto', requireApproval: false }; } else if (risk === 'medium') { return { type: 'auto-notify', requireApproval: false, notifyHuman: true }; } else { return { type: 'manual', requireApproval: true }; } } }

5.4 Canary Automation

Post-MVP: Auto-promote or auto-rollback based on health:

typescript
interface CanaryDeployment { stages: CanaryStage[]; healthChecks: HealthCheck[]; async deploy(artifact: Artifact): Promise<DeploymentResult> { for (const stage of this.stages) { // Deploy to stage (e.g., 5% of traffic) await this.deployToStage(stage, artifact); // Wait for bake time await this.wait(stage.bakeTime); // Check health const health = await this.checkHealth(stage); if (!health.healthy) { // Auto-rollback await this.rollback(stage); return { status: 'rolled-back', reason: health.issues }; } // Auto-promote to next stage } return { status: 'deployed' }; } } interface HealthCheck { metric: 'error_rate' | 'latency' | 'throughput'; baseline: number; threshold: number; // e.g., 1.2x baseline async check(deployment: Deployment): Promise<boolean> { const current = await this.measure(deployment); return current < this.baseline * this.threshold; } }

5.5 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Hardcode approval requirements async function deploy(artifact: Artifact) { // This won't work when we add automation levels const approved = await requestHumanApproval(); if (!approved) throw new Error('Not approved'); await this.executeDeploy(artifact); }

Safe:

typescript
// ✅ GOOD: Strategy-based deployment async function deploy(artifact: Artifact, risk: RiskLevel) { const strategy = selectDeploymentStrategy(risk, config.automation.level); if (strategy.requireApproval) { const approved = await requestHumanApproval(); if (!approved) throw new Error('Not approved'); } if (strategy.notifyHuman) { this.notifyHuman('Deploying', artifact); } await this.executeDeploy(artifact, strategy); }

5.6 Storing Automation State

typescript
// Add to schema.ts export const automationState = sqliteTable('automation_state', { id: text('id').primaryKey(), currentLevel: integer('current_level').notNull().default(1), // Track metrics for level advancement totalRuns: integer('total_runs').notNull().default(0), successfulRuns: integer('successful_runs').notNull().default(0), falsePositives: integer('false_positives').notNull().default(0), criticalMisses: integer('critical_misses').notNull().default(0), // Last level evaluation lastEvaluated: integer('last_evaluated', { mode: 'timestamp_ms' }), levelAdvancedAt: integer('level_advanced_at', { mode: 'timestamp_ms' }), });

5.7 Estimated Effort

Engineering time: 3-4 weeks Breaking changes: None Components affected:

  • Deployment strategy selector (new)
  • Automation metrics tracker (new)
  • Canary health checks (new)
  • Human notification system (extend)

5.8 Trigger to Build This

Indicators:

  • System achieves Level 2 requirements (50+ runs, 95% success rate)
  • Users request faster deploys for low-risk changes
  • Manual approval becomes a bottleneck

Milestone: Never auto-enable. Require explicit opt-in even after metrics are met.


6. Natural Language Requirements

Current: Structured task descriptions (e.g., "Add user authentication with JWT") Future: Plain English feature requests (e.g., "Users should be able to log in")

6.1 MVP Planner Agent Design

MVP: Assume well-formed requirements:

typescript
interface PlannerInput { task: string; // Specific, actionable: "Add JWT auth middleware" constraints?: string[]; acceptanceCriteria?: string[]; } async function plan(input: PlannerInput): Promise<ImplementationPlan> { // Assume task is clear and unambiguous const architecture = await this.designArchitecture(input.task); const tasks = await this.decompose(input.task, architecture); return { architecture, tasks }; }

6.2 Post-MVP: Ambiguity Handling

Clarification loop:

typescript
interface PlannerInput { task: string; // Vague: "Users should be able to log in" constraints?: string[]; acceptanceCriteria?: string[]; } async function planWithClarification(input: PlannerInput): Promise<ImplementationPlan> { // 1. Analyze for ambiguity const ambiguities = await this.detectAmbiguities(input.task); if (ambiguities.length > 0) { // 2. Ask clarifying questions const questions = ambiguities.map(a => a.question); const answers = await this.askHuman(questions); // 3. Refine requirements input.task = await this.refineRequirements(input.task, answers); } // 4. Proceed with planning return this.plan(input); } interface Ambiguity { aspect: string; // e.g., "authentication method" question: string; // "Should we use JWT, OAuth, or session cookies?" options: string[]; } async function detectAmbiguities(task: string): Promise<Ambiguity[]> { const analysis = await llm.chat({ system: `Analyze this task for ambiguities. Identify aspects that have multiple valid interpretations. Generate clarifying questions.`, messages: [{ role: 'user', content: task }] }); return analysis.ambiguities; }

6.3 Figma/Design Integration

Post-MVP: Visual requirements:

typescript
interface VisualRequirement { type: 'figma' | 'screenshot' | 'mockup'; url?: string; image?: Buffer; } interface PlannerInput { task: string; visual?: VisualRequirement; } async function planWithVisual(input: PlannerInput): Promise<ImplementationPlan> { if (input.visual) { // Extract requirements from design const extracted = await this.extractFromVisual(input.visual); // Merge with text requirements input.task = this.mergeRequirements(input.task, extracted); } return this.plan(input); } async function extractFromVisual(visual: VisualRequirement): Promise<ExtractedRequirements> { // Use vision model (GPT-4V, Claude with vision, etc.) const analysis = await visionModel.analyze(visual.image, { prompt: `Extract UI requirements from this design: - Layout structure - Components needed - Interactions (buttons, forms, etc.) - Styling details` }); return { components: analysis.components, layout: analysis.layout, interactions: analysis.interactions, }; }

6.4 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Assume requirements are always clear async function plan(task: string) { // No validation — will produce garbage for vague input return this.decompose(task); } // ❌ BAD: Hardcode structured input format interface PlannerInput { title: string; description: string; acceptanceCriteria: string[]; // Forces structured input }

Safe:

typescript
// ✅ GOOD: Accept string, validate quality async function plan(input: string | PlannerInput) { const task = typeof input === 'string' ? input : input.task; // Validate requirement quality const quality = await this.assessRequirementQuality(task); if (quality.score < 0.7) { // Request clarification const clarified = await this.requestClarification(task, quality.gaps); return this.plan(clarified); } return this.executePlanning(task); }

6.5 Estimated Effort

Engineering time: 3-4 weeks Breaking changes: None (input type can be string or structured) Components affected:

  • Planner agent (add clarification loop)
  • Human interaction (add Q&A workflow)
  • Vision model integration (new)

6.6 Trigger to Build This

Indicators:

  • Users frequently provide vague requirements
  • Planner produces poor plans due to ambiguity
  • Users request Figma integration

Milestone: After 200+ successful plans from well-formed requirements.


7. Real-Time Dashboards

Current: CLI output with progress indicators Future: Web UI with live updates, cost tracking, memory browser

7.1 MVP CLI Requirements

MUST provide rich event stream:

typescript
// The events table already captures everything // CLI just needs to subscribe and render class CLIRenderer { async watchPipeline(traceId: string) { bus.on('*', (event: ForgeEvent) => { if (event.traceId === traceId) { this.render(event); } }); } private render(event: ForgeEvent) { switch (event.type) { case 'phase.entered': console.log(`\n▶ Entering phase: ${event.payload.phase}`); break; case 'agent.iteration': process.stdout.write('.'); break; case 'finding.detected': console.log(`${event.payload.message}`); break; // ... } } }

Event stream is the source of truth — CLI and dashboard consume the same data.

7.2 Post-MVP Web Dashboard Architecture

Tech stack (matches user's preference):

typescript
// Use TanStack Start since it's in the user's stack forge-dashboard/ ├── app/ │ ├── routes/ │ │ ├── index.tsx // Dashboard home │ │ ├── runs.$runId.tsx // Run detail page │ │ ├── memory.tsx // Memory browser │ │ └── metrics.tsx // Cost & quality trends │ ├── components/ │ │ ├── pipeline-viz.tsx // Pipeline state visualization │ │ ├── event-stream.tsx // Live event feed │ │ └── cost-chart.tsx // Cost over time │ └── lib/ │ └── api.ts // API client for Forge backend

WebSocket streaming:

typescript
// Backend: Expose WebSocket endpoint import { WebSocketServer } from 'ws'; const wss = new WebSocketServer({ port: 3001 }); wss.on('connection', (ws) => { // Subscribe client to event bus const unsubscribe = bus.on('*', (event) => { ws.send(JSON.stringify(event)); }); ws.on('close', () => { unsubscribe(); }); }); // Frontend: Subscribe to events import { useEffect, useState } from 'react'; function useEventStream(traceId: string) { const [events, setEvents] = useState<ForgeEvent[]>([]); useEffect(() => { const ws = new WebSocket('ws://localhost:3001'); ws.onmessage = (msg) => { const event = JSON.parse(msg.data); if (event.traceId === traceId) { setEvents(prev => [...prev, event]); } }; return () => ws.close(); }, [traceId]); return events; }

7.3 Dashboard Components

Pipeline Status:

typescript
function PipelineVisualization({ traceId }: { traceId: string }) { const events = useEventStream(traceId); const status = usePipelineStatus(events); return ( <div className="pipeline"> {['planning', 'implementation', 'review', 'testing', 'deployment'].map(phase => ( <Phase key={phase} name={phase} status={status[phase]} events={events.filter(e => e.phase === phase)} /> ))} </div> ); }

Cost Tracking:

typescript
function CostDashboard() { const { data: runs } = useQuery({ queryKey: ['runs'], queryFn: () => fetch('/api/runs').then(r => r.json()), }); const totalCost = runs?.reduce((sum, r) => sum + r.totalCostUsd, 0) || 0; const avgCostPerRun = totalCost / (runs?.length || 1); return ( <div> <Metric label="Total Cost" value={`$${totalCost.toFixed(2)}`} /> <Metric label="Avg per Run" value={`$${avgCostPerRun.toFixed(2)}`} /> <CostChart data={runs} /> </div> ); }

Memory Browser:

typescript
function MemoryBrowser() { const [query, setQuery] = useState(''); const { data: memories } = useQuery({ queryKey: ['memories', query], queryFn: () => fetch(`/api/memory/search?q=${query}`).then(r => r.json()), enabled: query.length > 0, }); return ( <div> <SearchInput value={query} onChange={setQuery} /> <MemoryList memories={memories} /> </div> ); }

7.4 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Render output directly in agent code class PlannerAgent { async execute(input) { console.log('Starting planning...'); // Couples agent to CLI } } // ❌ BAD: Store UI state in agent class PlannerAgent { private progressBar: ProgressBar; // UI concerns in agent }

Safe:

typescript
// ✅ GOOD: Emit events, let consumers render class PlannerAgent { async execute(input) { this.bus.emit({ type: 'planning.started', payload: { task: input.task } }); // Agent doesn't know or care who's listening } }

7.5 Estimated Effort

Engineering time: 2-3 weeks Breaking changes: None (agents already emit events) New components:

  • WebSocket server
  • TanStack Start dashboard app
  • Dashboard components

7.6 Trigger to Build This

Indicators:

  • Users want to monitor long-running pipelines remotely
  • Teams want shared visibility into agent activity
  • CLI output is insufficient for debugging

Milestone: After 100+ CLI users, or when first team adopts Forge.


8. ClickHouse / Kafka Migration

Current: SQLite events table, in-memory event bus Future: ClickHouse for analytics, Kafka for event streaming

8.1 Performance Thresholds

SQLite is fine until:

  • Events table exceeds 1M rows
  • Event insertion latency > 50ms (p95)
  • Analytics queries take > 5 seconds

When to migrate:

  • If event table grows > 10GB
  • If multiple processes need to share events (distributed deployment)
  • If real-time analytics are needed (dashboards querying events table)

8.2 MVP Event Bus Design

Already abstracted:

typescript
interface EventBus { emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void>; on(type: string, handler: EventHandler): () => void; replay(traceId: string): Promise<ForgeEvent[]>; }

This interface works for:

  • In-memory (MVP)
  • Redis Pub/Sub (K8s deployment)
  • Kafka (high-volume production)

8.3 Kafka Implementation

typescript
class KafkaEventBus implements EventBus { private kafka: Kafka; private producer: Producer; private consumer: Consumer; async emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>) { const full: ForgeEvent = { ...event, id: ulid(), timestamp: new Date(), }; await this.producer.send({ topic: 'forge-events', messages: [{ key: full.traceId, value: JSON.stringify(full) }], }); } on(type: string, handler: EventHandler) { this.consumer.subscribe({ topic: 'forge-events' }); this.consumer.run({ eachMessage: async ({ message }) => { const event = JSON.parse(message.value.toString()); if (type === '*' || event.type === type) { handler(event); } }, }); return () => this.consumer.disconnect(); } async replay(traceId: string): Promise<ForgeEvent[]> { // Query ClickHouse instead of SQLite const result = await this.clickhouse.query({ query: 'SELECT * FROM events WHERE trace_id = ? ORDER BY timestamp', params: [traceId], }); return result.json(); } }

8.4 ClickHouse Schema

sql
-- ClickHouse (optimized for analytics) CREATE TABLE events ( id String, trace_id String, timestamp DateTime64(3), source String, type String, phase String, payload String, -- JSON tokens_used UInt32, cost_usd Decimal(10, 4), duration_ms UInt32, -- ClickHouse-specific optimizations date Date MATERIALIZED toDate(timestamp) ) ENGINE = MergeTree() PARTITION BY toYYYYMM(date) ORDER BY (trace_id, timestamp); -- Fast queries: SELECT type, count(*) FROM events WHERE date >= today() - 7 GROUP BY type; SELECT avg(cost_usd), sum(tokens_used) FROM events WHERE trace_id = '...' ;

8.5 Migration Path

Step 1: Dual-write

typescript
class DualWriteEventBus implements EventBus { constructor( private sqlite: SQLiteEventBus, private kafka: KafkaEventBus ) {} async emit(event) { await Promise.all([ this.sqlite.emit(event), this.kafka.emit(event), ]); } // Gradually shift reads from SQLite → ClickHouse async replay(traceId: string) { try { return await this.kafka.replay(traceId); // Prefer new system } catch (err) { return await this.sqlite.replay(traceId); // Fallback to old } } }

Step 2: Backfill historical events

typescript
async function backfillToClickHouse() { const events = await sqlite.query('SELECT * FROM events'); for (const event of events) { await clickhouse.insert('events', event); } }

Step 3: Single-write to Kafka/ClickHouse

8.6 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: SQL queries directly in agents const events = await db.select().from(events).where(eq(events.traceId, traceId)); // ❌ BAD: Assuming events table is in same DB as memories const result = await db.query(` SELECT e.*, m.content FROM events e JOIN memories m ON e.source = m.id `); // Won't work if events move to ClickHouse

Safe:

typescript
// ✅ GOOD: Use EventBus interface const events = await bus.replay(traceId); // ✅ GOOD: Keep events and memories separate const events = await bus.replay(traceId); const memories = await memory.recall(query);

8.7 Estimated Effort

Engineering time: 3-4 weeks Breaking changes: None if EventBus interface is used consistently New infrastructure:

  • Kafka cluster
  • ClickHouse instance
  • Schema registry (for Kafka)

8.8 Trigger to Build This

Indicators:

  • Events table exceeds 1M rows
  • Analytics queries taking >5 seconds
  • Multiple Forge instances need shared event log

Milestone: When SQLite bottlenecks are measurable.


9. Extension Architecture

Current: Hardcoded agents, tools, providers Future: Plugin system for custom agents, tools, integrations

9.1 MVP Abstraction Boundaries

Already designed for extensibility:

typescript
// 1. Tool interface — anyone can implement a tool interface Tool<TInput, TOutput> { name: string; description: string; schema: { input: ZodSchema<TInput>; output: ZodSchema<TOutput> }; execute(input: TInput, ctx: ToolContext): Promise<TOutput>; } // 2. Agent interface — anyone can implement an agent interface Agent { id: string; type: AgentType; execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput>; } // 3. LLM provider interface — swap providers interface LLMProvider { chat(request: ChatRequest): Promise<ChatResponse>; embed(text: string): Promise<Float32Array>; }

9.2 Plugin System Design

Plugin manifest:

typescript
// forge-plugin-jira/plugin.json { "name": "forge-plugin-jira", "version": "1.0.0", "type": "tool", "entry": "./dist/index.js", "provides": { "tools": ["jira_create_issue", "jira_search"], "agents": [], "integrations": ["jira"] } } // forge-plugin-jira/src/index.ts import { definePlugin, Tool } from '@forge/plugin-api'; export default definePlugin({ tools: [ { name: 'jira_create_issue', description: 'Create a Jira issue', schema: { input: z.object({ project: z.string(), summary: z.string(), description: z.string(), }), output: z.object({ issueKey: z.string(), url: z.string(), }), }, async execute(input, ctx) { const jira = new JiraClient(ctx.config.jiraUrl); const issue = await jira.createIssue(input); return { issueKey: issue.key, url: issue.url }; }, }, ], });

Plugin loader:

typescript
class PluginManager { private plugins = new Map<string, Plugin>(); async load(pluginPath: string): Promise<void> { // Read manifest const manifest = await import(`${pluginPath}/plugin.json`); // Load module const module = await import(manifest.entry); const plugin = module.default; // Register tools for (const tool of plugin.tools) { this.toolRegistry.register(tool); } // Register agents for (const agent of plugin.agents) { this.agentRegistry.register(agent); } this.plugins.set(manifest.name, plugin); } async loadAllPlugins() { const pluginDirs = await fs.readdir('./plugins'); await Promise.all(pluginDirs.map(dir => this.load(`./plugins/${dir}`))); } }

9.3 Custom Agent Registration

typescript
// forge-plugin-custom-reviewer/src/index.ts import { definePlugin, Agent } from '@forge/plugin-api'; export default definePlugin({ agents: [ { id: 'custom-security-reviewer', type: 'reviewer', async execute(input, ctx) { // Custom security review logic const findings = await this.runSecurityChecks(input.code); return { approved: findings.length === 0, findings, }; }, }, ], }); // In forge.config.ts, user selects which reviewer to use export default defineConfig({ agents: { reviewer: 'custom-security-reviewer', // Instead of default }, });

9.4 Webhook System

For external integrations:

typescript
interface WebhookConfig { events: string[]; // Which event types to forward url: string; headers?: Record<string, string>; transform?: (event: ForgeEvent) => unknown; // Optional transformation } class WebhookManager { private webhooks: WebhookConfig[] = []; async init() { // Subscribe to all events bus.on('*', async (event) => { // Forward to matching webhooks const matching = this.webhooks.filter(w => w.events.includes(event.type) || w.events.includes('*') ); await Promise.all(matching.map(w => this.sendWebhook(w, event))); }); } private async sendWebhook(webhook: WebhookConfig, event: ForgeEvent) { const payload = webhook.transform ? webhook.transform(event) : event; await fetch(webhook.url, { method: 'POST', headers: { 'Content-Type': 'application/json', ...webhook.headers, }, body: JSON.stringify(payload), }); } } // In forge.config.ts export default defineConfig({ webhooks: [ { events: ['run.completed', 'deployment.success'], url: 'https://slack.com/api/webhook/...', transform: (event) => ({ text: `Forge run completed: ${event.payload.task}`, }), }, ], });

9.5 API Surface for Third-Party Consumers

typescript
// Expose HTTP API for external consumers import { Hono } from 'hono'; const app = new Hono(); // Start a run app.post('/api/runs', async (c) => { const { task } = await c.req.json(); const traceId = await orchestrator.startRun(task); return c.json({ traceId }); }); // Get run status app.get('/api/runs/:traceId', async (c) => { const { traceId } = c.req.param(); const run = await db.select().from(runs).where(eq(runs.id, traceId)); return c.json(run); }); // Get run events app.get('/api/runs/:traceId/events', async (c) => { const { traceId } = c.req.param(); const events = await bus.replay(traceId); return c.json(events); }); // Get memories app.get('/api/memory', async (c) => { const { q } = c.req.query(); const memories = await memory.recall({ context: q }); return c.json(memories); });

9.6 MVP Must NOT Do

Forbidden:

typescript
// ❌ BAD: Hardcode tool list const tools = [gitTool, lintTool, testTool]; // Can't add plugins // ❌ BAD: Tight coupling to specific implementations import { PlannerAgent } from './agents/planner'; const planner = new PlannerAgent(); // Can't swap

Safe:

typescript
// ✅ GOOD: Registry-based tool discovery const tools = toolRegistry.getAll(); // ✅ GOOD: Agent factory const planner = agentRegistry.get(config.agents.planner);

9.7 Estimated Effort

Engineering time: 3-4 weeks Breaking changes: None if registries already exist Components:

  • Plugin loader
  • Plugin API types
  • Webhook manager
  • HTTP API server

9.8 Trigger to Build This

Indicators:

  • Users request custom tools (e.g., Jira, Linear, Notion)
  • Users want to customize agents
  • Third-party tools want to integrate with Forge

Milestone: After core agents/tools are stable (200+ runs).


10. Seams and Abstractions

This section catalogs every abstraction boundary in the MVP and what it future-proofs.

10.1 Tool Abstraction

Interface:

typescript
interface Tool<TInput, TOutput> { name: string; description: string; schema: { input: ZodSchema<TInput>; output: ZodSchema<TOutput> }; execute(input: TInput, ctx: ToolContext): Promise<TOutput>; }

Future-proofs:

  • Plugin tools (9. Extension Architecture)
  • Remote tools (tools run in separate processes/containers)
  • Tool versioning (same name, different versions)
  • Tool authentication (ctx provides credentials)

Must NOT:

  • Return non-serializable objects (e.g., class instances) — breaks remote tools
  • Mutate global state — breaks parallel execution
  • Assume local filesystem — breaks containerization

10.2 LLM Provider Abstraction

Interface:

typescript
interface LLMProvider { chat(request: ChatRequest): Promise<ChatResponse>; embed(text: string): Promise<Float32Array>; }

Future-proofs:

  • Swap Claude → OpenAI → Ollama → custom model
  • Multi-provider routing (cheap model for simple tasks, expensive for complex)
  • Provider fallback (if primary fails, try secondary)
  • Local model support (Ollama, LlamaCPP)

Must NOT:

  • Use provider-specific features in prompts (e.g., Claude-only XML tags) — breaks portability
  • Hardcode model names in agents — use config

10.3 Memory Store Abstraction

Interface:

typescript
interface MemoryStore { store(memory: Memory): Promise<void>; recall(query: RecallQuery): Promise<Memory[]>; update(id: string, updates: Partial<Memory>): Promise<void>; consolidate(): Promise<void>; }

Future-proofs:

  • SQLite → PostgreSQL (2. Kubernetes)
  • Brute-force → Vector DB (3. Vector Database)
  • Single-repo → Multi-repo (4. Multi-Repo Intelligence)
  • Local → Shared (team memory)

Must NOT:

  • Expose SQL queries to agents — breaks DB independence
  • Return DB-specific types (e.g., Drizzle objects) — breaks when we swap DBs

10.4 Event Bus Abstraction

Interface:

typescript
interface EventBus { emit(event: Omit<ForgeEvent, 'id' | 'timestamp'>): Promise<void>; on(type: string, handler: EventHandler): () => void; replay(traceId: string): Promise<ForgeEvent[]>; }

Future-proofs:

  • In-memory → Redis → Kafka (8. ClickHouse/Kafka)
  • Single-process → Distributed (2. Kubernetes)
  • Synchronous → Asynchronous (buffered writes)
  • Local → Remote (WebSocket streaming for dashboards)

Must NOT:

  • Assume synchronous delivery — events might be buffered
  • Mutate events after emitting — they might be serialized
  • Store non-serializable payloads — breaks remote bus

10.5 Agent Abstraction

Interface:

typescript
interface Agent { id: string; type: AgentType; execute(input: PhaseInput, ctx: AgentContext): Promise<PhaseOutput>; }

Future-proofs:

  • Sequential → Parallel (1. Parallel Execution)
  • Single implementer → Multiple (swarm development)
  • Local → Remote (agents run in K8s Jobs)
  • Default agents → Custom agents (9. Extension Architecture)

Must NOT:

  • Share mutable state between agents — breaks parallelism
  • Assume execute() runs in same process as orchestrator — breaks remote execution
  • Store instance state across execute() calls — breaks stateless scaling

10.6 Safety Controls Abstraction

Interface:

typescript
interface SafetyControls { check(state: ExecutionState): BreakerResult; configure(config: SafetyConfig): void; } interface BreakerResult { shouldBreak: boolean; reason?: string; breaker?: string; }

Future-proofs:

  • Static thresholds → Dynamic thresholds (learned from history)
  • Hardcoded breakers → Configurable breakers
  • Phase-level → Agent-level (parallel agents have independent budgets)

Must NOT:

  • Hardcode threshold values in breakers — use config
  • Assume global state — each agent should have isolated breaker state

10.7 Checkpoint Abstraction

Interface:

typescript
interface CheckpointSystem { save(phase: PhaseName, state: State): Promise<Checkpoint>; restore(checkpointId: string): Promise<State>; list(traceId: string): Promise<Checkpoint[]>; }

Future-proofs:

  • Phase-level → Agent-level (parallel agents)
  • Local files → Remote storage (S3, DB)
  • Manual resume → Auto-resume (on crash)

Must NOT:

  • Store non-serializable state — breaks restore in different process
  • Assume checkpoints are local files — breaks distributed deployment

10.8 Human Gate Abstraction

Interface:

typescript
interface HumanGate { id: string; condition: (ctx: Context) => boolean; request(ctx: Context): Promise<Approval>; } interface Approval { approved: boolean; reason?: string; modifications?: unknown; }

Future-proofs:

  • CLI approval → Web UI approval (7. Dashboards)
  • Single approver → Multi-level approval (per topic 10)
  • Synchronous → Asynchronous (approval can come hours later)
  • Manual → Conditional automation (5. Autonomous Deployment)

Must NOT:

  • Assume approver is available immediately — add timeouts
  • Block forever on approval — escalate after timeout

10.9 Configuration Abstraction

Interface:

typescript
interface ForgeConfig { llm: LLMConfig; tools: ToolConfig; safety: SafetyConfig; memory: MemoryConfig; github?: GitHubConfig; plugins?: PluginConfig[]; }

Future-proofs:

  • File-based → Environment-based (12-factor app)
  • Per-project → Per-user → Per-team
  • Static → Dynamic (config can change without restart)

Must NOT:

  • Hardcode defaults in code — use config file
  • Require config values that might not exist — provide defaults

10.10 Observability Abstraction

Events as audit trail:

  • Every decision logged with rationale
  • Every action attributed to agent
  • Every cost tracked per execution

Future-proofs:

  • CLI rendering → Dashboard rendering (same events)
  • SQLite → ClickHouse (same event schema)
  • Local analysis → Remote analytics

Must NOT:

  • Log events with ad-hoc formats — use typed events
  • Store logs separately from events — events ARE the logs

Summary Table: Deferred Features

FeatureMVP Must DoMVP Must NOT DoEffortTrigger
1. Parallel AgentsStateless agents, immutable contextShared mutable state2-3 weeks>10min implementation phase
2. KubernetesEnv-driven config, serializable stateHardcoded paths, singletons4-6 weeksTeam usage, shared memory
3. Vector DBAbstract similarity behind interfaceExpose DB-specific queries1-2 weeks>50K memories, >500ms recall
4. Multi-RepoNamespace memories by repoIdStore repo-specific code globally2-3 weeks>3 repos with similar stacks
5. Autonomous DeployStrategy-based deploymentHardcode approval requirements3-4 weeksLevel 2 metrics achieved
6. Natural LanguageAccept string input, validate qualityAssume structured input3-4 weeksFrequent vague requirements
7. DashboardsRich event streamRender in agent code2-3 weeksTeam adoption, remote monitoring
8. ClickHouse/KafkaEventBus interface abstractionSQL queries in agents3-4 weeks>1M events, >5s queries
9. ExtensionsTool/Agent registriesHardcode tool lists3-4 weeksUsers request custom tools
10. SeamsAll interfaces abstract, serializableNon-serializable state, DB exposureN/AFoundation for all above

Conclusion

The MVP is deliberately constrained — sequential, single-process, human-gated. But every interface is designed so that upgrading to parallel, distributed, autonomous execution requires NO breaking changes to calling code. We swap implementations, not interfaces.

Key principle: If an agent can't tell the difference between local and remote, sync and async, single and parallel, then the abstraction is correct.

This plan ensures we can ship fast (8 weeks to MVP) while keeping the door wide open for the future (parallel agents, K8s, autonomous deployment, etc.) without rewriting the core.