Architecture
This document maps factory's internals so a new contributor can locate the right module quickly.
High level
flowchart LR
A[CLI args] --> B[loadConfig] --> C[auth] --> D[provider] --> E[tools / MCP]
E --> F{isInteractiveTty}
F -- yes --> G[renderApp — TUI]
F -- no --> H[runHeadless]
G --> I[agent loop<br/>core/agent]
H --> IThe entry in src/index.ts parses CLI flags, loads config, runs auth, picks a provider, registers tools, attaches MCP servers, then checks isInteractiveTty and dispatches to either renderApp (TUI) or runHeadless (scripted/non-TTY) — both eventually call into the same agent loop.
Module map
src/index.ts
Top-level main(). Parses argv, applies --debug, branches on --version / --help, loads config, applies rotation overrides, runs auth, wires hooks, registers tools, instantiates the McpManager (src/mcp/client.ts), prints the welcome banner, and dispatches to TUI or headless mode based on isInteractiveTty. Also installs unhandledRejection / uncaughtException handlers.
src/cli/
args.ts— argv parser (manual, no commander),printUsage(),printVersion().picker.ts— line-based interactive provider/model selection.prompts.ts— Ink-based variant of the picker prompts.auth/— credential resolution and interactive auth flows (storage and model validation live insrc/core/auth/):index.ts— credential resolution: CLI → env var → config file → interactive prompt. Probes providers in parallel during startup. ReturnsStartupCredentialskeyed by provider.flows.ts— interactive auth flow helpers used during startup.
startup/— startup orchestration:phases.ts— per-phase functions called frommain().config.ts— pure helpers for rotation/experimental/source decisions.menu.tsx— Ink-rendered startup picker.parse-rotation.ts— parses the--rotate <p:m,p:m>chain syntax.
src/core/
The agent core. agent/run-agent.ts is the turn orchestrator: append user input, pre-flight compact, call/stream the model, parse/sanitize output, execute tool calls, and re-enter the loop until a terminal turn-complete stop reason (completed, user-abort, token-limit, or error). The StopReason union in types.ts also includes turn-limit (e.g. subagent runners); the main runAgent generator does not emit that value. Sibling files under agent/ split the loop into call/parse/run/compact/recover phases.
agent/— the loop and its phases:run-agent.ts— the top-level async generator; orchestrates one turn end-to-end.parse-response.ts— content + tool-call extraction (with text-fallback recovery).compaction.ts— invokes the context manager when the budget is hit.recovery-state.ts— per-run counters for replay and token-limit retries.types.ts—AgentEvent,AgentOptions,RotationOptions.cache/— read-cache + cache-boundary helpers:file-cache.ts— Read-tool dedup (mtime + lazy hash) plus compaction-summary fingerprints.cache-boundaries.ts— provider cache-boundary placement for system prompt + history.
call-model/— provider invocation wrapper:call-model.ts— entry that delegates to retry/rotation helpers and returns a chunk stream.call-model-retry.ts— transient retry policy.call-model-rotation.ts— fail-over across rotation chain on rate-limit/auth.provider-errors.ts— rotation classifier.provider-retry.ts— low-level retry-with-backoff primitive.repeat-detector.ts— detects model output loops and aborts the turn.weak-tier.ts— picks a cheaper model for the corrector sub-call.
tool-calls/— tool-call processing (parsing, correction, dispatch, execution):run-tool-calls.ts— top-level dispatcher; permission/security gate + correction loop.run-tool-calls-execute.ts— executes one approved call and records the result.run-tool-calls-cache.ts— cache-hit short-circuit for repeat Read calls.bash-dedup.ts— fires a one-shot nudge when the model spins on near-duplicate Bash calls.text-tool-parser.ts— fallback parser for prose-embedded tool calls.tool-call-corrector.ts— LLM-driven correction of malformed calls.tool-result-format.ts— sentinel framing + imitation strip.
context/— conversation state passed to the model:conversation.ts— message history with token-cap elision.context-manager.ts— recency window + summary compaction.system-prompt.ts— dynamic system prompt generation (project facts, capabilities, tool list).project-facts.ts— best-effort metadata extraction (cloc counts, README excerpt).
auth/— credential storage (interactive resolution lives insrc/cli/auth/):credentials.ts— multi-key store and migration from older single-key formats.model-validation.ts— model-id sanity checks at startup. Lives underauth/for historical reasons but is about provider model IDs, not credentials.
config/— config file load/merge/save with zod-validated schema (index.ts,merge.ts,types.ts,validate.ts).session/— per-session telemetry:session-log.ts— JSONL per-session logging in~/.factory/sessions/. Tracks provider auth, tool calls, model changes, errors.key-stats.ts— per-key request and rate-limit counters.
hooks/— user-configured shell hooks:index.ts— hook executor (sandboxed env, forbidden-command guard).discovery.ts— resolve hook entries by event + tool/matcher.trust.ts— first-run trust prompts for new hook scripts.
skills/— experimental conditional skills:loader.ts— discover skill files and parse their headers.matcher.ts— match-on-path / match-on-text rules.index.ts—SkillsRegistry(in-memory, dedupes injections).
subagent/— experimental delegation target:runner.ts— runs a sub-agent loop with its own conversation and tool registry.bash-allowlist.ts— restricted Bash policy for sub-agents.
src/providers/
All providers implement the Provider interface. The shared openai/ folder is an internal adapter (no registry entry of its own) that owns SSE parsing, streaming chunk handling, tool-call accumulation, and usage extraction; most providers delegate to it via buildChatBody / sendOpenAiChat / streamOpenAiChat.
Three flavors:
- Flat-file consumers of the shared adapter —
cerebras.ts,groq.ts,mistral.ts,openrouter.ts,vercel.ts,llamacpp.ts,workersai.tsare single files that delegate streaming and tool-call handling toopenai/.huggingface.tsreuses only the tool-call helpers (its transport is the@huggingface/inferenceSDK). - Folder-with-own-auth on the shared adapter —
copilot/andgoogleaistudio/useopenai/for transport but carry their own auth flow alongside;opencodezen/is a proxy folder that re-exposes Anthropic/Google through a single OpenAI-compatible endpoint. - Truly native —
anthropic.ts(uses@anthropic-ai/sdk),ollama.ts(uses theollamapackage), andcohere.ts(hand-rolled) parse their own response shapes and don't touchopenai/.
Cross-cutting files:
types.ts— theProviderinterface every adapter implements.registry.ts— maps a provider name to a constructor (noopenaientry — the adapter is internal).descriptors.ts— per-provider metadata (display label, aliases, env vars, default host).
src/tools/
Built-in tools plus the registry that exposes them. Each tool implements ToolHandler with a definition (LLM-facing JSON schema) and an execute that returns a ToolResult.
read.ts— file read with byte-range and line-range support.write.ts— atomic file write.edit.ts— string-replace edit with read-before-write check.bash.ts— sandboxed shell execution.glob.ts— glob-based file listing.grep.ts— content search.delegate.ts— delegates to a sub-agent when the experimentalsubagentsflag is on.registry.ts— maps tool name to handler;ToolRegistryregisters each built-in.index.ts— exports the shareddefaultRegistryinstance.types.ts—ToolHandler,ToolDefinition,ToolResultshapes.web/— HTTP fetch + HTML rendering pipeline backing the WebFetch tool. The pipeline is:fetch.ts → html-tokenize.ts → html-render.ts → html-to-markdown.tsindex.ts— theWebFetchtool handler.fetch.ts— HTTP client with content-type sniffing.html-tokenize.ts— lightweight HTML tokenizer.html-render.ts— top-level renderer entry.html-to-markdown.ts— DOM walker that emits Markdown.
Tool execution is gated by src/security/permissions.ts (allow-once / allow-always / deny / domain whitelist) and the security policies in src/security/ (path jail, env scrubbing, bash rule matching).
src/security/
permissions.ts— per-tool / per-domain interactive approval state machine.paths.ts— symlink-aware path jail with built-in deny list (.ssh,.aws,.gnupg,/etc/shadow, etc.) and user-configurable rules.bash-rules.ts— built-in forbidden patterns (rm -rf /, fork bomb,curl|sh,dd to /dev/*) plus user globs. Built-ins cannot be overridden.env.ts— deny-by-default env scrubbing with a small safe-vars whitelist.
src/ui/
Two render targets: the React + Ink TUI under tui/, and the non-TTY headless.ts. Both paths call into the same core/agent/run-agent.ts loop.
tui/:
index.tsx—renderApp()entry.App.tsx— tab host.Session.tsx— one tab's REPL.format.ts— assistant-text formatting helpers (used by the renderer).types.ts—DisplayItem,ToolCallSummary, and other view-model types.agent-loop/— DI'd helpers around the React hook that drives the loop:use-agent-loop.ts— the orchestrator hook itself.init.ts,setup.ts,swap.ts— session bootstrap and provider/model swap.run-loop.ts— drives one turn, consumesAgentEvents.event-handler.ts— mapsAgentEvent→ display-item / state mutations.history.ts— input history stack.git-state.ts— branch + dirty refresh.compose-system-prompt.ts— per-turn system prompt composition.agent-loop-types.ts—RunRefs,AgentLoopApi,AgentLoopDepsshapes.
components/— Ink components (status bar, conversation display, permission panel, plan-approval panel, rotation prompt, etc.).slash/— user-typed slash commands;dispatch.tsis the entry, individual handlers live alongside (hooks.ts,keys.ts,rotate.ts,rotate-helpers.ts,rotate-subcommands.ts,stats.ts).hooks/— small React hooks (use-rotation-fallback.ts,use-session-input.ts).tabs/— tab registry/context (tabs-registry.ts,TabsContext.tsx,use-tabs.ts).
Top-level files:
headless.ts— non-TTY entry point used for scripted runs and CI. Reads from stdin, writes to stdout, no Ink.renderer.ts— markdown-to-terminal pipeline (marked + marked-terminal, with a patch for inline-token rendering and a guard against unsupported highlight.js languages).
src/mcp/
Model Context Protocol integration.
client.ts— connects to external MCP servers as child processes.adapter.ts— wraps each remote tool as aToolHandlerso they slot into the same registry as built-in tools.types.ts— shared MCP types.
src/utils/
Small helpers.
errors.ts— extracterror.message.git.ts— branch / dirty detection.tokens.ts—CHARS_PER_TOKENconstant + estimator.glob-match.ts— POSIX-glob to regex.atomic-write.ts— write-then-rename file write.json-extract.ts— extract JSON from prose.build-info.ts— build metadata loader.debug.ts— debug logging gate.timeout.ts— promise timeout helper.
Build & distribution
- Source:
src/→dist/viatsc(Node16 module, ES2022 target, declaration maps). - Entry:
dist/index.jsships with a#!/usr/bin/env nodeshebang. package.json:bin.factorymakesfactoryavailable globally afternpm install -gornpm link.package.json:filesallowlistsdist,README.md,LICENSEfor the npm tarball;.npmignoreis defense-in-depth.- Ambient declarations live in
src/globals.d.ts(e.g.marked-terminaltypings). - Unit tests run directly against TS source via
tsx --test 'test/unit/**/*.test.ts'(no compile step). Unit tests mirrorsrc/undertest/unit/. - End-to-end tests compile both
src/andtest/intodist-test/viatsconfig.test.jsonand run withnode --test. The harness lives at the top oftest/:e2e-mocks.test.tsande2e-no-mocks.test.tsare the suites,cli-harness.tsspawns the built binary, andmock-copilot-server.ts/mock-ollama-server.tsstub provider endpoints.
Data flow: one user prompt → one response
- User submits input in
Session.tsx. useSessionInputroutes slash commands todispatchSlashCommand; plain text reachesrun-loop.ts:processInputvia theuseAgentLoopAPI (session log + UI item + optional skill injection, then the loop driver).run-loop.ts:runAgentLoopInternalsets per-turn runtime refs (abort controller, timeout, mutablecwdRef, responses-chain adapter, rotation context) and callscore/agent/run-agent.ts:runAgent.runAgentappends the user message toConversation, firesUserPromptSubmithook (optional), then enterswhile (true).- Each loop iteration does pre-flight compaction (
maybeCompact), emitspre-turn-statswhen acontextManageris present, then callscall-model.ts:callModel. callModelstreams providerChatChunks intotext-chunkevents, handles retry/rotation, and returns final text/tool calls/usage metadata.runAgentparses/sanitizes response text (parse-response.ts), emitstext-done, stores assistant content, and either:- completes the turn if no tool calls, or
- runs
tool-calls/run-tool-calls.ts(permissions/security/hooks/execution/corrector), appends tool results to conversation, and loops again so the model can react.
- The turn exits only via
turn-completewith stop reasoncompleted,user-abort,token-limit, orerrorfrom this generator (turn-limitis reserved for other runners — seeStopReasonintypes.ts). - In TUI mode,
event-handler.tsconsumes eachAgentEventand mutates UI state; in headless mode,src/ui/headless.tsconsumes the same events and renders stdout/stderr + process exit codes. core/session/session-log.tsrecords JSONL telemetry throughout (turn/model/tool/hook/rotation events).
AgentEvent lifecycle and UI state transitions
AgentEvent is the contract between the core loop (core/agent) and renderers (ui/tui, ui/headless).
sequenceDiagram
autonumber
participant U as User
participant T as TUI/Headless driver
participant R as runAgent
participant M as callModel
participant X as runToolCalls
U->>T: submit prompt
T->>R: runAgent(userInput, options)
loop turn loop (while true)
R->>R: maybeCompact + pre-turn-stats
R->>M: callModel(messages, tools)
M-->>R: text-chunk* / retries / rotations
R-->>T: AgentEvent stream
R->>R: parseModelResponse + text-done
alt tool calls present
R->>X: runToolCalls(toolCalls)
X-->>R: tool-call-* / permission-request / hook-*
R-->>T: AgentEvent stream
else no tool calls
R-->>T: turn-complete(completed)
end
endTypical sequence for one successful turn:
text-chunk(0..N)text-done(if any text)tool-call-start→ optionalpermission-request→tool-call-result/tool-call-denied(0..N calls)- repeat model + tool phases as needed
turn-complete
Important event families (full union: AgentEvent in core/agent/types.ts):
- Context/control:
compaction-start,compaction,pre-turn-stats(only with a context manager),turn-complete,error - Model transport:
provider-retry,key-rotation,key-rotation-exhausted,tuple-rotation,tuple-rotation-exhausted,repetition-detected - Permissions:
permission-request(interactive approval before a gated tool runs) - Tool calls:
tool-call-start,tool-call-result,tool-call-denied,tool-call-planned,tool-call-recovered,tool-call-corrected,tool-call-corrector-aborted,all-denied-halt,bash-dedup-nudge - Recovery/safety:
auto-retry-injected,auto-retry-exhausted,tool-result-imitation-stripped,output-cap-reached,empty-turn-warning - Hooks/cache:
hook-fired,hook-error,hook-veto,read-cache-hit
TUI run-state transitions (agent-loop/event-handler.ts):
idle→runningon prompt submission.running→awaiting-permissiononpermission-request.awaiting-permission→runningafter user decision resolves.- turn-finalization path resets transient activity labels (
retrying…,rotating…) and returns to idle at loop end.
Headless consumes the same events but maps them to stderr notices and exit codes (error→1, token-limit→5, non-TTY permission block→3).
Permission and security layering
flowchart TB
A[user input] --> B[slash dispatcher]
C[agent tool_call] --> D[permissions]
B --> E["tool execute()"]
D --> E
E --> F["security/{paths, bash-rules, env}"]
F --> G[actual I/O]src/security/permissions.ts decides per-tool / per-domain whether a call needs interactive approval. src/security/ checks the content of the call: paths must pass the jail, bash commands must not match a forbidden pattern, env vars passed to spawned shells are scrubbed against the deny list. Both layers must approve before execution.
Built-in security rules cannot be overridden by user config — only extended.
Where to start when adding...
| Change | Touch |
|---|---|
| New CLI flag | src/cli/args.ts (parser + usage) → src/index.ts (apply) |
| New provider | src/providers/<name>.ts (or folder) + descriptors.ts + registry.ts. For OpenAI-compatible APIs, delegate transport to ./openai/index.js (buildChatBody, sendOpenAiChat, streamOpenAiChat) — see vercel.ts or groq.ts as a template. (See CONTRIBUTING.md.) |
| New tool | src/tools/<name>.ts + src/tools/registry.ts (register in ToolRegistry) |
| New slash command | src/ui/tui/slash/<name>.ts + src/ui/tui/slash/dispatch.ts (dispatcher) |
| New session-log event | add a method on the SessionLogger interface in src/core/session/session-log.ts (alongside logModelChange, logToolCall, etc.) + call it from where the event fires |
| New security rule | src/security/<area>.ts — built-in rules are an export list |
| New hook event | src/core/hooks/discovery.ts (event enum) + caller (where the hook fires) |
| New skill matcher | src/core/skills/matcher.ts |