Skip to content

Providers

factory supports 16 providers. Each one ships with default model discovery, capability inference, and (where the SDK supports it) tool-call streaming.

Provider matrix

ProviderCLI aliasAuthNotes
Anthropic Claudeanthropic, claudeANTHROPIC_API_KEYNative protocol; explicit prompt caching supported.
CerebrascerebrasCEREBRAS_API_KEYOpenAI-compatible; very fast inference.
CodestralcodestralCODESTRAL_API_KEYMistral's code-focused endpoint. Same SDK as Mistral.
CoherecohereCOHERE_API_KEYNative protocol. Streaming not yet implemented (uses non-streaming chat).
GitHub CopilotcopilotGITHUB_COPILOT_API_KEY or COPILOT_API_KEY, or interactive OAuthCatalog reflects Copilot's own picker/policy flags.
Google AI Studiogoogleaistudio, geminiGEMINI_API_KEY or GOOGLE_API_KEY, or GOOGLE_APPLICATION_CREDENTIALS (ADC), or gcloud auth application-default loginFilters out non-chat models (embeddings, image/video, TTS).
GroqgroqGROQ_API_KEYOpenAI-compatible. Excludes audio/guardrail models.
HuggingFacehuggingface, hfHF_TOKEN or HUGGING_FACE_HUB_TOKENCurated starter list of chat models; HF doesn't expose a single "chat models I can use" catalog.
llama.cppllamacppnone (local)Lists the currently loaded model set from the llama.cpp server.
MistralmistralMISTRAL_API_KEYOpenAI-compatible.
Ollamaollamanone (local) or token via --tokenDefault; assumes http://localhost:11434.
OpenAIopenai, oaiOPENAI_API_KEYOpenAI-compatible (its native shape). Filters non-chat endpoints (embeddings, audio, image, moderation, realtime). Drops temperature for all reasoning models (o-series, gpt-5, codex). Drops parallel_tool_calls for o-series only — gpt-5 and codex keep it. Codex variants route through /v1/responses with chained previous_response_id and a per-model reasoning.effort default (medium for codex, low for other reasoning models).
OpenCode ZenopencodezenOPENCODE_ZEN_API_KEY or OPENCODE_API_KEYRoutes to Anthropic-, Google-, or OpenAI-shaped backends per model. GPT models routed via the legacy chat-completions path; native /responses is on the roadmap.
OpenRouteropenrouter, orOPENROUTER_API_KEYOpenAI-compatible; preserves Anthropic cache_control blocks for anthropic/... model ids.
Vercel AI GatewayvercelAI_GATEWAY_API_KEY or VERCEL_OIDC_TOKENOpenAI-compatible. Filters to language models only.
Workers AIworkersaiCLOUDFLARE_API_TOKEN (+ CLOUDFLARE_ACCOUNT_ID)OpenAI-compatible. Cloudflare account ID can also be inferred from the host URL.

Auth precedence

For every provider:

  1. CLI --token takes precedence.
  2. Then the env var(s) listed above.
  3. Then a key entry from the config file (keys.<provider>[].token). Multiple entries enable key rotation — see /keys for the saved set and /rotate for the active chain.
  4. Then an interactive prompt if the picker is on the screen.

Model discovery

Most providers expose /v1/models and factory paginates through it. Some quirks:

  • Anthropic uses a curated allowlist instead of a live list (the public catalog is broader than what we want to surface in a CLI picker).
  • HuggingFace has no "models I can use with this token" endpoint — uses a curated starter list.
  • Workers AI uses Cloudflare's models/search endpoint and filters to text generation tasks.
  • Google AI Studio filters out non-chat methods (embedding, imagen, veo, lyria, tts, speech, music, image, video, live).

Tool support

LevelDescription
nativeProvider exposes a tool-call API; tool calls round-trip through structured fields.
basicTool calls are supported but with limitations (no parallel calls, no reasoning preservation).
noneNo structured tool support; the agent falls back to text-based tool calls (recovered by src/core/agent/tool-calls/text-tool-parser.ts).

Per-model capability is reported by getCapabilities(model) on each provider — see src/providers/<provider>.ts (or src/providers/<provider>/index.ts for folder-based providers like copilot/, googleaistudio/, opencodezen/).

For maintainer notes on how the picker filters, sorts, and infers capabilities (and what to update when OpenAI ships a new flagship), see picker-internals.md.

Examples

bash
factory --provider anthropic --model claude-sonnet-4-6
factory -p copilot -m gpt-4.1
factory -p ollama -m qwen2.5-coder
factory -p googleaistudio -m gemini-2.5-pro
factory -p workersai -m '@cf/qwen/qwen2.5-coder-32b-instruct'
factory --host http://remote:11434                  # remote Ollama
factory -p llamacpp --host http://localhost:8080    # local llama.cpp server

Released under the Apache-2.0 License.