Model Configuration
Copy page
Configure AI models for your Agents and Sub Agents
Configure models at Project (required), Agent, or Sub Agent levels. Settings inherit down the hierarchy.
Configuration Hierarchy
You must configure at least the base model at the project level:
Override at agent or sub agent level:
Model Types
| Type | Purpose | Fallback |
|---|---|---|
base | Text generation and reasoning | Required at project level |
structuredOutput | JSON/structured output only | Falls back to base |
summarizer | Summaries and status updates | Falls back to base |
Supported Models
| Provider | Example Models | API Key |
|---|---|---|
| Anthropic | anthropic/claude-sonnet-4-6anthropic/claude-sonnet-4-5anthropic/claude-haiku-4-5anthropic/claude-opus-4-6 | ANTHROPIC_API_KEY |
| OpenAI | openai/gpt-5.4-proopenai/gpt-5.4openai/gpt-5.2openai/gpt-5.1openai/gpt-4.1openai/gpt-4.1-miniopenai/gpt-4.1-nanoopenai/gpt-5* | OPENAI_API_KEY |
| Azure OpenAI | azure/my-gpt4-deploymentazure/my-gpt35-deployment | AZURE_API_KEY |
google/gemini-3.1-pro-previewgoogle/gemini-2.5-flashgoogle/gemini-2.5-flash-lite | GOOGLE_GENERATIVE_AI_API_KEY | |
| OpenRouter | openrouter/anthropic/claude-sonnet-4-0openrouter/meta-llama/llama-3.1-405b | OPENROUTER_API_KEY |
| Gateway | gateway/openai/gpt-4.1-mini | AI_GATEWAY_API_KEY |
| NVIDIA NIM | nim/nvidia/llama-3.3-nemotron-super-49b-v1.5nim/nvidia/nemotron-4-340b-instruct | NIM_API_KEY |
| Custom OpenAI-compatible | custom/my-custom-modelcustom/llama-3-custom | CUSTOM_LLM_API_KEY |
| Mock | mock/default | None required |
openai/gpt-5, openai/gpt-5-mini, and openai/gpt-5-nano require a verified OpenAI organization. If your organization is not yet verified, these models will not be available.Pinned vs Unpinned Models
Pinned models include a specific date or version (e.g., anthropic/claude-sonnet-4-20250514) and always use that exact version.
Unpinned models use generic identifiers (e.g., anthropic/claude-sonnet-4-5) and let the provider choose the latest version, which may change over time as providers update their models.
The TypeScript SDK also provides constants for common models:
Provider Options
Inkeep Agents supports all Vercel AI SDK provider options.
How providerOptions works
providerOptions accepts two types of values:
- Scalars (
temperature,topP,maxOutputTokens,seed,maxDuration) — standard generation parameters applied to every call - Objects (
anthropic: {},openai: {},gateway: {}, etc.) — provider-specific options for that provider
This means you can mix them freely:
Constructor-level config (baseURL, headers, resourceName, apiVersion) is always specified at the top level of providerOptions, not nested under a provider key.
Complete Examples
Basic configuration:
OpenAI with reasoning:
Anthropic with thinking:
Google with thinking:
Vercel AI Gateway with model routing:
The Gateway provider supports routing requests across multiple models with automatic fallback. If the primary model fails or is unavailable, the gateway tries the next model in the list.
All models in the models array must be valid Vercel AI Gateway model IDs. The gateway falls through to the next model on failure — if all models fail, the request errors. Set AI_GATEWAY_API_KEY in your environment for authentication.
Azure OpenAI:
Azure OpenAI requires either resourceName (for standard Azure OpenAI deployments) or baseURL (for custom endpoints) in providerOptions. The AZURE_API_KEY environment variable must be set for authentication. Note that only one Azure OpenAI resource can be used at a time since authentication is handled via a single environment variable.
Custom OpenAI-compatible provider:
Custom OpenAI-compatible providers require a base URL to be specified in providerOptions.baseURL or providerOptions.baseUrl. The CUSTOM_LLM_API_KEY environment variable will be automatically used for authentication if present.
Context Window Override
For custom or unlisted models, you can explicitly specify the context window size:
The contextWindowSize option is useful when:
- Using a custom model not in the built-in registry
- The framework incorrectly detects the context window size
- You want to artificially limit the context window for testing
This affects compression triggers and oversized artifact detection (artifacts exceeding 30% of the context window).
CLI Defaults
When using inkeep init, defaults are set based on your chosen provider:
| Provider | Base | Structured Output | Summarizer |
|---|---|---|---|
| Anthropic | claude-sonnet-4-5 | claude-sonnet-4-5 | claude-sonnet-4-5 |
| OpenAI | gpt-4.1 | gpt-4.1-mini | gpt-4.1-nano |
gemini-2.5-flash | gemini-2.5-flash-lite | gemini-2.5-flash-lite |