HomeDocs › Providers

Providers

KosmoKrator supports 20+ LLM providers out of the box, from major cloud APIs to local models and Chinese-market providers. You can also add custom OpenAI-compatible endpoints.

Built-in Providers

Every built-in provider is ready to use after entering credentials. The table below lists all providers shipped with KosmoKrator, their authentication mode, and key notes.

Provider ID Label Auth Mode Notes
anthropic Anthropic API Key Claude family — Opus 4.5, Sonnet 4.5, Haiku 4.5
openai OpenAI API Key GPT-4o, GPT-4.1 family, o-series reasoning models
codex Codex (ChatGPT) OAuth Browser/device login flow, uses your ChatGPT subscription
gemini Google Gemini API Key Gemini 2.5 Pro and Flash
deepseek DeepSeek API Key DeepSeek V3 (chat), R1 (reasoning)
groq Groq API Key Ultra-fast inference on dedicated hardware
mistral Mistral API Key Mistral Large, Codestral
xai xAI API Key Grok 3, with reasoning support
openrouter OpenRouter API Key Meta-router for 100+ models from multiple providers
perplexity Perplexity API Key Online search-augmented models
ollama Ollama None Local models, no remote credentials required
kimi Kimi (Moonshot) API Key Long-context Chinese/English models
kimi-coding Kimi Coding API Key Code-optimized Moonshot endpoint
mimo Xiaomi MiMo Token Plan API Key MiMo models via token-plan key (free tier available)
mimo-api Xiaomi MiMo API API Key MiMo pay-as-you-go API
minimax MiniMax API Key MiniMax models
minimax-cn MiniMax CN API Key MiniMax China-region endpoint
z Z.AI API Key Z.AI coding endpoint
z-api Z.AI API API Key Z.AI standard API endpoint
stepfun StepFun API Key Step models
stepfun-plan StepFun Plan API Key Step Plan subscription endpoint with reasoning support

Authentication Setup

First-run wizard

The easiest way to configure credentials is the interactive setup command, which walks you through provider selection and API key entry:

kosmokrator setup

API key storage

API keys entered through the setup wizard or the /settings command are encrypted and stored in the local SQLite database at ~/.kosmokrator/data/kosmokrator.db. Keys are never written to plain-text config files.

Environment variables

Alternatively, you can set provider API keys via environment variables. These are read from your Prism PHP configuration and take effect if no key is stored in the database. Common variables:

  • ANTHROPIC_API_KEY — Anthropic
  • OPENAI_API_KEY — OpenAI
  • DEEPSEEK_API_KEY — DeepSeek
  • GROQ_API_KEY — Groq
  • MISTRAL_API_KEY — Mistral
  • XAI_API_KEY — xAI
  • OPENROUTER_API_KEY — OpenRouter
  • PERPLEXITY_API_KEY — Perplexity
  • GEMINI_API_KEY — Google Gemini
  • KIMI_API_KEY — Kimi / Kimi Coding
  • MIMO_API_KEY — MiMo (token plan)
  • MIMO_PAYG_API_KEY — MiMo (pay-as-you-go API)
  • MINIMAX_API_KEY — MiniMax
  • MINIMAX_CN_API_KEY — MiniMax CN (China region)
  • STEPFUN_API_KEY — StepFun / StepFun Plan
  • ZAI_API_KEY — Z.AI / Z.AI API
Database-stored keys always take priority over environment variables. If you set a key via /settings and also have an environment variable, the stored key is used.

OAuth flow (Codex / ChatGPT)

The codex provider uses a browser-based OAuth device login flow tied to your ChatGPT subscription. When you select Codex as your provider:

  1. KosmoKrator starts a local callback server on port 9876 (configurable in config/kosmokrator.yaml).
  2. Your browser opens to a ChatGPT authorization page.
  3. After granting access, the OAuth tokens are stored and refreshed automatically.

Token status is shown in the settings UI — including the associated email, expiration state, and whether a refresh is due.

Switching Providers

You can change the active provider and model at any time during a session:

  1. Open the settings panel with the /settings command.
  2. Navigate to the Agent category.
  3. Change default_provider to the desired provider ID.
  4. Change default_model to a model supported by that provider.

Both settings have applies_now effect — the change takes effect on the very next LLM call without restarting the session.

The model selector is filtered by the currently selected provider. Change the provider first, then pick from its available models.

Per-Depth Model Overrides

KosmoKrator supports running different models at different agent depths. This lets you use a powerful (and more expensive) model for the main agent while routing subagents to faster or cheaper models.

Depth Role Settings Fallback
0 Main agent default_provider / default_model
1 Subagents subagent_provider / subagent_model Inherits from depth 0
2+ Sub-subagents subagent_depth2_provider / subagent_depth2_model Inherits from depth 1, then depth 0

The resolution cascade works as follows: depth-2+ overrides fall back to depth-1 overrides, which fall back to the main agent defaults. Leave a setting empty to inherit from the parent depth.

Example: cost-optimized hierarchy

# Main agent — most capable model
default_provider: anthropic
default_model: claude-opus-4-5-20250929

# Subagents — fast and affordable
subagent_provider: anthropic
subagent_model: claude-haiku-4-5-20251001

# Sub-subagents — inherit from subagent settings
# (leave subagent_depth2_provider and subagent_depth2_model empty)
Per-depth overrides are configured under the Subagents category in /settings. Each setting applies immediately when changed.

Custom Providers

Any OpenAI-compatible API endpoint can be added as a custom provider. This is useful for self-hosted models, corporate proxies, or providers not yet included in the built-in catalog.

Adding a custom provider

  1. Open /settings and navigate to Provider Setup.
  2. Add a new provider with a unique ID.
  3. Configure the required fields:
Field Description Example
label Human-readable name shown in the UI My Corporate LLM
base_url Full URL to the chat completions endpoint https://llm.corp.example/v1
api_key API key for authentication sk-corp-...
default_model Model identifier to use by default llama-3.1-70b

Custom providers use the relay system for request/response normalization, so they work with tool calling, streaming, and all other agent features as long as the endpoint implements the OpenAI chat completions format.

Reasoning Support

Some providers support extended thinking / reasoning modes, where the model performs chain-of-thought reasoning before producing its final answer. KosmoKrator controls this via the reasoning_effort setting (under the Agent category in /settings).

Provider Reasoning Behavior Effort Levels
openai Controllable via reasoning_effort for o-series models (o1, o3, o4-mini) low / medium / high
xai Controllable via reasoning_effort for Grok 3 Think models low / medium / high
deepseek Always-on reasoning for R1 models Not configurable
stepfun, stepfun-plan Always-on reasoning Not configurable
kimi, kimi-coding Always-on reasoning Not configurable
groq Always-on reasoning Not configurable
mistral Always-on reasoning Not configurable
perplexity Always-on reasoning Not configurable
openrouter Always-on reasoning Not configurable
z, z-api Always-on reasoning Not configurable
minimax, minimax-cn Always-on reasoning Not configurable
mimo, mimo-api Always-on reasoning Not configurable
All others No reasoning support Setting is safely ignored
Anthropic supports extended thinking (chain-of-thought) via Prism's native driver, but this is not controlled through the reasoning_effort parameter. It is handled internally by the driver when supported models are used.

The available effort levels are off, low, medium, and high. Setting the value to off disables reasoning parameters entirely, even for providers that support it.

Reasoning models tend to produce longer, more thorough responses but use significantly more tokens. Use low or medium for routine tasks and reserve high for complex multi-step problems.

LLM Clients

Under the hood, KosmoKrator uses two client implementations to communicate with LLM providers. The correct client is selected automatically based on the provider.

AsyncLlmClient

The primary client for most providers. Built on Amp HTTP, it sends raw HTTP requests to OpenAI-compatible chat completions endpoints with full async streaming support. Used for:

  • OpenAI, DeepSeek, Groq, Mistral, xAI, OpenRouter, Perplexity
  • Ollama, Kimi, Kimi Coding, MiMo, MiMo API, Z.AI, Z.AI API, StepFun, StepFun Plan
  • All custom providers (OpenAI-compatible endpoints)

PrismService

A synchronous client backed by the Prism PHP SDK. Used for providers that have native Prism drivers with specialized request/response handling:

  • Anthropic (Claude) — uses Prism's native Anthropic driver with prompt caching
  • Google Gemini — uses Prism's native Gemini driver
  • MiniMax, MiniMax CN — uses Prism's Anthropic-compatible driver (Anthropic-format endpoints)

RetryableLlmClient

A decorator that wraps either client, adding automatic retry logic with exponential backoff and jitter. Retries are triggered on:

  • Rate limits (HTTP 429) — honors Retry-After headers from the provider
  • Server errors (HTTP 5xx) — transient provider outages
  • Network failures — connection timeouts, DNS resolution errors

The maximum number of retry attempts is configurable via the max_retries setting. A value of 0 means unlimited retries (the agent keeps trying until the provider responds successfully).