Documentation
Providers & Models
Dino connects to your own LLM providers — it doesn’t include a built-in model. This gives you full control over cost, quality, and data privacy. You can add multiple providers and switch between them freely, even mid-conversation.
Provider Types
API Key Providers
These are the major LLM platforms where you pay per token. Connect your API key and Dino handles the rest.
| Provider | SDK Type | Models |
|---|---|---|
| OpenAI | openai-responses |
GPT-5, GPT-5-Codex, GPT-4.1 |
| Anthropic | anthropic |
Claude Sonnet 4.5, Claude Opus 4.1 |
google |
Gemini 3.1 Pro, Gemini 3.5 Flash | |
| Mistral | openai-completions |
Devstral 2, Codestral |
| DeepSeek | openai-completions |
DeepSeek V4 Pro, DeepSeek V4 Flash |
| MiniMax | anthropic |
MiniMax M3, MiniMax M2.7 |
| Z.ai | openai-completions |
GLM-5.1, GLM-5 |
| Moonshot AI | openai-completions |
Kimi K2.5, Kimi K2.6 |
| Synthetic | openai-completions |
Open-source models via hosted inference |
| OpenRouter | openai-completions |
Access to many providers through a single API key |
Some providers offer regional endpoints. For example, MiniMax, Z.ai, and Moonshot offer separate endpoints for China (with domestic signup URLs).
Coding Plans
Coding plans are flat-rate subscriptions from LLM providers — you pay a fixed monthly fee instead of per-token. They’re recommended for agentic coding since they’re usually 5× to 10× cheaper than paying per token.
Available coding plans include:
| Provider | What You Get | Highlights |
|---|---|---|
| Synthetic - Coding Plan | Kimi K2.6, Qwen 3.6, GLM-5.1 | Low-latency inference |
| Fireworks AI - Fire Pass | Kimi K2.6 Turbo | Low-latency inference |
| Z.ai - GLM Coding Plan | GLM-5.1, GLM-5, GLM-5 Turbo | First-party from Z.ai |
| Moonshot - Kimi Coding Plan | Kimi K2.5, Kimi K2.6 | First-party from Moonshot |
| MiniMax - Token Plan | MiniMax M3, MiniMax M2.7 | First-party from MiniMax |
| OpenCode Go | GLM-5.1, Kimi K2.6, MiniMax M2.7, DeepSeek V4 Pro | Multi-model aggregator |
| Ollama Cloud - Subscription | GLM-5.1, DeepSeek V4, Kimi K2.6 | Hosted models |
And several China-specific plans from Baidu (千帆), Tencent Cloud, Volcengine (字节跳动), and others.
To set up a coding plan, sign up on the provider’s website, get your API key, and add it as a provider in Dino.
Local Models
Run models on your own hardware with zero per-token cost:
| Server | Default Endpoint | Notes |
|---|---|---|
| Ollama | http://localhost:11434/v1 |
Wide model support |
| LM Studio | http://localhost:1234/v1 |
GUI model manager |
| llama.cpp | http://localhost:8080/v1 |
Lightweight, fast C++ inference |
| mlx-lm | http://localhost:8080/v1 |
Apple Silicon optimized |
| Jan | http://localhost:1337/v1 |
Privacy-focused local AI |
Local model providers don’t require an API key — just make sure the server is running before you start a conversation.
Custom Providers
Any endpoint with OpenAI or Anthropic API format works as a custom provider. If the provider doesn’t return a model list, you’ll need to add models manually after setup. You’ll need to specify:
- Name — A display name for your reference
- SDK Type — The API format (see below)
- Endpoint — The base URL for API calls
- API Key — If required by the endpoint
SDK Types
Dino supports four API formats:
| SDK Type | Use For | API Format |
|---|---|---|
openai-responses |
OpenAI (Responses API) | OpenAI Responses API |
openai-completions |
OpenAI-compatible providers | OpenAI Chat Completions API |
anthropic |
Anthropic and Anthropic-compatible | Anthropic Messages API |
google |
Google Gemini | Google Generative Language API |
Most third-party providers use the openai-completions SDK type, even if they aren’t OpenAI. If a provider advertises “OpenAI-compatible,” use openai-completions.
Adding a Provider
- Open Settings from the chat title bar
- Go to the Models & Providers tab
- Click Add Provider
- Choose from preset providers or select Custom
- Enter your API key
- Dino will auto-load available models — if it doesn’t, add models manually
- Select the models you want available in the model switcher
You can add as many providers and models as you like and switch between them at any time.
Managing Models
After adding a provider, you can manage its model list:
- Auto-load — Dino fetches the model list from the provider’s API automatically
- Manual add — Enter a model ID manually if auto-fetch doesn’t return it
- Display names — Give models custom names for easy identification
- Remove — Delete models you don’t use to keep the list clean
Model Switching
You can switch models at any time during a conversation. Click the model name in the status line at the bottom of the chat to open the model picker. If a response is in progress, the switch takes effect after it completes.
This is useful for:
- Using a fast, cheap model for simple questions (
gemini-3.5-flash,deepseek-v4-flash) - Switching to a powerful model for complex refactors (
glm-5.1,gpt-5.5) - Testing the same prompt across different models to compare results
- Falling back to an alternative provider if you hit rate limits
A practical strategy: start with a fast, affordable model for the initial pass. If it struggles, switch to a more powerful model to finish the job. And if that model also has trouble, try a different strong model — each model has different strengths.
Choosing a Model
General guidance for coding tasks:
| Budget | Strategy |
|---|---|
| Minimize cost | Use a coding plan (flat-rate) with GLM-5.1 or Kimi K2.6 |
| Local / offline | Run Ollama with a coding-tuned model |
| Experimenting | Use OpenRouter to access many models with one key |
A Practical Approach to Cost
The biggest challenge for using coding agents today is cost. API key providers charge per token — typically $3–5 per million input tokens — and coding agents use a lot of tokens.
The most cost-effective approach: subscribe to the cheapest monthly coding plan from two providers and rotate between them. Most plans start at $18–20/month and provide enough daily usage for several hours of active coding. When one plan’s daily quota runs low, switch to the other.
This gives you near-unlimited coding for around $40/month — a fraction of what per-token API usage would cost for the same workload.
Dino is designed to work well across all major model families. Precise diff editing, multi-file edits, and the diff review workflow all function reliably regardless of which model you choose.