Providers & Models

Dino connects to your own LLM providers — it doesn’t include a built-in model. This gives you full control over cost, quality, and data privacy. You can add multiple providers and switch between them freely, even mid-conversation.

Provider Types

API Key Providers

These are the major LLM platforms where you pay per token. Connect your API key and Dino handles the rest.

Provider	SDK Type	Models
OpenAI	`openai-responses`	GPT-5, GPT-5-Codex, GPT-4.1
Anthropic	`anthropic`	Claude Sonnet 4.5, Claude Opus 4.1
Google	`google`	Gemini 3.1 Pro, Gemini 3.5 Flash
Mistral	`openai-completions`	Devstral 2, Codestral
DeepSeek	`openai-completions`	DeepSeek V4 Pro, DeepSeek V4 Flash
MiniMax	`anthropic`	MiniMax M3, MiniMax M2.7
Z.ai	`openai-completions`	GLM-5.1, GLM-5
Moonshot AI	`openai-completions`	Kimi K2.5, Kimi K2.6
Synthetic	`openai-completions`	Open-source models via hosted inference
OpenRouter	`openai-completions`	Access to many providers through a single API key

Some providers offer regional endpoints. For example, MiniMax, Z.ai, and Moonshot offer separate endpoints for China (with domestic signup URLs).

Coding Plans

Coding plans are flat-rate subscriptions from LLM providers — you pay a fixed monthly fee instead of per-token. They’re recommended for agentic coding since they’re usually 5× to 10× cheaper than paying per token.

Available coding plans include:

Provider	What You Get	Highlights
Synthetic - Coding Plan	Kimi K2.6, Qwen 3.6, GLM-5.1	Low-latency inference
Fireworks AI - Fire Pass	Kimi K2.6 Turbo	Low-latency inference
Z.ai - GLM Coding Plan	GLM-5.1, GLM-5, GLM-5 Turbo	First-party from Z.ai
Moonshot - Kimi Coding Plan	Kimi K2.5, Kimi K2.6	First-party from Moonshot
MiniMax - Token Plan	MiniMax M3, MiniMax M2.7	First-party from MiniMax
OpenCode Go	GLM-5.1, Kimi K2.6, MiniMax M2.7, DeepSeek V4 Pro	Multi-model aggregator
Ollama Cloud - Subscription	GLM-5.1, DeepSeek V4, Kimi K2.6	Hosted models

And several China-specific plans from Baidu (千帆), Tencent Cloud, Volcengine (字节跳动), and others.

To set up a coding plan, sign up on the provider’s website, get your API key, and add it as a provider in Dino.

Local Models

Run models on your own hardware with zero per-token cost:

Server	Default Endpoint	Notes
Ollama	`http://localhost:11434/v1`	Wide model support
LM Studio	`http://localhost:1234/v1`	GUI model manager
llama.cpp	`http://localhost:8080/v1`	Lightweight, fast C++ inference
mlx-lm	`http://localhost:8080/v1`	Apple Silicon optimized
Jan	`http://localhost:1337/v1`	Privacy-focused local AI

Local model providers don’t require an API key — just make sure the server is running before you start a conversation.

Custom Providers

Any endpoint with OpenAI or Anthropic API format works as a custom provider. If the provider doesn’t return a model list, you’ll need to add models manually after setup. You’ll need to specify:

Name — A display name for your reference
SDK Type — The API format (see below)
Endpoint — The base URL for API calls
API Key — If required by the endpoint

SDK Types

Dino supports four API formats:

SDK Type	Use For	API Format
`openai-responses`	OpenAI (Responses API)	OpenAI Responses API
`openai-completions`	OpenAI-compatible providers	OpenAI Chat Completions API
`anthropic`	Anthropic and Anthropic-compatible	Anthropic Messages API
`google`	Google Gemini	Google Generative Language API

Most third-party providers use the openai-completions SDK type, even if they aren’t OpenAI. If a provider advertises “OpenAI-compatible,” use openai-completions.

Adding a Provider

Open Settings from the chat title bar
Go to the Models & Providers tab
Click Add Provider
Choose from preset providers or select Custom
Enter your API key
Dino will auto-load available models — if it doesn’t, add models manually
Select the models you want available in the model switcher

You can add as many providers and models as you like and switch between them at any time.

Managing Models

After adding a provider, you can manage its model list:

Auto-load — Dino fetches the model list from the provider’s API automatically
Manual add — Enter a model ID manually if auto-fetch doesn’t return it
Display names — Give models custom names for easy identification
Remove — Delete models you don’t use to keep the list clean

Model Switching

You can switch models at any time during a conversation. Click the model name in the status line at the bottom of the chat to open the model picker. If a response is in progress, the switch takes effect after it completes.

This is useful for:

Using a fast, cheap model for simple questions (gemini-3.5-flash, deepseek-v4-flash)
Switching to a powerful model for complex refactors (glm-5.1, gpt-5.5)
Testing the same prompt across different models to compare results
Falling back to an alternative provider if you hit rate limits

A practical strategy: start with a fast, affordable model for the initial pass. If it struggles, switch to a more powerful model to finish the job. And if that model also has trouble, try a different strong model — each model has different strengths.

Choosing a Model

General guidance for coding tasks:

Budget	Strategy
Minimize cost	Use a coding plan (flat-rate) with `GLM-5.1` or `Kimi K2.6`
Local / offline	Run `Ollama` with a coding-tuned model
Experimenting	Use `OpenRouter` to access many models with one key

A Practical Approach to Cost

The biggest challenge for using coding agents today is cost. API key providers charge per token — typically $3–5 per million input tokens — and coding agents use a lot of tokens.

The most cost-effective approach: subscribe to the cheapest monthly coding plan from two providers and rotate between them. Most plans start at $18–20/month and provide enough daily usage for several hours of active coding. When one plan’s daily quota runs low, switch to the other.

This gives you near-unlimited coding for around $40/month — a fraction of what per-token API usage would cost for the same workload.

Dino is designed to work well across all major model families. Precise diff editing, multi-file edits, and the diff review workflow all function reliably regardless of which model you choose.