Providers & Models

Dino connects to your own LLM providers — it doesn’t include a built-in model. This gives you full control over cost, quality, and data privacy. You can add multiple providers and switch between them freely, even mid-conversation.

Provider Types

API Key Providers

These are the major LLM platforms where you pay per token. Connect your API key and Dino handles the rest.

Provider SDK Type Models
OpenAI openai-responses GPT-5, GPT-5-Codex, GPT-4.1
Anthropic anthropic Claude Sonnet 4.5, Claude Opus 4.1
Google google Gemini 3.1 Pro, Gemini 3.5 Flash
Mistral openai-completions Devstral 2, Codestral
DeepSeek openai-completions DeepSeek V4 Pro, DeepSeek V4 Flash
MiniMax anthropic MiniMax M3, MiniMax M2.7
Z.ai openai-completions GLM-5.1, GLM-5
Moonshot AI openai-completions Kimi K2.5, Kimi K2.6
Synthetic openai-completions Open-source models via hosted inference
OpenRouter openai-completions Access to many providers through a single API key

Some providers offer regional endpoints. For example, MiniMax, Z.ai, and Moonshot offer separate endpoints for China (with domestic signup URLs).

Coding Plans

Coding plans are flat-rate subscriptions from LLM providers — you pay a fixed monthly fee instead of per-token. They’re recommended for agentic coding since they’re usually 5× to 10× cheaper than paying per token.

Available coding plans include:

Provider What You Get Highlights
Synthetic - Coding Plan Kimi K2.6, Qwen 3.6, GLM-5.1 Low-latency inference
Fireworks AI - Fire Pass Kimi K2.6 Turbo Low-latency inference
Z.ai - GLM Coding Plan GLM-5.1, GLM-5, GLM-5 Turbo First-party from Z.ai
Moonshot - Kimi Coding Plan Kimi K2.5, Kimi K2.6 First-party from Moonshot
MiniMax - Token Plan MiniMax M3, MiniMax M2.7 First-party from MiniMax
OpenCode Go GLM-5.1, Kimi K2.6, MiniMax M2.7, DeepSeek V4 Pro Multi-model aggregator
Ollama Cloud - Subscription GLM-5.1, DeepSeek V4, Kimi K2.6 Hosted models

And several China-specific plans from Baidu (千帆), Tencent Cloud, Volcengine (字节跳动), and others.

To set up a coding plan, sign up on the provider’s website, get your API key, and add it as a provider in Dino.

Local Models

Run models on your own hardware with zero per-token cost:

Server Default Endpoint Notes
Ollama http://localhost:11434/v1 Wide model support
LM Studio http://localhost:1234/v1 GUI model manager
llama.cpp http://localhost:8080/v1 Lightweight, fast C++ inference
mlx-lm http://localhost:8080/v1 Apple Silicon optimized
Jan http://localhost:1337/v1 Privacy-focused local AI

Local model providers don’t require an API key — just make sure the server is running before you start a conversation.

Custom Providers

Any endpoint with OpenAI or Anthropic API format works as a custom provider. If the provider doesn’t return a model list, you’ll need to add models manually after setup. You’ll need to specify:

  • Name — A display name for your reference
  • SDK Type — The API format (see below)
  • Endpoint — The base URL for API calls
  • API Key — If required by the endpoint

SDK Types

Dino supports four API formats:

SDK Type Use For API Format
openai-responses OpenAI (Responses API) OpenAI Responses API
openai-completions OpenAI-compatible providers OpenAI Chat Completions API
anthropic Anthropic and Anthropic-compatible Anthropic Messages API
google Google Gemini Google Generative Language API

Most third-party providers use the openai-completions SDK type, even if they aren’t OpenAI. If a provider advertises “OpenAI-compatible,” use openai-completions.

Adding a Provider

  1. Open Settings from the chat title bar
  2. Go to the Models & Providers tab
  3. Click Add Provider
  4. Choose from preset providers or select Custom
  5. Enter your API key
  6. Dino will auto-load available models — if it doesn’t, add models manually
  7. Select the models you want available in the model switcher

You can add as many providers and models as you like and switch between them at any time.

Managing Models

After adding a provider, you can manage its model list:

  • Auto-load — Dino fetches the model list from the provider’s API automatically
  • Manual add — Enter a model ID manually if auto-fetch doesn’t return it
  • Display names — Give models custom names for easy identification
  • Remove — Delete models you don’t use to keep the list clean

Model Switching

You can switch models at any time during a conversation. Click the model name in the status line at the bottom of the chat to open the model picker. If a response is in progress, the switch takes effect after it completes.

This is useful for:

  • Using a fast, cheap model for simple questions (gemini-3.5-flash, deepseek-v4-flash)
  • Switching to a powerful model for complex refactors (glm-5.1, gpt-5.5)
  • Testing the same prompt across different models to compare results
  • Falling back to an alternative provider if you hit rate limits

A practical strategy: start with a fast, affordable model for the initial pass. If it struggles, switch to a more powerful model to finish the job. And if that model also has trouble, try a different strong model — each model has different strengths.

Choosing a Model

General guidance for coding tasks:

Budget Strategy
Minimize cost Use a coding plan (flat-rate) with GLM-5.1 or Kimi K2.6
Local / offline Run Ollama with a coding-tuned model
Experimenting Use OpenRouter to access many models with one key

A Practical Approach to Cost

The biggest challenge for using coding agents today is cost. API key providers charge per token — typically $3–5 per million input tokens — and coding agents use a lot of tokens.

The most cost-effective approach: subscribe to the cheapest monthly coding plan from two providers and rotate between them. Most plans start at $18–20/month and provide enough daily usage for several hours of active coding. When one plan’s daily quota runs low, switch to the other.

This gives you near-unlimited coding for around $40/month — a fraction of what per-token API usage would cost for the same workload.

Dino is designed to work well across all major model families. Precise diff editing, multi-file edits, and the diff review workflow all function reliably regardless of which model you choose.

© 2026 smartdino.dev·Email Us·Privacy Policy

Lofi music by lofidreams & others via Pixabay