How Tokens Are Priced
In AI, “how much does this conversation cost” is measured in tokens, not minutes. This page lays out the mechanics in plain language.
What Is a Token?
A token is the smallest unit the model processes. Not characters, not words — the model segments text into its own pieces.
Rough conversion:
| Language | 1 Token ≈ |
|---|---|
| English | ~0.75 word (e.g., “apple” = 1 token, “refrigerator” = 2 tokens) |
| Chinese | ~1 character = 1.5–2 tokens (e.g., “你好” = 3–4 tokens) |
| Code | 0.25–0.5 line (varies by language) |
Example
In English:
“Hello, the weather is nice today, let’s go to the park!”
That’s 11 words, which the model typically splits into about 13 tokens.
In Chinese:
“你好,今天天氣真好,我們去公園散步吧!”
That’s 17 characters, which the model splits into about 28 tokens.
Chinese uses more tokens than English. Chatting in Chinese often costs 30–50% more than chatting in English.
Cost per Exchange
Formula:
Exchange cost =
Input tokens × input price
+ Output tokens × output price
Key concept: input includes the conversation history.
Message 1 from you: 100 tokens
AI reply 1: 300 tokens
────────────────────────────────
Message 2 from you: 80 tokens + prior 400 = 480 tokens input
AI reply 2: 250 tokens
────────────────────────────────
Message 3 from you: 60 tokens + prior 730 = 790 tokens input
AI reply 3: 200 tokens
The longer the chat, the more each request costs, because every turn carries history. This is why context compression exists.
Model Pricing (2026-04)
OpenAI
| Model | Input / 1M | Output / 1M | Best For |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | Complex reasoning, creative work |
| GPT-4o-mini | $0.15 | $0.60 | Daily chat, support |
| o1 | $15.00 | $60.00 | Deep reasoning (slow + expensive) |
| o1-mini | $3.00 | $12.00 | Mid-tier reasoning |
Anthropic
| Model | Input / 1M | Output / 1M | Best For |
|---|---|---|---|
| Claude 3.5 Sonnet | $3.00 | $15.00 | Code, long-context |
| Claude 3.5 Haiku | $0.80 | $4.00 | Fast responses |
| Claude 3 Opus | $15.00 | $75.00 | Flagship |
| Model | Input / 1M | Output / 1M | Best For |
|---|---|---|---|
| Gemini 1.5 Pro | $1.25 | $5.00 | Long context (2M) |
| Gemini 1.5 Flash | $0.075 | $0.30 | Cheap, high volume |
| Gemini 2.0 Flash | $0.10 | $0.40 | Newer version |
Others
| Model | Note |
|---|---|
| Groq (Llama, Mixtral) | Fast and cheap |
| DeepSeek | Very low price, strong on Chinese |
| Azure OpenAI | Same models as OpenAI, slightly different pricing |
Prices change often — providers are in a price war. The table above reflects 2026-04; confirm current pricing on each vendor’s site.
Example: One Day of Support Conversations
Ada handles 50 customer questions per day. Each conversation averages 5 turns, with ~100 tokens per turn on either side.
Total tokens:
- Message portion: 50 × 5 × 2 × 100 = 50,000 tokens (raw messages)
- History accumulation: each turn carries prior turns, roughly 150,000 tokens input + 50,000 tokens output
By model:
| Model | Cost / Day (USD) | Month (30 days) |
|---|---|---|
| GPT-4o-mini | $0.05 | $1.5 |
| GPT-4o | $0.88 | $26.4 |
| Claude Sonnet | $1.20 | $36.0 |
| Gemini Flash | $0.03 | $0.9 |
Takeaway: same workload, expensive vs cheap can differ by 40×.
Checking Your Own Consumption
In the Admin Panel:
- Each companion tab → Usage tab
- See Usage Dashboard Reference
Quickest “gut check”:
- Ask the AI any question
- Some chat UIs show “this exchange: X tokens” at the bottom
- Multiply by the model’s unit price → cost of this exchange
Estimating Your Bill
Personal use (~20 exchanges/day):
- GPT-4o-mini → $5–10 / month
- Claude Haiku → $15–30 / month
- GPT-4o → $30–60 / month
Small-team support (100 exchanges/day):
- GPT-4o-mini → $30–60 / month
- Claude Sonnet → $200–400 / month
Mid-size e-commerce support (500 exchanges/day):
- GPT-4o-mini → $150–300 / month
- GPT-4o → $2,000–4,000 / month
Rule of thumb: picking the right model can cut cost by 10–20×. If you just say “hello” and the AI replies “hi” in 5 seconds, you probably don’t need $15/1M-token o1.