How Tokens Are Priced

In AI, “how much does this conversation cost” is measured in tokens, not minutes. This page lays out the mechanics in plain language.

What Is a Token?

A token is the smallest unit the model processes. Not characters, not words — the model segments text into its own pieces.

Rough conversion:

Language	1 Token ≈
English	~0.75 word (e.g., “apple” = 1 token, “refrigerator” = 2 tokens)
Chinese	~1 character = 1.5–2 tokens (e.g., “你好” = 3–4 tokens)
Code	0.25–0.5 line (varies by language)

Example

In English:

“Hello, the weather is nice today, let’s go to the park!”

That’s 11 words, which the model typically splits into about 13 tokens.

In Chinese:

“你好，今天天氣真好，我們去公園散步吧！”

That’s 17 characters, which the model splits into about 28 tokens.

Chinese uses more tokens than English. Chatting in Chinese often costs 30–50% more than chatting in English.

Cost per Exchange

Formula:

Exchange cost =
  Input tokens × input price
+ Output tokens × output price

Key concept: input includes the conversation history.

Message 1 from you: 100 tokens
AI reply 1: 300 tokens
────────────────────────────────
Message 2 from you: 80 tokens + prior 400 = 480 tokens input
AI reply 2: 250 tokens
────────────────────────────────
Message 3 from you: 60 tokens + prior 730 = 790 tokens input
AI reply 3: 200 tokens

The longer the chat, the more each request costs, because every turn carries history. This is why context compression exists.

Model Pricing (2026-04)

OpenAI

Model	Input / 1M	Output / 1M	Best For
GPT-4o	$2.50	$10.00	Complex reasoning, creative work
GPT-4o-mini	$0.15	$0.60	Daily chat, support
o1	$15.00	$60.00	Deep reasoning (slow + expensive)
o1-mini	$3.00	$12.00	Mid-tier reasoning

Anthropic

Model	Input / 1M	Output / 1M	Best For
Claude 3.5 Sonnet	$3.00	$15.00	Code, long-context
Claude 3.5 Haiku	$0.80	$4.00	Fast responses
Claude 3 Opus	$15.00	$75.00	Flagship

Google

Model	Input / 1M	Output / 1M	Best For
Gemini 1.5 Pro	$1.25	$5.00	Long context (2M)
Gemini 1.5 Flash	$0.075	$0.30	Cheap, high volume
Gemini 2.0 Flash	$0.10	$0.40	Newer version

Others

Model	Note
Groq (Llama, Mixtral)	Fast and cheap
DeepSeek	Very low price, strong on Chinese
Azure OpenAI	Same models as OpenAI, slightly different pricing

Prices change often — providers are in a price war. The table above reflects 2026-04; confirm current pricing on each vendor’s site.

Example: One Day of Support Conversations

Ada handles 50 customer questions per day. Each conversation averages 5 turns, with ~100 tokens per turn on either side.

Total tokens:

Message portion: 50 × 5 × 2 × 100 = 50,000 tokens (raw messages)
History accumulation: each turn carries prior turns, roughly 150,000 tokens input + 50,000 tokens output

By model:

Model	Cost / Day (USD)	Month (30 days)
GPT-4o-mini	$0.05	$1.5
GPT-4o	$0.88	$26.4
Claude Sonnet	$1.20	$36.0
Gemini Flash	$0.03	$0.9

Takeaway: same workload, expensive vs cheap can differ by 40×.

Checking Your Own Consumption

In the Admin Panel:

Each companion tab → Usage tab
See Usage Dashboard Reference

Quickest “gut check”:

Ask the AI any question
Some chat UIs show “this exchange: X tokens” at the bottom
Multiply by the model’s unit price → cost of this exchange

Estimating Your Bill

Personal use (~20 exchanges/day):

GPT-4o-mini → $5–10 / month
Claude Haiku → $15–30 / month
GPT-4o → $30–60 / month

Small-team support (100 exchanges/day):

GPT-4o-mini → $30–60 / month
Claude Sonnet → $200–400 / month

Mid-size e-commerce support (500 exchanges/day):

GPT-4o-mini → $150–300 / month
GPT-4o → $2,000–4,000 / month

Rule of thumb: picking the right model can cut cost by 10–20×. If you just say “hello” and the AI replies “hi” in 5 seconds, you probably don’t need $15/1M-token o1.