Costs & Usage
AI is not priced like traditional SaaS. Every conversation consumes tokens; every token has a cost. This section helps you track the spend and shrink it.
Three Things to Know
1. realvco Subscription ≠ AI Usage Fees
- realvco monthly fee: pays for the host and operations
- AI usage fees: paid to OpenAI / Anthropic / Google for API calls
These are billed separately. A realvco subscription gets you the host and companion framework; the AI API keys are yours (or procured via realvco).
2. Tokens Are the Unit — Not Message Count
- 1 Chinese character ≈ 1.5–2 tokens
- 1 English word ≈ 1.3 tokens
- AI replies also count (output tokens typically 3–5× the input price)
- Longer conversations compound — each request includes the full history
A typical short exchange (you ask 100 words, AI answers 300 words) is roughly 1,000 tokens.
3. Models Vary Wildly in Price
Same task, different costs:
| Model | Input / 1M tokens | Output / 1M tokens | Best For |
|---|---|---|---|
| GPT-4o-mini | $0.15 | $0.60 | Daily chat, support |
| GPT-4o | $2.50 | $10.00 | Complex reasoning, creative work |
| Claude 3.5 Sonnet | $3.00 | $15.00 | Code, long-context |
| Claude 3.5 Haiku | $0.80 | $4.00 | Fast responses |
| Gemini 1.5 Flash | $0.075 | $0.30 | High volume, cheap |
GPT-4o costs 17× more than GPT-4o-mini. Daily chat rarely needs the premium tier.
Deep Dives
Usage Dashboard
What every number on the Admin Panel's Usage tab means
How Tokens Are Priced
Why a 1,000-word conversation can cost 5 cents
Budget Alerts
Monthly caps, daily alerts, hard stops
Cost Optimization
Switching models, compressing context, caching, rate limits
Top 5 Quick Wins
If costs feel high today, work through these in order:
- Downshift daily chat to a cheap model — move Ada from GPT-4o to GPT-4o-mini; expect ~80% savings
- Enable context compression — long conversations auto-summarize old turns, cutting history carried per request
- Cap response length — set
maxTokensso the AI stops writing novels - Set a monthly budget cap — stop before costs spiral
- Run bulk tasks on Gemini Flash — cheap enough to be nearly free
Each has step-by-step instructions in Cost Optimization.
Related Docs
- Admin Panel Overview
- Engine Selection Guide — engine choice also affects cost