realvco Docs

Costs & Usage

AI is not priced like traditional SaaS. Every conversation consumes tokens; every token has a cost. This section helps you track the spend and shrink it.


Three Things to Know

1. realvco Subscription ≠ AI Usage Fees

  • realvco monthly fee: pays for the host and operations
  • AI usage fees: paid to OpenAI / Anthropic / Google for API calls

These are billed separately. A realvco subscription gets you the host and companion framework; the AI API keys are yours (or procured via realvco).

2. Tokens Are the Unit — Not Message Count

  • 1 Chinese character ≈ 1.5–2 tokens
  • 1 English word ≈ 1.3 tokens
  • AI replies also count (output tokens typically 3–5× the input price)
  • Longer conversations compound — each request includes the full history

A typical short exchange (you ask 100 words, AI answers 300 words) is roughly 1,000 tokens.

3. Models Vary Wildly in Price

Same task, different costs:

ModelInput / 1M tokensOutput / 1M tokensBest For
GPT-4o-mini$0.15$0.60Daily chat, support
GPT-4o$2.50$10.00Complex reasoning, creative work
Claude 3.5 Sonnet$3.00$15.00Code, long-context
Claude 3.5 Haiku$0.80$4.00Fast responses
Gemini 1.5 Flash$0.075$0.30High volume, cheap

GPT-4o costs 17× more than GPT-4o-mini. Daily chat rarely needs the premium tier.


Deep Dives


Top 5 Quick Wins

If costs feel high today, work through these in order:

  1. Downshift daily chat to a cheap model — move Ada from GPT-4o to GPT-4o-mini; expect ~80% savings
  2. Enable context compression — long conversations auto-summarize old turns, cutting history carried per request
  3. Cap response length — set maxTokens so the AI stops writing novels
  4. Set a monthly budget cap — stop before costs spiral
  5. Run bulk tasks on Gemini Flash — cheap enough to be nearly free

Each has step-by-step instructions in Cost Optimization.