Your OpenClaw agent is running. It's useful. But your Anthropic invoice just hit and โ€” wait, $187 last month? For a personal assistant?

You're not alone. The most common complaint in the OpenClaw community isn't bugs or features. It's cost. And almost every expensive setup I've audited has the same fixable problems.

1. Stop Using Your Best Model for Everything

This is the #1 money pit. If your openclaw.json has one model configured, every single message goes through that model. Every "ok, done" confirmation. Every "good morning" reply.

The fix: Model routing. Use cheap models for simple tasks, premium for complex ones.
TaskWith Opus ($0.15/msg)With RoutingSavings
"Good morning" reply$0.15$0.001 (Flash)99%
Calendar check$0.15$0.007 (Haiku)95%
Write a blog post$0.15$0.15 (Opus)0%
Heartbeat check$0.15$0.00 (local)100%
๐Ÿ’ฐ Typical result: 60-70% savings from routing alone

2. Run a Local Model for the Easy Stuff

If you're on a Mac (any Apple Silicon) or a Linux box with 16GB+ RAM, you can run Ollama with a small model for $0/month. Forever.

ModelSizeRAMSpeedGood For
Qwen 2.5 7B4.7GB8GB+FastQuick tasks, routing
Llama 3 8B4.7GB8GB+FastGeneral use, basic coding
Mistral 7B4.1GB8GB+FastMultilingual
Qwen 2.5 14B8.9GB16GB+MediumBetter reasoning, still free
Qwen 3.5 32B19GB32GB+SlowerNear-Sonnet quality
The hybrid approach: Local model handles 80% of messages. Cloud API handles the complex 20%. Monthly bill drops from $80 to $10-15.

3. Set Token Limits and Context Windows

Every message includes context โ€” memory, conversation history, system prompts. After a week, your agent is sending 50,000+ tokens with every message. At Sonnet rates, that's ~$0.15 per message just in input tokens.

๐Ÿ’ฐ 20-40% reduction in per-message costs

4. Batch Your Agent's Work

Your agent doesn't need to respond instantly to everything. If you're using heartbeats, each one costs a full API call.

๐Ÿ’ฐ 30-50% reduction in total API calls

5. Monitor What You're Actually Spending

You can't optimize what you don't measure.

6. Pick the Right Provider

ProviderBudgetMid-TierPremiumNotes
AnthropicHaiku $0.25/MSonnet $3/MOpus $15/MBest tool calling
OpenAI4o Mini $0.15/MGPT-4o $2.50/Mo1 $15/MBroad ecosystem
GoogleFlash $0.075/MPro $1.25/MUltra TBDCheapest budget
DeepSeekV3 $0.27/Mโ€”R1 $0.55/MNear-Sonnet at 1/5 price
Local (Ollama)FreeFreeFreeNeeds hardware

โšก The Optimization Playbook (Do This Today)

  1. Configure model routing โ€” cheap for simple, premium for complex
  2. Install Ollama with Qwen 2.5 7B for heartbeats and quick replies
  3. Trim your system prompt to under 500 words
  4. Set context window limits (8K-16K tokens)
  5. Increase heartbeat intervals to 30+ minutes
  6. Set billing alerts at $50 and $100/month
  7. Review your API dashboard weekly for the first month

Expected result: $80-150/month โ†’ $15-40/month

Before/After: Real Examples

Personal Assistant

Before: Claude Opus for everything, 5-min heartbeats, 32K context. $187/month.

After: Haiku for quick tasks, Sonnet for complex, local for heartbeats, 8K context. $28/month.

๐Ÿ’ฐ 85% savings

Business Agent (5-person team)

Before: GPT-4o for all channels including group chat. $340/month.

After: 4o Mini for group monitoring, GPT-4o for DMs and tasks, context trimming. $95/month.

๐Ÿ’ฐ 72% savings

Developer Agent

Before: Opus for coding, Opus for everything else too. $420/month.

After: Opus for coding only, Qwen 3.5 32B local for everything else. $65/month.

๐Ÿ’ฐ 85% savings

Don't Want to Do This Yourself?

ClawReady's $49 Cost & Security Audit does all of this for you. Full spending analysis, model routing configuration, local model recommendations, context optimization, and security review included.

Average client saves $50-120/month after an audit. The $49 pays for itself in the first week.

Book Your $49 Audit โ†’

Written by the team at ClawReady โ€” we set up and optimize OpenClaw agents for a living. If your agent is costing more than $50/month, we can probably fix that.