Your OpenClaw agent is running. It's useful. But your Anthropic invoice just hit and โ wait, $187 last month? For a personal assistant?
You're not alone. The most common complaint in the OpenClaw community isn't bugs or features. It's cost. And almost every expensive setup I've audited has the same fixable problems.
1. Stop Using Your Best Model for Everything
This is the #1 money pit. If your openclaw.json has one model configured, every single message goes through that model. Every "ok, done" confirmation. Every "good morning" reply.
| Task | With Opus ($0.15/msg) | With Routing | Savings |
|---|---|---|---|
| "Good morning" reply | $0.15 | $0.001 (Flash) | 99% |
| Calendar check | $0.15 | $0.007 (Haiku) | 95% |
| Write a blog post | $0.15 | $0.15 (Opus) | 0% |
| Heartbeat check | $0.15 | $0.00 (local) | 100% |
2. Run a Local Model for the Easy Stuff
If you're on a Mac (any Apple Silicon) or a Linux box with 16GB+ RAM, you can run Ollama with a small model for $0/month. Forever.
| Model | Size | RAM | Speed | Good For |
|---|---|---|---|---|
| Qwen 2.5 7B | 4.7GB | 8GB+ | Fast | Quick tasks, routing |
| Llama 3 8B | 4.7GB | 8GB+ | Fast | General use, basic coding |
| Mistral 7B | 4.1GB | 8GB+ | Fast | Multilingual |
| Qwen 2.5 14B | 8.9GB | 16GB+ | Medium | Better reasoning, still free |
| Qwen 3.5 32B | 19GB | 32GB+ | Slower | Near-Sonnet quality |
3. Set Token Limits and Context Windows
Every message includes context โ memory, conversation history, system prompts. After a week, your agent is sending 50,000+ tokens with every message. At Sonnet rates, that's ~$0.15 per message just in input tokens.
- Set max context length โ 8K-16K tokens is plenty
- Archive old conversations โ your agent doesn't need last Tuesday's grocery list
- Use memory files โ loaded once and referenced, not re-sent every message
- Trim system prompts โ keep SOUL.md under 500 words; put details in reference files
4. Batch Your Agent's Work
Your agent doesn't need to respond instantly to everything. If you're using heartbeats, each one costs a full API call.
- Increase heartbeat intervals โ 30 min instead of 5 min = 6x fewer calls
- Use HEARTBEAT_OK โ simple ack instead of full reasoning cycle
- Batch notifications โ check email hourly, not every 10 minutes
- Route overnight heartbeats through local model (free)
5. Monitor What You're Actually Spending
You can't optimize what you don't measure.
- Cost per channel โ group chats multiply costs (agent processes every message)
- Cost per task type โ coding at $0.15/msg might be worth it; "what time is it?" is not
- Daily spending trends โ catch spikes early; a runaway heartbeat loop can burn $20 overnight
- Set billing alerts at your API provider ($50/mo, $100/mo thresholds)
6. Pick the Right Provider
| Provider | Budget | Mid-Tier | Premium | Notes |
|---|---|---|---|---|
| Anthropic | Haiku $0.25/M | Sonnet $3/M | Opus $15/M | Best tool calling |
| OpenAI | 4o Mini $0.15/M | GPT-4o $2.50/M | o1 $15/M | Broad ecosystem |
| Flash $0.075/M | Pro $1.25/M | Ultra TBD | Cheapest budget | |
| DeepSeek | V3 $0.27/M | โ | R1 $0.55/M | Near-Sonnet at 1/5 price |
| Local (Ollama) | Free | Free | Free | Needs hardware |
โก The Optimization Playbook (Do This Today)
- Configure model routing โ cheap for simple, premium for complex
- Install Ollama with Qwen 2.5 7B for heartbeats and quick replies
- Trim your system prompt to under 500 words
- Set context window limits (8K-16K tokens)
- Increase heartbeat intervals to 30+ minutes
- Set billing alerts at $50 and $100/month
- Review your API dashboard weekly for the first month
Expected result: $80-150/month โ $15-40/month
Before/After: Real Examples
Personal Assistant
Before: Claude Opus for everything, 5-min heartbeats, 32K context. $187/month.
After: Haiku for quick tasks, Sonnet for complex, local for heartbeats, 8K context. $28/month.
Business Agent (5-person team)
Before: GPT-4o for all channels including group chat. $340/month.
After: 4o Mini for group monitoring, GPT-4o for DMs and tasks, context trimming. $95/month.
Developer Agent
Before: Opus for coding, Opus for everything else too. $420/month.
After: Opus for coding only, Qwen 3.5 32B local for everything else. $65/month.
Don't Want to Do This Yourself?
ClawReady's $49 Cost & Security Audit does all of this for you. Full spending analysis, model routing configuration, local model recommendations, context optimization, and security review included.
Average client saves $50-120/month after an audit. The $49 pays for itself in the first week.
Book Your $49 Audit โWritten by the team at ClawReady โ we set up and optimize OpenClaw agents for a living. If your agent is costing more than $50/month, we can probably fix that.