CrabTrap: Brex Built an LLM-as-a-Judge HTTP Proxy to Run OpenClaw Agents Safely in Production
Brex — the corporate card and financial software company — open-sourced CrabTrap this week: an HTTP/HTTPS proxy that intercepts every outbound network request from an AI agent and uses an LLM-as-a-judge to determine if it should be allowed.
The motivation, straight from Brex's engineering blog:
"While OpenClaw is the fastest-growing project on GitHub, there are few successful cases of enterprise deployments. Brex decided to change that."
The community summary is even punchier: "OpenClaw + Composio alone gets you to 'the agent works.' Add CrabTrap and you get to 'the agent works and I can sleep.'"
The Problem CrabTrap Solves
When you run an OpenClaw agent with real credentials — API keys, OAuth tokens, service accounts — those credentials can be used to take actions with production consequences. The agent can hallucinate destructive requests. It can get prompt-injected. And once a request leaves the process, it hits real APIs.
Existing guardrails have gaps:
- MCP gateways — enforce policy at protocol layer, but only for MCP traffic
- LLM provider guardrails — tied to a single model, opaque to customize
- NVIDIA OpenShell — sandbox-level egress control, not request-level judgment
- Scoped tools + per-action approvals — don't scale as agent capabilities grow
Brex needed something that sat between every agent and every network request, framework-agnostic, with the ability to make nuanced decisions. They built it.
How CrabTrap Works
Transport-layer interception: Set HTTP_PROXY and HTTPS_PROXY in the agent's environment. Every outbound request routes through CrabTrap — no SDK, no wrappers, no per-tool integration required. Works with OpenClaw regardless of which channels or tools it's using.
TLS interception for HTTPS: CrabTrap generates certificates per host, negotiates TLS with the client, opens a separate upstream connection. Optionally add iptables rules to block direct outbound connections as an extra enforcement layer.
Two-stage evaluation on every request:
- Static rules first — deterministic URL pattern matching (prefix, exact, glob), optionally scoped to HTTP methods. Deny rules always take priority. Executes in microseconds from cached regexps.
- LLM-as-judge for the long tail — if no static rule matches, the full request context goes to the judge along with a natural-language policy for that agent. Returns a structured JSON decision — allow, deny, or allow-with-modification.
The natural-language policy is the key innovation. Instead of hand-coding every allowed API pattern, you write what the agent is supposed to do in plain English: "This agent manages calendar events and reads email. It should not write to external databases, send messages to contacts it hasn't received messages from first, or access financial APIs." The LLM judge enforces that intent.
Setup for OpenClaw
# Install CrabTrap
git clone https://github.com/brexhq/CrabTrap
cd CrabTrap && docker compose up -d
# In your OpenClaw environment
export HTTP_PROXY=http://localhost:8080
export HTTPS_PROXY=http://localhost:8080
# Define your agent policy (natural language)
# CrabTrap evaluates all outbound requests against it
The proxy is framework-agnostic — it works with any OpenClaw channel, any tool, any skill. You don't change how OpenClaw is configured; you change where its network traffic goes.
This Week's Security Layer Stack
Between the SecurityScorecard report (28K exposed instances), NVIDIA NemoClaw (sandbox-level execution), and CrabTrap (request-level LLM judgment), this week has produced a complete layered security stack for OpenClaw production deployments:
| Layer | Tool | What It Protects |
|---|---|---|
| Network perimeter | Cloudflare Tunnel + gateway auth | Keeps gateway off public internet |
| Agent behavior | SOUL.md + AGENTS.md permissions | Limits what agent is allowed to attempt |
| Sandbox isolation | NVIDIA NemoClaw / OpenShell | Isolates agent execution environment |
| Request-level enforcement | CrabTrap | Judges every outbound API call in real time |
| Patch cadence | OpenClaw update monitoring | Keeps CVEs patched (3 active this week) |
For most individual and small business operators, layers 1 and 2 — network perimeter hardening and proper SOUL.md/AGENTS.md configuration — are sufficient and achievable in a single setup session. CrabTrap and NemoClaw add meaningful additional protection for production deployments with real credentials doing real work.
Get a Security-First Setup — $49 Audit Covers All Five Layers →