Skip to content

trimwire — context pruning for Claude Code

Claude Code sessions degrade as context fills up. trimwire prunes the dead weight on every request — silently, in under 2 ms, with no certificate, no restart, and no credential changes. The same gateway pattern as LiteLLM, Vercel AI Gateway, and Cloudflare AI Gateway.

Every Claude Code session piles up dead weight: re-reads of files you’ve since edited, old tool output that scrolled off three tools ago, reasoning blocks from problems already solved. All of it ships to Anthropic on every request, crowding out the context that actually matters right now.

trimwire sits between Claude Code and Anthropic using ANTHROPIC_BASE_URL, the same gateway mechanism Anthropic documents and that LiteLLM, Vercel AI Gateway, and Cloudflare AI Gateway use.

01 · Intercept

Claude Code sends its normal /v1/messages request to trimwire on localhost instead of straight to api.anthropic.com.

02 · Prune

trimwire rewrites only the messages[] array: dedup, bloat trimming, stale-read removal, and more. It never touches system, tools[], your auth header, or sampling.

03 · Forward

The slimmer payload goes to Anthropic; the response streams back byte-for-byte, unbuffered. Overhead is sub-2 ms, off the network critical path.

04 · Stay warm

Stable-prefix re-pruning keeps the pruned prefix byte-identical turn-to-turn, so Anthropic’s prompt cache holds across the session.

Your on-disk Claude Code transcript is never modified. On any error trimwire forwards your original bytes unchanged. The worst case is “no pruning this turn.”

Truly transparent

No CA certificate, no TLS interception, no Claude Code restart, no system-prompt changes. Just the documented ANTHROPIC_BASE_URL.

Eight strategies, one opt-in ninth

The default profile runs all eight cache-safe strategies; simhash_dedup is opt-in.

Keeps the prompt cache warm

Stable-prefix re-pruning reuses pruning decisions while the conversation stays append-only, so the pruned prefix stays byte-identical.

Content-free ledger

The local SQLite ledger stores byte counts, hashes, and timings. Never message text. trimwire stats / recall show savings + cache health.

Two profiles, one knob

default prunes hardest; gentle is the lightest touch. Both keep stable-prefix re-pruning on.

Opt-in summarizer (multi-backend)

Off by default. For long reasoning-dense sessions, summarize old context with a local model (ollama) or a cloud API you choose; any failure falls back to model-free pruning. Configure with trimwire summarizer setup.

It doesn't store your code

The ledger records byte counts and prefix hashes, never message text. Disable it with [ledger] enabled = false.

Your transcript is untouched

trimwire shapes the outbound request only; it never writes your ~/.claude session files.

ToS-safe

ANTHROPIC_BASE_URL is Anthropic’s documented gateway mechanism. Claude Code stays verbatim; system is never altered.

Try it risk-free

trimwire preview <session.jsonl> replays the real strategies over a recorded transcript and shows exactly what would be trimmed (read-only, no network).

Full answers to “is it ToS-safe?”, “does it see my code?”, and “what if it crashes?” are on the FAQ & Trust page.

Requires Rust 1.85+ (edition 2024).

Terminal window
cargo binstall trimwire # prebuilt binary (fastest); or `cargo install trimwire`
# from source: cargo install --path . · script: curl -LsSf https://raw.githubusercontent.com/AZagatti/trimwire/main/scripts/install.sh | sh
# wire trimwire into Claude Code (idempotent, safe to re-run)
trimwire install
exec $SHELL # pick up the new ANTHROPIC_BASE_URL, then use `claude` as normal
# check the setup, then watch the savings
trimwire doctor # config + gateway health + ANTHROPIC_BASE_URL wiring
trimwire stats # bytes pruned, reduction %, ~tokens, per-strategy breakdown
trimwire stats --since 2026-06-01 # savings within a UTC date window

Optional: trimwire completions <shell> for tab-completion, trimwire man for man pages. On systemd / launchd, trimwire install starts an always-up service; trimwire uninstall reverses everything.


Open source under MIT OR Apache-2.0 · context pruning, not magic. It reports headroom, not dollar savings.