Why it matters
Caching policy changes can swing the effective cost of agentic coding dramatically. If your workflow depends on long-running projects and repeated context reuse, a cache-window reduction can turn ‘steady’ spend into surprise overages overnight.
Tokenmaxxing read
Tokenmaxxing isn’t just verbosity—it’s cache economics. Track cache hit rates and billed input tokens, budget for cache misses, and design flows to avoid replaying long context (summaries, state files, retrieval, smaller diffs).
Source takeaway
The article points to Claude Code JSONL logs (`~/.claude/projects/`) with cache fields like `ephemeral_5m_input_tokens` vs `ephemeral_1h_input_tokens`, plus a mid-April Anthropic comment on a Claude Code GitHub issue about the shorter cache.
