Politeness is Costing You Thousands of LLM Tokens and $$$
If you use Claude Code, Cursor, or GitHub Copilot for hours every day, you know that LLM verbosity is a real tax on your wallet and your focus.
There’s a project called Caveman that cuts output tokens by ~65% without losing technical accuracy.
- Faster Iterations: 3x speed increase because the model stops writing the filler “I’d be happy to help you with that!”
- Lower Costs: Significantly reduces token spend for heavy users.
- Cleaner Context: Uses a caveman-compress skill to shrink your memory files, saving input tokens every time you start a session.
If you’re looking to optimize your AI dev workflow and stop paying for polite filler text, check out Julius Brussee’s Caveman repo
or just add the Caveman skill to your favorite coding editor.