I got curious where my tokens were actually going, so I wrote a quick audit tool that reads the Claude Code session logs.
Turns out the biggest sink isn't your prompts. It's the agent's own tool calls. In my case, grep alone ate 3.5M tokens across ~350 sessions. 1800+ calls, most of them dumping raw output that the agent barely used. Full file reads for one function signature. Complete test output when only failures matter.
So I built wrappers: function signatures without bodies (~90% smaller), test output with just failures, that kind of thing. 2.3M tokens saved so far.
You can audit your own sessions without installing anything:
For a start they could make the answers less talkative?
I switched back to ChatGPT out of necessity, because Claude stopped working after two queries, where it gave overly elaborate answers (about a simple web app config).
But Claude isn't alone.
It seems a recent (subjective) trend that Claude and ChatGPT give very lengthy answers, with a lot of repetition from the original query on the free plans.
I got used to add "answer briefly", to keep the noise in check.
18 comments
Turns out the biggest sink isn't your prompts. It's the agent's own tool calls. In my case, grep alone ate 3.5M tokens across ~350 sessions. 1800+ calls, most of them dumping raw output that the agent barely used. Full file reads for one function signature. Complete test output when only failures matter.
So I built wrappers: function signatures without bodies (~90% smaller), test output with just failures, that kind of thing. 2.3M tokens saved so far.
You can audit your own sessions without installing anything:
https://github.com/edimuj/tokenleanI switched back to ChatGPT out of necessity, because Claude stopped working after two queries, where it gave overly elaborate answers (about a simple web app config).
But Claude isn't alone. It seems a recent (subjective) trend that Claude and ChatGPT give very lengthy answers, with a lot of repetition from the original query on the free plans.
I got used to add "answer briefly", to keep the noise in check.
> Anthropic recently accidentally released part of its internal source code for Claude Code due to "human error".
I wonder who that human was counting on leading up to this "human error" ...
That’s why there’re now work hours restrictions