What Happened in Claude Code's HERMES.md Billing Incident

Claude Code recently had a typical billing incident: a user only started the CLI and had not made an explicit request, yet a large local HERMES.md file was read and generated a significant charge.

This is worth looking at because it exposes a new risk in AI coding tools. Once a tool automatically reads context, local files can become real token cost.

What Happened

The public issue shows that the user had a large HERMES.md file in the working directory. When Claude Code started, the CLI scanned and loaded project context. The problem was that this file was automatically included in context and counted toward API usage.

The user did not explicitly ask the model to process that file, but billing had already happened. The harder part is that this can occur during initialization or context preparation, so users may not immediately realize that cost is being generated.

Anthropic later replied in the issue that it would refund the abnormal charge and provide extra credits. That confirms the problem was acknowledged and handled, but it also reminds users that “automatic context” in an AI CLI is not free.

Why HERMES.md Triggered It

HERMES.md itself is not the point. It could be any large file: logs, exported documents, test data, database dumps, generated reports.

The real issue is the combination of three things:

Claude Code automatically reads project context.
The file being read may be large.
Context tokens enter the billing path.

If a file is large enough, even being pulled in “incidentally” can create noticeable cost. For token-based models, stronger automation needs clearer boundaries.

This Is Not an Ordinary Bug

An ordinary CLI bug may mean a failed command, wrong output, or broken feature. A billing bug is more sensitive because it affects the user’s bill directly.

For AI coding tools, the billing boundary can be blurry:

System prompts consume tokens.
Project rules consume tokens.
Automatically read files consume tokens.
Tool call results consume tokens.
Retries, compression, and summaries can keep consuming tokens.

Users may see only “starting the tool” or “one chat,” while the background may already have sent multiple requests with a large amount of context.

How Users Can Reduce Risk

If you use Claude Code, Codex, Cline, or similar AI coding tools, start with a few habits:

Do not put large files directly in the project root.
Add logs, exported data, build outputs, and temporary files to ignore rules.
Check whether the tool supports .ignore, context exclusion, or file allowlists.
Enable budget alerts or usage limits.
Test in a small directory before running in a large repository.

If a repository must keep large files, explicitly tell the tool not to read them. Project rules can also say: do not proactively read logs, dumps, datasets, archives, or large Markdown files.

What Tool Vendors Should Improve

This cannot rely only on user caution. Tools should provide hard boundaries.

Better designs include:

Initialization should not silently bill for large files.
Reading very large files automatically should require confirmation.
The CLI should show estimated tokens and cost range for the request.
Common large files and generated directories should be ignored by default.
Abnormal token spikes should have protective thresholds.

The more AI coding tools behave like autonomous agents, the more transparent their costs need to be. Otherwise users cannot judge how much a single operation will cost.

Summary

The Claude Code HERMES.md billing incident is essentially a conflict between automatic context and usage-based billing.

For users, the key is to control project context: do not expose large files to AI tools by default, and set budget and usage limits. For tool vendors, automatic file reading needs visible cost prompts and protective mechanisms.

References: