Claude Code on KnightLi Blog

Karpathy's 65-Line CLAUDE.md: Helping AI Coding Avoid Three Common Mistakes

Sun, 19 Apr 2026 18:27:23 +0800

A GitHub project about AI coding has been getting a lot of attention recently. Its core is not a complex codebase, but a roughly 65-line CLAUDE.md file. The reason it attracted so many stars is not technical complexity. It is that it captures problems many people repeatedly run into when using AI to write code.

The background starts with Andrej Karpathy’s observations on AI coding. Karpathy is an influential educator and engineer in AI: a Stanford PhD, an early OpenAI contributor, and a former Tesla AI leader responsible for Autopilot’s vision system. He has continued to share his views on large models, education, and AI tools, so his comments on changes in programming workflows tend to draw a lot of attention from developers.

He once said that after using Claude Code for a few weeks, his programming style changed noticeably. Previously it was roughly 80% handwritten code and 20% AI assistance. Now it is closer to 80% code written by AI and 20% edits by himself. He described it as “programming in English”, telling an LLM what to write through natural language.

But he also pointed out several recurring problems in AI coding.

01 Wrong Assumptions

The first problem is that models easily make assumptions on behalf of the user, then keep writing along that path. They do not always manage their own confusion, and they do not always stop to ask questions when the requirement is ambiguous.

For example, if the user only says “add a user export feature”, the model might assume it should export all users, output JSON, write to a local file, and skip any confirmation around permissions or fields. Only after the code is done does the user discover that the model’s understanding does not match the real scenario.

A better approach is to list the uncertainties first: should it export all users or filtered results? Should it trigger a browser download or run as a background job? Which fields are needed? How large is the data set? Are there permission constraints? If these questions are not clarified, writing faster only means drifting farther.

02 Over-Complexity

The second problem is that models often turn simple problems into complex ones. A task that could be handled with one function might receive abstract classes, strategy patterns, factory patterns, configuration layers, and a pile of extension points that may never be needed.

This kind of code can look engineered, but in practice it increases maintenance cost. AI is especially good at quickly generating large structures, but it does not always judge whether those structures are necessary. The result is that a task solvable in 100 lines becomes inflated into 1,000 lines.

The test is straightforward: would a senior engineer look at the change and think it is over-designed? If the answer is yes, remove the extra layers and solve the current problem with the least code needed.

03 Collateral Damage

The third problem is that models sometimes modify or delete code they do not fully understand. While fixing a small bug, they may casually change comments, reformat nearby code, clean up imports that look unused, or even touch logic unrelated to the current task.

These “drive-by improvements” are risky because they expand the change scope and make review harder. The user may only want to fix a validator crash caused by an empty email, but the model may also enhance email validation, add username validation, and rewrite docstrings. In the end, it becomes hard to tell which line changed behavior.

A safer rule is: only change what must be changed, and only clean up issues caused by your own change. Existing dead code, formatting problems, or historical baggage should not be touched unless the task explicitly asks for it. At most, mention it.

04 Turning Complaints Into CLAUDE.md

After Karpathy’s comments spread widely, developer Forrest Cheung did something clever: he organized these complaints into executable behavior rules and put them into a CLAUDE.md file.

The project does not contain complicated code. Its key idea is to turn the most failure-prone parts of AI coding into clear working rules. They can be summarized as four principles.

The first is to think before writing. Do not silently assume. Do not hide confusion. If a requirement has multiple interpretations, list them. If there is a simpler approach, say so. Ask when clarification is needed, and push back when needed.

The second is to keep things simple. Do not add features that were not requested. Do not abstract one-off code. Do not add unnecessary configuration. Do not write large amounts of defensive code for extremely unlikely scenarios. If 50 lines can solve it, do not write 200.

The third is to make precise changes. Every changed line should trace directly back to the user’s request. Do not improve nearby code as a side quest. Do not refactor something that is not broken. Match the existing project style as much as possible.

The fourth is goal-driven execution. Do not give the model only a vague instruction. Give it a verifiable success criterion. For example, “fix the bug” can become “write a test that reproduces the bug, then make it pass”; “add validation” can become “write invalid-input tests and make them pass”. The clearer the success criterion, the easier it is for the model to loop toward completion.

05 Why It Took Off

This project became popular not because the content is mysterious, but because it is close to real development work.

Many people using AI for coding have seen similar scenes: the model confidently misunderstands the requirement, the code gets more complex as it goes, or it touches places it should not touch. The value of CLAUDE.md is that it turns those experiences into collaboration rules that can be placed inside a project.

The entry cost is also low: one file can start making a difference, with no complicated integration. Combined with Karpathy’s influence and the project’s practical comparison examples, it naturally spread through the Claude Code user base and the broader AI coding community.

More importantly, these rules are not only for Claude Code. No matter which AI coding tool you use, the underlying issues are similar: the model needs to know when to ask, when to simplify, when to stop, and how to decide that the task is complete.

06 What Developers Can Take Away

The lesson for ordinary developers is simple: AI coding is not about throwing one sentence at a model and waiting for a miracle. The effective approach is to give the model boundaries.

When the requirement is unclear, ask it to expose its assumptions first. When the implementation starts getting complicated, ask it to return to the smallest viable solution. When changing code, keep it focused on the task goal. When finishing work, use tests, commands, or explicit checkpoints to verify the result.

AI is already very capable at writing code, but it still needs good collaboration constraints. The fact that a short CLAUDE.md can attract so much attention shows that developers do not only need smarter models. They also need more reliable ways of working.

In short:

Think before writing to reduce wrong assumptions.
Keep things simple to avoid over-design.
Make precise changes to control change scope.
Work toward goals with verifiable success criteria.

These four rules are not complicated, but they are practical. The prerequisite for AI coding to truly improve efficiency is not making the model write more. It is making it write more accurately, with less code, and under better control.

Using Claude Code Quota More Efficiently: Models, Context, Caching, and /compact

Sun, 19 Apr 2026 15:29:06 +0800

Many Claude Code or Claude Max users run into the same problem: even after paying for Pro, Max 5x, or Max 20x, the usage warning appears quickly, or they have to wait for the next reset. This feels especially obvious when Claude Code reads many files, fixes complicated bugs, or runs long tasks in a large project.

The key point is this: usage is not deducted linearly by “minutes.” It depends on the model, context length, attachments, codebase size, conversation history, tool calls, and current capacity. In the same 5-hour window, one person may work for a long time while another hits the limit in minutes. Usually the account is not broken; each request is simply too heavy.

This note collects a set of practical habits for using quota more efficiently.

01 First Understand Claude’s Usage Window

Claude Pro and Max both have usage limits. Claude Code usage is shared with Claude on web, desktop, and mobile under the same subscription quota. Anthropic’s help center explains that message counts depend on message length, attachment size, current conversation length, model or feature used, and that Claude Code usage is also affected by project complexity, codebase size, and auto-accept settings.

A simple way to think about it:

Pro: suitable for light usage and small projects.
Max 5x: suitable for more frequent usage and larger codebases.
Max 20x: suitable for heavier daily collaboration.
Usage windows reset on a 5-hour session basis.
Long messages, long conversations, large files, and complex tasks consume usage faster.
Stronger models such as Opus hit limits faster than Sonnet.

So “I only used it for 20 minutes” does not explain much by itself. What matters is how much context Claude read during those 20 minutes, which model was used, whether large files were processed repeatedly, and whether the same long conversation kept accumulating more tasks.

02 First Habit: Do Not Default to the Most Expensive Model

The Claude model family is commonly positioned like this:

Opus: strongest capability, suitable for complex reasoning, architecture decisions, and hard bugs.
Sonnet: balanced capability and cost, suitable for most everyday coding tasks.
Haiku: lighter, suitable for simple classification, summarization, and format conversion.

For daily scripts, small bug fixes, documentation cleanup, and code explanation, Sonnet is usually enough. Save Opus for cases such as:

Complex architecture design.
Deep multi-file refactors.
Bugs that are hard to reproduce.
Long-chain troubleshooting.
Tasks where the normal model is clearly stuck.

In Claude Code, use /model to switch models, or set the default in /config. A steadier habit is to use Sonnet by default and switch to Opus only at key points, rather than running the whole task on Opus.

03 Second Habit: Control Context, Do Not Drag Old Tasks Along

The longer the context, the more Claude needs to process on each turn, and the faster usage is consumed. The Claude Code docs explicitly recommend proactive context management:

Use /clear when switching to an unrelated task.
Use /compact when one phase is done but important context should remain.
Use /context to see what is taking space.
Configure a status line if you want continuous status visibility.

A useful rhythm:

Small phase done: /compact
Large task done: /clear
Switching to unrelated work: /clear
Context usage getting high: /compact early

/compact summarizes earlier conversation history while preserving key task state, conclusions, file paths, and remaining work. It reduces the amount of history carried into later requests. You can also add a short instruction:

`1`	`/compact Preserve changed files, test results, remaining TODOs, and key design decisions`

Do not wait for automatic compaction. The docs note that Claude Code auto-compacts when context approaches the limit, but manually compacting at phase boundaries is usually easier to control.

04 Third Habit: Long Conversations and Large Files Make Every Request Heavier

Many people assume that “I only asked one more question” should be cheap. But in a long conversation, that question may carry a lot of history, file summaries, tool definitions, and system rules behind it.

Things that easily bloat context include:

Long conversations that are never cleared.
Asking Claude to read entire large files.
Pasting long logs, build output, or test output.
Adding many screenshots or images at once.
Asking it to repeatedly scan the whole repository.
An overly long CLAUDE.md.
Too many MCP servers enabled.

A more efficient approach: paste only key errors from logs, include only failing parts of test output, and let Claude use rg, head, tail, and symbol search before reading only the necessary parts. If command-line filtering can shrink the content, do not paste the whole thing into context.

05 Fourth Habit: Understand Caching, but Do Not Worship It

Anthropic’s Prompt Caching can cache repeated prompt prefixes. The default cache lifetime is 5 minutes, and a 1-hour cache is also supported. When cache hits, large repeated context does not need to be fully reprocessed, which helps reduce cost and improve rate limit utilization.

But caching has limitations:

Content must match exactly, including text and images.
The default cache is short-lived.
Changing models, tools, system prompts, or context structure may reduce cache hits.
Output tokens do not disappear because of caching; the response still needs to be generated.
How Claude Code uses caching is a product-level implementation detail, so do not treat it as permanent “free memory.”

In practice, the important part is not studying every caching detail. It is keeping the session stable:

Avoid frequent model switching within the same phase.
Do not repeatedly rewrite large rule blocks mid-task.
Do not keep adding new images inside the same task.
Do not leave a long task idle for too long and then return with another huge request.
Use /compact at phase boundaries.

This makes repeated context easier to reuse and reduces later request weight.

06 About Peak Hours: Avoid Them When You Can, but Do Not Treat Them as a Formula

People often say certain hours feel tighter. Anthropic’s help center is more careful: message counts can be affected by current Claude capacity, conversation length, attachments, model, and features. In other words, peak capacity can affect the experience, but do not treat a specific local time window as a permanent rule.

Practical suggestions:

Put large refactors and heavy analysis in periods when both your network and the service are stable.
Do not start a huge task right before you plan to step away.
If you expect to leave for a long time, run /compact or /clear first.
For small edits, do not use Opus with a long context unless you really need it.

This is more reliable than memorizing a fixed “do not use it from X to Y” rule.

07 Slim Down CLAUDE.md, rules, MCP, and skills

Claude Code loads project rules, tool information, and some environment context into the session. The official docs also recommend separating general rules from specialized rules so every session does not start with a large amount of unrelated text.

A useful split:

CLAUDE.md: only global rules that always apply.
rules: path-specific or file-type-specific rules.
skills: specific workflows, such as publishing posts, deployment, image generation, or committing code.
MCP: only enable servers that the current task actually needs.

If CLAUDE.md is hundreds or thousands of lines long, every session carries that cost. A better pattern is to move occasional workflows into skills and load them only when needed.

MCP is similar. More tools do not automatically mean more efficiency. The Claude Code docs mention using /mcp to view and disable unnecessary servers, and /context to see what is consuming context space.

08 Practical Command List

These are the most useful daily commands:

/model

Switch models. Sonnet is a good default; use Opus for complex reasoning.

/clear

Clear the current context. Use it when switching to unrelated work.

`1`	`/compact`

Compress conversation history. Use it when a phase is done but the same task continues.

`1`	`/context`

Inspect context usage and find what is taking space.

/status

Check subscription or usage-related status. Anthropic’s help center also recommends monitoring remaining allocation.

/mcp

View and manage MCP servers, and disable tools not needed for the current task.

If you use API billing, /cost can be useful. But for Pro/Max subscriptions, the Claude Code docs explain that the dollar estimate from /cost is not the right billing reference; subscribers should rely more on usage information such as /stats and /status.

09 A Quota-Saving Workflow

A practical workflow looks like this:

Run /clear before starting a new task.
Use Sonnet by default.
Let Claude inspect project structure and key files first, not the whole repository.
Run /compact after each small phase.
Switch to Opus only for hard blockers.
Filter logs, errors, and test output before pasting them.
Run /clear after the task is done; do not start new work with stale context.
Periodically review CLAUDE.md, MCP, and skills to shrink always-on context.

The core idea is simple: let Claude see only what it truly needs for the current task.

10 Summary

Claude Code usage running out quickly is usually not caused by one thing. It is often a combination of high-cost models, long uncleared conversations, too many files and logs, heavy MCP and rule context, weaker cache reuse, and peak capacity fluctuations.

The practical fixes are also simple:

Use Sonnet for daily work.
Save Opus for truly complex problems.
Use /compact when a phase is done.
Use /clear when switching tasks.
Use /context to find context bloat.
Slim down CLAUDE.md, rules, MCP, and skills.
Do not dump the whole repository, full logs, or large image batches into context.

How much work the same Pro or Max plan can support depends heavily on how you manage context. Make the context smaller and task boundaries clearer, and Claude Code will feel much steadier.

References

Claude Help Center: Using Claude Code with your Pro or Max plan: https://support.claude.com/en/articles/11145838-using-claude-code-with-your-pro-or-max-plan
Claude Help Center: About Claude’s Max Plan Usage: https://support.anthropic.com/en/articles/11014257-about-claude-s-max-plan-usage/
Claude Code Docs: Manage costs effectively: https://code.claude.com/docs/en/costs
Anthropic Docs: Prompt caching: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching