When people first look at Codex usage limits, it is easy to assume that the 5-hour limit is a short-term balance, and that the weekly limit only starts decreasing after the 5-hour quota is used up.
That is not how it works. Codex is better understood as checking multiple limit windows at the same time: a short window prevents burst usage, while the weekly window controls total usage over the week. A Codex request usually counts against both.
So this situation is usually normal:
|
|
01 The Short Version
You can understand Codex usage with three rules:
- The
5-hour limitand theweekly limitapply at the same time. - If the weekly limit is exhausted, you usually cannot continue using the same subscription quota pool even if the 5-hour quota still has room.
- Codex is not priced by simple message count. Usage depends on the model, tokens, task complexity, context size, and execution location.
In pseudocode:
|
|
When the 5-hour window resets, only the 5-hour quota is restored. It does not restore weekly quota. Weekly quota resets on its own schedule, or you may be able to buy extra credits on supported plans.
02 Why Both Windows Decrease
Think of Codex limits as two gates:
| Window | Purpose |
|---|---|
| 5-hour window | Prevents high-frequency burst usage |
| Weekly window | Controls total weekly usage |
Each Codex task creates real usage. That usage is reflected in the relevant rate limit windows.
It is not:
|
|
It is closer to:
|
|
That is why weekly usage can drop even when the 5-hour quota is not exhausted.
03 Look at Token-Based Credits
OpenAI does not publish a formula that lets users fully reproduce the exact Codex charge. What is public is the rate card, the main factors, and per-model credit pricing.
As of 2026-04-15, the main Codex rate card model is token-based credits. Usage is estimated from input tokens, cached input tokens, and output tokens.
Example official rates:
| Model | Input / 1M tokens | Cached input / 1M tokens | Output / 1M tokens |
|---|---|---|---|
| GPT-5.4 | 62.50 credits | 6.250 credits | 375 credits |
| GPT-5.4-Mini | 18.75 credits | 1.875 credits | 113 credits |
| GPT-5.3-Codex | 43.75 credits | 4.375 credits | 350 credits |
| GPT-5.2-Codex | 43.75 credits | 4.375 credits | 350 credits |
| GPT-5.1-Codex-Max | 31.25 credits | 3.125 credits | 250 credits |
| GPT-5.1-Codex-mini | 6.25 credits | 0.625 credits | 50 credits |
A rough estimate is:
|
|
This is not an exact billing formula, but it explains the trend: output is expensive, long context is expensive, and stronger models cost more. The official rate card also says Fast mode uses 2x credits, and Code review uses GPT-5.3-Codex pricing.
04 Do Not Only Count Messages
Ten Codex messages can consume very different amounts.
Light tasks are usually cheaper:
- Editing one small function
- Explaining a short code snippet
- Writing a short paragraph
- Making a local change in a clearly specified file
Heavy tasks cost more:
- Scanning a large codebase
- Running a long agent session
- Repeated read, edit, test, and fix loops
- Generating lots of code or a long report
- Using cloud tasks
- Enabling fast mode
So message count is only a rough feeling. It does not tell you the real usage.
05 Local Tasks vs Cloud Tasks
Execution location can make a big difference.
A local task works in your local workspace: reading files, editing code, and running commands. A cloud task is delegated to a hosted cloud environment, which is better for longer and more automated workflows.
Cloud tasks are often more expensive because they involve:
- A hosted execution environment
- Longer tasks
- More tool calls
- Larger context
- A more complete automation loop
For normal code edits, article cleanup, or small fixes, local tasks are usually cheaper. Use cloud tasks when the job truly needs hosted execution.
06 Why Weekly Usage Drops Fast
If your 5-hour quota barely moves but weekly usage drops a lot, common causes include:
- You used cloud tasks.
- You used a more expensive model.
- You enabled fast mode.
- The context was large, with many files or a long conversation.
- The output was long, such as lots of code, a long report, or log analysis.
- The task chain was long: search, edit, test, fix, test again.
- Your quota script mislabeled the limit windows.
If you read fields from something like /backend-api/wham/usage, do not only trust processed labels such as five_hour% or weekly%. Check the raw JSON fields:
limit_window_secondspercent_leftreset_at- bucket / feature name
Typical windows:
|
|
If your script labels the windows backwards, the quota display will be misleading.
07 How to Save Quota
To make weekly quota last longer:
- Split large jobs into smaller tasks.
- Prefer local tasks when possible.
- Tell Codex the relevant paths to reduce unnecessary scanning.
- Avoid dumping huge logs, long files, or unrelated context.
- Use cheaper mini models for light work.
- Ask for a plan before starting a long task.
- Ask for concise answers when you do not need a long report.
A useful mental model:
|
|
This model is not exact billing math, but it explains most Codex usage-limit behavior.