How Codex Usage Limits Work: 5-Hour Limits, Weekly Limits, and Credits

When people first look at Codex usage limits, it is easy to assume that the 5-hour limit is a short-term balance, and that the weekly limit only starts decreasing after the 5-hour quota is used up.

That is not how it works. Codex is better understood as checking multiple limit windows at the same time: a short window prevents burst usage, while the weekly window controls total usage over the week. A Codex request usually counts against both.

So this situation is usually normal:

1
2


5-hour quota still has plenty left
but weekly quota has already decreased

01 The Short Version

You can understand Codex usage with three rules:

The 5-hour limit and the weekly limit apply at the same time.
If the weekly limit is exhausted, you usually cannot continue using the same subscription quota pool even if the 5-hour quota still has room.
Codex is not priced by simple message count. Usage depends on the model, tokens, task complexity, context size, and execution location.

In pseudocode:

1
2
3
4


can_use_codex =
    five_hour_remaining > 0
    && weekly_remaining > 0
    && no other product policy is triggered

When the 5-hour window resets, only the 5-hour quota is restored. It does not restore weekly quota. Weekly quota resets on its own schedule, or you may be able to buy extra credits on supported plans.

02 Why Both Windows Decrease

Think of Codex limits as two gates:

Window	Purpose
5-hour window	Prevents high-frequency burst usage
Weekly window	Controls total weekly usage

Each Codex task creates real usage. That usage is reflected in the relevant rate limit windows.

It is not:

1
2
3


Use 5-hour quota first
After the 5-hour quota runs out
Start using weekly quota

It is closer to:

1
2
3


One Codex request
=> counts toward the 5-hour window
=> also counts toward the weekly window

That is why weekly usage can drop even when the 5-hour quota is not exhausted.

03 Look at Token-Based Credits

OpenAI does not publish a formula that lets users fully reproduce the exact Codex charge. What is public is the rate card, the main factors, and per-model credit pricing.

As of 2026-04-15, the main Codex rate card model is token-based credits. Usage is estimated from input tokens, cached input tokens, and output tokens.

Example official rates:

Model	Input / 1M tokens	Cached input / 1M tokens	Output / 1M tokens
GPT-5.4	62.50 credits	6.250 credits	375 credits
GPT-5.4-Mini	18.75 credits	1.875 credits	113 credits
GPT-5.3-Codex	43.75 credits	4.375 credits	350 credits
GPT-5.2-Codex	43.75 credits	4.375 credits	350 credits
GPT-5.1-Codex-Max	31.25 credits	3.125 credits	250 credits
GPT-5.1-Codex-mini	6.25 credits	0.625 credits	50 credits

A rough estimate is:

1
2
3
4


usage
≈ input tokens / 1,000,000 × model input price
+ cached input tokens / 1,000,000 × model cached input price
+ output tokens / 1,000,000 × model output price

This is not an exact billing formula, but it explains the trend: output is expensive, long context is expensive, and stronger models cost more. The official rate card also says Fast mode uses 2x credits, and Code review uses GPT-5.3-Codex pricing.

04 Do Not Only Count Messages

Ten Codex messages can consume very different amounts.

Light tasks are usually cheaper:

Editing one small function
Explaining a short code snippet
Writing a short paragraph
Making a local change in a clearly specified file

Heavy tasks cost more:

Scanning a large codebase
Running a long agent session
Repeated read, edit, test, and fix loops
Generating lots of code or a long report
Using cloud tasks
Enabling fast mode

So message count is only a rough feeling. It does not tell you the real usage.

05 Local Tasks vs Cloud Tasks

Execution location can make a big difference.

A local task works in your local workspace: reading files, editing code, and running commands. A cloud task is delegated to a hosted cloud environment, which is better for longer and more automated workflows.

Cloud tasks are often more expensive because they involve:

A hosted execution environment
Longer tasks
More tool calls
Larger context
A more complete automation loop

For normal code edits, article cleanup, or small fixes, local tasks are usually cheaper. Use cloud tasks when the job truly needs hosted execution.

06 Why Weekly Usage Drops Fast

If your 5-hour quota barely moves but weekly usage drops a lot, common causes include:

You used cloud tasks.
You used a more expensive model.
You enabled fast mode.
The context was large, with many files or a long conversation.
The output was long, such as lots of code, a long report, or log analysis.
The task chain was long: search, edit, test, fix, test again.
Your quota script mislabeled the limit windows.

If you read fields from something like /backend-api/wham/usage, do not only trust processed labels such as five_hour% or weekly%. Check the raw JSON fields:

limit_window_seconds
percent_left
reset_at
bucket / feature name

Typical windows:

1
2
3
4
5


limit_window_seconds = 18000
=> about 5 hours

limit_window_seconds = 604800
=> about 7 days

If your script labels the windows backwards, the quota display will be misleading.

07 How to Save Quota

To make weekly quota last longer:

Split large jobs into smaller tasks.
Prefer local tasks when possible.
Tell Codex the relevant paths to reduce unnecessary scanning.
Avoid dumping huge logs, long files, or unrelated context.
Use cheaper mini models for light work.
Ask for a plan before starting a long task.
Ask for concise answers when you do not need a long report.

A useful mental model:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


can continue using
= short window has quota
&& weekly window has quota

usage speed
= model price
× tokens
× output length
× task complexity
× execution location

This model is not exact billing math, but it explains most Codex usage-limit behavior.