If your website logs suddenly show a large number of 404 and 400 responses, the cause is often not normal users clicking broken links. It is usually an automated scanner probing paths such as .env, .git, wp-admin, phpmyadmin, and xmlrpc.php.
These requests create several problems:
- access log grows quickly
- error log fills with useless noise
- static sites or reverse proxy services waste connections on invalid requests
- real issues get buried under scan noise
Nginx can use limit_req and limit_conn to control this. But first, one point matters: Nginx cannot natively rate-limit directly by “response status code is 404 or 400”, because rate limiting happens before the response is generated.
The practical approach is to rate-limit scan paths, suspicious sources, and high-frequency site-wide requests before they produce 404 / 400 responses.
Basic idea
Use three layers:
- Apply gentle site-wide rate limiting to prevent one IP from hammering the site.
- Apply strict rate limiting to common scan paths and return
404directly. - Limit concurrent connections per IP.
A safer rollout order is: first add the scan path rule and access_log off, observe for one day, and only add site-wide limit_req if there are still many random-path 404 requests.
Define rate-limit zones in http first
limit_req_zone and limit_conn_zone must be placed inside http {}. They cannot be placed inside a single site’s server {}.
You can add them directly to the http {} block in /etc/nginx/nginx.conf:
|
|
You can also create a new file:
|
|
Write:
|
|
This assumes your nginx.conf really includes this inside http {}:
|
|
Then use the zones in server
A site config file is usually something like /etc/nginx/sites-enabled/www.example.com, and it usually contains server {}. Do not write limit_req_zone there. Only use zones already defined earlier.
Example:
|
|
If you are worried about hurting normal traffic with site-wide rate limiting, start with only the scan path rule:
|
|
What these parameters mean
This line:
|
|
means:
limit_req_zone: defines the accounting zone for request rate limiting.$binary_remote_addr: uses client IP as the rate-limit key, and uses less memory than$remote_addr.zone=perip_general:20m: creates a shared memory zone namedperip_generalwith size20m.rate=5r/s: each IP is allowed an average of 5 requests per second.
This line:
|
|
is similar, but stricter:
perip_scan: used specifically for suspicious scan paths.rate=1r/s: each IP is allowed only 1 request per second.
This line:
|
|
means:
limit_conn_zone: defines the accounting zone for concurrent connection limits.$binary_remote_addr: still counts by client IP.zone=addr_conn:20m: creates a connection-counting shared memory zone namedaddr_conn.
The actual concurrent connection limit is:
|
|
It means each IP can have at most 20 simultaneous connections.
Understanding burst and nodelay
For example:
|
|
You can read it like this:
rate=5r/s: the long-term average rate is 5 requests per second.burst=30: allow 30 extra requests during a short burst.nodelay: when requests exceed the average rate but are still withinburst, process them immediately instead of queueing; reject only afterburstis exceeded.
Without nodelay, Nginx tries to delay and queue some requests. For ordinary web pages, nodelay is usually easier to reason about. For APIs or especially sensitive endpoints, adjust according to actual behavior.
Common error: limit_req_zone in the wrong place
If you see this error:
|
|
It means limit_req_zone was written in a context where it is not allowed.
A common incorrect configuration is placing it inside server {}:
|
|
This will not work.
One-line memory aid:
limit_req_zonedefines the pool, so put it inhttp {}.limit_requses the pool, so put it inserver {}orlocation {}.limit_conn_zonedefines the connection pool, so put it inhttp {}.limit_connuses the connection pool, so put it inserver {}orlocation {}.
Temporarily block clearly abnormal IPs
If the logs confirm that several IPs are continuously sending scan requests, you can temporarily block them:
|
|
These deny directives can be placed inside server {} or a specific location {}. Whether to keep them long term depends on false-positive risk and traffic sources.
Check and reload
Check the configuration first:
|
|
If there is no error, reload:
|
|
Do not restart the service directly. reload lets Nginx load the new configuration gracefully, which is safer.
Recommended parameters
For a personal site or static site, start with:
- normal pages:
rate=5r/sto10r/s - scan paths:
rate=1r/s - scan path
burst=5 - site-wide
burst=30 - per-IP concurrency:
10to20
If normal traffic is very low, the settings can be stricter. If the site has many images, scripts, or API requests, loosen the normal page limit to avoid hurting real users.
The safest approach is a staged rollout:
- First apply
access_log off+return 404to scan paths. - Then add strict
perip_scanrate limiting. - Observe logs for one day.
- If random-path 404s are still heavy, enable gentle site-wide rate limiting.