How to read Nginx access logs
When something breaks, the access log tells a story. Who hit the site. What they asked for. Which URLs failed. When a spike started. I used these steps on a real project and they worked. The examples below are safe to copy and run, then tweak for your setup.
What are Nginx access logs?
Nginx access logs are text files that record every HTTP request your web server receives. Each line represents one request and contains details like the client's IP address, timestamp, HTTP method, URL path, status code, response size, and more.
These logs are automatically generated by Nginx as it processes requests. Every time someone visits your site, clicks a link, submits a form, or even hits a broken URL, Nginx writes a line to the access log. This happens in real-time, so the log file is constantly growing.
How access logs work
Nginx uses a configurable log format defined in your nginx.conf file. The default format looks like this:
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent"';
This creates log entries like:
192.168.1.100 - - [20/Aug/2025:10:30:45 +0000] "GET /blog/post HTTP/1.1" 200 1234 "https://example.com/" "Mozilla/5.0..."
Many modern setups use a more detailed JSON-like format with key=value pairs, which is what the examples in this post assume.
CDN vs. Origin: Where to look for logs
If you're using a CDN (like Cloudflare, CloudFront, or Fastly), there are two different sets of logs to consider:
CDN edge logs: Show requests from actual users hitting the CDN. These contain the real client IPs, user agents, and geographic locations.
Origin server logs (what this post covers): Show requests that made it through the CDN to your Nginx server. These logs often show the CDN's IP address as the client, not the end user's IP.
For troubleshooting, you usually want the CDN logs first. Only check origin logs when you need to see what actually reached your server or when debugging server-side issues.
All examples assume key=value logs (fields like status="404" and request="GET /path HTTP/1.1"), and files under /var/log/nginx/. Adjust names as needed.
What you will learn
- How to read live and gzipped logs in one go
- How to find top IPs and top failing URLs
- How to pull everything from one IP
- How to zoom in on a short time window
- How to tie access logs to WAF blocks
- How to keep results fast and useful
Quick start
Read everything, no matter if the file is .log or .gz:
zgrep -h . /var/log/nginx/ssl-*.access.log*
That zgrep trick handles both normal and rotated files. Add filters after it.
Tools you'll use
zgrep: Like grep but works on both regular files and compressed (.gz) files. Perfect for log files that get rotated and compressed.
awk: A powerful text processing tool that can split lines by delimiters and extract specific fields. Great for parsing structured log formats.
cut: Extracts specific columns or fields from text. Use -d to specify a delimiter (like quotes or commas) and -f to pick which field number you want.
Find top IPs behind 404 storms
When editors report "lots of broken links," start here.
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'status="' '$2 ~ /^404"/' \
| awk -F'x_forwarded_for="' '{print $2}' \
| cut -d'"' -f1 | cut -d',' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head
Why it helps. You see the worst offenders first. If an IP is scanning random paths, you will spot it fast.
See everything a single IP did
Handy when you need to explain a block or a spike from one source.
IP="112.134.209.112"
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| grep -F 'x_forwarded_for="'$IP
Tip. In some stacks the client IP can live in src or src_ip. If so, swap the field name in the grep.
Top URLs hit by that IP (ignore query strings)
IP="112.134.209.112"
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| grep -F 'x_forwarded_for="'$IP \
| awk -F'request="' '{print $2}' | cut -d'"' -f1 \
| awk '{print $2}' | cut -d'?' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head
Why it helps. Dropping the query string groups "the same page with different params" together, which makes patterns obvious.
Top failing URLs by status
404s:
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'status="' '$2 ~ /^404"/' \
| awk -F'request="' '{print $2}' | cut -d'"' -f1 \
| awk '{print $2}' | cut -d'?' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head
5xx:
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'status="' '{split($2,a,"\""); if (a[1] ~ /^5[0-9][0-9]$/) print}' \
| awk -F'request="' '{print $2}' | cut -d'"' -f1 \
| awk '{print $2}' | cut -d'?' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head
Why it helps. This gives you a ranked list of problem paths. Fix the top five and you often fix most of the pain.
Zoom in on a short time window
When a spike hits at a known time, you want before and after.
Simple and quick (good enough for a 10-minute window inside the same hour):
# Example: 09/Aug/2025 23:15–23:25
zgrep -h 'time_local="' /var/log/nginx/ssl-*.access.log* \
| egrep '09/Aug/2025:23:1[5-9]|09/Aug/2025:23:2[0-5]'
Pipe the result into any of the counts above.
Why it helps. You compare traffic right before and right after an event, which is often all you need to see what changed.
Form posts and large bodies
Large POSTs to a form can trip WAF rules. Measure them.
PATH_RE="/about/.*speaker-request-form"
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'request="' -v r="$PATH_RE" '
{ split($2,a,"\""); split(a[1],b," "); m=b[1]; u=b[2];
if (m=="POST" && u ~ r) print }' \
| awk -F'request_length="' '{print $2}' | cut -d'"' -f1 \
| awk '{sum+=$1; c++} END{print "POST count="c, "avg_request_length=" (c?sum/c:0)}'
Why it helps. If average request size is high, your WAF may block inspection. You now have proof and numbers.
Tie access logs to WAF blocks
- Grab the exact time and rule from your WAF event.
- Filter access logs for that window using the "time window" trick.
- Look for the same path, the same IP, or a large request_length.
- If the WAF rule talks about body size or inspection limits, confirm with the POST analysis above.
This closes the loop. You can say what happened and why it happened.
Speed tips that matter
- Filter early. Add grep 'status="404"' before you sort.
- Use LC_ALL=C sort for faster and stable sorting.
- Drop query strings to group pages by path.
- Sample first. Run head on a pipeline to check you are slicing the right field.
- Document your log_format in the repo. Future you will thank you.
Common pitfalls
- Counting by remote_addr when your app sits behind a proxy. Use x_forwarded_for or your real client IP field.
- Assuming all log files use the same format. Check your nginx.conf log_format before parsing.
- Forgetting that rotated logs (.gz files) need zgrep, not grep.
- Using complex regex patterns that break when log formats change. Keep it simple.
Takeaways
Logs answer real questions fast. Start with a clear question, pipe only what you need, and count. Tie what you see in Nginx to what your WAF reports. You will move from "something felt slow" to "this URL failed 2,379 times from one IP in ten minutes, and here is the fix."