Back to Blog

How to read Nginx access logs

2025-08-2012 min read

When something breaks, the access log tells a story. Who hit the site. What they asked for. Which URLs failed. When a spike started. I used these steps on a real project and they worked. The examples below are safe to copy and run, then tweak for your setup.

What are Nginx access logs?

Nginx access logs are text files that record every HTTP request your web server receives. Each line represents one request and contains details like the client's IP address, timestamp, HTTP method, URL path, status code, response size, and more.

These logs are automatically generated by Nginx as it processes requests. Every time someone visits your site, clicks a link, submits a form, or even hits a broken URL, Nginx writes a line to the access log. This happens in real-time, so the log file is constantly growing.

How access logs work

Nginx uses a configurable log format defined in your nginx.conf file. The default format looks like this:

log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent"';

This creates log entries like:

192.168.1.100 - - [20/Aug/2025:10:30:45 +0000] "GET /blog/post HTTP/1.1" 200 1234 "https://example.com/" "Mozilla/5.0..."

Many modern setups use a more detailed JSON-like format with key=value pairs, which is what the examples in this post assume.

CDN vs. Origin: Where to look for logs

If you're using a CDN (like Cloudflare, CloudFront, or Fastly), there are two different sets of logs to consider:

CDN edge logs: Show requests from actual users hitting the CDN. These contain the real client IPs, user agents, and geographic locations.

Origin server logs (what this post covers): Show requests that made it through the CDN to your Nginx server. These logs often show the CDN's IP address as the client, not the end user's IP.

For troubleshooting, you usually want the CDN logs first. Only check origin logs when you need to see what actually reached your server or when debugging server-side issues.

All examples assume key=value logs (fields like status="404" and request="GET /path HTTP/1.1"), and files under /var/log/nginx/. Adjust names as needed.

What you will learn

  • How to read live and gzipped logs in one go
  • How to find top IPs and top failing URLs
  • How to pull everything from one IP
  • How to zoom in on a short time window
  • How to tie access logs to WAF blocks
  • How to keep results fast and useful

Quick start

Read everything, no matter if the file is .log or .gz:

zgrep -h . /var/log/nginx/ssl-*.access.log*

That zgrep trick handles both normal and rotated files. Add filters after it.

Tools you'll use

zgrep: Like grep but works on both regular files and compressed (.gz) files. Perfect for log files that get rotated and compressed.

awk: A powerful text processing tool that can split lines by delimiters and extract specific fields. Great for parsing structured log formats.

cut: Extracts specific columns or fields from text. Use -d to specify a delimiter (like quotes or commas) and -f to pick which field number you want.

Find top IPs behind 404 storms

When editors report "lots of broken links," start here.

zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'status="' '$2 ~ /^404"/' \
| awk -F'x_forwarded_for="' '{print $2}' \
| cut -d'"' -f1 | cut -d',' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head

Why it helps. You see the worst offenders first. If an IP is scanning random paths, you will spot it fast.

See everything a single IP did

Handy when you need to explain a block or a spike from one source.

IP="112.134.209.112"
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| grep -F 'x_forwarded_for="'$IP

Tip. In some stacks the client IP can live in src or src_ip. If so, swap the field name in the grep.

Top URLs hit by that IP (ignore query strings)

IP="112.134.209.112"
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| grep -F 'x_forwarded_for="'$IP \
| awk -F'request="' '{print $2}' | cut -d'"' -f1 \
| awk '{print $2}' | cut -d'?' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head

Why it helps. Dropping the query string groups "the same page with different params" together, which makes patterns obvious.

Top failing URLs by status

404s:

zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'status="' '$2 ~ /^404"/' \
| awk -F'request="' '{print $2}' | cut -d'"' -f1 \
| awk '{print $2}' | cut -d'?' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head

5xx:

zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'status="' '{split($2,a,"\""); if (a[1] ~ /^5[0-9][0-9]$/) print}' \
| awk -F'request="' '{print $2}' | cut -d'"' -f1 \
| awk '{print $2}' | cut -d'?' -f1 \
| LC_ALL=C sort | uniq -c | sort -nr | head

Why it helps. This gives you a ranked list of problem paths. Fix the top five and you often fix most of the pain.

Zoom in on a short time window

When a spike hits at a known time, you want before and after.

Simple and quick (good enough for a 10-minute window inside the same hour):

# Example: 09/Aug/2025 23:15–23:25
zgrep -h 'time_local="' /var/log/nginx/ssl-*.access.log* \
| egrep '09/Aug/2025:23:1[5-9]|09/Aug/2025:23:2[0-5]'

Pipe the result into any of the counts above.

Why it helps. You compare traffic right before and right after an event, which is often all you need to see what changed.

Form posts and large bodies

Large POSTs to a form can trip WAF rules. Measure them.

PATH_RE="/about/.*speaker-request-form"
zgrep -h . /var/log/nginx/ssl-*.access.log* \
| awk -F'request="' -v r="$PATH_RE" '
  { split($2,a,"\""); split(a[1],b," "); m=b[1]; u=b[2];
    if (m=="POST" && u ~ r) print }' \
| awk -F'request_length="' '{print $2}' | cut -d'"' -f1 \
| awk '{sum+=$1; c++} END{print "POST count="c, "avg_request_length=" (c?sum/c:0)}'

Why it helps. If average request size is high, your WAF may block inspection. You now have proof and numbers.

Tie access logs to WAF blocks

  1. Grab the exact time and rule from your WAF event.
  2. Filter access logs for that window using the "time window" trick.
  3. Look for the same path, the same IP, or a large request_length.
  4. If the WAF rule talks about body size or inspection limits, confirm with the POST analysis above.

This closes the loop. You can say what happened and why it happened.

Speed tips that matter

  • Filter early. Add grep 'status="404"' before you sort.
  • Use LC_ALL=C sort for faster and stable sorting.
  • Drop query strings to group pages by path.
  • Sample first. Run head on a pipeline to check you are slicing the right field.
  • Document your log_format in the repo. Future you will thank you.

Common pitfalls

  • Counting by remote_addr when your app sits behind a proxy. Use x_forwarded_for or your real client IP field.
  • Assuming all log files use the same format. Check your nginx.conf log_format before parsing.
  • Forgetting that rotated logs (.gz files) need zgrep, not grep.
  • Using complex regex patterns that break when log formats change. Keep it simple.

Takeaways

Logs answer real questions fast. Start with a clear question, pipe only what you need, and count. Tie what you see in Nginx to what your WAF reports. You will move from "something felt slow" to "this URL failed 2,379 times from one IP in ten minutes, and here is the fix."