Analyze Web Traffic¶
Understand how HTTP traffic behaves, catch spikes in errors, and share data-backed summaries with stakeholders.
When to Use This Guide¶
- Operating Nginx or Apache services that emit combined-format access logs.
- Investigating customer complaints about latency or failed requests.
- Building quick daily reports without moving data into another system.
Before You Start¶
- Ensure the access logs include the fields you need. The bundled sample
examples/simple_combined.logfollows the Apache/Nginx combined format. request_timeis only present when you add it to your Nginx log format. If it is missing, prefer backend application metrics or use upstream timing fields instead.- Run commands from the repo root or update paths to point at your own logs (Kelora handles
.gzfiles automatically).
Step 1: Inspect a Sample¶
Confirm the log format and field names before you start filtering.
Key fields available in the combined format:
ip,timestamp,method,path,status,bytes- Optional
request_time(Nginx custom field),referer,user_agent - Use
--keysor-kto display additional headers if you extended the format.
Step 2: Highlight Errors and Hotspots¶
Filter for failing requests and capture the context you need for triage.
kelora -f combined examples/simple_combined.log \
--filter 'e.status >= 500' \
-k timestamp,ip,status,request
Tips:
- For client errors, use
'e.status >= 400 && e.status < 500'. - When the application encodes errors in the URI, add a second filter such as
e.path.contains("/api/"). - Use
--before-contextand--after-contextif you need to see neighbouring requests from the same source.
Step 3: Investigate Slow Endpoints¶
Track latency outliers to confirm performance complaints or detect resource exhaustion.
kelora -f combined examples/simple_combined.log \
--filter 'e.get_path("request_time", "0").to_float() > 1.0' \
-k timestamp,method,path,request_time,status
If request_time is not logged:
- Switch to backend service logs via Build a Service Health Snapshot.
- Consider adding upstream timing variables (
$upstream_response_time,$request_time) to your Nginx format so Kelora can read them directly.
Step 4: Summarise by Status and Source¶
Generate quick aggregates to prioritise remediation work or include in change reviews.
kelora -f combined examples/simple_combined.log \
-e 'track_count("status_" + e.status)' \
-e 'track_count("method_" + e.method)' \
-e 'track_count(e.ip)' \
--metrics
track_count(e.ip)highlights noisy consumers or suspicious sources.- Use
track_bucket()withrequest_timeto build latency histograms. - Run with
--statsfor throughput metrics and parse error counts.
Step 5: Export a Shareable Slice¶
Deliver the findings to teammates or import them into downstream tools.
kelora -f combined examples/simple_combined.log \
--filter 'e.status >= 500' \
-k timestamp,ip,status,request,user_agent \
-F csv > web-errors.csv
Alternatives:
-Jto produce JSON for ingestion into a SIEM.- Add
--no-diagnosticsto suppress diagnostics if the output is piped into another script.
Variations¶
-
Focus on a specific endpoint
-
Compare time windows
-
Detect suspicious behaviour
-
Process rotated archives
Validate and Communicate¶
- Use
--strictif you added new log fields and want Kelora to stop on parsing mistakes. - Attach
--statsoutput to change reports so readers can see event counts and error rates. - Note whether
request_timeor other custom fields were available; this influences how teams interpret latency results.
See Also¶
- Process Archives at Scale for large historical datasets.
- Design Streaming Alerts to notify on live 5xx spikes.
- Investigate Syslog Sources for load balancer or reverse-proxy system logs.