In this module, you will learn how to collect analytics data using the web server itself — no custom JavaScript collector required. Using tracking pixels and custom log formats, the server becomes the data collector.
Note: tracking-pixel.php requires a PHP-capable server (Apache with mod_php, or php -S localhost:8000).
Modules 01 and 02 used JavaScript to collect data and send beacons. This module takes a completely different approach — using the web server itself as the data collector. No JavaScript required.
This is the oldest analytics technique on the web, predating Google Analytics by a decade. In the mid-1990s, tools like Analog and Webalizer parsed server access logs to produce traffic reports. The approach fell out of fashion as JavaScript-based analytics offered richer client-side data, but it remains useful as a fallback and for capturing traffic that JavaScript cannot see.
The key insight is that every HTTP request the browser makes is logged by the web server. If you can trigger a request with useful data attached, the server will record it for you automatically.
A tracking pixel is a 1x1 transparent GIF image embedded in the page. When the browser renders the page, it requests the image, and the server logs that request. The image is invisible to the user, but the HTTP request it generates carries valuable data:
<!-- Basic tracking pixel -->
<img src="https://analytics.example.com/pixel.php?page=/home&t=pageview"
width="1" height="1" alt="" style="position:absolute;left:-9999px">
<!-- With cache-busting (prevents browser from caching the pixel) -->
<img src="https://analytics.example.com/pixel.php?page=/home&r=1706123456789"
width="1" height="1" alt="">
<!-- noscript fallback — works even when JavaScript is disabled -->
<noscript>
<img src="https://analytics.example.com/pixel.php?page=/home&js=0"
width="1" height="1" alt="">
</noscript>
The style="position:absolute;left:-9999px" pushes the pixel off-screen so it does not affect page layout. The cache-busting technique appends a unique value (typically a timestamp) to the query string, ensuring the browser makes a fresh request every time rather than serving the image from cache.
<noscript> version is important — it is the only analytics method that works when JavaScript is completely disabled. This is why Google Analytics still includes a tracking pixel as a fallback.
The tracking-pixel.php script does two things: serves a 1x1 transparent GIF to the browser, and logs the request data to a file. Let us walk through it section by section.
<?php
// tracking-pixel.php — Serves a 1x1 transparent GIF and logs the request
// Prevent caching so every page view generates a new request
header('Cache-Control: no-store, no-cache, must-revalidate');
header('Pragma: no-cache');
header('Expires: 0');
These headers are essential. If the browser caches the pixel image, subsequent page views will not generate new HTTP requests — the browser will simply reuse the cached version. That means lost data. The three headers cover different caching layers:
Cache-Control: no-store — tells the browser and any intermediate proxies not to store the response at allPragma: no-cache — HTTP/1.0 backward compatibilityExpires: 0 — marks the response as already expired// Set content type to GIF image
header('Content-Type: image/gif');
// The smallest valid GIF (43 bytes) — a 1x1 transparent pixel
// This is the GIF89a header + a single transparent pixel
echo base64_decode('R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7');
The base64 string decodes to a 43-byte GIF89a image — the smallest valid transparent GIF possible. The browser receives a real image, renders it (invisibly), and the user sees nothing. Meanwhile, the server has captured the request.
// Log the hit (in production, write to a database or log file)
$data = [
'timestamp' => date('c'),
'ip' => $_SERVER['REMOTE_ADDR'] ?? '',
'ua' => $_SERVER['HTTP_USER_AGENT'] ?? '',
'referer' => $_SERVER['HTTP_REFERER'] ?? '',
'page' => $_GET['page'] ?? '',
'type' => $_GET['t'] ?? 'pageview',
'language' => $_SERVER['HTTP_ACCEPT_LANGUAGE'] ?? '',
];
// Append to a JSON Lines file
$logFile = __DIR__ . '/pixel-hits.jsonl';
file_put_contents($logFile, json_encode($data) . "\n", FILE_APPEND | LOCK_EX);
The script extracts data from two sources: the $_GET superglobal (query parameters you control) and the $_SERVER superglobal (HTTP headers the browser sends automatically). It writes each hit as a single line of JSON to a .jsonl (JSON Lines) file — one JSON object per line, easy to parse with standard tools.
The LOCK_EX flag ensures that concurrent requests do not corrupt the file by writing simultaneously.
Instead of writing PHP to log data, you can configure the web server itself to capture richer data in its access log. This requires no application code at all — just a configuration change.
# Standard Combined Log Format (what most servers use by default)
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
# Extended format for analytics — adds response time, cookie, and query string
LogFormat "%h %t \"%r\" %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\" \"%{Accept-Language}i\" \"%{_session}C\"" analytics
# Using the custom format
CustomLog /var/log/apache2/analytics.log analytics
Each format directive captures a specific piece of data:
| Directive | Meaning |
|---|---|
%h |
Remote host (client IP address) |
%t |
Timestamp of the request |
%r |
Request line (e.g., GET /page.html HTTP/1.1) |
%>s |
Final HTTP status code |
%b |
Response size in bytes |
%D |
Time to serve the request in microseconds |
%{Header}i |
Value of the named input (request) header |
%{Cookie}C |
Value of the named cookie |
# Extended analytics format
log_format analytics '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'"$http_accept_language" '
'$request_time '
'"$http_x_forwarded_for"';
access_log /var/log/nginx/analytics.log analytics;
Nginx uses variable names instead of percent directives. The $request_time variable gives response time in seconds with millisecond precision, and $http_x_forwarded_for captures the original client IP when behind a reverse proxy or load balancer.
The User-Agent string has become increasingly unreliable — browsers are freezing and reducing their UA strings to prevent fingerprinting. HTTP Client Hints provide a structured alternative. The server opts in by sending an Accept-CH response header, and the browser then includes the requested hints in subsequent requests.
# Server response header (opt-in)
Accept-CH: Sec-CH-UA, Sec-CH-UA-Platform, Sec-CH-UA-Mobile, ECT, Downlink, RTT
# Subsequent browser request headers
Sec-CH-UA: "Chromium";v="120", "Google Chrome";v="120"
Sec-CH-UA-Platform: "macOS"
Sec-CH-UA-Mobile: ?0
ECT: 4g
Downlink: 10
RTT: 50
These headers provide structured, machine-readable data instead of a monolithic User-Agent string that must be parsed with complex regex patterns. The network-quality hints (ECT, Downlink, RTT) are particularly valuable for analytics — they tell you the user's effective connection type, bandwidth, and round-trip time.
You can capture Client Hints in your server log format:
# Apache — capture Client Hints in log
LogFormat "%h %t \"%r\" %>s \"%{Sec-CH-UA}i\" \"%{Sec-CH-UA-Platform}i\" \"%{ECT}i\" \"%{Downlink}i\"" hints
JavaScript can set cookies, and cookies are sent as HTTP headers with every request. This creates a bridge between client-side data collection and server-side logging: JavaScript collects data that is only available in the browser, stores it in a cookie, and the server logs the cookie value from the HTTP headers.
// Client-side: store viewport size in a cookie
document.cookie = 'vp=' + window.innerWidth + 'x' + window.innerHeight +
';path=/;max-age=1800;SameSite=Lax';
# Server-side: log the cookie value
LogFormat "%h %t \"%r\" %>s \"%{vp}C\"" viewport-log
The first request from a user will not include the cookie (it has not been set yet), but every subsequent request will carry the viewport data in the Cookie header. The server logs it without any application code.
How does server-log collection compare to the JavaScript beacon approach from Modules 01 and 02?
| JavaScript Beacons (Modules 01-02) | Server-Log Collection | |
|---|---|---|
| Works without JS | No | Yes (tracking pixel) |
| Client-side data | Full access (viewport, JS APIs) | Limited (headers, Client Hints) |
| Implementation | Custom script | Server config |
| Data granularity | High (custom events, timing) | Lower (page views, basic metadata) |
| Bot traffic | Can filter client-side | Captures everything |
| Privacy | Can check consent in JS | Harder to add consent |