Build the same collection endpoint in PHP using PDO prepared statements. The validation and enrichment logic is identical to the Node.js version — same fields, same 204 response, same security considerations — but the runtime model is fundamentally different.
The collection logic is the same in both languages: receive a POST, parse the JSON body, validate the fields, enrich with server-side data, insert into the database, and return 204. The differences are in how the runtime handles requests.
| Aspect | PHP | Node.js |
|---|---|---|
| Execution model | Per-request process — script starts, runs, and exits for every request | Long-running process — single event loop handles all requests |
| State between requests | None by default — no shared memory between requests | Shared — variables persist across requests in the same process |
| Database driver | PDO (PHP Data Objects) | mysql2 / pg / sqlite3 |
| Connection pooling | Handled by the web server (e.g., persistent connections via PDO::ATTR_PERSISTENT) |
Handled in application code (e.g., mysql2/promise pool) |
| Crash isolation | A fatal error kills only one request | An unhandled exception can crash the entire server |
| Deployment | Drop files into the web server directory | Run a process, manage with pm2 or systemd |
PHP's per-request model means every request gets a clean slate. There is no risk of memory leaks accumulating over time, and a bug in one request cannot corrupt state for another. The tradeoff is that PHP must re-establish database connections and re-initialize state on every request (though persistent connections mitigate this).
The collector script sends beacons to /collect, but the actual PHP file is collect.php. Apache's mod_rewrite bridges this gap:
RewriteEngine On
RewriteRule ^collect$ collect.php [L,QSA]
This rule tells Apache: when a request comes in for /collect (no file extension), internally rewrite it to collect.php. The client never sees the rewrite — the URL stays as /collect.
L — Last rule. Stop processing further rewrite rules after this match.QSA — Query String Append. If the original request had query parameters, append them to the rewritten URL. Not strictly necessary for a POST endpoint, but good practice.mod_rewrite must be enabled on the server. On Ubuntu/Debian, run sudo a2enmod rewrite and restart Apache. Also ensure the directory's AllowOverride directive includes FileInfo or All so that .htaccess files are respected.
In the Node.js version, routing is handled in application code (app.post('/collect', ...)). In PHP, routing is typically delegated to the web server via rewrite rules. This is a fundamental architectural difference — PHP separates the routing concern from the application logic.
Here is collect.php broken down step by step.
Just like the Node.js version, the PHP endpoint must send CORS headers to allow cross-origin beacons:
header('Access-Control-Allow-Origin: *');
header('Access-Control-Allow-Methods: POST, OPTIONS');
header('Access-Control-Allow-Headers: Content-Type');
if ($_SERVER['REQUEST_METHOD'] === 'OPTIONS') {
http_response_code(204);
exit;
}
PHP's header() function sets HTTP response headers. The exit after handling OPTIONS ensures no further code runs for preflight requests.
Reject anything that is not a POST:
if ($_SERVER['REQUEST_METHOD'] !== 'POST') {
http_response_code(405);
exit;
}
In Node.js/Express, this is implicit — app.post('/collect', ...) only matches POST requests. In PHP, you must check explicitly because the script runs for any HTTP method.
PHP does not automatically parse JSON request bodies the way Express middleware does. You must read the raw input stream and decode it manually:
$raw = file_get_contents('php://input');
$data = json_decode($raw, true);
php://input is a read-only stream that gives you the raw request body. The true parameter to json_decode returns an associative array instead of an object, which is more convenient for field access with the ?? null coalescing operator.
Silently reject invalid data with a 204 — do not leak validation details to potential attackers:
if (!$data || empty($data['url'])) {
http_response_code(204);
exit;
}
$allowedTypes = ['pageview', 'event', 'error', 'performance'];
$type = in_array($data['type'] ?? '', $allowedTypes) ? $data['type'] : 'pageview';
Note the difference from the Node.js version: here we return 204 even for invalid data instead of 400. This is a deliberate security choice — returning different status codes for valid vs. invalid data tells an attacker which fields are required and which values are accepted. A uniform 204 reveals nothing.
$serverTimestamp = date('Y-m-d H:i:s');
$clientIp = $_SERVER['REMOTE_ADDR'] ?? '';
Same enrichment as the Node.js version: a server-side timestamp (because client clocks cannot be trusted) and the client IP address (for geolocation and bot detection).
This is where PHP and Node.js differ most at the code level. PHP uses PDO with positional placeholders (?):
$pdo = new PDO(
'mysql:host=localhost;dbname=analytics;charset=utf8mb4',
'root', '',
[PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION]
);
$stmt = $pdo->prepare(
'INSERT INTO pageviews (url, type, user_agent, viewport_width,
viewport_height, referrer, client_timestamp, server_timestamp,
client_ip, session_id, payload)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)'
);
$stmt->execute([
substr($data['url'], 0, 2048),
$type,
substr($data['userAgent'] ?? '', 0, 512),
isset($data['viewportWidth']) ? (int)$data['viewportWidth'] : null,
isset($data['viewportHeight']) ? (int)$data['viewportHeight'] : null,
substr($data['referrer'] ?? '', 0, 2048),
$data['timestamp'] ?? null,
$serverTimestamp,
$clientIp,
substr($data['sessionId'] ?? '', 0, 64),
isset($data['payload']) ? json_encode($data['payload']) : null,
]);
Key points about the PDO approach:
? placeholders are replaced by the database driver, not by string concatenation. This prevents SQL injection by design.substr() for length limits — Truncate strings to match the database column sizes. Without this, MySQL would silently truncate or throw an error depending on strict mode.??) — Safely handle missing fields without triggering undefined index notices.(int) for viewport dimensions ensures the values are integers, not strings.PDO::ERRMODE_EXCEPTION makes PDO throw exceptions on errors instead of returning false. This is critical — without it, failed queries silently return false and your data disappears.try {
// ... PDO code above ...
} catch (PDOException $e) {
error_log('Analytics collect error: ' . $e->getMessage());
}
http_response_code(204);
Errors are logged server-side but never exposed to the client. The endpoint always returns 204, whether the INSERT succeeded or failed. This prevents information leakage and ensures the collector script is not affected by database issues.
PDO prepared statements are the primary defense against SQL injection. Compare the unsafe approach with the safe one:
// UNSAFE - string concatenation allows injection
$pdo->query("INSERT INTO pageviews (url) VALUES ('$url')");
// SAFE - prepared statement separates SQL from data
$stmt = $pdo->prepare("INSERT INTO pageviews (url) VALUES (?)");
$stmt->execute([$url]);
With prepared statements, the database driver handles escaping. Even if $url contains SQL metacharacters like ' or ;, they are treated as literal data, not SQL syntax. There is no way to break out of the placeholder.
When you later display the collected data in a dashboard, use htmlspecialchars() to prevent stored XSS:
// When displaying collected data in HTML
echo htmlspecialchars($row['url'], ENT_QUOTES, 'UTF-8');
An attacker could send a beacon with a URL like <script>alert('xss')</script>. If you render this directly in your dashboard HTML without escaping, the script executes. htmlspecialchars() converts < to <, neutralizing the attack.
Use test.html to send a test beacon to the PHP endpoint. Open the page, click the button, and check your browser's Network tab for the 204 response.
You can also test with curl from the command line:
curl -X POST http://localhost/collect \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com","type":"pageview","userAgent":"curl/test"}' \
-v
Look for HTTP/1.1 204 No Content in the response. Then verify the data was inserted:
mysql -u root analytics -e "SELECT * FROM pageviews ORDER BY id DESC LIMIT 5;"
The test page sends a beacon with all the standard fields (URL, type, user agent, viewport dimensions, referrer, session ID) and displays the result. It uses the same approach as the Node.js test page — a simple fetch POST with a JSON body.
.htaccess with mod_rewrite maps clean URLs (/collect) to PHP scripts (collect.php)php://input reads the raw request body; json_decode parses itsubstr() to enforce field length limits at collection timeerror_log(), never expose them to clients