Module 01: Node.js Collection Endpoint

Build an Express.js server that receives analytics beacons, validates the payload, enriches it with server-side data, and inserts it into MySQL.

Demo Files

1. Project Setup

Create a new directory and initialize a Node.js project. We need two packages: express for HTTP handling and mysql2 for the database connection.

mkdir analytics-endpoint && cd analytics-endpoint
npm init -y
npm install express mysql2

After installing, your package.json will list both dependencies. The mysql2 package is preferred over the older mysql package because it supports Promises, prepared statements, and the newer MySQL authentication protocol out of the box.

Prerequisite: You need MySQL installed and running locally. Create the analytics database before proceeding: CREATE DATABASE analytics;

Once the database exists, run the schema file to create the pageviews table:

mysql -u root analytics < schema.sql

2. Database Schema

The schema.sql file defines the pageviews table. Let us walk through each column and why it exists:

Column Type Purpose
id INT AUTO_INCREMENT Unique row identifier. Every INSERT gets the next integer automatically.
url VARCHAR(2048) The page URL that generated this beacon. 2048 characters accommodates even the longest query strings.
type ENUM Restricts values to pageview, event, error, or performance. The database enforces the allowlist, not just the application.
user_agent VARCHAR(512) The browser's User-Agent string. Useful for identifying browser type, version, and device category.
viewport_width / height SMALLINT UNSIGNED The browser viewport dimensions in pixels. Tells you whether the user is on mobile, tablet, or desktop without parsing the User-Agent.
referrer VARCHAR(2048) The referring page URL. Reveals where traffic comes from: search engines, social media, direct navigation.
client_timestamp DATETIME When the event occurred on the client. May be inaccurate due to clock skew, but still useful for measuring client-side latency.
server_timestamp DATETIME NOT NULL When the server received the beacon. This is the authoritative timestamp for ordering events.
client_ip VARCHAR(45) The client's IP address. VARCHAR(45) accommodates both IPv4 (max 15 chars) and IPv6 (max 45 chars).
session_id VARCHAR(64) An optional session identifier set by the collector. Links multiple pageviews into a single user session.
payload JSON A flexible column for additional data that does not fit the fixed schema: custom events, performance metrics, error details.
created_at TIMESTAMP Automatically set by MySQL when the row is inserted. Useful for auditing and as a secondary time reference.

The schema includes three indexes for common query patterns:

Why InnoDB? InnoDB is MySQL's default storage engine. It supports transactions, row-level locking, and foreign keys. For an analytics workload with many concurrent INSERTs, InnoDB's row-level locking is far better than MyISAM's table-level locking.

3. The Express Endpoint

The collect.js file is roughly 80 lines. Let us walk through each section.

Step 1: Imports and Connection Pool

const express = require('express');
const mysql = require('mysql2');

const app = express();
const PORT = 3006;

const pool = mysql.createPool({
  host: 'localhost',
  user: 'root',
  database: 'analytics',
  waitForConnections: true,
  connectionLimit: 10
});

We use a connection pool instead of a single connection. A pool maintains a set of reusable connections to the database. When a request arrives, it borrows a connection from the pool, executes the query, and returns the connection for the next request. This avoids the overhead of establishing a new TCP connection to MySQL for every beacon and prevents connection exhaustion under load.

Step 2: CORS Middleware

app.use((req, res, next) => {
  res.header('Access-Control-Allow-Origin', '*');
  res.header('Access-Control-Allow-Methods', 'POST, OPTIONS');
  res.header('Access-Control-Allow-Headers', 'Content-Type');
  if (req.method === 'OPTIONS') return res.sendStatus(204);
  next();
});

CORS headers allow the browser to send beacons from any origin to this endpoint. The OPTIONS preflight handler returns 204 immediately — no need to process a preflight request any further. In production, replace '*' with your actual domain.

Step 3: JSON Body Parsing

app.use(express.json());

Express does not parse request bodies by default. This middleware reads the raw body, parses it as JSON, and attaches the result to req.body. Without this line, req.body would be undefined.

Step 4: The POST /collect Route

app.post('/collect', (req, res) => {
  const data = req.body;

  // Validate required fields
  if (!data || typeof data.url !== 'string' || !data.url) {
    return res.status(400).json({ error: 'Missing required field: url' });
  }

  const ALLOWED_TYPES = ['pageview', 'event', 'error', 'performance'];
  const type = data.type || 'pageview';
  if (!ALLOWED_TYPES.includes(type)) {
    return res.status(400).json({ error: 'Invalid type' });
  }

Validation happens before anything touches the database. Two checks:

  // Enrich with server-side data
  const serverTimestamp = new Date().toISOString()
    .slice(0, 19).replace('T', ' ');
  const clientIp = req.ip;

The server adds two fields the client cannot reliably provide:

  // Parameterized INSERT
  const sql = `INSERT INTO pageviews
    (url, type, user_agent, viewport_width, viewport_height,
     referrer, client_timestamp, server_timestamp, client_ip,
     session_id, payload)
    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`;

  const params = [
    data.url,
    type,
    data.userAgent   || null,
    data.viewportW   || null,
    data.viewportH   || null,
    data.referrer    || null,
    data.timestamp   || null,
    serverTimestamp,
    clientIp,
    data.sessionId   || null,
    data.payload ? JSON.stringify(data.payload) : null
  ];

  pool.execute(sql, params, (err) => {
    if (err) {
      console.error('MySQL INSERT error:', err.message);
      return res.sendStatus(500);
    }
    res.sendStatus(204);
  });
});

Key design decisions:

Never use string concatenation for SQL queries. Even if you think the data is safe, always use parameterized queries. A single unparameterized query is a SQL injection vulnerability waiting to happen.

Step 5: Start the Server

app.listen(PORT, () => {
  console.log(`Collection endpoint listening on http://localhost:${PORT}`);
});

Run the server with node collect.js. It will listen on port 3006 and begin accepting POST requests at /collect.

4. Testing

Open test.html in a browser (you can serve it from the same Express server or open it directly). The test page includes a minimal collector script that sends a beacon to http://localhost:3006/collect on page load.

Step 1: Start the Server

node collect.js
# Output: Collection endpoint listening on http://localhost:3006

Step 2: Open the Test Page

Open test.html in your browser. Open the browser developer tools (F12) and check the Network tab. You should see a POST request to /collect with a 204 response.

Step 3: Verify the INSERT

Connect to MySQL and query the pageviews table:

mysql -u root analytics -e "SELECT id, url, type, server_timestamp, client_ip FROM pageviews ORDER BY id DESC LIMIT 5;"

You should see a row for the test page visit with the URL, type pageview, a server timestamp, and a client IP of ::1 (IPv6 localhost) or 127.0.0.1.

Debugging tip: If the POST returns 500, check the terminal where collect.js is running. The error message will tell you what went wrong — usually a missing database, wrong credentials, or a schema mismatch.

5. What We Built

Here is the full data flow from browser to database:

Browser Express (collect.js) MySQL +--------------+ +------------------------+ +-----------+ | test.html | | POST /collect | | analytics | | | POST /collect | | | | | sendBeacon() +------------------>| 1. Parse JSON body | | | | or fetch() | {url, type, ...} | 2. Validate url, type | | | | | | 3. Enrich: timestamp, | | | | | | IP address | | | | | | 4. Parameterized | INSERT| | | | | INSERT INTO --------+------>| pageviews | | |<--- 204 ---------| 5. Return 204 | | | +--------------+ No Content +------------------------+ +-----------+

In this module you built:

In the next module, we will build the same endpoint in PHP to compare how the two server-side languages handle the same problem.