Module 04: Custom Endpoint

In this module, you will build a Node.js/Express collection endpoint that receives, validates, and stores analytics beacons, and update the collector script with a configurable endpoint URL and a cascading delivery strategy.

Demo Files

Run: npm init -y && npm install express then node endpoint.js

Why Build Your Own Endpoint?

In Modules 01 and 02, we sent beacons to /collect — but nothing was receiving them. The browser fired the request and the server returned a 404. Now we build the other side: a server that receives, validates, and stores analytics data.

Why build your own instead of using a third-party service?

Browser Express Server ┌─────────────┐ ┌──────────────────┐ │ collector │ │ endpoint.js │ │ -v3.js │ │ │ │ │ POST /collect │ 1. Parse JSON │ │ sendBeacon()├─────────────────>│ 2. Validate │ │ or fetch() │ │ 3. Add timestamp │ │ │<─── 204 ────────│ 4. Append .jsonl │ └─────────────┘ No Content └──────────────────┘

Building the Express Endpoint

We will build endpoint.js step by step. By the end, you will have a working analytics server in roughly 60 lines of code.

Step 1: Basic Server Setup

Start with the imports and configuration. We need Express for HTTP handling, fs for writing to disk, and path for safe file path construction:

const express = require('express');
const fs = require('fs');
const path = require('path');

const app = express();
const PORT = 3005;
const LOG_FILE = path.join(__dirname, 'analytics.jsonl');

The LOG_FILE constant points to analytics.jsonl in the same directory as the server. This file will store every validated beacon as one JSON object per line.

Step 2: CORS Middleware

CORS (Cross-Origin Resource Sharing) headers are required when the collector and the endpoint are on different origins. For example, if your website runs on example.com but your analytics endpoint is on analytics.example.com, the browser will block the request unless the server explicitly permits it.

// CORS headers — required when collector and endpoint are on different origins
app.use((req, res, next) => {
  res.header('Access-Control-Allow-Origin', '*');
  res.header('Access-Control-Allow-Methods', 'POST, OPTIONS');
  res.header('Access-Control-Allow-Headers', 'Content-Type');
  if (req.method === 'OPTIONS') {
    return res.sendStatus(204);
  }
  next();
});

Without these headers, the browser enforces the Same-Origin Policy and refuses to send the beacon across origins. The OPTIONS preflight handling is necessary because fetch with Content-Type: application/json triggers a CORS preflight request. The server must respond to this preflight with the appropriate headers before the browser will send the actual POST.

Note: In production, replace '*' with your actual domain (e.g., 'https://example.com') to prevent other sites from sending data to your endpoint. A wildcard origin is fine for development but too permissive for production.

Step 3: JSON Body Parsing

Express does not parse request bodies by default. We need the built-in JSON parser middleware:

app.use(express.json());

This middleware reads the request body, parses it as JSON, and attaches the result to req.body. Without it, req.body would be undefined.

Step 4: The Collection Endpoint

This is the core of the server — the POST handler that receives, validates, enriches, and stores each beacon:

app.post('/collect', (req, res) => {
  const payload = req.body;

  // Validate: must have url and type at minimum
  if (!payload || !payload.url || !payload.type) {
    return res.status(400).json({ error: 'Missing required fields: url, type' });
  }

  // Add server-side timestamp (client clocks can be wrong)
  payload.serverTimestamp = new Date().toISOString();

  // Add IP address (Express provides this)
  payload.ip = req.ip;

  // Append to JSON Lines file
  const line = JSON.stringify(payload) + '\n';
  fs.appendFile(LOG_FILE, line, (err) => {
    if (err) {
      console.error('Write error:', err);
      return res.sendStatus(500);
    }
    res.sendStatus(204); // No Content — success, nothing to return
  });
});

Several design decisions here are worth understanding:

Step 5: Serve Static Files for Testing

For development, we serve the test page and collector script from the same directory as the endpoint:

app.use(express.static(__dirname));

app.listen(PORT, () => {
  console.log(`Analytics endpoint listening on http://localhost:${PORT}`);
  console.log(`Test page: http://localhost:${PORT}/test.html`);
  console.log(`Data file: ${LOG_FILE}`);
});

This makes test.html, collector-v3.js, and any other files in the directory accessible at the server root. In production, your endpoint would be a standalone service and would not serve static files.

JSON Lines Format

The endpoint stores data in JSON Lines format (.jsonl) — one JSON object per line, no wrapping array, no commas between records:

{"url":"https://example.com/","type":"pageview","serverTimestamp":"2026-01-15T08:30:00Z"}
{"url":"https://example.com/about","type":"pageview","serverTimestamp":"2026-01-15T08:30:05Z"}
{"url":"https://example.com/about","type":"pageview","serverTimestamp":"2026-01-15T08:31:12Z"}

Why JSON Lines instead of a JSON array or a CSV file?

Property JSON Lines JSON Array CSV
Append-only writes Yes — just append a line No — must read, parse, push, rewrite Yes — append a row
Line-by-line parsing Yes — each line is independent No — must parse entire file Yes
Nested data Yes — full JSON Yes No — flat only
Schema flexibility High — each line can have different fields High Low — fixed columns
Grep-friendly Yes No Partially
Memory usage Low — process one line at a time High — entire array in memory Low

JSON Lines is the de facto standard for analytics log storage. Tools like jq, BigQuery, and Elasticsearch all support it natively. You can filter records with grep, count them with wc -l, and process them with simple line-by-line readers.

Updating the Collector: sendBeacon → fetch Fallback

In Module 01, the collector used sendBeacon with a simple fetch fallback. Now we implement a more robust cascading delivery strategy:

function send(payload) {
  const json = JSON.stringify(payload);
  const blob = new Blob([json], { type: 'application/json' });

  // Strategy 1: sendBeacon (preferred — survives unload)
  if (navigator.sendBeacon) {
    const sent = navigator.sendBeacon(ENDPOINT, blob);
    if (sent) return;
  }

  // Strategy 2: fetch with keepalive (survives unload, has response)
  fetch(ENDPOINT, {
    method: 'POST',
    body: json,
    headers: { 'Content-Type': 'application/json' },
    keepalive: true
  }).catch(() => {
    // Strategy 3: plain fetch (last resort)
    fetch(ENDPOINT, {
      method: 'POST',
      body: json,
      headers: { 'Content-Type': 'application/json' }
    }).catch(() => {});
  });
}

The cascade works in order of reliability during page unload:

  1. sendBeacon — The most reliable during unload. Fire-and-forget, guaranteed to be queued by the browser even as the page is being torn down. Returns true if successfully queued, false if the queue is full.
  2. fetch with keepalive: true — Survives unload like sendBeacon, but gives you access to the response (useful for debugging). Falls back to this if sendBeacon is not available or returns false.
  3. Plain fetch — Last resort. Does not survive unload but works in all modern browsers. Only reached if both previous strategies fail.
fetch with keepalive: true has a 64KB payload limit per page. If your analytics data is large (e.g., full resource timing entries for a page with hundreds of assets), you may hit this limit. sendBeacon does not have this restriction, which is another reason it is the preferred first choice.

Configurable Endpoint URL

The updated collector introduces a configurable ENDPOINT variable at the top of the script:

const ENDPOINT = 'http://localhost:3005/collect';

In production, you would change this to your analytics server's URL:

const ENDPOINT = 'https://analytics.example.com/collect';

This is a simple approach. In Module 08 (Configuration API), we will refactor the collector into a library with a proper init() method that accepts configuration options including the endpoint URL.

Input Validation

The endpoint validates incoming data before storing it. This is essential because anyone can send POST requests to your endpoint — not just your collector script. Bots, scrapers, and curious developers can all craft and send arbitrary payloads.

At minimum, always validate:

The validation in endpoint.js is intentionally minimal for this tutorial. In production, you would add stricter type checks, field length limits, and payload size enforcement at the middleware level.

Cross-Reference: See the Node.js Tutorial Module 05 (REST API) for more on Express POST handlers and JSON APIs. The existing production endpoint at collector.cse135.site/endpoint.js follows this same pattern — POST handler, validation, and JSONL storage.

Summary