Module 04: Custom Endpoint

In this module, you will build a Node.js/Express collection endpoint that receives, validates, and stores analytics beacons, and update the collector script with a configurable endpoint URL and a cascading delivery strategy.

Demo Files

collector-v3.js — Collector with configurable endpoint and fetch fallback
endpoint.js — Express collection server (~60 lines)
test.html — Test page for the updated collector

Run: npm init -y && npm install express then node endpoint.js

Why Build Your Own Endpoint?

In Modules 01 and 02, we sent beacons to /collect — but nothing was receiving them. The browser fired the request and the server returned a 404. Now we build the other side: a server that receives, validates, and stores analytics data.

Why build your own instead of using a third-party service?

Full data ownership — Your analytics data stays on your infrastructure. No third-party has access, no data-sharing agreements to worry about, no vendor lock-in.
No third-party dependencies — Your analytics do not break when an external service goes down or changes its API.
Custom validation rules — You define what constitutes a valid beacon. Reject malformed data, enforce field requirements, apply business logic at ingest time.
Storage format you control — Choose the format that best suits your analysis pipeline: JSON Lines, SQLite, PostgreSQL, or anything else.

Browser Express Server ┌─────────────┐ ┌──────────────────┐ │ collector │ │ endpoint.js │ │ -v3.js │ │ │ │ │ POST /collect │ 1. Parse JSON │ │ sendBeacon()├─────────────────>│ 2. Validate │ │ or fetch() │ │ 3. Add timestamp │ │ │<─── 204 ────────│ 4. Append .jsonl │ └─────────────┘ No Content └──────────────────┘

Building the Express Endpoint

We will build endpoint.js step by step. By the end, you will have a working analytics server in roughly 60 lines of code.

Step 1: Basic Server Setup

Start with the imports and configuration. We need Express for HTTP handling, fs for writing to disk, and path for safe file path construction:

const express = require('express');
const fs = require('fs');
const path = require('path');

const app = express();
const PORT = 3005;
const LOG_FILE = path.join(__dirname, 'analytics.jsonl');

The LOG_FILE constant points to analytics.jsonl in the same directory as the server. This file will store every validated beacon as one JSON object per line.

Step 2: CORS Middleware

CORS (Cross-Origin Resource Sharing) headers are required when the collector and the endpoint are on different origins. For example, if your website runs on example.com but your analytics endpoint is on analytics.example.com, the browser will block the request unless the server explicitly permits it.

// CORS headers — required when collector and endpoint are on different origins
app.use((req, res, next) => {
  res.header('Access-Control-Allow-Origin', '*');
  res.header('Access-Control-Allow-Methods', 'POST, OPTIONS');
  res.header('Access-Control-Allow-Headers', 'Content-Type');
  if (req.method === 'OPTIONS') {
    return res.sendStatus(204);
  }
  next();
});

Without these headers, the browser enforces the Same-Origin Policy and refuses to send the beacon across origins. The OPTIONS preflight handling is necessary because fetch with Content-Type: application/json triggers a CORS preflight request. The server must respond to this preflight with the appropriate headers before the browser will send the actual POST.

Note: In production, replace '*' with your actual domain (e.g., 'https://example.com') to prevent other sites from sending data to your endpoint. A wildcard origin is fine for development but too permissive for production.

Step 3: JSON Body Parsing

Express does not parse request bodies by default. We need the built-in JSON parser middleware:

app.use(express.json());

This middleware reads the request body, parses it as JSON, and attaches the result to req.body. Without it, req.body would be undefined.

Step 4: The Collection Endpoint

This is the core of the server — the POST handler that receives, validates, enriches, and stores each beacon:

app.post('/collect', (req, res) => {
  const payload = req.body;

  // Validate: must have url and type at minimum
  if (!payload || !payload.url || !payload.type) {
    return res.status(400).json({ error: 'Missing required fields: url, type' });
  }

  // Add server-side timestamp (client clocks can be wrong)
  payload.serverTimestamp = new Date().toISOString();

  // Add IP address (Express provides this)
  payload.ip = req.ip;

  // Append to JSON Lines file
  const line = JSON.stringify(payload) + '\n';
  fs.appendFile(LOG_FILE, line, (err) => {
    if (err) {
      console.error('Write error:', err);
      return res.sendStatus(500);
    }
    res.sendStatus(204); // No Content — success, nothing to return
  });
});

Several design decisions here are worth understanding:

204 No Content — The ideal response for analytics. It confirms receipt without wasting bandwidth on a response body. The collector does not need to read a response — it only needs to know the data was accepted.
Server-side timestamping — Client clocks can be wrong by minutes or even hours. The server timestamp provides ground truth for when the data was actually received. You keep both timestamps so you can detect clock skew.
IP address capture — req.ip gives you the client's IP address, useful for approximate geolocation and bot detection. In production behind a reverse proxy, you would configure Express to trust the X-Forwarded-For header.
fs.appendFile — Append-only writes are fast and safe. No need to read the existing file, parse it, modify it, and write it back. Each beacon is an independent append operation.

Step 5: Serve Static Files for Testing

For development, we serve the test page and collector script from the same directory as the endpoint:

app.use(express.static(__dirname));

app.listen(PORT, () => {
  console.log(`Analytics endpoint listening on http://localhost:${PORT}`);
  console.log(`Test page: http://localhost:${PORT}/test.html`);
  console.log(`Data file: ${LOG_FILE}`);
});

This makes test.html, collector-v3.js, and any other files in the directory accessible at the server root. In production, your endpoint would be a standalone service and would not serve static files.

JSON Lines Format

The endpoint stores data in JSON Lines format (.jsonl) — one JSON object per line, no wrapping array, no commas between records:

{"url":"https://example.com/","type":"pageview","serverTimestamp":"2026-01-15T08:30:00Z"}
{"url":"https://example.com/about","type":"pageview","serverTimestamp":"2026-01-15T08:30:05Z"}
{"url":"https://example.com/about","type":"pageview","serverTimestamp":"2026-01-15T08:31:12Z"}

Why JSON Lines instead of a JSON array or a CSV file?

Property	JSON Lines	JSON Array	CSV
Append-only writes	Yes — just append a line	No — must read, parse, push, rewrite	Yes — append a row
Line-by-line parsing	Yes — each line is independent	No — must parse entire file	Yes
Nested data	Yes — full JSON	Yes	No — flat only
Schema flexibility	High — each line can have different fields	High	Low — fixed columns
Grep-friendly	Yes	No	Partially
Memory usage	Low — process one line at a time	High — entire array in memory	Low

JSON Lines is the de facto standard for analytics log storage. Tools like jq, BigQuery, and Elasticsearch all support it natively. You can filter records with grep, count them with wc -l, and process them with simple line-by-line readers.

Updating the Collector: sendBeacon → fetch Fallback

In Module 01, the collector used sendBeacon with a simple fetch fallback. Now we implement a more robust cascading delivery strategy:

function send(payload) {
  const json = JSON.stringify(payload);
  const blob = new Blob([json], { type: 'application/json' });

  // Strategy 1: sendBeacon (preferred — survives unload)
  if (navigator.sendBeacon) {
    const sent = navigator.sendBeacon(ENDPOINT, blob);
    if (sent) return;
  }

  // Strategy 2: fetch with keepalive (survives unload, has response)
  fetch(ENDPOINT, {
    method: 'POST',
    body: json,
    headers: { 'Content-Type': 'application/json' },
    keepalive: true
  }).catch(() => {
    // Strategy 3: plain fetch (last resort)
    fetch(ENDPOINT, {
      method: 'POST',
      body: json,
      headers: { 'Content-Type': 'application/json' }
    }).catch(() => {});
  });
}

The cascade works in order of reliability during page unload:

sendBeacon — The most reliable during unload. Fire-and-forget, guaranteed to be queued by the browser even as the page is being torn down. Returns true if successfully queued, false if the queue is full.
fetch with keepalive: true — Survives unload like sendBeacon, but gives you access to the response (useful for debugging). Falls back to this if sendBeacon is not available or returns false.
Plain fetch — Last resort. Does not survive unload but works in all modern browsers. Only reached if both previous strategies fail.

fetch with keepalive: true has a 64KB payload limit per page. If your analytics data is large (e.g., full resource timing entries for a page with hundreds of assets), you may hit this limit. sendBeacon does not have this restriction, which is another reason it is the preferred first choice.

Configurable Endpoint URL

The updated collector introduces a configurable ENDPOINT variable at the top of the script:

const ENDPOINT = 'http://localhost:3005/collect';

In production, you would change this to your analytics server's URL:

const ENDPOINT = 'https://analytics.example.com/collect';

This is a simple approach. In Module 08 (Configuration API), we will refactor the collector into a library with a proper init() method that accepts configuration options including the endpoint URL.

Input Validation

The endpoint validates incoming data before storing it. This is essential because anyone can send POST requests to your endpoint — not just your collector script. Bots, scrapers, and curious developers can all craft and send arbitrary payloads.

At minimum, always validate:

Required fields exist — Every beacon must have url and type. Reject anything that is missing these.
Types are correct — The url should be a string, the timestamp should be an ISO 8601 date string. Do not blindly trust client data.
Sizes are reasonable — Reject payloads larger than 50KB. A normal analytics beacon is well under 5KB. Anything larger is suspicious or misconfigured.
Rate limiting — Prevent a single client from flooding your endpoint with thousands of requests per second. We will cover rate limiting in detail in Module 10 (Production Readiness).

The validation in endpoint.js is intentionally minimal for this tutorial. In production, you would add stricter type checks, field length limits, and payload size enforcement at the middleware level.

Cross-Reference: See the Node.js Tutorial Module 05 (REST API) for more on Express POST handlers and JSON APIs. The existing production endpoint at collector.cse135.site/endpoint.js follows this same pattern — POST handler, validation, and JSONL storage.

Summary

Build a collection endpoint with Express: parse, validate, timestamp, store
Use 204 No Content as the response — minimal overhead, confirms receipt
CORS headers are required for cross-origin beacons
JSON Lines (.jsonl) is ideal for append-only analytics storage
Always add server-side timestamps for clock-skew protection
Delivery cascade: sendBeacon → fetch(keepalive) → fetch
Validate all incoming data — never trust the client

← Previous: Server-Log Collection Next: Performance Timing →