Module 10: Production Readiness

Our collector works — it tracks pageviews, technographics, performance timing, Web Vitals, errors, and supports plugins. But it is not production-ready. Production means the script loads without blocking the page, handles user consent, filters bots, retries failed sends, measures its own overhead, and ships as a minimal embed snippet. This module closes every gap between a demo collector and one you could deploy to a real site.

Demo Files

Run: Open loader.html to see the command queue pattern in action.

The Gap Between Demo and Production

Everything we have built so far works correctly in a controlled demo environment: the script is loaded synchronously, the user has implicitly consented by opening a test page, no bots are visiting, and the network is reliable. None of those assumptions hold in production.

A production analytics collector must solve six additional problems:

  1. Non-blocking loading — the script must not delay page rendering
  2. Consent — the collector must respect privacy regulations and user preferences
  3. Bot filtering — automated visitors must be detected and excluded
  4. Delivery reliability — failed beacons must be retried, not silently lost
  5. Self-measurement — the collector must track its own performance overhead
  6. Minimal footprint — the embed snippet should be a few lines, and the script should be minified

We will address each of these in order, building up to the final collector-v9.js.

Async Script Loading

Browsers offer three strategies for loading external scripts, each with different blocking behavior:

<!-- Blocking (bad for analytics scripts) -->
<script src="collector.js"></script>

<!-- Async (downloads in parallel, executes when ready) -->
<script async src="collector.js"></script>

<!-- Defer (downloads in parallel, executes after DOM parsed) -->
<script defer src="collector.js"></script>

A blocking <script> tag halts HTML parsing until the script downloads and executes. For analytics, this is unacceptable — a slow CDN or DNS lookup could add hundreds of milliseconds to the page load, directly harming the user experience the collector is supposed to measure.

The async attribute tells the browser to download the script in parallel with HTML parsing and execute it as soon as the download finishes. The defer attribute also downloads in parallel, but delays execution until after the DOM is fully parsed.

For analytics, async is preferred: we want the collector running as early as possible to catch errors, start vitals observers, and record the initial pageview. We do not need to wait for the DOM to be complete — the collector attaches global listeners, not DOM elements.

But async loading creates a new problem: if the page calls collector.init() in an inline script, and the collector file has not finished downloading yet, the call fails with a ReferenceError. This is where the command queue pattern comes in.

The Command Queue Pattern

The solution is elegant: instead of calling methods on the collector directly, the page pushes commands into a plain JavaScript array. When the collector script eventually loads, it processes the queue and replaces the array with a live proxy that executes commands immediately.

The Embed Snippet

This is what site owners paste into their <head>:

<!-- Analytics Collector -- Place in <head> -->
<script>
  window._cq = window._cq || [];
  _cq.push(['init', { endpoint: '/collect' }]);
  _cq.push(['set', 'pageType', 'article']);
  _cq.push(['track', 'pageview']);
</script>
<script async src="collector.js"></script>

The snippet initializes window._cq as an array (if it does not already exist), then pushes configuration commands. Each command is an array where the first element is the method name and the remaining elements are arguments. The actual collector script loads asynchronously — it might arrive 50ms later, or 500ms later, or even seconds later on a slow connection.

Processing the Queue

Inside the collector IIFE, after the public API is defined, the queue is drained:

// Inside the IIFE, after publicAPI is defined...
function processQueue() {
  const queue = window._cq || [];
  for (const args of queue) {
    const method = args[0];
    const params = args.slice(1);
    if (typeof publicAPI[method] === 'function') {
      publicAPI[method](...params);
    }
  }
  // Replace the array with a live object
  // that calls methods immediately
  window._cq = {
    push: (args) => {
      const method = args[0];
      const params = args.slice(1);
      if (typeof publicAPI[method] === 'function') {
        publicAPI[method](...params);
      }
    }
  };
}

The key insight is the replacement step: after draining the queue, window._cq is replaced with an object that has a push() method. Any _cq.push() call made after the script has loaded will execute the command immediately instead of queuing it. Because the object has a push method, the calling code does not need to know whether the script has loaded yet — _cq.push() always works.

Industry Pattern: This is exactly how Google Analytics (gtag.js) works: the gtag() function pushes commands to a queue (dataLayer), and the actual script processes them when it loads. Any calls made after script load execute immediately. Segment, Amplitude, Mixpanel, and virtually every modern analytics library use this same approach.

Consent Management

Privacy regulations (GDPR, CCPA, ePrivacy Directive) require that analytics collection respect user consent. The collector must check consent before doing anything — before attaching observers, before sending beacons, before writing to storage.

function hasConsent() {
  // Check Global Privacy Control signal
  if (navigator.globalPrivacyControl) {
    return false;
  }

  // Check for consent cookie
  const cookies = document.cookie.split(';');
  for (const c of cookies) {
    const cookie = c.trim();
    if (cookie.indexOf('analytics_consent=') === 0) {
      return cookie.split('=')[1] === 'true';
    }
  }

  // No consent signal found -- default depends on jurisdiction
  // For GDPR: default to false (opt-in required)
  // For non-EU: could default to true (opt-out model)
  return false;
}

The function checks two signals in order. First, the Global Privacy Control (GPC) header, exposed as navigator.globalPrivacyControl. GPC is a browser-level "do not sell/share" signal that is legally binding under CCPA and recognized by GDPR supervisory authorities. If GPC is set, the collector stops immediately.

Second, it checks for an analytics_consent cookie, which is set by a consent banner (see below). The value is either true or false. If no consent cookie is found, the default depends on the jurisdiction — GDPR requires opt-in (default false), while some other frameworks allow opt-out (default true).

The Consent Module

The separate consent.js file provides a ConsentManager object with methods for checking, granting, and revoking consent, plus a simple banner UI:

// consent.js -- Consent management for the collector
const ConsentManager = {
  check: function() { /* ... */ },
  grant: function() { /* ... */ },
  revoke: function() { /* ... */ },
  showBanner: function(options) { /* ... */ }
};

In the collector's init() method, consent is checked before starting collection. If consent is not granted, the collector logs a message and does nothing — no observers, no beacons, no storage.

Cross-Reference: See analytics-overview.html Section 9 (Privacy & Consent) for GDPR, CCPA, and ePrivacy requirements, and why consent must be checked before any data collection begins.

Bot Detection

Bots, crawlers, and automated testing tools generate analytics data that looks like real user traffic but is not. Without filtering, bot traffic inflates pageview counts, skews performance metrics (bots tend to be very fast), and pollutes behavioral data. Client-side heuristics can catch the most common cases:

function isBot() {
  // Check WebDriver flag (Puppeteer, Selenium, Playwright)
  if (navigator.webdriver) return true;

  // Check for headless indicators
  const ua = navigator.userAgent;
  if (/HeadlessChrome|PhantomJS|Lighthouse/i.test(ua)) return true;

  // Check for missing Chrome object in Chrome UA
  if (/Chrome/.test(ua) && !window.chrome) return true;

  // Check for automation properties
  if (window._phantom || window.__nightmare || window.callPhantom) return true;

  return false;
}

The checks work as follows:

Warning: Bot detection is an arms race. These heuristics catch basic bots and testing tools, but sophisticated bots can spoof all of these signals. For production, combine client-side heuristics with server-side analysis: abnormal request patterns, known bot IP ranges (via the IAB/ABC International Spiders & Bots List), and request frequency analysis. See analytics-overview.html Section 15 (Bot Traffic) for a comprehensive treatment.

Beacon Retry Queue

Network requests fail. The user might be on a flaky mobile connection, a corporate firewall might block the endpoint, or the analytics server might be temporarily down. Without retry logic, those beacons are silently lost.

The strategy: if sendBeacon returns false (queue full) and the fetch fallback also fails, queue the payload in sessionStorage. On the next page load (or the next successful send), drain the retry queue.

function send(payload) {
  const json = JSON.stringify(payload);

  if (config.debug) {
    console.log('[Collector] Debug:', payload);
    return;
  }

  let sent = false;
  if (navigator.sendBeacon) {
    sent = navigator.sendBeacon(
      config.endpoint,
      new Blob([json], { type: 'application/json' })
    );
  }

  if (!sent) {
    fetch(config.endpoint, {
      method: 'POST',
      body: json,
      headers: { 'Content-Type': 'application/json' },
      keepalive: true
    }).catch(() => {
      // Failed -- queue for retry
      queueForRetry(payload);
    });
  }
}

function queueForRetry(payload) {
  const queue = JSON.parse(
    sessionStorage.getItem('_collector_retry') || '[]'
  );
  if (queue.length >= 50) return; // Cap the queue
  queue.push(payload);
  sessionStorage.setItem(
    '_collector_retry', JSON.stringify(queue)
  );
}

function processRetryQueue() {
  const queue = JSON.parse(
    sessionStorage.getItem('_collector_retry') || '[]'
  );
  if (!queue.length) return;
  sessionStorage.removeItem('_collector_retry');
  queue.forEach((payload) => { send(payload); });
}

The retry queue is capped at 50 entries to prevent unbounded storage growth. If the endpoint is truly unreachable for an extended period, the oldest entries are simply dropped. The queue uses sessionStorage rather than localStorage because retry data is inherently session-scoped — retrying a pageview beacon from yesterday would be misleading.

Accurate Time-on-Page

Naive time-on-page calculations (difference between page load and unload) overcount because they include time the tab is in the background. A user who opens your page, switches to another tab for 20 minutes, then comes back has not spent 20 minutes engaging with your content.

The solution is to track only the time the page is actually visible, using the visibilitychange event:

let pageShowTime = Date.now();
let totalVisibleTime = 0;

document.addEventListener('visibilitychange', () => {
  if (document.visibilityState === 'hidden') {
    totalVisibleTime += Date.now() - pageShowTime;
    // Send time-on-page with exit beacon
    send({
      type: 'page_exit',
      timeOnPage: totalVisibleTime,
      url: window.location.href,
      timestamp: new Date().toISOString()
    });
  } else {
    pageShowTime = Date.now();
  }
});

When the page becomes hidden (tab switch, minimize, navigation), the elapsed visible time is accumulated. When the page becomes visible again, the timer resets. The exit beacon sends the total visible time, giving you an accurate measure of actual engagement.

Self-Measurement

An analytics collector should not degrade the experience it is measuring. The performance.mark() and performance.measure() APIs let the collector measure its own overhead:

function send(payload) {
  performance.mark('collector_send_start');
  // ... send logic ...
  performance.mark('collector_send_end');
  performance.measure(
    'collector_send',
    'collector_send_start',
    'collector_send_end'
  );
}

These marks and measures appear in the browser's Performance tab, making it easy to verify that the collector adds negligible overhead. In a well-built collector, the send() operation should take less than 1ms — the actual network transfer happens asynchronously via sendBeacon.

You can also measure initialization time:

performance.mark('collector_init_start');
// ... initialization logic ...
performance.mark('collector_init_end');
performance.measure(
  'collector_init',
  'collector_init_start',
  'collector_init_end'
);

The Embed Snippet

With the command queue in place, the recommended embed pattern for site owners is just four lines of meaningful code:

<!-- Analytics Collector -- Place in <head> -->
<script>
  window._cq = window._cq || [];
  _cq.push(['init', {
    endpoint: 'https://analytics.example.com/collect',
    enableVitals: true,
    enableErrors: true,
    sampleRate: 1.0
  }]);
</script>
<script async src="https://analytics.example.com/collector-v9.min.js"></script>

This is the pattern used by every major analytics service. The inline script is tiny (under 200 bytes) and executes instantly. The actual collector loads asynchronously and processes the queued init command when ready. Site owners never need to touch the collector code — all configuration happens through the init options.

Init Option Type Default Description
endpoint string (required) URL of the analytics collection endpoint
enableVitals boolean true Enable Web Vitals observers (LCP, CLS, INP)
enableErrors boolean true Enable JS error and resource failure tracking
sampleRate number 1.0 Fraction of sessions to track (0.0 to 1.0)
debug boolean false Log payloads to console instead of sending
respectConsent boolean true Check consent before collecting
detectBots boolean true Filter out detected bots

Minification

The development version of collector-v9.js is well-commented and readable — exactly what you want during development and debugging. The production version, collector-v9.min.js, strips all comments, removes unnecessary whitespace, and shortens internal variable names. The result is the same functionality in a significantly smaller file.

Version Size Purpose
collector-v9.js ~14 KB Development — readable, commented, debuggable
collector-v9.min.js ~6 KB Production — minified, smaller transfer size

In practice, you would use a tool like Terser or esbuild for minification. These tools parse the JavaScript AST, rename local variables to single characters, remove dead code, and compress the output. With gzip compression on the server, the transfer size drops even further — typically 60-70% smaller than the raw minified file.

The Full Architecture

Here is the complete flow from page load to page exit, showing every gate and subsystem:

Page Load | |--> Embed snippet pushes to _cq queue | |--> collector-v9.min.js loads (async) | | | |--> Check consent ----> No? Stop. | |--> Check bot --------> Bot? Stop. | |--> Check sampling ---> Not sampled? Stop. | | | |--> Process _cq queue | |--> Start observers (vitals, errors) | |--> Process retry queue | | | '--> On load: send pageview + timing | '--> On page hide: |--> Send final vitals |--> Send time-on-page '--> Extensions flush final data

The three gates — consent, bot detection, and sampling — are checked in that order during init(). If any gate fails, the collector stops completely. No observers are attached, no beacons are sent, no storage is written. This ensures zero overhead for visitors who should not be tracked.

Cross-Reference

Cross-Reference: This module brings together concepts from across the analytics overview: Section 9 (Privacy & Consent) covers the legal requirements for consent, Section 15 (Bot Traffic) covers bot detection strategies and the IAB bot list, and Section 16 (Data Quality) covers sampling, deduplication, and data validation. The command queue pattern is used by every major analytics tool — Google Analytics, Segment, Amplitude, and Mixpanel all use this approach.

Summary