Our collector works — it tracks pageviews, technographics, performance timing, Web Vitals, errors, and supports plugins. But it is not production-ready. Production means the script loads without blocking the page, handles user consent, filters bots, retries failed sends, measures its own overhead, and ships as a minimal embed snippet. This module closes every gap between a demo collector and one you could deploy to a real site.
Run: Open loader.html to see the command queue pattern in action.
Everything we have built so far works correctly in a controlled demo environment: the script is loaded synchronously, the user has implicitly consented by opening a test page, no bots are visiting, and the network is reliable. None of those assumptions hold in production.
A production analytics collector must solve six additional problems:
We will address each of these in order, building up to the final collector-v9.js.
Browsers offer three strategies for loading external scripts, each with different blocking behavior:
<!-- Blocking (bad for analytics scripts) -->
<script src="collector.js"></script>
<!-- Async (downloads in parallel, executes when ready) -->
<script async src="collector.js"></script>
<!-- Defer (downloads in parallel, executes after DOM parsed) -->
<script defer src="collector.js"></script>
A blocking <script> tag halts HTML parsing until the script downloads and executes. For analytics, this is unacceptable — a slow CDN or DNS lookup could add hundreds of milliseconds to the page load, directly harming the user experience the collector is supposed to measure.
The async attribute tells the browser to download the script in parallel with HTML parsing and execute it as soon as the download finishes. The defer attribute also downloads in parallel, but delays execution until after the DOM is fully parsed.
For analytics, async is preferred: we want the collector running as early as possible to catch errors, start vitals observers, and record the initial pageview. We do not need to wait for the DOM to be complete — the collector attaches global listeners, not DOM elements.
But async loading creates a new problem: if the page calls collector.init() in an inline script, and the collector file has not finished downloading yet, the call fails with a ReferenceError. This is where the command queue pattern comes in.
The solution is elegant: instead of calling methods on the collector directly, the page pushes commands into a plain JavaScript array. When the collector script eventually loads, it processes the queue and replaces the array with a live proxy that executes commands immediately.
This is what site owners paste into their <head>:
<!-- Analytics Collector -- Place in <head> -->
<script>
window._cq = window._cq || [];
_cq.push(['init', { endpoint: '/collect' }]);
_cq.push(['set', 'pageType', 'article']);
_cq.push(['track', 'pageview']);
</script>
<script async src="collector.js"></script>
The snippet initializes window._cq as an array (if it does not already exist), then pushes configuration commands. Each command is an array where the first element is the method name and the remaining elements are arguments. The actual collector script loads asynchronously — it might arrive 50ms later, or 500ms later, or even seconds later on a slow connection.
Inside the collector IIFE, after the public API is defined, the queue is drained:
// Inside the IIFE, after publicAPI is defined...
function processQueue() {
const queue = window._cq || [];
for (const args of queue) {
const method = args[0];
const params = args.slice(1);
if (typeof publicAPI[method] === 'function') {
publicAPI[method](...params);
}
}
// Replace the array with a live object
// that calls methods immediately
window._cq = {
push: (args) => {
const method = args[0];
const params = args.slice(1);
if (typeof publicAPI[method] === 'function') {
publicAPI[method](...params);
}
}
};
}
The key insight is the replacement step: after draining the queue, window._cq is replaced with an object that has a push() method. Any _cq.push() call made after the script has loaded will execute the command immediately instead of queuing it. Because the object has a push method, the calling code does not need to know whether the script has loaded yet — _cq.push() always works.
gtag() function pushes commands to a queue (dataLayer), and the actual script processes them when it loads. Any calls made after script load execute immediately. Segment, Amplitude, Mixpanel, and virtually every modern analytics library use this same approach.
Privacy regulations (GDPR, CCPA, ePrivacy Directive) require that analytics collection respect user consent. The collector must check consent before doing anything — before attaching observers, before sending beacons, before writing to storage.
function hasConsent() {
// Check Global Privacy Control signal
if (navigator.globalPrivacyControl) {
return false;
}
// Check for consent cookie
const cookies = document.cookie.split(';');
for (const c of cookies) {
const cookie = c.trim();
if (cookie.indexOf('analytics_consent=') === 0) {
return cookie.split('=')[1] === 'true';
}
}
// No consent signal found -- default depends on jurisdiction
// For GDPR: default to false (opt-in required)
// For non-EU: could default to true (opt-out model)
return false;
}
The function checks two signals in order. First, the Global Privacy Control (GPC) header, exposed as navigator.globalPrivacyControl. GPC is a browser-level "do not sell/share" signal that is legally binding under CCPA and recognized by GDPR supervisory authorities. If GPC is set, the collector stops immediately.
Second, it checks for an analytics_consent cookie, which is set by a consent banner (see below). The value is either true or false. If no consent cookie is found, the default depends on the jurisdiction — GDPR requires opt-in (default false), while some other frameworks allow opt-out (default true).
The separate consent.js file provides a ConsentManager object with methods for checking, granting, and revoking consent, plus a simple banner UI:
// consent.js -- Consent management for the collector
const ConsentManager = {
check: function() { /* ... */ },
grant: function() { /* ... */ },
revoke: function() { /* ... */ },
showBanner: function(options) { /* ... */ }
};
check() — inspects GPC and the consent cookie, returns true or falsegrant() — sets analytics_consent=true cookie with a 1-year expiryrevoke() — sets analytics_consent=false and clears sessionStorageshowBanner() — creates a DOM-based consent banner with Accept/Decline buttonsIn the collector's init() method, consent is checked before starting collection. If consent is not granted, the collector logs a message and does nothing — no observers, no beacons, no storage.
Bots, crawlers, and automated testing tools generate analytics data that looks like real user traffic but is not. Without filtering, bot traffic inflates pageview counts, skews performance metrics (bots tend to be very fast), and pollutes behavioral data. Client-side heuristics can catch the most common cases:
function isBot() {
// Check WebDriver flag (Puppeteer, Selenium, Playwright)
if (navigator.webdriver) return true;
// Check for headless indicators
const ua = navigator.userAgent;
if (/HeadlessChrome|PhantomJS|Lighthouse/i.test(ua)) return true;
// Check for missing Chrome object in Chrome UA
if (/Chrome/.test(ua) && !window.chrome) return true;
// Check for automation properties
if (window._phantom || window.__nightmare || window.callPhantom) return true;
return false;
}
The checks work as follows:
navigator.webdriver — set to true by Puppeteer, Selenium, and Playwright when controlling a browserwindow.chrome; headless Chrome (older versions) and spoofed UAs do notwindow._phantom and window.callPhantom; Nightmare.js sets window.__nightmareNetwork requests fail. The user might be on a flaky mobile connection, a corporate firewall might block the endpoint, or the analytics server might be temporarily down. Without retry logic, those beacons are silently lost.
The strategy: if sendBeacon returns false (queue full) and the fetch fallback also fails, queue the payload in sessionStorage. On the next page load (or the next successful send), drain the retry queue.
function send(payload) {
const json = JSON.stringify(payload);
if (config.debug) {
console.log('[Collector] Debug:', payload);
return;
}
let sent = false;
if (navigator.sendBeacon) {
sent = navigator.sendBeacon(
config.endpoint,
new Blob([json], { type: 'application/json' })
);
}
if (!sent) {
fetch(config.endpoint, {
method: 'POST',
body: json,
headers: { 'Content-Type': 'application/json' },
keepalive: true
}).catch(() => {
// Failed -- queue for retry
queueForRetry(payload);
});
}
}
function queueForRetry(payload) {
const queue = JSON.parse(
sessionStorage.getItem('_collector_retry') || '[]'
);
if (queue.length >= 50) return; // Cap the queue
queue.push(payload);
sessionStorage.setItem(
'_collector_retry', JSON.stringify(queue)
);
}
function processRetryQueue() {
const queue = JSON.parse(
sessionStorage.getItem('_collector_retry') || '[]'
);
if (!queue.length) return;
sessionStorage.removeItem('_collector_retry');
queue.forEach((payload) => { send(payload); });
}
The retry queue is capped at 50 entries to prevent unbounded storage growth. If the endpoint is truly unreachable for an extended period, the oldest entries are simply dropped. The queue uses sessionStorage rather than localStorage because retry data is inherently session-scoped — retrying a pageview beacon from yesterday would be misleading.
Naive time-on-page calculations (difference between page load and unload) overcount because they include time the tab is in the background. A user who opens your page, switches to another tab for 20 minutes, then comes back has not spent 20 minutes engaging with your content.
The solution is to track only the time the page is actually visible, using the visibilitychange event:
let pageShowTime = Date.now();
let totalVisibleTime = 0;
document.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'hidden') {
totalVisibleTime += Date.now() - pageShowTime;
// Send time-on-page with exit beacon
send({
type: 'page_exit',
timeOnPage: totalVisibleTime,
url: window.location.href,
timestamp: new Date().toISOString()
});
} else {
pageShowTime = Date.now();
}
});
When the page becomes hidden (tab switch, minimize, navigation), the elapsed visible time is accumulated. When the page becomes visible again, the timer resets. The exit beacon sends the total visible time, giving you an accurate measure of actual engagement.
An analytics collector should not degrade the experience it is measuring. The performance.mark() and performance.measure() APIs let the collector measure its own overhead:
function send(payload) {
performance.mark('collector_send_start');
// ... send logic ...
performance.mark('collector_send_end');
performance.measure(
'collector_send',
'collector_send_start',
'collector_send_end'
);
}
These marks and measures appear in the browser's Performance tab, making it easy to verify that the collector adds negligible overhead. In a well-built collector, the send() operation should take less than 1ms — the actual network transfer happens asynchronously via sendBeacon.
You can also measure initialization time:
performance.mark('collector_init_start');
// ... initialization logic ...
performance.mark('collector_init_end');
performance.measure(
'collector_init',
'collector_init_start',
'collector_init_end'
);
With the command queue in place, the recommended embed pattern for site owners is just four lines of meaningful code:
<!-- Analytics Collector -- Place in <head> -->
<script>
window._cq = window._cq || [];
_cq.push(['init', {
endpoint: 'https://analytics.example.com/collect',
enableVitals: true,
enableErrors: true,
sampleRate: 1.0
}]);
</script>
<script async src="https://analytics.example.com/collector-v9.min.js"></script>
This is the pattern used by every major analytics service. The inline script is tiny (under 200 bytes) and executes instantly. The actual collector loads asynchronously and processes the queued init command when ready. Site owners never need to touch the collector code — all configuration happens through the init options.
| Init Option | Type | Default | Description |
|---|---|---|---|
endpoint |
string | (required) | URL of the analytics collection endpoint |
enableVitals |
boolean | true |
Enable Web Vitals observers (LCP, CLS, INP) |
enableErrors |
boolean | true |
Enable JS error and resource failure tracking |
sampleRate |
number | 1.0 |
Fraction of sessions to track (0.0 to 1.0) |
debug |
boolean | false |
Log payloads to console instead of sending |
respectConsent |
boolean | true |
Check consent before collecting |
detectBots |
boolean | true |
Filter out detected bots |
The development version of collector-v9.js is well-commented and readable — exactly what you want during development and debugging. The production version, collector-v9.min.js, strips all comments, removes unnecessary whitespace, and shortens internal variable names. The result is the same functionality in a significantly smaller file.
| Version | Size | Purpose |
|---|---|---|
collector-v9.js |
~14 KB | Development — readable, commented, debuggable |
collector-v9.min.js |
~6 KB | Production — minified, smaller transfer size |
In practice, you would use a tool like Terser or esbuild for minification. These tools parse the JavaScript AST, rename local variables to single characters, remove dead code, and compress the output. With gzip compression on the server, the transfer size drops even further — typically 60-70% smaller than the raw minified file.
Here is the complete flow from page load to page exit, showing every gate and subsystem:
The three gates — consent, bot detection, and sampling — are checked in that order during init(). If any gate fails, the collector stops completely. No observers are attached, no beacons are sent, no storage is written. This ensures zero overhead for visitors who should not be tracked.
sessionStorage handles failed beacon sends gracefullyvisibilitychange tracks accurate time-on-page by counting only visible timeperformance.mark() measures the collector's own overhead