Web Performance

The Performance Golden Rule

"Send less data, less often, from nearby, when it is needed."

CSE 135 — Full Overview

Section 1Why Performance Matters

Speed is not a feature. Speed is the baseline expectation.

Speed and User Behavior

The data is in and it is irrefutable:

  • 47% of consumers expect page loads under 2 seconds
  • Users become dissatisfied at 2–4 seconds
  • 40% abandon sites that take 3+ seconds
  • 5 → 19 seconds increases bounce rates by 65%
  • 1-second delays may cost 7% loss of converting actions
  • Amazon: "100ms = 1% sales" — Google: "500ms delay = 20% fewer searches"
Speed is not a feature. Speed is the baseline expectation.

It's Not Just a Web Thing: Chronos & Kairos

The ancient Greeks distinguished two concepts of time:

  • Chronos: quantitative, clock time — measurable milliseconds
  • Kairos: qualitative, experiential time — how it feels

A minute waiting for a page to load feels longer than a minute reading interesting content.

The expectation floor only moves in one direction: bank lines → pneumatic tubes → ATMs → card swipe → phone tap. Once users experience faster, they cannot un-experience it.

User tolerance correlates with needs and wants. How long would you tolerate a page load if you really wanted something? (DMV line vs. prize giveaway) The ultimate "grader" of quality is a human shaped by their experience.

The Clock Starts Before Your Site

The user's total time: unlock device → open browser → type query → DNS → TCP → then your server sees a byte.

Three phases of the user journey:

  1. Before they arrive — expectations set by experiences elsewhere
  2. During their visit — more of a montage
  3. After they leave — the takeaway, what they remember
The 99% Rule: 99% of the time, people are elsewhere. That shapes their perception of your 1%. If Amazon transacts in a second, users bring that expectation to your site — fair or not.

UX/DX Tension & Mobile Reality

The Economic Tension

  • Delivery is a variable cost — every user pays the byte tax
  • Development is a more fixed cost — write once, ship many times
  • Framework convenience saves dev hours but costs bytes in every download
  • Savings in creation may be far exceeded by costs in use!

Mobile Reality

  • Mobile phones are not desktop computers with small screens
  • Even flagship phones are a magnitude slower than laptops
  • Developers use flagships; real users are on median Android devices
"The web has moved to relatively underpowered mobile devices with connections that are often slow, flaky, or both." — Addy Osmani, Google

Section 2Key Definitions: Bandwidth and Latency

Latency is the real enemy.

Bandwidth vs. Latency

Bandwidth

Data capacity per unit time. Diameter of the pipe.

A wider pipe carries more water — but doesn't make a drop arrive sooner.

Latency

Network delay / travel time. Length of the pipe.

Includes processing, queuing, and transmission delays at every hop.

  • Upgrading bandwidth 5–10 Mbps: only ~5% load time improvement
  • Reducing round-trip time 20ms: linear load time improvement
"Latency is the real enemy." Bandwidth gets the marketing (100 Gbps! Unlimited!); latency gets the blame.

Fallacies of Hope & Scale ≠ Speed

Common misconceptions:
  • More bandwidth ≠ faster (latency dominates)
  • More servers ≠ faster (capacity ≠ speed)
  • Geographically close ≠ network close
  • Network conditions are NOT constant
  • New technologies will NOT just solve it
"Scale ≠ Speed." Think of a store with too few registers — adding a checker improves throughput (scale) but doesn't make any individual checkout faster (speed). They are related but different.

Key Definitions

TermDefinition
TTFBTime to First Byte — request sent to first byte arriving
TTLBTime to Last Byte — request sent to final byte arriving
FCPFirst Contentful Paint — first DOM content rendered
TTITime to Interactive — visually rendered AND responsive to input
LCPLargest Contentful Paint — largest visible element rendered (Core Web Vital)
CLSCumulative Layout Shift — visual stability (Core Web Vital)
INPInteraction to Next Paint — responsiveness to interactions (Core Web Vital)

Section 3RAIL: The Performance Model

Response < 100ms. Animation < 10ms. Idle < 50ms. Load < 1s.

RAIL Breakdown

PhaseBudgetWhat It Means
Response< 100msTap/click to visible feedback must feel instant
Animation~10ms/frame60 fps = ~16.66ms/frame. ~10ms of work to leave room for rendering. Failure = jank
Idle0–50msBackground work in ≤ 50ms chunks. You share the main thread with the UI!
Load< 1000msCritical above-the-fold content on screen. Avoid the White Screen of Death
Jank Busting: Your client-side JavaScript shares the execution thread with the browser's paint process. What you do can literally stall the browser itself!

The 90% Problem & RAIL Is Not Binary

"~90% of user-response time issues are client-side." — Steve Souders

Start with client-side optimizations. They are simpler, easier to measure, and affect the largest portion of the user's experience. Client-side truly is harder than server-side.

RAIL Is a Spectrum

  • Meeting: User experience meets the RAIL target
  • Tolerating: Near the target but degraded
  • Failing: Significantly misses the target
3rd Party Lack of Control: When you rely on external services and linked scripts you have given away control of your performance outcome. Performance guarantees (SLAs) cost real money.

Section 4Content Selection & Payload Reduction

The fastest byte is the one you never send.

Do You Even Need It?

The first question: do we really need this object?

The "localhost" effect: the developer's perception of performance isn't the user's reality on a slow phone over cellular.

Content You Might Not Need

  • Marketing <meta> tags that add no user-facing value
  • Comments, excessive whitespace, redundant markup
  • Unused CSS rules in a monolithic stylesheet
  • Unused JS code paths in a monolithic bundle
  • High-resolution images displayed at thumbnail sizes
Not all bytes are the same. JS must be downloaded, parsed, AND compiled. 200 KB of JavaScript is significantly more expensive than 200 KB of images.

Framework Bloat & The Tangibility Problem

Importing all of Bootstrap CSS just to center a few elements. Including an entire JS library for one utility function. The entire framework ships to every user, every time.

  • Single framework <link> tag = high DX (easy for devs)
  • Custom-building with care = lower DX but better UX (fewer bytes)
  • Savings in creation may be far exceeded by costs in use!
The (Un)clarity of (In)tangibility: Other engineering disciplines acknowledge the nature of their materials. We don't see the humming data centers. At the end of the day every byte has a power cost — poor performance has costs just like single-use plastic bottles.

Section 5Minification: HTML

"Code for maintenance, but prepare for delivery."

Dev-Performance Pragmatism

"Code for maintenance, but prepare for delivery." Minification is automated — part of a build pipeline, not manual effort. Your source stays readable; your delivered code is optimized.

Minify first, then compress. They address different things and compound: minification removes structural redundancy, compression removes statistical redundancy.

HTML Minification Techniques

TechniqueExample
Whitespace removalCollapse multiple spaces (preserve <pre>, <textarea>)
Optional quote removal<p id="foo"><p id=foo>
Comment removalStrip <!-- --> (also reduces info leakage)
Boolean shortening<hr noshade="noshade"><hr noshade>
Self-closing cleanup<br /><br>
Entity remapping&#174;&reg; (whichever shorter)

HTML Is the Base Object

The final product is HTML — it is the atoms of web content. As the root document that triggers all other fetches, any delay adds delays to everything downstream.

Don't cache base HTML objects aggressively. If you cache the root HTML, invalidating dependent objects (versioned CSS, JS) becomes impossible.
Markup Quality: Valid, semantic markup reduces bytes, improves structure, helps accessibility (a11y), and even helps bots. A seasoned engineer knows that even the smallest thing can have an outsized impact in a complex system.

Section 6Minification: CSS

CSS impact isn't just delivery — it's render blocking.

The Growing CSS Problem

  • CSS requests per page: 1–3 → 6+ over a decade (p90: 9 → 18)
  • Misused CSS causes rendering problems and expensive reflows
  • Font dependencies can block the critical render path
  • Improper loading causes FOUC (Flash of Unstyled Content)
  • Measured by LCP and CLS

Key Techniques

TechniqueExample
Unused CSS removalPurifyCSS, PurgeCSS
Rule shorthandsmargin-left/right/top/bottommargin
Value recasting#ff0000red, bold700
Unit elimination0px0
Rule mergingCollapse redundant selectors

Critical CSS & The CSP Tension

Critical CSS

Extract above-the-fold CSS and inline it in <head>. Eliminates a render-blocking request for the initial viewport. Load full stylesheet asynchronously afterward.

The Three-Way Trade-Off

  • Performance: inlining CSS eliminates a network request
  • Security: CSP disallows inline styles to prevent injection
  • Complexity: CSP nonces/hashes add build pipeline work
CSS optimization is not just about bytes — it's about render blocking. The browser must parse CSS entirely before rendering. Excessive HTML and CSS in JavaScript gets in the way of what the browser has been optimized to do!

Section 7Minification: JavaScript

JS abuse is arguably the worst performance issue on the modern web.

JS Costs More Than You Think

JavaScript must be downloaded, parsed, and compiled. It's a triple cost.

The Uncanny Valley: A page visually loaded but not usable. Measured by TTI — can be severe on median Android phones. This has led some to describe modern WWW as more "Wealth Western Web" than "World Wide Web."

Key Techniques

TechniqueSavings
Variable name rewriting (var myLongNamevar x)Significant
Dead code eliminationVariable — can be very large in frameworks
Whitespace reductionModerate (watch for ASI)
Repetition rewritesModerate
Code optimizations (i=i+1i++)Small, adds up

Bundling, Code Splitting & "The Demo Illusion"

  • Separate files = developer value. Bundle for delivery.
  • But not a monolith — code split by route (homepage.js, checkout.js)
  • Postel's Law: be conservative in optimizations unless you control all code
"The Demo Illusion": A library is 40 KB gzipped — but 200 KB+ ungzipped. Then it fetches plugins and extensions. The Getting Started demo is small and easy — that's the illusion. Performance reality crashes in once it's too late to remove the dependency. Know what a dependency does before you adopt it.

Section 8Fonts, URI Paths & Finishing Touches

Fine polish for large-scale optimization.

Font Optimization & URI Path Reduction

Fonts

  • Is the font worth it? Most users can't tell SansSerif A from B
  • Consider system fonts — free delivery!
  • Variable fonts: one file represents many variations
  • Subset fonts: only glyphs you need. 150 KB → 15 KB for Latin subset
  • Do you need all weight versions? Semi-bold, bold, light, extra-bold, regular?

URI Path Reduction

<!-- Before --> <!-- After --> <img src="/images/UCSD_logo.png"> <img src="/d/i/u0.png"> <link href="/css/main-styles.css"> <link href="/d/c/m0.css">

Only rewrite dependent resource paths — user-facing URLs stay readable.

Fine polish for large-scale sites. At hyperscale, every byte truly matters. For smaller sites, these yield diminishing returns.

Section 9Compression

Up to 70% savings — usually just a server config change.

HTTP Compression

Client Server | | |--- GET /app.js ------------------> | | Accept-Encoding: gzip, br | | | | [compress]| | | |<-- 200 OK ---------------------- | | Content-Encoding: br | | 167,422 bytes → 48,291 bytes | | Savings: ~71% |
FormatNotes
gzipUniversal support. The safe default.
BrotliBetter ratios. Requires HTTPS. Increasingly supported.
DeflateOlder. Largely superseded by gzip.
DictionaryEmerging. Leverages "sameness" across web pages.
Minify first, then compress. They are complementary — minification removes structural redundancy, compression removes statistical redundancy.

Image Compression

Images are the biggest and most obvious byte savings opportunity — yet consistently neglected.

  • Modern formats: WebP, AVIF offer significant improvements over JPEG/PNG
  • But consider user context — if users want to save images locally, exotic formats frustrate
  • Inline SVG replaces raster images for simple visuals
Don't be "packet stupid." Once you're at 1–2 KB, making it smaller doesn't help — the TCP packet envelope has a minimum size. Also, decompression time for huge images can outweigh delivery savings.

Section 10Caching

"Why do you keep sending me that logo?!" — Your browser

Cache-Control Directives

DirectiveMeaningUse Case
publicAny cache can storeStatic assets
privateBrowser onlyPersonalized content
max-age=NFresh for N seconds31536000 (1 year) for versioned assets
no-cacheRevalidate before useChangeable content
no-storeDon't cache at allSensitive data
immutableNever changesContent-hashed assets

Validation: ETags & 304

  • Time-based: max-age — browser doesn't ask at all until expired
  • Content-based: ETag + If-None-Match → 304 Not Modified (no body)
  • 304 saves bandwidth but still incurs a round-trip

Cache Busting, Vary & Layers

Cache Busting

URL = cache key. To invalidate, change the URL:

  • Content hash: app.3f2a1b.js (best practice)
  • Version: logo-v2.gif
  • Query string: logo.jpg?ts=324243 (quick & dirty)

Vary Header

One URL, multiple versions (compressed/uncompressed, WebP/JPEG, en/es). Vary tells caches which headers create distinct versions. Key becomes URL + Vary header.

Caching Layers

Browser → Proxy → CDN edge → Reverse proxy at origin

Caching is the most effective latency elimination technique. A cached resource has zero network latency.

Section 11Latency Reduction: CDNs and DNS

Move content closer to users.

CDNs & DNS Optimization

User (Tokyo) | [CDN Edge - Tokyo] ← Cache hit: 15ms | | Cache miss? v [Origin - San Diego] ← Full round-trip: 180ms

DNS Tips

  • Use a DNS vendor (Cloudflare, Google, Route 53) + own backup
  • Short TTLs = faster failover; Long TTLs = fewer lookups
  • Short domain names, handle typos (wwwamazon.com)
  • It's a running joke: when something's broken, it's probably DNS
DNS centralization is a double-edged sword. Speed from centralized DNS, but concentration creates single points of failure. The 2021 outages took large portions of the internet offline.
Pay for Performance: Cloud vendors offer latency improvement at cost. The faster you want to go, the more you pay — just like a car! Don't pick based on brand loyalty alone.

Section 12Preloading, Prefetch & Demand-Driven Loading

Send the bytes when you need them — no earlier, no later.

Resource Hints

Very little work for potentially quite a lot of gain:

HintPurposeWhen
preloadFetch resource needed for current page but discovered lateCritical fonts, hero images
prefetchFetch resource likely needed for next navigationNext-page assets
preconnectEstablish TCP+TLS earlyCDN, analytics, font origins
dns-prefetchResolve DNS only (lighter)Domains you might need

Lazy Loading

<img src="photo.jpg" loading="lazy" alt="Below the fold">
<img src="hero.jpg" loading="eager" alt="Hero image">

Native standard — no JavaScript needed. If browsers support it, go native.

Code Splitting & HTTP/2 Implications

Don't send a monolithic bundle. Split by route — homepage gets homepage.js, checkout gets checkout.js.

Optimization techniques have a shelf life. Bundling was essential under HTTP/1.1's 6-connection limit. Under HTTP/2, it can hurt performance by preventing fine-grained caching. Just as modem-era patterns made the 2010s web worse, 2010s patterns can make the 2020s web worse. Beware of expired "best practices" — especially those baked into LLM training data.

Section 13Service Workers & PWAs

Make the network a controlled — and even optional — component.

Cache Strategies

StrategyHow It WorksBest For
Cache-firstServe from cache; fetch on missStatic assets (CSS, JS, images)
Network-firstTry network; fall back to cacheDynamic content, APIs
Stale-while-revalidateServe stale immediately; update in backgroundFeeds, dashboards
Page Service Worker Cache Network | | | | |-- fetch(/api) -->| | | | |-- check cache -->| | | |<-- cache hit ----| | |<-- cached data --| | | | |-- bg fetch ------|-------------->| | |<-- fresh --------|<-------------| | |-- update cache ->| | [stale-while-revalidate: instant response, silent update]
Service Workers are the logical endpoint of the performance progression. The network should be considered a progressive enhancement target — this observation is behind the "local first" pattern of software.

Section 14Monitoring, Analytics & RAIL Conformance

Performance optimization without measurement is guessing.

What to Monitor

MetricWhat It MeasuresTarget
LCPLargest visible element renders≤ 2.5s
INPResponsiveness to interactions≤ 200ms
CLSVisual stability≤ 0.1
TTFBServer + network responsiveness≤ 800ms
FCPFirst visible content≤ 1.8s
TTIReliably interactive≤ 3.8s

Testing Tools

  • WebPageTest — real browser testing from global locations
  • Chrome DevTools / Lighthouse — lab-based auditing
  • Throttle testing — simulate slow network, mid-range devices
RUM reveals reality — and it will humble you. Lab tests on developer machines don't capture the 90th percentile user. See the OpenTelemetry Overview for building a telemetry pipeline.

Section 15Interface Illusion & the Perception Stack

Perceived speed and actual speed are different things.

When Engineering Runs Out

Keep people busy and occupied — from waiters bringing drinks early to Disneyland engaging you in a long line. The web has its own techniques:

  • Preloaders & spinners: Set expectations, signal progress
  • Skeleton screens: Show page structure before content arrives
  • Progress bars: Determinate bars feel psychologically faster than spinners
  • Optimistic UI: Show result before server confirms ("sent" before acknowledged)
Load Time (ms): 0 ---- 500 ---- 1000 ---- 1500 ---- 2000 ---- 2500 WSOD: [ blank white screen ] [content!] User perceives: "slow, broken?" Skeleton: [skeleton] [partial] [ content fills in ] User perceives: "fast, responsive" Same total load time. Different user experience.
The best performance work addresses both actual speed and perceived speed. Engineering handles the first; design handles the second. Neither alone is sufficient. But if bytes = money, no amount of UI/UX stage craft changes the cost.

Performance/Security Tensions & Privacy

  • Inlining CSS/JS helps performance but conflicts with CSP
  • Caching sensitive data requires careful private / no-store
  • Short URI paths add anti-scraping but maintenance complexity
Browser private mode isn't as private as users think. The browser forgets, but DNS caches, ISP logs, transit proxies, and server logs do not. Device fingerprinting goes beyond cookies. The technology isn't the problem — it's what is done with the data.

SummaryKey Takeaways

"Send less data, less often, from nearby, when it is needed."

Web Performance at a Glance

ConceptKey Takeaway
Why It MattersSpeed is the baseline expectation. Users abandon slow sites. Developer perceptions lie.
Bandwidth vs. LatencyLatency is the real enemy. Scale ≠ Speed.
RAILR < 100ms, A < 10ms, I < 50ms, L < 1s. ~90% is client-side.
Content SelectionFastest byte = one you never send. JS costs far more than images.
MinificationCode for maintenance, prepare for delivery. Minify first, then compress.
Compressiongzip/Brotli: up to 70% savings. Server config, not code change.
CachingZero-latency for cached resources. Versioned filenames + long max-age.
CDNs & DNSMove content closer. DNS optimization prevents bottlenecks.
Loading StrategyPreload now, prefetch next, lazy-load below fold. HTTP/2 changes the calculus.
Service WorkersMake the network optional. Cache-first / network-first / stale-while-revalidate.
MonitoringRUM reveals reality. Core Web Vitals. RAIL is a spectrum, not binary.
Interface IllusionPerceived ≠ actual speed. Skeleton screens and optimistic UI are legitimate tools.