Web Performance

The Performance Golden Rule

"Send less data, less often, from nearby, when it is needed."

CSE 135 — Full Overview

← → sections • ↓ more detail

Section 1Why Performance Matters

Speed is not a feature. Speed is the baseline expectation.

Speed and User Behavior

The data is in and it is irrefutable:

47% of consumers expect page loads under 2 seconds
Users become dissatisfied at 2–4 seconds
40% abandon sites that take 3+ seconds
5 → 19 seconds increases bounce rates by 65%
1-second delays may cost 7% loss of converting actions
Amazon: "100ms = 1% sales" — Google: "500ms delay = 20% fewer searches"

Speed is not a feature. Speed is the baseline expectation.

Read full section →

It's Not Just a Web Thing: Chronos & Kairos

The ancient Greeks distinguished two concepts of time:

Chronos: quantitative, clock time — measurable milliseconds
Kairos: qualitative, experiential time — how it feels

A minute waiting for a page to load feels longer than a minute reading interesting content.

The expectation floor only moves in one direction: bank lines → pneumatic tubes → ATMs → card swipe → phone tap. Once users experience faster, they cannot un-experience it.

User tolerance correlates with needs and wants. How long would you tolerate a page load if you really wanted something? (DMV line vs. prize giveaway) The ultimate "grader" of quality is a human shaped by their experience.

Read full section →

The Clock Starts Before Your Site

The user's total time: unlock device → open browser → type query → DNS → TCP → then your server sees a byte.

Three phases of the user journey:

Before they arrive — expectations set by experiences elsewhere
During their visit — more of a montage
After they leave — the takeaway, what they remember

The 99% Rule: 99% of the time, people are elsewhere. That shapes their perception of your 1%. If Amazon transacts in a second, users bring that expectation to your site — fair or not.

Read full section →

UX/DX Tension & Mobile Reality

The Economic Tension

Delivery is a variable cost — every user pays the byte tax
Development is a more fixed cost — write once, ship many times
Framework convenience saves dev hours but costs bytes in every download
Savings in creation may be far exceeded by costs in use!

Mobile Reality

Mobile phones are not desktop computers with small screens
Even flagship phones are a magnitude slower than laptops
Developers use flagships; real users are on median Android devices

"The web has moved to relatively underpowered mobile devices with connections that are often slow, flaky, or both." — Addy Osmani, Google

Read full section →

Section 2Key Definitions: Bandwidth and Latency

Latency is the real enemy.

Bandwidth vs. Latency

Bandwidth

Data capacity per unit time. Diameter of the pipe.

A wider pipe carries more water — but doesn't make a drop arrive sooner.

Latency

Network delay / travel time. Length of the pipe.

Includes processing, queuing, and transmission delays at every hop.

Upgrading bandwidth 5–10 Mbps: only ~5% load time improvement
Reducing round-trip time 20ms: linear load time improvement

"Latency is the real enemy." Bandwidth gets the marketing (100 Gbps! Unlimited!); latency gets the blame.

Read full section →

Fallacies of Hope & Scale ≠ Speed

Common misconceptions:

More bandwidth ≠ faster (latency dominates)
More servers ≠ faster (capacity ≠ speed)
Geographically close ≠ network close
Network conditions are NOT constant
New technologies will NOT just solve it

"Scale ≠ Speed." Think of a store with too few registers — adding a checker improves throughput (scale) but doesn't make any individual checkout faster (speed). They are related but different.

Read full section →

Key Definitions

Term	Definition
TTFB	Time to First Byte — request sent to first byte arriving
TTLB	Time to Last Byte — request sent to final byte arriving
FCP	First Contentful Paint — first DOM content rendered
TTI	Time to Interactive — visually rendered AND responsive to input
LCP	Largest Contentful Paint — largest visible element rendered (Core Web Vital)
CLS	Cumulative Layout Shift — visual stability (Core Web Vital)
INP	Interaction to Next Paint — responsiveness to interactions (Core Web Vital)

Read full section →

Section 3RAIL: The Performance Model

Response < 100ms. Animation < 10ms. Idle < 50ms. Load < 1s.

RAIL Breakdown

Phase	Budget	What It Means
Response	< 100ms	Tap/click to visible feedback must feel instant
Animation	~10ms/frame	60 fps = ~16.66ms/frame. ~10ms of work to leave room for rendering. Failure = jank
Idle	0–50ms	Background work in ≤ 50ms chunks. You share the main thread with the UI!
Load	< 1000ms	Critical above-the-fold content on screen. Avoid the White Screen of Death

Jank Busting: Your client-side JavaScript shares the execution thread with the browser's paint process. What you do can literally stall the browser itself!

Read full section →

The 90% Problem & RAIL Is Not Binary

"~90% of user-response time issues are client-side." — Steve Souders

Start with client-side optimizations. They are simpler, easier to measure, and affect the largest portion of the user's experience. Client-side truly is harder than server-side.

RAIL Is a Spectrum

Meeting: User experience meets the RAIL target
Tolerating: Near the target but degraded
Failing: Significantly misses the target

3rd Party Lack of Control: When you rely on external services and linked scripts you have given away control of your performance outcome. Performance guarantees (SLAs) cost real money.

Read full section →

Section 4Content Selection & Payload Reduction

The fastest byte is the one you never send.

Do You Even Need It?

The first question: do we really need this object?

The "localhost" effect: the developer's perception of performance isn't the user's reality on a slow phone over cellular.

Content You Might Not Need

Marketing <meta> tags that add no user-facing value
Comments, excessive whitespace, redundant markup
Unused CSS rules in a monolithic stylesheet
Unused JS code paths in a monolithic bundle
High-resolution images displayed at thumbnail sizes

Not all bytes are the same. JS must be downloaded, parsed, AND compiled. 200 KB of JavaScript is significantly more expensive than 200 KB of images.

Read full section →

Framework Bloat & The Tangibility Problem

Importing all of Bootstrap CSS just to center a few elements. Including an entire JS library for one utility function. The entire framework ships to every user, every time.

Single framework <link> tag = high DX (easy for devs)
Custom-building with care = lower DX but better UX (fewer bytes)
Savings in creation may be far exceeded by costs in use!

The (Un)clarity of (In)tangibility: Other engineering disciplines acknowledge the nature of their materials. We don't see the humming data centers. At the end of the day every byte has a power cost — poor performance has costs just like single-use plastic bottles.

Read full section →

Section 5Minification: HTML

"Code for maintenance, but prepare for delivery."

Dev-Performance Pragmatism

"Code for maintenance, but prepare for delivery." Minification is automated — part of a build pipeline, not manual effort. Your source stays readable; your delivered code is optimized.

Minify first, then compress. They address different things and compound: minification removes structural redundancy, compression removes statistical redundancy.

HTML Minification Techniques

Technique	Example
Whitespace removal	Collapse multiple spaces (preserve `<pre>`, `<textarea>`)
Optional quote removal	`<p id="foo">` → `<p id=foo>`
Comment removal	Strip `<!-- -->` (also reduces info leakage)
Boolean shortening	`<hr noshade="noshade">` → `<hr noshade>`
Self-closing cleanup	`<br />` → `<br>`
Entity remapping	`®` → `®` (whichever shorter)

Read full section →

HTML Is the Base Object

The final product is HTML — it is the atoms of web content. As the root document that triggers all other fetches, any delay adds delays to everything downstream.

Don't cache base HTML objects aggressively. If you cache the root HTML, invalidating dependent objects (versioned CSS, JS) becomes impossible.

Markup Quality: Valid, semantic markup reduces bytes, improves structure, helps accessibility (a11y), and even helps bots. A seasoned engineer knows that even the smallest thing can have an outsized impact in a complex system.

Read full section →

Section 6Minification: CSS

CSS impact isn't just delivery — it's render blocking.

The Growing CSS Problem

CSS requests per page: 1–3 → 6+ over a decade (p90: 9 → 18)
Misused CSS causes rendering problems and expensive reflows
Font dependencies can block the critical render path
Improper loading causes FOUC (Flash of Unstyled Content)
Measured by LCP and CLS

Key Techniques

Technique	Example
Unused CSS removal	PurifyCSS, PurgeCSS
Rule shorthands	`margin-left/right/top/bottom` → `margin`
Value recasting	`#ff0000` → `red`, `bold` → `700`
Unit elimination	`0px` → `0`
Rule merging	Collapse redundant selectors

Read full section →

Critical CSS & The CSP Tension

Critical CSS

Extract above-the-fold CSS and inline it in <head>. Eliminates a render-blocking request for the initial viewport. Load full stylesheet asynchronously afterward.

The Three-Way Trade-Off

Performance: inlining CSS eliminates a network request
Security: CSP disallows inline styles to prevent injection
Complexity: CSP nonces/hashes add build pipeline work

CSS optimization is not just about bytes — it's about render blocking. The browser must parse CSS entirely before rendering. Excessive HTML and CSS in JavaScript gets in the way of what the browser has been optimized to do!

Read full section →

Section 7Minification: JavaScript

JS abuse is arguably the worst performance issue on the modern web.

JS Costs More Than You Think

JavaScript must be downloaded, parsed, and compiled. It's a triple cost.

The Uncanny Valley: A page visually loaded but not usable. Measured by TTI — can be severe on median Android phones. This has led some to describe modern WWW as more "Wealth Western Web" than "World Wide Web."

Key Techniques

Technique	Savings
Variable name rewriting (`var myLongName` → `var x`)	Significant
Dead code elimination	Variable — can be very large in frameworks
Whitespace reduction	Moderate (watch for ASI)
Repetition rewrites	Moderate
Code optimizations (`i=i+1` → `i++`)	Small, adds up

Read full section →

Bundling, Code Splitting & "The Demo Illusion"

Separate files = developer value. Bundle for delivery.
But not a monolith — code split by route (homepage.js, checkout.js)
Postel's Law: be conservative in optimizations unless you control all code

"The Demo Illusion": A library is 40 KB gzipped — but 200 KB+ ungzipped. Then it fetches plugins and extensions. The Getting Started demo is small and easy — that's the illusion. Performance reality crashes in once it's too late to remove the dependency. Know what a dependency does before you adopt it.

Read full section →

Section 8Fonts, URI Paths & Finishing Touches

Fine polish for large-scale optimization.

Font Optimization & URI Path Reduction

Fonts

Is the font worth it? Most users can't tell SansSerif A from B
Consider system fonts — free delivery!
Variable fonts: one file represents many variations
Subset fonts: only glyphs you need. 150 KB → 15 KB for Latin subset
Do you need all weight versions? Semi-bold, bold, light, extra-bold, regular?

URI Path Reduction

<img src="/images/UCSD_logo.png"> <img src="/d/i/u0.png"> <link href="/css/main-styles.css"> <link href="/d/c/m0.css">

Only rewrite dependent resource paths — user-facing URLs stay readable.

Fine polish for large-scale sites. At hyperscale, every byte truly matters. For smaller sites, these yield diminishing returns.

Read full section →

Section 9Compression

Up to 70% savings — usually just a server config change.

HTTP Compression

Format	Notes
gzip	Universal support. The safe default.
Brotli	Better ratios. Requires HTTPS. Increasingly supported.
Deflate	Older. Largely superseded by gzip.
Dictionary	Emerging. Leverages "sameness" across web pages.

Minify first, then compress. They are complementary — minification removes structural redundancy, compression removes statistical redundancy.

Read full section →

Image Compression

Images are the biggest and most obvious byte savings opportunity — yet consistently neglected.

Modern formats: WebP, AVIF offer significant improvements over JPEG/PNG
But consider user context — if users want to save images locally, exotic formats frustrate
Inline SVG replaces raster images for simple visuals

Don't be "packet stupid." Once you're at 1–2 KB, making it smaller doesn't help — the TCP packet envelope has a minimum size. Also, decompression time for huge images can outweigh delivery savings.

Read full section →

Section 10Caching

"Why do you keep sending me that logo?!" — Your browser

Cache-Control Directives

Directive	Meaning	Use Case
`public`	Any cache can store	Static assets
`private`	Browser only	Personalized content
`max-age=N`	Fresh for N seconds	`31536000` (1 year) for versioned assets
`no-cache`	Revalidate before use	Changeable content
`no-store`	Don't cache at all	Sensitive data
`immutable`	Never changes	Content-hashed assets

Validation: ETags & 304

Time-based: max-age — browser doesn't ask at all until expired
Content-based: ETag + If-None-Match → 304 Not Modified (no body)
304 saves bandwidth but still incurs a round-trip

Read full section →

Cache Busting, Vary & Layers

Cache Busting

URL = cache key. To invalidate, change the URL:

Content hash: app.3f2a1b.js (best practice)
Version: logo-v2.gif
Query string: logo.jpg?ts=324243 (quick & dirty)

Vary Header

One URL, multiple versions (compressed/uncompressed, WebP/JPEG, en/es). Vary tells caches which headers create distinct versions. Key becomes URL + Vary header.

Caching Layers

Browser → Proxy → CDN edge → Reverse proxy at origin

Caching is the most effective latency elimination technique. A cached resource has zero network latency.

Read full section →

Section 11Latency Reduction: CDNs and DNS

Move content closer to users.

CDNs & DNS Optimization

User (Tokyo) | [CDN Edge - Tokyo] ← Cache hit: 15ms | | Cache miss? v [Origin - San Diego] ← Full round-trip: 180ms

DNS Tips

Use a DNS vendor (Cloudflare, Google, Route 53) + own backup
Short TTLs = faster failover; Long TTLs = fewer lookups
Short domain names, handle typos (wwwamazon.com)
It's a running joke: when something's broken, it's probably DNS

DNS centralization is a double-edged sword. Speed from centralized DNS, but concentration creates single points of failure. The 2021 outages took large portions of the internet offline.

Pay for Performance: Cloud vendors offer latency improvement at cost. The faster you want to go, the more you pay — just like a car! Don't pick based on brand loyalty alone.

Read full section →

Section 12Preloading, Prefetch & Demand-Driven Loading

Send the bytes when you need them — no earlier, no later.

Resource Hints

Very little work for potentially quite a lot of gain:

Hint	Purpose	When
preload	Fetch resource needed for current page but discovered late	Critical fonts, hero images
prefetch	Fetch resource likely needed for next navigation	Next-page assets
preconnect	Establish TCP+TLS early	CDN, analytics, font origins
dns-prefetch	Resolve DNS only (lighter)	Domains you might need

Lazy Loading

<img src="photo.jpg" loading="lazy" alt="Below the fold">
<img src="hero.jpg" loading="eager" alt="Hero image">

Native standard — no JavaScript needed. If browsers support it, go native.

Read full section →

Code Splitting & HTTP/2 Implications

Don't send a monolithic bundle. Split by route — homepage gets homepage.js, checkout gets checkout.js.

Optimization techniques have a shelf life. Bundling was essential under HTTP/1.1's 6-connection limit. Under HTTP/2, it can hurt performance by preventing fine-grained caching. Just as modem-era patterns made the 2010s web worse, 2010s patterns can make the 2020s web worse. Beware of expired "best practices" — especially those baked into LLM training data.

Read full section →

Section 13Service Workers & PWAs

Make the network a controlled — and even optional — component.

Cache Strategies

Strategy	How It Works	Best For
Cache-first	Serve from cache; fetch on miss	Static assets (CSS, JS, images)
Network-first	Try network; fall back to cache	Dynamic content, APIs
Stale-while-revalidate	Serve stale immediately; update in background	Feeds, dashboards

Page Service Worker Cache Network | | | | |-- fetch(/api) -->| | | | |-- check cache -->| | | |<-- cache hit ----| | |<-- cached data --| | | | |-- bg fetch ------|-------------->| | |<-- fresh --------|<-------------| | |-- update cache ->| | [stale-while-revalidate: instant response, silent update]

Service Workers are the logical endpoint of the performance progression. The network should be considered a progressive enhancement target — this observation is behind the "local first" pattern of software.

Read full section →

Section 14Monitoring, Analytics & RAIL Conformance

Performance optimization without measurement is guessing.

What to Monitor

Metric	What It Measures	Target
LCP	Largest visible element renders	≤ 2.5s
INP	Responsiveness to interactions	≤ 200ms
CLS	Visual stability	≤ 0.1
TTFB	Server + network responsiveness	≤ 800ms
FCP	First visible content	≤ 1.8s
TTI	Reliably interactive	≤ 3.8s

Testing Tools

WebPageTest — real browser testing from global locations
Chrome DevTools / Lighthouse — lab-based auditing
Throttle testing — simulate slow network, mid-range devices

RUM reveals reality — and it will humble you. Lab tests on developer machines don't capture the 90th percentile user. See the OpenTelemetry Overview for building a telemetry pipeline.

Read full section →

Section 15Interface Illusion & the Perception Stack

Perceived speed and actual speed are different things.

When Engineering Runs Out

Keep people busy and occupied — from waiters bringing drinks early to Disneyland engaging you in a long line. The web has its own techniques:

Preloaders & spinners: Set expectations, signal progress
Skeleton screens: Show page structure before content arrives
Progress bars: Determinate bars feel psychologically faster than spinners
Optimistic UI: Show result before server confirms ("sent" before acknowledged)

Load Time (ms): 0 ---- 500 ---- 1000 ---- 1500 ---- 2000 ---- 2500 WSOD: [ blank white screen ] [content!] User perceives: "slow, broken?" Skeleton: [skeleton] [partial] [ content fills in ] User perceives: "fast, responsive" Same total load time. Different user experience.

The best performance work addresses both actual speed and perceived speed. Engineering handles the first; design handles the second. Neither alone is sufficient. But if bytes = money, no amount of UI/UX stage craft changes the cost.

Read full section →

Performance/Security Tensions & Privacy

Inlining CSS/JS helps performance but conflicts with CSP
Caching sensitive data requires careful private / no-store
Short URI paths add anti-scraping but maintenance complexity

Browser private mode isn't as private as users think. The browser forgets, but DNS caches, ISP logs, transit proxies, and server logs do not. Device fingerprinting goes beyond cookies. The technology isn't the problem — it's what is done with the data.

Read full section →

SummaryKey Takeaways

"Send less data, less often, from nearby, when it is needed."

Web Performance at a Glance

Concept	Key Takeaway
Why It Matters	Speed is the baseline expectation. Users abandon slow sites. Developer perceptions lie.
Bandwidth vs. Latency	Latency is the real enemy. Scale ≠ Speed.
RAIL	R < 100ms, A < 10ms, I < 50ms, L < 1s. ~90% is client-side.
Content Selection	Fastest byte = one you never send. JS costs far more than images.
Minification	Code for maintenance, prepare for delivery. Minify first, then compress.
Compression	gzip/Brotli: up to 70% savings. Server config, not code change.
Caching	Zero-latency for cached resources. Versioned filenames + long max-age.
CDNs & DNS	Move content closer. DNS optimization prevents bottlenecks.
Loading Strategy	Preload now, prefetch next, lazy-load below fold. HTTP/2 changes the calculus.
Service Workers	Make the network optional. Cache-first / network-first / stale-while-revalidate.
Monitoring	RUM reveals reality. Core Web Vitals. RAIL is a spectrum, not binary.
Interface Illusion	Perceived ≠ actual speed. Skeleton screens and optimistic UI are legitimate tools.

Read the full Performance overview →