Web Servers Overview

The Gatekeepers of the Web

"Web servers are the gatekeepers between clients and your application. They handle HTTP, serve files, route requests, terminate TLS, and forward traffic to application servers. Before your code ever runs, the web server has already done the heavy lifting."

CSE 135 — Full Overview | Review Questions

Section 1What is a Web Server?

Software that listens for HTTP requests and sends responses — the process between the network and your code.

DNS → TCP → TLS → HTTP

Browser DNS Server │ │ │ │ 1. "What IP is │ │ │ example.com?" │ │ │ ────────────────────────▸│ │ │ "93.184.216.34" │ │ │ ◂────────────────────────│ │ │ │ │ 2. TCP connect to 93.184.216.34:443 │ │ ──────────────────────────────────────────────▸│ │ 3. TLS handshake │ │ ◂─────────────────────────────────────────────▸│ │ │ │ 4. GET /index.html HTTP/1.1 │ │ ──────────────────────────────────────────────▸│ │ 5. HTTP/1.1 200 OK │ │ ◂──────────────────────────────────────────────│
  • Port 80 — HTTP (unencrypted)  |  Port 443 — HTTPS (encrypted)
  • Ports below 1024 are privileged — server starts as root, then drops to www-data
  • Static content: read file, send it  |  Dynamic content: run code, build response

Section 2Common Web Servers

Apache, Nginx, Node.js, and Caddy — four servers with distinct architectures.

Server Comparison

ServerArchitectureConfig StyleStrengthsBest For
ApacheProcess/thread MPMs.htaccess + httpd.confModule ecosystem, per-dir configTraditional / shared hosting
NginxEvent-drivennginx.conf blocksHigh concurrency, reverse proxyModern deployments, LB
Node.jsEvent loop (single-threaded)Programmatic JSFull-stack JS, custom logicAPI servers, real-time apps
CaddyEvent-drivenCaddyfile (minimal)Automatic HTTPS, zero-config TLSSimple deploys, personal projects
In production, you often use two servers together. Nginx handles TLS, static files, compression, and load balancing. Node.js (or PHP-FPM) handles your application logic. Best of both worlds.

Section 3Server Architecture Patterns

How do you handle 10,000 simultaneous connections? Three models, each with trade-offs.

Process-per-Request vs Event-Driven

Process-per-Request (Apache Prefork)

Master Process (httpd) │ ├─ fork() ─▸ Child 1 ─▸ Req A │ ├─ fork() ─▸ Child 2 ─▸ Req B │ ├─ fork() ─▸ Child 3 ─▸ Req C │ └─ fork() ─▸ Child 4 ─▸ Req D Each child: ~10MB, full isolation

Event-Driven (Nginx, Node.js)

Single Thread (Event Loop) │ │ ┌──────────────────────────────┐ │ │ Event Queue │ │ │ [Conn A] [Data B] [Write C] │ │ └──────────────────────────────┘ │ │ 1. Pick event 2. Handle it │ 3. If I/O, register callback │ 4. Repeat — never block

Architecture Comparison & C10K

ArchitectureMemory per 10K ConnectionsIsolationBlocking ToleranceExample
Process-per-request~100 GBCompleteHigh (each independent)Apache prefork
Thread-per-request~10 GBPartial (shared mem)High (each independent)Apache worker
Event-driven~100 MBNone (single thread)Zero (one block stalls all)Nginx, Node.js
The C10K Problem (1999): How to handle 10,000 simultaneous connections on one server? Process-per-request couldn't do it. Event-driven architecture, enabled by epoll (Linux) and kqueue (BSD/macOS), was the answer.
A single blocking operation stalls ALL connections in event-driven servers. This is why Node.js uses async I/O for everything — never run fs.readFileSync() in a request handler.

Section 4Configuration: Virtual Hosts

One server, many domains — the Host header decides which site to serve.

Virtual Hosting

DNS Server (93.184.216.34) │ │ example.com ───▸│── 93.184.216.34 ─────────▸│ blog.example.com ▸│── 93.184.216.34 ─────────▸│ Host header routing: api.example.com ─▸│── 93.184.216.34 ─────────▸│ │ Host: example.com │ → /var/www/example/ │ Host: blog.example.com │ → /var/www/blog/ │ Host: api.example.com │ → proxy to localhost:3000

Nginx vs Apache Config

# Nginx server block server { listen 80; server_name example.com www.example.com; root /var/www/example/public_html; index index.html; } # Apache VirtualHost <VirtualHost *:80> ServerName example.com ServerAlias www.example.com DocumentRoot /var/www/example/public_html </VirtualHost>
Always reload, never restart. nginx -s reload applies config changes without dropping active connections. A restart kills every active download, WebSocket, and in-flight API call.

Section 5Routing and URL Handling

How the server matches a URL to a response — location priority, redirects, and rewrites.

Nginx Location Priority & Redirects vs Rewrites

Nginx Location Priority (highest to lowest): 1. Exact match location = /favicon.ico { ... } 2. Preferential location ^~ /static/ { ... } 3. Regex location ~* \.(css|js)$ { ... } 4. Prefix (longest) location /api/ { ... } 5. Default location / { ... } SPA fallback: location / { try_files $uri $uri/ /index.html; # Let JS handle routing }
Rewrites happen internally — the client never knows. Redirects send a 301/302 back to the client, causing a new request. Use rewrites for clean URLs (/products/42/products.php?id=42). Use redirects when the URL has actually moved.

Section 6Static File Serving

sendfile() zero-copy, compression, caching headers, and cache busting strategies.

sendfile() Zero-Copy & Caching

Traditional serving (4 copies): Disk → Kernel buffer → User buffer → Kernel buffer → Network socket sendfile() zero-copy (2 copies): Disk → Kernel buffer ────────────────────────▸ Network socket User-space is never involved. The kernel handles the entire transfer.

Compression & Cache Busting

AlgorithmCompressionSpeedBrowser Support
Gzip~70% reductionFastUniversal
Brotli~80% reductionSlower compress, fast decompressModern (HTTPS only)
  • Hashed filenames (best): app.a1b2c3.js — cache forever with immutable
  • Query strings: style.css?v=42 — simple but some CDNs ignore query strings
  • HTML files: use no-cache so they always revalidate (they reference the hashed assets)

Section 7Reverse Proxying

Nginx sits between clients and your app servers — TLS, static files, load balancing, and security isolation.

Reverse Proxy Architecture

┌─────────────────┐ │ App Server 1 │ ┌──▸│ (Node.js :3001) │ Client ───▸ Nginx ─────────────✗ └─────────────────┘ (Reverse Proxy) │ ┌─────────────────┐ :443 ├──▸│ App Server 2 │ │ │ (Node.js :3002) │ Handles: │ └─────────────────┘ - TLS termination │ ┌─────────────────┐ - Static files └──▸│ App Server 3 │ - Compression │ (Node.js :3003) │ - Load balancing └─────────────────┘

Header Forwarding

HeaderPurposeExample
X-Real-IPClient's actual IP203.0.113.42
X-Forwarded-ForChain of IPs (client, proxies)203.0.113.42, 10.0.0.1
X-Forwarded-ProtoOriginal protocolhttps
HostPass through original Hostexample.com
Always forward the real client IP via X-Forwarded-For. Without it, your app sees the proxy's IP for every request — breaking rate limiting, geolocation, and access logs.

Load Balancing Strategies

MethodHow It WorksBest For
Round-robinCycle through servers in orderEqual-capacity servers, stateless apps
Least connectionsSend to server with fewest active connectionsUneven request durations
IP hashSame client IP always hits same serverSession affinity (sticky sessions)
WeightedTraffic proportional to server weightMixed-capacity servers
# Nginx upstream with load balancing upstream app_servers { least_conn; server 127.0.0.1:3001; server 127.0.0.1:3002; server 127.0.0.1:3003 weight=2; # Gets 2x traffic server 127.0.0.1:3004 backup; # Only if others are down }

Section 8TLS/HTTPS Configuration

Certificate chains, Let's Encrypt, and common TLS errors.

Certificate Chain & Common Errors

Certificate Chain of Trust: ┌────────────────────────────────┐ │ Root CA Certificate │ ← Built into browsers/OS │ (DigiCert, Let's Encrypt) │ (pre-trusted, self-signed) └──────────────┌─────────────────┘ │ signs ┌──────────────&DnArrow;─────────────────┐ │ Intermediate CA Certificate │ ← Signed by root CA │ (Let's Encrypt R3) │ (server must send this) └──────────────┌─────────────────┘ │ signs ┌──────────────&DnArrow;─────────────────┐ │ Server Certificate │ ← Your certificate │ (example.com) │ (contains your public key) └────────────────────────────────┘
ErrorLikely CauseFix
ERR_CERT_DATE_INVALIDCertificate expiredcertbot renew
ERR_CERT_COMMON_NAME_INVALIDCert doesn't match domainReissue with correct domain(s)
ERR_CERT_AUTHORITY_INVALIDMissing intermediate certUse fullchain.pem, not cert.pem
ERR_SSL_VERSION_OR_CIPHER_MISMATCHTLS version/cipher mismatchEnable TLS 1.2 + modern ciphers

Section 9Logging and Monitoring

Access logs (what happened), error logs (what broke), and correlation IDs (connecting the dots).

Access Log Anatomy

93.184.216.34 - jane [10/Oct/2025:13:55:36 -0700] "GET /api/books HTTP/1.1" 200 2326 "https://example.com/" "Mozilla/5.0" │ │ │ │ │ │ │ │ └─ Client IP │ └─ Timestamp └─ Request line │ │ └─ Referer └─ User-Agent └─ User (auth) │ └─ Size (bytes) └─ Status code

Log Formats

FormatFieldsBest ForMachine-Parseable?
Common (CLF)IP, user, time, request, status, sizeBasic / small sitesSomewhat (regex)
CombinedCLF + Referer + User-AgentGeneral purposeSomewhat (regex)
JSONStructured key-value pairsLog aggregation (ELK, Splunk)Yes

Correlation IDs & Key Metrics

Request: GET /api/orders/42 X-Request-ID: req-abc-123 Nginx log: [req-abc-123] GET /api/orders/42 → 200 │ &DnArrow; App log: [req-abc-123] Fetching order 42 from database │ &DnArrow; Database log: [req-abc-123] SELECT * FROM orders WHERE id = 42 (3ms)
MetricWhat It MeasuresWarning Sign
Request rateRequests per secondSudden spikes or drops
P95 latency95th percentile response timeIncreasing trend
Error rate (5xx)% of server errorsAbove 1%
Active connectionsCurrent open connectionsNear configured maximum
Start with access logs + error logs. Add metrics next. Only add distributed tracing when you have multiple services that need correlation.

Section 10Security Hardening

Defense in depth: rate limiting, security headers, and principle of least privilege.

Security Headers & Checklist

HeaderPurposeRecommended Value
Strict-Transport-SecurityForce HTTPSmax-age=31536000; includeSubDomains
Content-Security-PolicyControl resource loadingdefault-src 'self'; script-src 'self'
X-Content-Type-OptionsPrevent MIME sniffingnosniff
X-Frame-OptionsPrevent clickjackingDENY or SAMEORIGIN
Referrer-PolicyControl Referer leakagestrict-origin-when-cross-origin
Permissions-PolicyRestrict browser featurescamera=(), microphone=()
Security through obscurity is not a defense strategy. Hiding your server version (server_tokens off) is one thin layer. Always combine with real protections: TLS, rate limiting, security headers, proper access controls.

Section 11Performance Tuning

Identify the bottleneck first, then tune. A fast Nginx won't help if your database is slow.

Key Tunables & Benchmarking

TunableDefaultRecommendedWhy
worker_processes1auto (= CPU cores)One worker per core
worker_connections5121024–4096Max connections per worker
keepalive_timeout75s30–65sReuse vs freeing resources
client_body_buffer_size8k/16k16k–128kAvoid writing body to disk
gzip_comp_level14–6Compression vs CPU trade-off

Benchmarking Tools

  • ab (Apache Bench): ab -n 1000 -c 100 http://localhost/
  • wrk: wrk -t4 -c100 -d30s http://localhost/
  • k6: Scriptable load testing with JavaScript (realistic scenarios)
Profile your application before tuning the server. Use $request_time in access logs to find slow endpoints, then optimize the actual bottleneck.

Section 12Common Issues and Debugging

Reproduce, isolate, read logs, hypothesize, test. 90% of issues are in the error log.

502 Bad Gateway & 504 Gateway Timeout

Debugging 502 Bad Gateway (proxy can't reach backend): 1. Is the backend running? systemctl status your-app 2. Right port? ss -tlnp | grep :3000 3. Can you reach it? curl -v http://127.0.0.1:3000/ 4. Error log says what? tail -20 /var/log/nginx/error.log Look for: "connect() failed (111: Connection refused)" 504 Gateway Timeout (backend too slow): • Slow database query • External API timeout • Insufficient proxy_read_timeout

Other Common Errors

ErrorLikely CauseFirst Step
403 ForbiddenFile permissions, no index.html, SELinuxls -la /var/www/
404 Not FoundWrong root, case mismatch, missing rewriteCheck document root path
"Address already in use"Another process on the portss -tlnp | grep :80

Essential Debugging Tools

ToolPurposeExample
curl -vSee full HTTP request/responsecurl -v http://localhost/
ss -tlnpShow listening ports and processesss -tlnp | grep :80
lsof -iList open network connectionslsof -i :3000
Error logsServer-reported issuestail -f /var/log/nginx/error.log
nginx -tValidate config before applyingnginx -t && nginx -s reload
Never restart a production server to debug it. Use reload to apply config changes gracefully. Restarting drops all active connections — every download, WebSocket, and in-flight request.

Section 13The Node.js Server Model

The application is the server — full control, full responsibility.

Three Models Compared

Traditional (Apache + PHP): Apache (web server) ──▸ PHP (runs per request, dies after response) Handles: ports, TLS, static files, connections, logging Raw Node.js (app IS the server): Node.js handles EVERYTHING: ports, HTTP parsing, routing, static files, responses — process runs continuously Hybrid (production — Nginx + Node.js): Nginx (reverse proxy) ──▸ Node.js (business logic only) Handles: TLS, static Handles: API, routing, files, compression, database queries rate limiting, LB
ResponsibilityApache/NginxRaw Node.jsHybrid
Static file servingBuilt-in, optimizedYour codeNginx
TLS terminationBuilt-inYour codeNginx
Rate limitingConfig directiveYour codeNginx
Crash recoveryAuto-restart workersProcess diesPM2/systemd
CompressionConfig directiveYour codeNginx
Never run raw Node.js facing the internet in production. Put Nginx in front. Node.js is single-threaded — one unhandled exception takes down the entire server and every active connection.

Section 14Serverless and Edge Computing

"Serverless" means you don't manage servers — your code runs in short-lived containers managed by a provider.

Trade-offs & When It Fits

AspectTraditional ServerServerless
ScalingManual (add servers)Automatic (per request)
Cost modelPay for uptime (24/7)Pay per invocation (idle = free)
Cold startsNone100ms–2s first request after idle
Persistent connectionsWebSockets, SSENot supported
DebuggingSSH in, check logsCloud-based, limited visibility
Vendor lock-inLowHigh

When to Use / When to Avoid

  • Use: Sporadic traffic, event-driven tasks (webhooks, image processing), prototypes
  • Avoid: Consistent high-throughput, WebSocket apps, latency-sensitive, stateful operations
  • Edge: Code at CDN nodes (Cloudflare Workers, Vercel Edge) — lower latency, more constraints

Section 15Choosing Your Architecture

No single best architecture — start simple, add complexity only when you have a specific problem.

Architecture Decision Table

Use CaseRecommended ArchitectureWhy
Static site / blogCDN + object storage (S3, R2)No server needed
Traditional web appNginx + PHP/Python/RubyProven, well-documented
API serverNginx + Node.js/GoEvent-driven concurrency
Real-time (chat, games)Nginx + Node.js + WebSocketsPersistent connections
High-scaleNginx LB + multiple instancesHorizontal scaling
Sporadic / event-drivenServerless (Lambda, Cloud Functions)Pay only when used
Start with the simplest architecture that works. A single Nginx + Node.js server handles more traffic than most apps will ever see. Every layer of abstraction makes debugging harder.

SummaryKey Takeaways

16 sections of web server fundamentals in one table.

Web Servers at a Glance

ConceptKey Points
What is a Web ServerSoftware listening for HTTP requests. Handles TLS, routing, access control, logging before your code runs.
Common ServersApache (process/thread), Nginx (event-driven), Node.js (app is server), Caddy (auto HTTPS). Often paired.
ArchitectureProcess-per-request (isolated), thread-per-request (lighter), event-driven (10K+ connections). Hybrid is production answer.
Virtual HostsOne server, many domains via Host header. Declarative config. Reload, never restart.
RoutingLocation matching priority. Redirects (301/302) vs rewrites (internal). SPA fallback with try_files.
Static Servingsendfile() zero-copy. Gzip/Brotli compression. Hashed filenames for cache busting.
Reverse ProxyingNginx in front of app servers. Forward X-Forwarded-For. Load balancing: round-robin, least-conn, IP hash.
TLS/HTTPSRoot CA → Intermediate → Server cert. Let's Encrypt. Use fullchain.pem.
LoggingAccess + error logs. Combined format. Correlation IDs. Monitor request rate, P95, error rate.
SecurityDefense in depth: rate limiting, HSTS, CSP, X-Frame-Options. Obscurity is not defense.
Performanceworker_processes auto, worker_connections 1024+. Profile bottleneck first.
DebuggingError log first. 502 = backend unreachable, 504 = too slow. nginx -t before reload.
Node.js ModelApp IS the server. Full control, full responsibility. Always put Nginx in front.
ServerlessNo server management. Cold starts, no WebSockets. Best for sporadic traffic.
ChoosingStart simple. Single Nginx + app handles most traffic. Add complexity for specific problems.