The Gatekeepers of the Web
"Web servers are the gatekeepers between clients and your application. They handle HTTP, serve files, route requests, terminate TLS, and forward traffic to application servers. Before your code ever runs, the web server has already done the heavy lifting."
CSE 135 — Full Overview | Review Questions
Software that listens for HTTP requests and sends responses — the process between the network and your code.
www-dataApache, Nginx, Node.js, and Caddy — four servers with distinct architectures.
| Server | Architecture | Config Style | Strengths | Best For |
|---|---|---|---|---|
| Apache | Process/thread MPMs | .htaccess + httpd.conf | Module ecosystem, per-dir config | Traditional / shared hosting |
| Nginx | Event-driven | nginx.conf blocks | High concurrency, reverse proxy | Modern deployments, LB |
| Node.js | Event loop (single-threaded) | Programmatic JS | Full-stack JS, custom logic | API servers, real-time apps |
| Caddy | Event-driven | Caddyfile (minimal) | Automatic HTTPS, zero-config TLS | Simple deploys, personal projects |
How do you handle 10,000 simultaneous connections? Three models, each with trade-offs.
| Architecture | Memory per 10K Connections | Isolation | Blocking Tolerance | Example |
|---|---|---|---|---|
| Process-per-request | ~100 GB | Complete | High (each independent) | Apache prefork |
| Thread-per-request | ~10 GB | Partial (shared mem) | High (each independent) | Apache worker |
| Event-driven | ~100 MB | None (single thread) | Zero (one block stalls all) | Nginx, Node.js |
epoll (Linux) and kqueue (BSD/macOS), was the answer.
fs.readFileSync() in a request handler.
One server, many domains — the Host header decides which site to serve.
nginx -s reload applies config changes without dropping active connections. A restart kills every active download, WebSocket, and in-flight API call.
How the server matches a URL to a response — location priority, redirects, and rewrites.
/products/42 → /products.php?id=42). Use redirects when the URL has actually moved.
sendfile() zero-copy, compression, caching headers, and cache busting strategies.
| Algorithm | Compression | Speed | Browser Support |
|---|---|---|---|
| Gzip | ~70% reduction | Fast | Universal |
| Brotli | ~80% reduction | Slower compress, fast decompress | Modern (HTTPS only) |
app.a1b2c3.js — cache forever with immutablestyle.css?v=42 — simple but some CDNs ignore query stringsno-cache so they always revalidate (they reference the hashed assets)Nginx sits between clients and your app servers — TLS, static files, load balancing, and security isolation.
| Header | Purpose | Example |
|---|---|---|
X-Real-IP | Client's actual IP | 203.0.113.42 |
X-Forwarded-For | Chain of IPs (client, proxies) | 203.0.113.42, 10.0.0.1 |
X-Forwarded-Proto | Original protocol | https |
Host | Pass through original Host | example.com |
| Method | How It Works | Best For |
|---|---|---|
| Round-robin | Cycle through servers in order | Equal-capacity servers, stateless apps |
| Least connections | Send to server with fewest active connections | Uneven request durations |
| IP hash | Same client IP always hits same server | Session affinity (sticky sessions) |
| Weighted | Traffic proportional to server weight | Mixed-capacity servers |
Certificate chains, Let's Encrypt, and common TLS errors.
| Error | Likely Cause | Fix |
|---|---|---|
| ERR_CERT_DATE_INVALID | Certificate expired | certbot renew |
| ERR_CERT_COMMON_NAME_INVALID | Cert doesn't match domain | Reissue with correct domain(s) |
| ERR_CERT_AUTHORITY_INVALID | Missing intermediate cert | Use fullchain.pem, not cert.pem |
| ERR_SSL_VERSION_OR_CIPHER_MISMATCH | TLS version/cipher mismatch | Enable TLS 1.2 + modern ciphers |
Access logs (what happened), error logs (what broke), and correlation IDs (connecting the dots).
| Format | Fields | Best For | Machine-Parseable? |
|---|---|---|---|
| Common (CLF) | IP, user, time, request, status, size | Basic / small sites | Somewhat (regex) |
| Combined | CLF + Referer + User-Agent | General purpose | Somewhat (regex) |
| JSON | Structured key-value pairs | Log aggregation (ELK, Splunk) | Yes |
| Metric | What It Measures | Warning Sign |
|---|---|---|
| Request rate | Requests per second | Sudden spikes or drops |
| P95 latency | 95th percentile response time | Increasing trend |
| Error rate (5xx) | % of server errors | Above 1% |
| Active connections | Current open connections | Near configured maximum |
Defense in depth: rate limiting, security headers, and principle of least privilege.
| Header | Purpose | Recommended Value |
|---|---|---|
Strict-Transport-Security | Force HTTPS | max-age=31536000; includeSubDomains |
Content-Security-Policy | Control resource loading | default-src 'self'; script-src 'self' |
X-Content-Type-Options | Prevent MIME sniffing | nosniff |
X-Frame-Options | Prevent clickjacking | DENY or SAMEORIGIN |
Referrer-Policy | Control Referer leakage | strict-origin-when-cross-origin |
Permissions-Policy | Restrict browser features | camera=(), microphone=() |
server_tokens off) is one thin layer. Always combine with real protections: TLS, rate limiting, security headers, proper access controls.
Identify the bottleneck first, then tune. A fast Nginx won't help if your database is slow.
| Tunable | Default | Recommended | Why |
|---|---|---|---|
worker_processes | 1 | auto (= CPU cores) | One worker per core |
worker_connections | 512 | 1024–4096 | Max connections per worker |
keepalive_timeout | 75s | 30–65s | Reuse vs freeing resources |
client_body_buffer_size | 8k/16k | 16k–128k | Avoid writing body to disk |
gzip_comp_level | 1 | 4–6 | Compression vs CPU trade-off |
ab -n 1000 -c 100 http://localhost/wrk -t4 -c100 -d30s http://localhost/$request_time in access logs to find slow endpoints, then optimize the actual bottleneck.
Reproduce, isolate, read logs, hypothesize, test. 90% of issues are in the error log.
| Error | Likely Cause | First Step |
|---|---|---|
| 403 Forbidden | File permissions, no index.html, SELinux | ls -la /var/www/ |
| 404 Not Found | Wrong root, case mismatch, missing rewrite | Check document root path |
| "Address already in use" | Another process on the port | ss -tlnp | grep :80 |
| Tool | Purpose | Example |
|---|---|---|
curl -v | See full HTTP request/response | curl -v http://localhost/ |
ss -tlnp | Show listening ports and processes | ss -tlnp | grep :80 |
lsof -i | List open network connections | lsof -i :3000 |
| Error logs | Server-reported issues | tail -f /var/log/nginx/error.log |
nginx -t | Validate config before applying | nginx -t && nginx -s reload |
reload to apply config changes gracefully. Restarting drops all active connections — every download, WebSocket, and in-flight request.
The application is the server — full control, full responsibility.
| Responsibility | Apache/Nginx | Raw Node.js | Hybrid |
|---|---|---|---|
| Static file serving | Built-in, optimized | Your code | Nginx |
| TLS termination | Built-in | Your code | Nginx |
| Rate limiting | Config directive | Your code | Nginx |
| Crash recovery | Auto-restart workers | Process dies | PM2/systemd |
| Compression | Config directive | Your code | Nginx |
"Serverless" means you don't manage servers — your code runs in short-lived containers managed by a provider.
| Aspect | Traditional Server | Serverless |
|---|---|---|
| Scaling | Manual (add servers) | Automatic (per request) |
| Cost model | Pay for uptime (24/7) | Pay per invocation (idle = free) |
| Cold starts | None | 100ms–2s first request after idle |
| Persistent connections | WebSockets, SSE | Not supported |
| Debugging | SSH in, check logs | Cloud-based, limited visibility |
| Vendor lock-in | Low | High |
No single best architecture — start simple, add complexity only when you have a specific problem.
| Use Case | Recommended Architecture | Why |
|---|---|---|
| Static site / blog | CDN + object storage (S3, R2) | No server needed |
| Traditional web app | Nginx + PHP/Python/Ruby | Proven, well-documented |
| API server | Nginx + Node.js/Go | Event-driven concurrency |
| Real-time (chat, games) | Nginx + Node.js + WebSockets | Persistent connections |
| High-scale | Nginx LB + multiple instances | Horizontal scaling |
| Sporadic / event-driven | Serverless (Lambda, Cloud Functions) | Pay only when used |
16 sections of web server fundamentals in one table.
| Concept | Key Points |
|---|---|
| What is a Web Server | Software listening for HTTP requests. Handles TLS, routing, access control, logging before your code runs. |
| Common Servers | Apache (process/thread), Nginx (event-driven), Node.js (app is server), Caddy (auto HTTPS). Often paired. |
| Architecture | Process-per-request (isolated), thread-per-request (lighter), event-driven (10K+ connections). Hybrid is production answer. |
| Virtual Hosts | One server, many domains via Host header. Declarative config. Reload, never restart. |
| Routing | Location matching priority. Redirects (301/302) vs rewrites (internal). SPA fallback with try_files. |
| Static Serving | sendfile() zero-copy. Gzip/Brotli compression. Hashed filenames for cache busting. |
| Reverse Proxying | Nginx in front of app servers. Forward X-Forwarded-For. Load balancing: round-robin, least-conn, IP hash. |
| TLS/HTTPS | Root CA → Intermediate → Server cert. Let's Encrypt. Use fullchain.pem. |
| Logging | Access + error logs. Combined format. Correlation IDs. Monitor request rate, P95, error rate. |
| Security | Defense in depth: rate limiting, HSTS, CSP, X-Frame-Options. Obscurity is not defense. |
| Performance | worker_processes auto, worker_connections 1024+. Profile bottleneck first. |
| Debugging | Error log first. 502 = backend unreachable, 504 = too slow. nginx -t before reload. |
| Node.js Model | App IS the server. Full control, full responsibility. Always put Nginx in front. |
| Serverless | No server management. Cold starts, no WebSockets. Best for sporadic traffic. |
| Choosing | Start simple. Single Nginx + app handles most traffic. Add complexity for specific problems. |