Authentication & Identity

The Lock Everyone Uses and Thousands Try to Pick

"Authentication is the one part of your application that, if wrong, compromises everything else."

CSE 135 — Full Overview | Review Questions

← → sections • ↓ more detail

Section 1The Authentication Problem

Proving identity — three factors, high stakes, deceptively hard.

Three Factors of Authentication

#	Factor	What It Is	Examples
1	Something you know	A secret only you should possess	Password, PIN, security question
2	Something you have	A physical object in your possession	Phone, hardware key, smart card
3	Something you are	A biometric characteristic	Fingerprint, face scan, retina

Security is more mindset than technology. Every decision — how you store credentials, handle failures, recover accounts — has security implications that compound. A small mistake in one area can undermine every other precaution.

Read full section →

Auth vs. Authz & the Stakes

Authentication vs. Authorization

Authentication: "Who are you?" — verifies identity
Authorization: "What can you do?" — determines permissions
You must authenticate before you can authorize

Why It Matters

Credential stuffing: leaked pairs tried on other sites
Account takeover: fraud, data theft, breach liability
Trust: once lost, nearly impossible to rebuild

You're building a lock that many people use and many will actively try to pick. Authentication is deceptively hard because mistakes compound across the entire system.

Read full section →

Section 2Build or Delegate?

The eternal "buy vs. build" question — applied to the most security-critical part of your app.

Delegate: Let Someone Else Handle It

Services like Auth0, Firebase Auth, Supabase Auth, and Clerk handle password storage, hashing, MFA, compliance, and reset flows.

When Delegation Makes Sense

Small team — can't dedicate engineering time to security infrastructure
Need compliance certifications (HIPAA, SOC2) you can't staff for
Want MFA, social login, passwordless without building each from scratch
Prefer per-user pricing over maintaining the system yourself

You always own every requirement. Customers don't care about your vendor having a bad day — they view you as the responsible party. Terms of Service shift liability. Real ownership costs money one way or another.

Read full section →

Build It Yourself & the Middle Ground

When You Must Build

Privacy is a core requirement — can't send credentials to a third party
Government, military, or air-gapped environments
Custom org requirements that don't fit a provider's model
Need fine-grained control providers don't offer

The Honest Middle Ground

Most real-world apps land in between: delegate the hard parts (password hashing, MFA tokens, social login) while owning session management, authorization, and user data storage.

A partially maintained auth system is worse than a fully delegated one. Be honest about what you can maintain long-term.

Read full section →

Section 3Password Storage Done Right

Never store passwords. Store proof that someone knows the password.

Hashing vs. Encryption

Concept	Direction	Purpose
Encryption	Reversible — if you have the key, get the original back	Protecting data in transit or at rest
Hashing	One-way — cannot get the original from the hash	Storing proof of passwords

Why Fast Hashes Are the Enemy

MD5, SHA-256 — designed to be fast. A GPU does billions per second.
Rainbow tables — precomputed hash → password lookups for common algorithms
Salting — unique random value per user, stored with hash, defeats rainbow tables

If you hash password123 with plain SHA-256, it's already in a rainbow table. Fast hashing without salts is barely better than plaintext.

Read full section →

The Right Algorithms

Algorithm	Key Property	Status
bcrypt	Built-in salt, configurable work factor, intentionally slow	Battle-tested, widely supported, still good
scrypt	Memory-hard — requires significant RAM, not just CPU	Good alternative, used in some crypto systems
Argon2id	Memory-hard, resistant to GPU & side-channel attacks	Current recommendation (PHC winner)

Work factor controls how slow the hash is to compute. bcrypt cost 12 ≈ 250ms — imperceptible to a user, devastating to an attacker trying billions of guesses. Increase it as hardware gets faster.

Read full section →

Password Hashing Flow

User enters password │ ▼ ┌──────────────┐ ┌──────────────────┐ │ plaintext │────▶│ Generate unique │ │ password │ │ random salt │ └──────────────┘ └────────┬─────────┘ │ ▼ ┌───────────────────────┐ │ Hash function │ │ (bcrypt / Argon2id) │ │ password + salt │ │ + work factor │ └───────────┬───────────┘ │ ▼ ┌───────────────────────┐ │ Store in database: │ │ salt + hash │ │ (bcrypt embeds salt │ │ in the output) │ └───────────────────────┘ At login: hash entered password with stored salt and compare. Never decrypt — just compare.

Read full section →

Section 4What NOT to Do (Hall of Shame)

Every item on this list has been found in production systems — some with millions of users.

Danger Items 1–4

Storing passwords in plaintext. Database breached = every password immediately compromised. No effort required by the attacker.

Storing passwords encrypted with a reversible key. Encryption is not hashing. One compromised key = all passwords recoverable.

Using fast hashes (MD5, SHA-256) without salt or work factor. A GPU tries billions per second. Rainbow tables exist for every common password.

Hardcoding a single salt for all users. Identical passwords still produce identical hashes. Attacker needs one rainbow table for your entire database.

Read full section →

Danger Items 5–8

Logging passwords in server logs or error messages. Login failed for user admin with password hunter2 — PII/secrets in logs is a general problem.

Sending passwords in URL query strings. URLs appear in browser history, server logs, proxy logs, referrer headers, and analytics tools.

Emailing users their password. If a "forgot password" flow sends your actual password, the system stored it in recoverable form. Red flag.

Displaying password "hints." Hints like "rhymes with bassword" often give away the password entirely. A relic of the 1990s.

How to check: Try the "forgot password" flow on services you use. If they email you your password, they stored it in plain text. If they send a reset link, they're at least doing that part right.

Read full section →

Section 5Password Policies

NIST 800-63B overturned decades of conventional wisdom — many organizations still haven't caught up.

NIST 800-63B Key Changes

Length over complexity: A 20-char passphrase like angry purple unicorn battery powered by 7 beats P@ssw0rd!
No forced rotation: Mandatory expiration produces Password1!, Password2!, Password3!
Check against breach databases: Use the HIBP API (k-anonymity: send first 5 chars of SHA-1, get matching suffixes)
Discourage complexity rules: They produce predictable patterns: capitalize first letter, add 1! at the end

Change passwords when there's evidence of compromise, not on a calendar.

Read full section →

Policy Comparison

Policy	NIST Recommendation	Common (Bad) Practice
Minimum length	8 absolute min, 12+ recommended	6 characters (far too short)
Maximum length	At least 64 characters	16-character cap (you're hashing it anyway)
Complexity rules	Don't require them	Must have upper, lower, number, symbol
Expiration	Don't expire unless compromised	Every 90 days
Breach check	Yes, at registration	Not done at all

Bad policies have real usability costs: users write passwords on sticky notes, reuse passwords across sites, and use password managers solely to satisfy arbitrary rules rather than for genuine security.

Read full section →

Section 6Hardening the Login Flow

A correct password hash is necessary but not sufficient.

Rate Limiting, Tarpitting & Lockout

Technique	How It Works	Trade-off
Rate limiting	Cap attempts per account/IP in time window	Hard cutoff = DoS vector (attacker locks any account)
Tarpitting	Progressive delays: 1s after 3 fails, 5s after 5, 30s after 10	More elegant — expensive for attacker, mild for real users
Temp lockout	15 min after N failures	Reasonable balance
Permanent lockout	Locked until admin intervention	DoS vulnerability — attacker can lock every account
CAPTCHA	Challenge after many failures	Last resort — accessibility problems, increasingly solvable by AI

CAPTCHA is overkill for most situations. Try rate limiting and tarpitting first. Escalate only when you see problems. LLMs amplify common poor choices, so know what to ask for!

Read full section →

Account Enumeration & Decision Tree

Enumeration Prevention

Error messages: Always "Invalid username or password" — never reveal which was wrong
Registration/reset: "If an account exists, we've sent a reset link"
Constant-time comparison: Prevent timing attacks that reveal "user not found" vs "password wrong"
Information disclosure: Don't leak info via HTTP headers, session names, or error pages

Login attempt received │ ▼ ┌─────────────────┐ Yes ┌────────────────────┐ │ IP rate limit │─────────▶│ Return 429 │ │ exceeded? │ │ Too Many Requests │ └────────┬────────┘ └────────────────────┘ │ No ▼ ┌─────────────────┐ Yes ┌───────────────────┐ │ Account tarpit │─────────▶│ Delay response │ │ active? │ │ (1s / 5s / 30s) │ └────────┬────────┘ └────────┬──────────┘ │ No │ ▼ ▼ ┌─────────────────┐ Check credentials │ Check │ then ── Valid → session │ credentials │ ── Invalid → generic error └────────┬────────┘ + increment counter Valid │ Invalid ↓ session ↓ generic error + incr counter

Read full section →

Section 7Account Recovery

Reset, never recover. If a service can send you your password, it stored it wrong.

Secure Password Reset Flow

User: "I forgot my password" │ ▼ ┌──────────────────────┐ │ Enter email address │ └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ Always respond: │ Is email in system? │───▶ "If an account exists, └──────────┬───────────┘ we've sent a reset link." │ (internally) ▼ ┌──────────────────────┐ │ Generate token: │ │ • Crypto-random │ │ • Single-use │ │ • Expires: 15-60min │ └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ User clicks link, │ │ enters new password │ └──────────┬───────────┘ │ ▼ ┌──────────────────────┐ │ Hash new password, │ │ invalidate token, │ │ invalidate sessions │ └──────────────────────┘

Read full section →

Recovery Mistakes

Security questions: "Mother's maiden name?" is on Facebook. "First car?" has a small search space. A parallel auth path with weaker security than the password itself.

SMS-only recovery: Phone numbers can be hijacked via SIM swapping — attacker convinces your carrier to transfer your number to their SIM.

No token expiration: A reset link that works forever is a permanent backdoor. If the email account is compromised later, every old reset link becomes an attack vector.

The recovery flow is often the weakest link. It's a parallel authentication path — if it's weaker than the primary path, it undermines everything. Password managers eliminate "I forgot my password" almost entirely.

Read full section →

Section 8GeoIP & Risk-Based Auth

Contextual signals beyond username and password — same tech as ad-tech tracking.

Contextual Signals

GeoIP Monitoring

Map IP addresses to geographic locations
Impossible travel: New York then Beijing in 30 minutes = one login isn't legitimate
Trigger additional verification for logins from new countries

Device Fingerprinting

Combines browser, OS, screen, fonts, WebGL, timezone into near-unique ID
Known device = reduce friction; new device = increase verification

Machine Registration

"Remember this computer" stores a long-lived token
Future logins from that device are lower risk — the trusted device model

Read full section →

Risk Scoring Engine

┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ Device │ │ IP / Geo │ │ Time │ │ Behavior │ │ fingerprint │ │ location │ │ of day │ │ patterns │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ RISK SCORING ENGINE │ │ │ │ Known device + home IP + usual time = LOW RISK → Allow │ │ Known device + new IP + usual time = MEDIUM → Email alert │ │ New device + new country + odd hour = HIGH RISK → Require MFA │ │ Impossible travel detected = CRITICAL → Block + alert │ └─────────────────────────────────────────────────────────────────────┘

The line between security and surveillance is thin. Fingerprinting, location tracking, behavioral analysis — identical to methods ad-tech companies use for tracking users across the web.

Read full section →

Section 9Sessions & Tokens

Authentication happens once. HTTP is stateless. How do you remember they're logged in?

Server-Side Sessions vs. JWT

Aspect	Server-Side Sessions	JWT (JSON Web Tokens)
Storage	Server (memory, Redis, DB)	Client (cookie or header)
Revocation	Easy — delete on server	Hard — can't revoke before expiry without blacklist
Scaling	Needs shared store (Redis)	Stateless — any server can verify
Size	Small session ID in cookie	Larger token with embedded claims
Best for	Traditional web apps	APIs, microservices, SPAs

Token refresh pattern: Short-lived access token (15 min–1 hr) + longer-lived refresh token (days–weeks). Limits exposure if access token is stolen, keeps user logged in.

Read full section →

Secure Cookie Flags

Flag	Purpose	Omitting It
`HttpOnly`	Cookie cannot be accessed by JavaScript	XSS attacks can steal the cookie
`Secure`	Cookie only sent over HTTPS	Cookie transmitted in plaintext over HTTP
`SameSite=Strict` / `Lax`	Cookie not sent on cross-origin requests	Vulnerable to CSRF attacks

Be careful with layered mechanisms. Visualize a person with 5 locks on their door in a NYC apartment — it does very little against someone smashing the door down or crawling through the fire escape window.

Read full section →

Section 10Implementation Patterns

Node.js assembles libraries; PHP bundles auth primitives into the language.

Node.js Auth Stack

bcrypt / bcryptjs for password hashing
express-session + connect-redis for sessions
jsonwebtoken for JWT-based auth
passport for strategy-based auth (local, OAuth, SAML)

// Registration: hash and store const bcrypt = require('bcrypt'); const saltRounds = 12; async function register(username, password) { const hash = await bcrypt.hash(password, saltRounds); await db.query( 'INSERT INTO users (username, password_hash) VALUES ($1, $2)', [username, hash] ); } // Login: compare hash async function login(username, password) { const user = await db.query( 'SELECT * FROM users WHERE username = $1', [username] ); if (!user) return false; // don't reveal "user not found" const match = await bcrypt.compare(password, user.password_hash); return match ? user : false; }

Read full section →

PHP Auth Stack

// Registration: hash and store $hash = password_hash($password, PASSWORD_BCRYPT); // or: PASSWORD_ARGON2ID (PHP 7.2+) $stmt = $pdo->prepare( 'INSERT INTO users (username, password_hash) VALUES (:user, :hash)' ); $stmt->execute(['user' => $username, 'hash' => $hash]); // Login: verify $stmt = $pdo->prepare('SELECT * FROM users WHERE username = :user'); $stmt->execute(['user' => $username]); $user = $stmt->fetch(); if ($user && password_verify($password, $user['password_hash'])) { session_start(); $_SESSION['user_id'] = $user['id']; session_regenerate_id(true); // prevent fixation }

Key difference: PHP's password_hash() and password_verify() are built-in — no packages to install, no version conflicts, no supply-chain risk. Node requires npm install bcrypt with native C++ bindings. Both produce correct bcrypt hashes — PHP is simpler to get right.

Read full section →

Section 11OAuth & Federated Identity

Let Google handle passwords — compelling idea, but is it?

OAuth 2.0 Authorization Code Flow

┌──────────┐ ┌──────────────┐ │ Your │ │ Identity │ │ App │ │ Provider │ │ (Client) │ │ (Google, etc)│ └────┬─────┘ └──────┬───────┘ │ 1. Redirect to provider │ │ /authorize?client_id=... │ │ &scope=openid email profile │ │──────────────────────────────────────▶│ │ 2. User logs in at provider │ │ 3. User consents to scopes │ │ 4. Redirect back with CODE │ │◀──────────────────────────────────────│ │ 5. Exchange code for ACCESS TOKEN │ │ POST /token {code, client_secret} │ │──────────────────────────────────────▶│ │ 6. Provider returns tokens │ │ {access_token, id_token} │ │◀──────────────────────────────────────│ │ 7. GET /userinfo with bearer token │ │──────────────────────────────────────▶│ │ 8. Returns {name, email, picture} │ │◀──────────────────────────────────────│ │ 9. Create local session/account │

Read full section →

OIDC, SSO & Genuine Benefits

OpenID Connect (OIDC)

OAuth 2.0 is authorization ("can this app access my photos?"). OIDC adds an identity layer on top — ID tokens with user claims (name, email, picture).

Single Sign-On (SSO)

Log in once → authenticated across all trusting services. Enterprise SSO (SAML/OIDC) = one login for email, Slack, GitHub, internal tools.

Genuine Benefits

No password to store — you never see the user's password
No reset flow to build — provider handles "forgot my password"
Built-in MFA — if user has MFA on Google, your app gets it free
Reduced attack surface — can't leak passwords you never had

Read full section →

Section 12The Trust Problem with OAuth

Benefits are real — costs are rarely discussed honestly.

Surveillance, Scope Creep & Lock-In

Surveillance by Design

"Sign in with Google" gives Google a log of every time your user visits your site. For an advertising company, this is valuable behavioral data you're handing over for free.

Scope Creep

Start with openid email, but providers make it easy to request more: contacts, calendar, drive files. Consent screens become checkboxes users click through without reading.

Vendor Lock-In

Google bans a user's account → they lose access to your service
Provider changes API / raises prices → you scramble to adapt
Provider outage → your users can't log in

Read full section →

Trust Asymmetry & Alternatives

The "Free" Illusion

Social login costs nothing in dollars. But the currency is your users' privacy. The provider knows which services each person uses, when, and how often.

Trust Asymmetry

You trust the provider with your users' identity. The provider has no obligation to your users — they're not the customer, they're the product.

GDPR implications: Transferring identity data to a third-party provider (especially outside the EU) requires informed consent. "Sign in with Google" is not automatically GDPR-compliant.

Self-Hosted Alternatives

Keycloak, Ory, Authentik — OAuth/OIDC capabilities without sending user data to a third party. Protocol benefits without the surveillance trade-off. Cost = infrastructure and maintenance.

Read full section →

Section 13Multi-Factor Authentication

Even if the password is stolen, the attacker still needs the second factor.

Factor Comparison

Method	How It Works	Strength	Weakness
Hardware keys (FIDO2/WebAuthn)	USB/NFC, cryptographic challenge-response	Phishing-resistant, no shared secrets	Cost (~$25-50), can be lost
TOTP apps	Shared secret + time = 6-digit code; offline	No network needed, widely supported	Phishable (fake site trick)
Push notifications	"Was this you?" tap to approve	Convenient, low friction	MFA fatigue attacks
SMS codes	6-digit code via text	Familiar, no app needed	SIM swapping, SS7 intercept
Email codes	Code or link sent to email	Universal (everyone has email)	Email accounts are targets
Backup codes	Single-use recovery codes	Work when all else fails	Must be stored securely

Read full section →

The Factor Hierarchy

2FA Method Hierarchy (strongest to weakest): ████████████████████████████████████████ Hardware keys (FIDO2/WebAuthn) ███████████████████████████████████ Authenticator apps (TOTP) ██████████████████████████ Push notifications █████████████████████ SMS codes ████████████████ Email codes ▓▓▓▓▓▓▓▓▓▓▓▓ Nothing (password only) ◀─────────────────────────────────────▶ More phishing-resistant ──▶ Less resistant Harder to intercept ──────▶ Easier to intercept

TOTP explained: Shared secret (QR code at setup) + current time (30-sec windows) = 6-digit code computed independently by server and app. Secret never transmitted after setup — only the derived code is sent at login.

Read full section →

Section 14The 2FA Surveillance Paradox

Security measures designed to protect you create new vectors for surveillance.

Phone as Universal ID & Location Tracking

Phone Number as Universal Identifier

Registering a phone for 2FA links your online identity to a physical device & carrier account
Phone numbers are tied to government ID, billing address, physical SIM
Far more identifying than an email address

Location Tracking

Your phone connects to cell towers; towers know where you are
Binding auth to a phone = someone with carrier access can correlate auth events with location
Approving a push notification for your corporate VPN tells your employer you were awake, near your phone, and roughly where

Biometric Leakage

Face/fingerprint data can leak into systems beyond authentication
Technology cannot solve societal abuses — pure technosolutionism doesn't work

Read full section →

Cross-Service Correlation & the Alternative

Cross-Service Correlation

Same phone number for 2FA on ten services = an adversary with carrier access can correlate all ten identities. Banking, social media, work VPN — all linked by one number.

The Irony

Mandatory 2FA (especially SMS-based) requires giving an organization a phone number — a personal identifier with far more tracking potential than an email address.

The privacy-preserving alternative: hardware security keys (FIDO2/WebAuthn). No phone number, no network connection, no location data. Cryptographic challenge-response on the device itself. The server learns the correct key was present — nothing more.

Read full section →

Section 15Security vs. Usability

The question is never "is this more secure?" but "is the security improvement worth the usability cost?"

Friction Budgets & Adaptive Security

Friction Budgets

Users have finite tolerance for security friction. Once the budget is spent, they find workarounds: reuse passwords, share accounts, write credentials on sticky notes, or stop using the service.

Risk-Based / Adaptive Security

Low-risk: Viewing a dashboard → basic auth is enough
High-risk: Changing password, transferring money → re-authenticate, MFA
Progressive security keeps the common path fast, protects critical actions

Security Theater

Security questions — guessable, publicly findable
Password expiration — produces predictable sequences
CAPTCHAs for every login — annoying, increasingly solvable by bots
Complexity rules — produce P@ssw0rd! not genuinely strong passwords

Read full section →

Password Managers & Passkeys

The Password Manager Paradox

The most secure option — a unique, random 30+ character password for every site — requires a tool most users don't have, don't understand, or don't trust. The most secure path isn't the easiest path.

Passkeys: The Emerging Answer

Device-bound cryptographic credentials that replace passwords entirely
No password to remember, steal, or phish
User authenticates with biometric (fingerprint, face) or device PIN
Backed by Apple, Google, Microsoft — built on FIDO2/WebAuthn
Sync across devices via iCloud Keychain, Google Password Manager, etc.

The design principle: make the secure path the easy path. If the most secure option is also the most convenient, users will choose it — not because they care about security, but because it's easier. Passkeys are that convergence point.

Read full section →

SummaryKey Takeaways

15 concepts of authentication and identity in one table.

Authentication at a Glance

Concept	Key Takeaway
The Auth Problem	Three factors (know, have, are); deceptively hard; security is mindset over technology
Build or Delegate?	Hosted services reduce risk but add dependency; most teams delegate the hard parts
Password Storage	Never store passwords; use bcrypt or Argon2id with unique salts and configurable work factor
Hall of Shame	Plaintext, reversible encryption, fast hashes, single salts, logged/emailed passwords
Password Policies	NIST 800-63B: length over complexity, no forced expiration, check breach databases
Hardening Login	Rate limiting, tarpitting, generic errors, constant-time comparison
Account Recovery	Reset, never recover; crypto-random, single-use, time-limited tokens
GeoIP & Risk Auth	Device + location + time + behavior = risk score; same tech as ad-tech tracking
Sessions & Tokens	Server sessions (revocable) vs. JWTs (stateless); always use HttpOnly, Secure, SameSite
Implementation	PHP: built-in password_hash/verify; Node: bcrypt package; both correct, PHP simpler
OAuth & Federated	Authorization Code flow; no password to store, built-in MFA; genuine benefits
The Trust Problem	Provider logs every login; vendor lock-in, scope creep; Keycloak/Ory as alternatives
MFA	Hardware keys > TOTP > push > SMS > nothing; MFA fatigue and SIM swapping are real
2FA Surveillance	Phone-based 2FA = tracking vectors; hardware keys are the privacy-preserving answer
Security vs. Usability	Friction budgets are finite; passkeys make the secure path the easy path

Read the full Authentication overview →