Data Storage & Databases

From Flat Files to Full Databases

"The right question isn't 'which database is best?' It's 'what does my application actually need?'"

CSE 135 — Full Overview | Review Questions

Section 1The Storage Landscape

Six storage options — parallel choices, not a ranking.

Storage Options at a Glance

Storage TypeSetupQuery PowerConcurrencyBest For
In-MemoryNoneCode-levelSingle processPrototyping, caching
Flat FilesMinimalRead/parse entire filePoor — race conditionsTutorials, config
SQLiteLowFull SQLSingle writerMobile, embedded, dev
PostgreSQL/MySQLMediumFull SQL + extensionsExcellent — ACIDProduction web apps
MongoDBMediumDocument queriesGood — built for scaleFlexible schemas
RedisMediumKey-valueExcellent — atomicCaching, sessions
NO SERVER REQUIRED REQUIRES A SERVER In-Memory Flat Files SQLite PostgreSQL MongoDB Redis
These aren't rungs on a ladder — they're parallel choices. Don't reach for PostgreSQL when a JSON file will do, or use a JSON file when you need transactions.

Section 2Flat-File Storage

Read file, parse JSON, modify array, write file back — the simplest persistence.

The Read-Modify-Write Cycle

Request comes in: 1. Read entire file from disk ┌──────────────┐ ─────────────────────────────▶ │ items.json │ │ [{"id":1, │ 2. Parse JSON into memory │ "name":"…"}│ items = JSON.parse(data) │ ... │ └──────────────┘ 3. Modify array in memory items.push(newItem) 4. Write entire file back ┌──────────────┐ ─────────────────────────────▶ │ items.json │ │ [updated │ │ contents] │ └──────────────┘ RACE CONDITION: Two requests read → modify → write simultaneously = second write overwrites first write's changes
ProsCons
Zero setup — no database serverNo query language — load entire file
Human-readable — open and see dataRace conditions with concurrent writes
Easy to debug — just look at JSONNo indexing — slow with large data
Portable — copy one fileNo relationships — no JOINs
Great for learning CRUDRewrites entire file on every change
Flat files are fine for prototyping and tutorials. The moment you need concurrent users or complex queries, it's time for a real database.

Section 3Relational Databases & SQL

Tables, rows, foreign keys, ACID transactions — the backbone of web applications.

Schema & CRUD-to-SQL-to-REST Mapping

The Stories Table

ColumnTypeConstraintsPurpose
idSERIALPRIMARY KEYAuto-incrementing unique ID
titleVARCHAR(200)NOT NULLRequired, max 200 chars
descriptionTEXT(nullable)Optional longer text
priorityVARCHAR(10)DEFAULT 'medium'low / medium / high
statusVARCHAR(20)DEFAULT 'todo'todo / in-progress / done
created_atTIMESTAMPDEFAULT CURRENT_TIMESTAMPAuto-set on creation

Three-Layer Mapping

CRUDSQL StatementHTTP MethodREST Endpoint
CreateINSERT INTOPOST/api/stories
Read (all)SELECT * FROMGET/api/stories
Read (one)SELECT ... WHERE id=$1GET/api/stories/:id
UpdateUPDATE ... SET ... WHEREPUT / PATCH/api/stories/:id
DeleteDELETE FROM ... WHEREDELETE/api/stories/:id

SQL Examples & RETURNING

-- CREATE: Insert a new story INSERT INTO stories (title, description, priority) VALUES ('Build login page', 'Add auth', 'high') RETURNING *; -- READ: Get all stories SELECT * FROM stories ORDER BY created_at DESC; -- READ: Filter stories SELECT * FROM stories WHERE status = 'todo' AND priority = 'high'; -- UPDATE: Change status UPDATE stories SET status = 'in-progress' WHERE id = 1 RETURNING *; -- DELETE: Remove a story DELETE FROM stories WHERE id = 1 RETURNING *;
The RETURNING * clause is a PostgreSQL feature that returns affected rows after INSERT/UPDATE/DELETE — no second query needed. Extremely useful in REST APIs: return the created/updated resource in the response directly.

Section 4Connecting from Code

Node.js (pg) vs PHP (PDO) — same database, different client libraries.

Node.js vs PHP — Side by Side

ConceptNode.js (pg)PHP (PDO)
Installnpm install pgBuilt-in (pdo_pgsql extension)
Connectionnew Pool({host, database, ...})new PDO($dsn, $user, $password)
Simple querypool.query('SELECT ...')$pdo->query('SELECT ...')
Parameterizedpool.query('...$1...', [val])$stmt->execute(['id' => $val])
PlaceholdersPositional: $1, $2, $3Named: :id, :title or ?
Get rowsresult.rows$stmt->fetchAll()
Connection poolingBuilt into PoolPersistent connections or external pooler
Error handlingtry/catch with async/awaittry/catch with PDO::ERRMODE_EXCEPTION
Never hardcode database passwords in source code. Use environment variables (process.env.DB_PASSWORD / getenv('DB_PASSWORD')). Bots scan public repos for leaked credentials.

Section 5SQL Injection

The #1 web vulnerability — and the most preventable.

The Attack

Normal input: alice Query: SELECT * FROM users WHERE username = 'alice' ✓ OK Malicious input: ' OR '1'='1' -- Query: SELECT * FROM users WHERE username = '' OR '1'='1' --' What the database sees: username = '' → false OR '1'='1' → always true! -- → comment (ignores the rest) Result: Returns ALL users. Attacker is "logged in."
Even worse input: '; DROP TABLE stories; -- Query: SELECT * FROM users WHERE username = ''; DROP TABLE stories; --' Database executes: 1. SELECT * FROM users WHERE username = '' (returns nothing) 2. DROP TABLE stories (deletes your table!) 3. --' (rest is a comment)
SQL injection is the #1 web vulnerability (OWASP Top 10). It can expose data, bypass auth, delete tables, and in some cases execute OS commands on the database server.

The Fix: Parameterized Queries

String Concatenation (VULNERABLE): Your code: "SELECT * FROM users WHERE name = '" + input + "'" Database sees: One big string to parse as SQL Problem: Input becomes part of the SQL structure Parameterized Query (SAFE): Step 1 - PREPARE: Database parses "SELECT * FROM users WHERE name = $1" SQL structure is locked in — one condition, one value Step 2 - EXECUTE: Database receives $1 = "'; DROP TABLE users; --" Treated as a data value, period. Cannot change the query structure.
WRONG — ConcatenationRIGHT — Parameterized
pool.query(`...WHERE id = ${id}`)

Attacker sends: 1; DROP TABLE stories
Query executes the DROP!
pool.query('...WHERE id = $1', [id])

Attacker sends: 1; DROP TABLE stories
DB sees id = '1; DROP TABLE stories' (string value, no damage)
$pdo->query("...WHERE id = " . $_GET['id'])

Same vulnerability in PHP
$stmt = $pdo->prepare('...WHERE id = :id')
$stmt->execute(['id' => $_GET['id']])

Safe in PHP
If you only remember one thing: never put user input directly into a SQL string. Always use parameterized queries ($1, $2 in pg) or prepared statements (:name in PDO).

Section 6NoSQL / Document Databases

Flexible JSON-like documents instead of rigid table rows.

SQL vs NoSQL

Terminology Mapping

SQL ConceptMongoDB Equivalent
DatabaseDatabase
TableCollection
RowDocument
ColumnField
SELECTfind()
INSERTinsertOne()
UPDATEupdateOne()
DELETEdeleteOne()
JOINEmbedded doc or $lookup

When to Choose

Choose SQL (PostgreSQL) WhenChoose NoSQL (MongoDB) When
Data has consistent, well-defined structureData structure varies between records
You need JOINs across related dataRelated data is naturally nested
ACID transactions are criticalEventual consistency is acceptable
Complex queries with aggregationSimple lookups by key or basic filters
Data integrity constraints matterSchema flexibility is more important
Don't choose based on hype. Many teams used MongoDB for everything circa 2012–2015 — including clearly relational data. Choose based on your data's natural shape, not the latest trend.

Section 7ORMs & Query Builders

Mapping tables to classes — convenience vs understanding.

Raw SQL vs ORM

OperationRaw SQLORM (Sequelize / Eloquent)
Get allSELECT * FROM storiesStory.findAll() / Story::all()
Get by IDSELECT ... WHERE id = $1Story.findByPk(id) / Story::find($id)
CreateINSERT INTO stories ...Story.create({title}) / Story::create([...])
UpdateUPDATE ... SET ... WHEREstory.update({status}) / $story->update([...])
DeleteDELETE FROM ... WHEREstory.destroy() / $story->delete()

Active Record vs Data Mapper

PatternHow It WorksExamplesTrade-off
Active RecordObject saves itself: story.save()Sequelize, EloquentSimple; harder to test
Data MapperSeparate mapper handles persistencePrisma, DoctrineMore code; cleaner separation

Popular ORMs

LanguageToolType
Node.jsSequelizeORM (Active Record)
Node.jsPrismaORM (Data Mapper-ish)
Node.jsKnexQuery Builder
PHPEloquentORM (Active Record)
PHPDoctrineORM (Data Mapper)
ORMs are convenient but hide what's happening. Learn raw SQL first, then use an ORM when you understand what it generates. The "N+1 query problem" is a common trap for ORM-only developers.

Section 8Choosing Your Storage

Match the tool to the problem — not the hype.

Decision Framework

Use CaseRecommendedWhy
Learning CRUD / tutorialsFlat files or in-memoryZero setup; focus on HTTP and REST
Personal project / prototypeSQLiteReal SQL, no server, one file
Production web appPostgreSQLACID, battle-tested, rich ecosystem
CMS / content managementPostgreSQL or MongoDBDepends on content structure
Sessions / cachingRedisBlazing fast, built-in TTL
Analytics / event loggingLog files → PostgreSQLAppend-only capture; SQL analysis
Mobile with offlineSQLite + PostgreSQLSQLite offline, sync to server
MicroservicesMix per service"Polyglot persistence"
In MVC, the Model layer is where all database logic lives. The model handles validation, storage, retrieval, and business rules — the controller just coordinates. See MVC Overview.

SummaryKey Takeaways

9 concepts of data storage in one table.

Data Storage at a Glance

ConceptKey Takeaway
Storage LandscapeSix options (in-memory, flat files, SQLite, PostgreSQL, MongoDB, Redis) — parallel choices, not a ranking
Flat FilesRead/write entire JSON file; zero setup; no concurrency; fine for tutorials, bad for production
Relational DatabasesTables, rows, columns; SQL for CRUD; foreign keys; ACID transactions for integrity
SQL ↔ CRUD ↔ RESTINSERT=POST, SELECT=GET, UPDATE=PUT/PATCH, DELETE=DELETE — same pattern at every layer
Connecting from CodeNode.js pg with Pool + $1; PHP PDO with :name placeholders
SQL Injection#1 web vulnerability; string concatenation is the cause; parameterized queries are the fix
NoSQL / MongoDBFlexible documents; good for varied schemas; use SQL when you need JOINs and transactions
ORMsMap tables to objects (Sequelize, Prisma, Eloquent, Doctrine); learn raw SQL first
Choosing StorageMatch tool to problem; PostgreSQL is the safe default; don't over-engineer for tutorials