Normalization introduction

Last reviewed May 28, 2026 Content v20260528

Track mode

sql_sandbox

Means

SQL sandbox

Reading

~2 min

Level

intermediate

This lesson

An orientation to the SQL track—relational concepts, query patterns, and how to practice until the SQL sandbox lab ships.

You need a clear map of the SQL track so tables, keys, JOINs, and aggregates do not feel like magic.

You will apply Normalization introduction in contexts like: Postgres, MySQL, SQLite, warehouses, and ORMs that still expose SQL.

Copy SQL from each lesson into SQLite (sqlite3), DB Fiddle, or local Postgres—read result grids and row counts. The in-browser SQL lab (sql_sandbox) will run queries when the runner ships; until then, local clients are the practice path. Also read the interview prep blocks.

After basic programming literacy—before ORM-heavy frameworks assume you can read the SQL they generate.

Normalization reduces redundancy by splitting data into related tables. First normal form (1NF) through third normal form (3NF) are interview staples and guide practical schema design.

Problem: duplication

Storing customer name on every order row duplicates data—update anomalies if Ada renames and only some rows change.

Split tables

-- Normalized: customer data once
CREATE TABLE customers (id INTEGER PRIMARY KEY, name TEXT);
CREATE TABLE orders (
  id INTEGER PRIMARY KEY,
  customer_id INTEGER REFERENCES customers(id),
  total REAL
);

Practice: Run DDL in a fresh SQLite file or DB Fiddle schema pane. Drop test tables when experimenting.

Denormalize deliberately

Reporting caches, read replicas, and materialized views sometimes denormalize for speed—start normalized, denormalize with measured need. Data Science pipelines may flatten tables for ML features.

Normal forms sketch

1NF — atomic columns, no repeating groups
2NF — no partial dependency on composite keys
3NF — no transitive dependency on non-key attributes

Important interview questions and answers

Q: Always 3NF?
A: OLTP apps usually yes; analytics may denormalize star schemas in warehouses.
Q: JOIN cost vs duplication?
A: JOINs cost CPU; duplication costs consistency—normalize first, optimize with indexes and caches.

Self-check

What update anomaly does duplication cause?
Name the first three normal forms in order.

Tip: Normalize OLTP schemas first; denormalize reporting only with measured need.

Interview prep

Why normalize?: Reduce duplication and update anomalies.
3NF sketch?: No non-key attribute depends on another non-key attribute.

Discussion

Past discussion is visible to everyone. Only logged-in users can post comments and replies.

Starter discussion topics

3NF enough?
Denormalize when?

No discussion yet. Be the first to ask a question.