Lesson 1

SQL Foundations (Databases, Tables, Rows, and Columns Explained)

Every SQL query you write targets a specific location inside a structured hierarchy. Before you can write a single SELECT statement, you need to understand that hierarchy — the four levels that organize all data in a relational database system. This lesson introduces those four levels, explains how they relate to each other, and shows how the SQL language is designed to navigate them.

The Four Levels of a SQL Database

A relational database system organizes data into four nested levels, each containing the next. You can think of them as stackable containers — like measuring cups that fit inside one another, each holding something more specific than the one surrounding it. When you query a database, your SQL statement must specify enough of this hierarchy for the database engine to locate exactly the data you are asking for.

The four levels, from outermost to innermost, are: the database, the table, the row, and the column.

Level 1 — The Database

The database is the outermost container. It is a named collection of related tables, managed as a unit by the database engine. A single database server can host dozens or hundreds of individual databases — one per application, one per client, or one per domain, depending on how the server is organized. When you connect to a database server using a tool like DBeaver, Beekeeper Studio, or HeidiSQL, one of the first things you specify is which database you want to work with.

The database named relationaldbdesign — shown at the top of the diagram — is the container for all the tables that store the data behind this website. Every table in this lesson's examples lives inside that database. In SQL, you reference the database in connection strings and, in some dialects, in fully qualified table names: relationaldbdesign.Customers.

Level 2 — The Table

Inside a database are one or more tables. A table is a named, two-dimensional structure that stores data about a single entity type — customers, products, orders, employees. The table named Customers stores data about customers. The table named Orders stores data about orders. Each table has its own structure, defined by its schema — the list of column names, data types, and constraints that determine what data the table can hold and how it is organized.

A database can contain zero or more tables, though most production databases contain many. The table is the primary object you interact with when writing SQL — SELECT, INSERT, UPDATE, and DELETE all operate on tables.

Level 3 — The Row

Inside a table are zero or more rows. Each row represents one instance of the entity the table describes — one customer, one product, one order. A table with three rows contains data about three distinct entities. A table with zero rows has a defined structure but no data yet — an empty container whose schema is ready to receive records.

SQL theorists use the term tuple to describe a row, because each row is formally a tuple in the mathematical sense — an ordered set of values, one per column. In everyday SQL development, the terms row and record are used interchangeably, and both mean the same thing: one horizontal slice of data across all columns in the table.

Level 4 — The Column

Inside each row are one or more column values. A column represents one attribute of the entity — the customer's name, the product's price, the order's date. Every row in a table has the same set of columns, defined by the table's schema. Where rows represent individual entities, columns represent the properties that describe every entity of that type.

SQL theorists use the term attribute to describe a column, for the same reason they use tuple for row — relational algebra treats each column as an attribute of the relation. In practice, column and attribute are used interchangeably. Each column has a data type that constrains what values it can store: an integer column cannot store text, a date column cannot store negative numbers, and a VARCHAR column has a maximum character length defined at schema creation time.

SQL database hierarchy showing Database, Table, Row, and Column as nested containers — A SQL database contains tables. Tables contain rows. Rows contain columns. Every query you write targets this four-level hierarchy.

SQL — Structured Query Language

What SQL Is and What It Does

SQL stands for Structured Query Language. It is the standard language for interacting with relational database management systems — creating their structures, populating them with data, querying that data, modifying it, and deleting it. SQL is not a general-purpose programming language like Java or Python; it is a specialized query language designed specifically for relational data operations.

SQL lets you access and manipulate databases at every level of the four-level hierarchy. You use SQL to create a database, to create tables within it, to insert rows into those tables, to update existing rows, to delete rows, and to query rows — retrieving exactly the data you need by specifying the columns, the table, the conditions, and the relationships that define your request.

SQL as a Set-Oriented Language

SQL is described as a set-oriented query language because it operates on entire sets of rows rather than on individual rows one at a time. When you write a SELECT statement with a WHERE clause, you are not telling the database engine to examine row 1, then row 2, then row 3. You are describing a set of rows — all rows where a certain condition is true — and asking the engine to return that entire set as a result.

This set orientation is what makes SQL declarative: you describe what data you want, not how to retrieve it. The database engine's query planner decides the most efficient physical execution path — which indexes to use, in what order to scan tables, how to join multiple tables — and executes that plan to produce your result set. As a SQL developer, your job is to state the request correctly; the engine's job is to fulfill it efficiently.

The ANSI/ISO Standard

SQL is an ANSI (American National Standards Institute) and ISO (International Organization for Standardization) standard. The standard defines a core set of SQL syntax, data types, and behavioral guarantees that conforming database systems must support. Major SQL standards were published in 1986, 1992 (SQL-92), 1999, 2003, 2008, 2011, 2016, and 2023, with each revision adding new features and clarifying existing ones.

In practice, every major relational database — PostgreSQL, MySQL, SQL Server, Oracle, SQLite, IBM Db2 — implements the SQL standard with extensions and variations specific to that product. These variations are called SQL dialects. The core SELECT, INSERT, UPDATE, DELETE, and CREATE TABLE syntax is consistent across all major dialects; the differences appear in advanced features, data type names, and built-in functions. The SQL you learn in this course targets the standard core that works across all major systems.

Tables, Rows, and Columns in Detail

The Schema — A Table's Structure

Every table has a schema — a formal definition of its structure that specifies the table's name, the name of each column, the data type of each column, and any constraints that apply. The schema is created when the table is created using the CREATE TABLE statement, and it persists independently of the data. A table's schema does not change when rows are inserted or deleted; it only changes when a database administrator explicitly alters the table structure.

The schema is what makes a relational table queryable in a predictable way. Because the engine knows the name and type of every column before any query runs, it can validate your SQL at parse time — catching errors like referencing a column that does not exist or comparing an integer column to a text value — before spending any resources on data retrieval.

Columns as Attributes

Each column in a table represents one attribute of the entity the table describes. In the Customers table shown in the diagram, three attributes are defined: ID (a numeric identifier), NAME (a text name), and ADDRESS (a text address). Every row in the Customers table has exactly these three attributes — no more, no less. A customer without a name cannot be inserted if the NAME column is defined as NOT NULL; a customer with two addresses cannot be stored in a single row without violating the column structure.

This uniformity is one of the relational model's foundational principles: every row in a table conforms to the same schema. It is what allows SQL to treat a table as a set — every member of the set has the same attributes, making set operations like union and intersection mathematically meaningful.

Rows as Tuples — One Record Per Row

Each row in a table represents one complete record — one instance of the entity described by the table's schema. The example table below contains three rows, each describing one mathematician. The terms row and record are used interchangeably throughout SQL documentation and in this course; they refer to the same concept. In formal relational theory, a row is called a tuple — an ordered sequence of attribute values, one for each column in the table's schema.

An example of a table is shown below:

Table 1. Example Database Table

ID	NAME	ADDRESS
1	Gauss	29 Magnetic Street
2	Riemann	31 Prime Road
3	Hilbert	37 Vector Space

This table contains three rows of data and three columns: ID, NAME, and ADDRESS. The column headers define the schema — the structure that every row must conform to. ID is the primary key, NAME is a text attribute, and ADDRESS is a text attribute. Each row represents one mathematician: Gauss at 29 Magnetic Street, Riemann at 31 Prime Road, and Hilbert at 37 Vector Space.

The Primary Key

What Makes ID a Primary Key

The ID column in the example table is designated as the primary key. A primary key is a column — or combination of columns — whose values uniquely identify each row in the table. No two rows can share the same primary key value, and the primary key column cannot contain a NULL value. These two constraints together guarantee that every row in the table is uniquely addressable.

In the example table, Gauss has ID 1, Riemann has ID 2, and Hilbert has ID 3. These values are unique — no two mathematicians share an ID. If you attempt to insert a fourth row with ID 1, the database engine rejects the INSERT with a primary key violation error. This enforcement happens automatically, at the database level, regardless of which application submitted the INSERT.

In module 3, the primary key concept was introduced through the attorney analogy — the attorney's name as the unique identifier linking the Calendar table to the Files table. In the Customers table, ID plays the same role: it is the value that other tables use as a foreign key reference when they need to link to a specific customer record.

Why Every Table Needs One

A table without a primary key cannot guarantee that any two rows are distinct. Without uniqueness enforcement, the same customer could be inserted twice with identical data — and the database engine would have no way to tell them apart or to update only one of them. In practice, every well-designed table has a primary key, and most database design tools warn or refuse when a table is created without one.

The primary key is also the mechanism by which relational databases implement the JOIN operation. When the Orders table stores a customer identifier as a foreign key, it is storing the primary key value from the Customers table. The JOIN condition — ON Orders.CustomerID = Customers.ID — connects the two tables through this shared value. Without a reliable primary key in the Customers table, that join cannot be guaranteed to produce correct results.

NULL — When a Cell Has No Value

NULL is a special marker in SQL that means a column value is absent — the value is unknown, not applicable, or not yet provided. NULL is not the same as zero, not the same as an empty string, and not the same as the text "NULL". It is the absence of a value, represented distinctly from any actual data value.

In the example table, every cell has a value — every mathematician has an ID, a name, and an address. In real-world tables, not every row will have a complete set of values for every column. A customer record might be created before a shipping address is known. An employee record might have a NULL end date because the employee is still active. A product might have a NULL discount percentage because no discount applies.

Columns can be defined as NOT NULL — meaning the database engine will reject any INSERT or UPDATE that leaves that column without a value. Primary key columns are always implicitly NOT NULL. Other columns are NULL-able by default unless the schema explicitly prohibits it. Understanding NULL behavior is essential for writing correct SQL, because NULL values participate in comparisons differently from regular values — a topic covered in detail in later lessons.

SQL Commands That Work With This Structure

SQL provides five categories of commands for working with the four-level hierarchy. Each category targets a specific layer of the database structure.

CREATE TABLE — Defining the Schema

The CREATE TABLE statement defines a new table's schema — its name, columns, data types, and constraints. Once created, the schema persists in the database's metadata catalog until explicitly changed or dropped. The table exists as an empty container, ready to receive rows, the moment the CREATE TABLE statement executes successfully.

CREATE TABLE Customers (
    ID      INT          NOT NULL,
    NAME    VARCHAR(100) NOT NULL,
    ADDRESS VARCHAR(200),
    CONSTRAINT pk_customers PRIMARY KEY (ID)
);

INSERT — Adding Rows

The INSERT statement adds one or more rows to an existing table. Each INSERT must provide values that satisfy the table's schema — the correct number of columns, the correct data types, and compliance with all NOT NULL and PRIMARY KEY constraints.

INSERT INTO Customers (ID, NAME, ADDRESS)
VALUES (1, 'Gauss',   '29 Magnetic Street'),
       (2, 'Riemann', '31 Prime Road'),
       (3, 'Hilbert', '37 Vector Space');

SELECT — Querying Rows

The SELECT statement retrieves rows from one or more tables, returning a result set that matches the specified columns and conditions. SELECT is the most frequently used SQL command and the primary focus of this course.

SELECT ID, NAME, ADDRESS
FROM Customers
WHERE ID = 2;

This query returns one row — Riemann at 31 Prime Road — by filtering the Customers table to the row where ID equals 2. The WHERE clause is the condition; the SELECT clause specifies the columns to include in the result; the FROM clause names the table to query.

UPDATE — Modifying Rows

The UPDATE statement modifies the values in one or more existing rows. An UPDATE without a WHERE clause modifies every row in the table — a common and consequential mistake. In production systems, UPDATE statements should always include a WHERE clause that precisely identifies the rows to be changed.

UPDATE Customers
SET ADDRESS = '42 Hilbert Space'
WHERE ID = 3;

DELETE — Removing Rows

The DELETE statement removes one or more rows from a table. Like UPDATE, a DELETE without a WHERE clause removes every row in the table — leaving the table structure intact but empty. The schema and the table itself remain; only the row data is removed.

DELETE FROM Customers
WHERE ID = 1;

Navigating the Four Levels in a SQL Query

Every SQL query you write navigates the four-level hierarchy — database, table, row, column — to locate and retrieve the specific data you need. The SELECT statement makes this navigation explicit in its structure.

Consider the query:

SELECT NAME, ADDRESS
FROM Customers
WHERE ID = 2;

Each clause corresponds to one level of the hierarchy. The FROM Customers clause identifies the table — Level 2. The WHERE ID = 2 clause identifies the row — Level 3, the specific record where ID equals 2. The SELECT NAME, ADDRESS clause identifies the columns — Level 4, the specific attribute values to return from that row.

The database level — Level 1 — is specified implicitly by the database connection. When you connect to the relationaldbdesign database using DBeaver or HeidiSQL, every table reference in your SQL statements is resolved within that database unless you explicitly qualify it otherwise.

This four-level navigation structure — connection establishes the database, FROM identifies the table, WHERE identifies the rows, SELECT identifies the columns — is the foundation of every SQL query you will write in this course, from the simplest single-table SELECT to the most complex multi-table JOIN with aggregation and subqueries. Understanding the hierarchy now means that every new SQL concept introduced in the lessons ahead has a clear structural context to attach to.