| Lesson 3 |
How to create a table |
| Objective |
Learn how to Create a SQL Table and how table structure affects SQL. |
Table Structure and SQL Queries (CREATE TABLE, Column Types, and Views)
The structure of a database table — the names you give its columns, the data types you assign them, and the constraints you place on them — directly determines how you write every SQL query against that table. This lesson introduces the CREATE TABLE statement, explains how each structural decision you make at table creation time flows through into the SELECT, WHERE, and INSERT statements you write later, and covers views as a mechanism for defining reusable derived queries on top of base tables. Lesson 4 will cover the full CREATE TABLE syntax in detail; this lesson focuses on the structural concepts and their SQL consequences.
How Table Structure Affects Your SQL
When you create a table, you are making decisions that will shape every SQL statement written against that table for its entire lifetime. Three structural decisions have the most direct impact on query syntax.
First, the column names you define in CREATE TABLE are the exact names you must use in SELECT, WHERE, INSERT, and UPDATE statements. If you name a column Firstname in CREATE TABLE, you reference it as Firstname in SELECT — not firstname, not FirstName, not first_name. Most database engines are case-insensitive for column names, but the name must otherwise match exactly. A column named PhoneNumber cannot be queried as Phone.
Second, the data type of each column determines what operations are valid against it in SQL. A column defined as CHAR(10) stores text and supports string functions and pattern matching with LIKE. A column defined as INT supports arithmetic and numeric comparisons. A column defined as DATE supports date arithmetic and date functions. Attempting to apply a numeric operation to a text column, or a string function to a date column, either produces an error or an unexpected result depending on the database engine.
Third, whether a column allows NULL values — determined by the NULL or NOT NULL constraint in CREATE TABLE — affects how you write WHERE clause conditions for that column. A column defined as NULL-able requires special handling in comparisons, because NULL does not equal anything, including itself. These implications are covered in detail in the sections below.
A Simple CREATE TABLE Example
The BasicTable Definition
The following CREATE TABLE statement defines a simple contact table with three columns:
CREATE TABLE BasicTable (
Firstname CHAR(50) NULL,
Lastname CHAR(50) NULL,
PhoneNumber CHAR(10) NULL
);
This statement creates a new table named
BasicTable with three columns. Each column is defined with a name, a data type, and a NULL constraint. The parentheses enclose the column list; each column definition is separated by a comma; the statement ends with a semicolon.
Reading the Column Properties
The column definitions in the CREATE TABLE statement map directly to the structural properties of the table:
| Property |
Firstname |
Lastname |
PhoneNumber |
| Column Type |
CHAR(50) — text, 50 characters |
CHAR(50) — text, 50 characters |
CHAR(10) — text, 10 characters |
| NULL Allowed |
Yes |
Yes |
Yes |
| Column Description |
Contact's first name |
Contact's last name |
Contact's phone number |
The column names —
Firstname,
Lastname,
PhoneNumber — are the names you use in every SQL statement that references this table. A SELECT that retrieves all contacts looks like this:
SELECT Firstname, Lastname, PhoneNumber
FROM BasicTable
ORDER BY Lastname, Firstname;
The column names in the SELECT clause match the column names in the CREATE TABLE statement exactly. The table name in the FROM clause matches the table name in CREATE TABLE exactly. This one-to-one correspondence between the table definition and the query syntax is the fundamental connection this lesson is built on.
NULL and NOT NULL — Choosing Wisely
The
NULL keyword in a column definition indicates that the column accepts NULL values — the absence of a value is permitted for that column. In the BasicTable example, all three columns are NULL-able: a contact can be stored with no first name, no last name, and no phone number.
The
NOT NULL constraint prevents NULL values from being stored in a column. If you want to require that every contact has at least a last name, you define that column as NOT NULL:
CREATE TABLE BasicTable (
Firstname CHAR(50) NULL,
Lastname CHAR(50) NOT NULL,
PhoneNumber CHAR(10) NULL
);
With this definition, any INSERT that omits Lastname or explicitly provides NULL for Lastname will be rejected by the database engine with a constraint violation error. The NOT NULL constraint is enforced automatically at the database level — application code does not need to validate this separately.
The choice between NULL and NOT NULL is a data modeling decision: which attributes are mandatory for every record, and which are optional? Primary key columns are always implicitly NOT NULL. Columns that represent optional attributes — a middle name, a fax number, a secondary email address — are typically defined as NULL-able. Columns that represent core identity or required business data — a customer ID, an order date, a product name — should generally be NOT NULL.
We will examine the full CREATE TABLE statement more closely in the next lesson.
Column Types and Their SQL Implications
Character Types — CHAR vs VARCHAR
The BasicTable example uses
CHAR(50) for the name columns and
CHAR(10) for the phone number. CHAR is a fixed-length character type — every stored value is padded with spaces to the declared length. A first name of "Ali" stored in a
CHAR(50) column actually occupies 50 bytes, with 47 trailing spaces. CHAR is efficient for columns where every value has the same length (a two-letter state abbreviation, a fixed-format code), but wasteful for variable-length text.
VARCHAR(n) is the variable-length alternative. A first name of "Ali" stored in a
VARCHAR(50) column occupies only 3 bytes plus a small overhead for the length indicator. VARCHAR is the standard choice for name, address, and description columns where values vary in length. The SQL implications are identical — you reference a VARCHAR column the same way you reference a CHAR column in SELECT, WHERE, and INSERT — but the storage efficiency is significantly better for variable-length data.
Numeric Types — INT, DECIMAL, FLOAT
Numeric column types enable arithmetic operations and numeric comparisons in SQL.
INT stores whole numbers without decimal places — customer IDs, quantities, counts.
DECIMAL(p,s) stores exact decimal values with
p total digits and
s digits after the decimal point — the correct choice for monetary values, where rounding errors in floating-point representation are unacceptable.
FLOAT stores approximate floating-point values — suitable for scientific measurements where a small margin of imprecision is acceptable but inappropriate for financial data.
A column defined as
INT supports queries like:
SELECT Firstname, Lastname
FROM Contacts
WHERE AreaCode = 602;
The same column defined as
CHAR(3) requires a string comparison:
SELECT Firstname, Lastname
FROM Contacts
WHERE AreaCode = '602';
The single quotes indicate a string literal. Using numeric literals against a CHAR column, or string literals against a numeric column, either produces an error or forces an implicit type conversion — a source of subtle bugs and performance problems that the correct initial data type choice prevents entirely.
Date and Time Types
DATE,
TIME,
DATETIME, and
TIMESTAMP columns enable date arithmetic and temporal filtering that text columns cannot support. A column defined as VARCHAR that stores dates as text strings ("2026-04-15") cannot be queried with date range conditions without string manipulation. A column defined as DATE supports natural date range queries:
SELECT Firstname, Lastname
FROM Contacts
WHERE DateAdded BETWEEN '2026-01-01' AND '2026-04-15';
The database engine interprets the string literals as dates because the column is declared as DATE, enabling index use and correct temporal ordering.
NULL in WHERE Clauses — IS NULL vs = NULL
One of the most common SQL mistakes for developers new to NULL is writing
WHERE column = NULL. This condition never returns any rows — not because there are no NULL values, but because NULL is not equal to anything, including NULL itself. The SQL standard defines NULL comparisons as producing a third logical value — UNKNOWN — rather than TRUE or FALSE. A WHERE clause requires TRUE to include a row in the result; UNKNOWN means the row is excluded.
The correct syntax for testing NULL is
IS NULL and
IS NOT NULL:
-- Find contacts with no phone number recorded
SELECT Firstname, Lastname
FROM BasicTable
WHERE PhoneNumber IS NULL;
-- Find contacts who have a phone number
SELECT Firstname, Lastname
FROM BasicTable
WHERE PhoneNumber IS NOT NULL;
This behavior is a direct consequence of defining PhoneNumber as NULL-able in CREATE TABLE. If PhoneNumber were defined as NOT NULL, the IS NULL condition would never return rows and the question would never arise. The NULL constraint decision at table creation time creates the IS NULL query requirement at query time.
Column Naming and SQL Query Impact
Names Must Match Exactly
The column names defined in CREATE TABLE are the identifiers that SQL uses to reference those columns in every subsequent statement. If the column is named
PhoneNumber, the SELECT clause must say
PhoneNumber. Most database engines treat column names as case-insensitive by default —
phonenumber,
PHONENUMBER, and
PhoneNumber all reference the same column. However, spaces in column names, special characters, and reserved words require quoting — wrapping the name in double quotes (SQL standard), backticks (MySQL), or square brackets (SQL Server) depending on the database engine.
Naming Conventions — CamelCase and snake_case
Two naming conventions dominate relational database column naming: CamelCase (
FirstName,
PhoneNumber,
OrderDate) and snake_case (
first_name,
phone_number,
order_date). Both are valid. The important principle is consistency within a schema — mixing conventions within the same table or database creates confusion and increases the chance of typos in SQL statements.
The BasicTable example uses CamelCase. A snake_case equivalent would be:
CREATE TABLE basic_table (
first_name CHAR(50) NULL,
last_name CHAR(50) NULL,
phone_number CHAR(10) NULL
);
The SQL queries against this table use the snake_case names:
SELECT first_name, last_name FROM basic_table.
Reserved Words as Column Names
SQL has a set of reserved words — keywords that have special meaning in the SQL grammar — that cannot be used as unquoted column names. Words like
ORDER,
DATE,
TABLE,
SELECT,
FROM, and
WHERE are reserved in most SQL dialects. Naming a column
date or
order without quoting it will cause a syntax error in most database engines. The solution is to either avoid reserved words as column names — use
order_date instead of
date,
sort_order instead of
order — or quote the names wherever they appear in SQL statements.
Relational Algebra — Base and Derived Relations
The operators of the
relational algebra allow us to start with given relations and derive further relations from them — through selection, projection, join, union, and other operations. The given relations are called base relations; the results of applying operators to base relations are called derived relations.
Base Relations and CREATE TABLE
In SQL, base relations are implemented as base tables — the concrete, persistently stored tables created with CREATE TABLE. A base table physically stores data rows. When you INSERT a row into a base table, that row is written to the database's storage system and remains there until explicitly deleted.
CREATE TABLE Contacts (
ContactID INT NOT NULL,
Firstname CHAR(50) NULL,
Lastname CHAR(50) NOT NULL,
PhoneNumber CHAR(10) NULL,
CONSTRAINT pk_contacts PRIMARY KEY (ContactID)
);
This CREATE TABLE statement defines a base relation named Contacts. Every INSERT into Contacts adds a physical row to the base table. Every SELECT against Contacts reads from those physically stored rows.
Derived Relations and Views
A derived relation does not store data physically. Instead, it stores a query expression — a definition of how to compute a result from one or more base relations at the time the derived relation is referenced. In SQL, the primary mechanism for named derived relations is the view.
CREATE VIEW — Defining a Named Derived Relation
The View Syntax
A view is a named relation whose value at any point in time is the result of evaluating its defining query expression against the current state of the underlying base tables. Here is a concrete example based on the Contacts table defined above:
CREATE VIEW ContactsWithPhone AS
SELECT ContactID, Firstname, Lastname, PhoneNumber
FROM Contacts
WHERE PhoneNumber IS NOT NULL;
This view, named
ContactsWithPhone, presents only those contacts who have a phone number recorded. Once defined,
ContactsWithPhone can be queried exactly like a base table:
SELECT Firstname, Lastname, PhoneNumber
FROM ContactsWithPhone
WHERE Lastname = 'Gauss';
The database engine evaluates the view's defining query at execution time and combines it with the outer query — effectively executing the view's filter (PhoneNumber IS NOT NULL) and the outer query's filter (Lastname = 'Gauss') together against the base Contacts table.
When to Use Views
Views serve several practical purposes in SQL development. They hide complexity — a view can encapsulate a multi-table JOIN with aggregation so that application developers can query a simple named view rather than writing the full JOIN every time. They present a subset of columns — a view can expose only the columns that a particular application or user role needs, without granting access to sensitive columns in the underlying table. They enforce row-level access — a view filtered to specific rows can be granted to users who should see only that subset of data.
Views also provide a layer of abstraction between the logical data model (what the application sees) and the physical data model (how the data is actually stored). If a table is restructured — columns renamed, tables split — the view's defining query can be updated to maintain the same interface for applications, without requiring those applications to change their SQL.
Views vs Materialized Views
A standard view, as described above, does not store data — it stores only the query definition. Every time the view is queried, the defining expression is evaluated against the current base table data. This means the view always reflects the latest state of the underlying data, but it also means the view's query executes every time the view is referenced.
A materialized view (also called a snapshot in some database systems) stores the result of the view's defining query physically, like a base table. The stored result is refreshed periodically or on demand. Materialized views improve query performance for expensive aggregations or joins, at the cost of data currency — the materialized result reflects the state of the base data as of the last refresh, not the current moment.
For the standard SQL work covered in this course, views are non-materialized — they are purely conceptual overlays on top of base tables. When you query a view, you can think of the database engine constructing a temporary base table whose value is obtained by evaluating the view's defining expression at that moment, and then executing your outer query against that temporary result. This mental model — materialization as a way of thinking, not a physical reality — captures the correct conceptual behavior without implying that any physical storage is involved.
