SQL Foundations  «Prev  Next»

Lesson 4 Create a table statement
Objective Write a SQL Statement to create a table to match Requirements

SQL CREATE TABLE Statement (Column Types, Constraints, and Table Structure)

Lesson 3 introduced the connection between table structure and SQL queries. This lesson applies that foundation by walking through the CREATE TABLE statement in detail — keyword by keyword, column by column, constraint by constraint. By the end of this lesson you will be able to write a complete CREATE TABLE statement from a set of business requirements, choose the correct data type for each column, and apply NULL, NOT NULL, and PRIMARY KEY constraints correctly.

Tables, Rows, Columns, and Values

Terminology — Codd Model vs SQL Standard vs File Processing

If you are familiar with databases already, you have heard alternative terms for similar concepts. Figure 4-4.1 shows how these terms are related. Codd's relational-model terms are in the first column; SQL-standard and DBMS-documentation terms are in the second column; and the third-column terms are holdovers from traditional non-relational file processing. This course uses SQL terminology throughout — table, column, and row — though in formal academic texts the relational model terms and SQL terms are kept distinct.

Three-column diagram mapping Codd model terms (Relation, Attribute, Tuple) to SQL terms (Table, Column, Row) and file processing terms (File, Field, Record)
Figure 4-4.1: Model, SQL, Files — Codd's relational model terms map to SQL standard terms and traditional file processing terms. This course uses SQL terminology throughout.

Database Tables

From a user's point of view, a database is a collection of one or more tables — and nothing but tables. A table has four defining characteristics that distinguish it from other data structures.

First, a table is the database structure that holds data. All data in a relational database lives in tables — not in files, not in objects, not in flat structures. Tables are the fundamental storage unit of the relational model.

Second, a table contains data about a specific entity type. An entity type is a class of distinguishable real-world objects, events, or concepts with common properties — patients, movies, genes, weather conditions, invoices, projects, or appointments. Patients and appointments are different entities, so you store information about them in different tables. Mixing entity types in one table is the source of the normalization violations and data anomalies covered in module 3.

Third, a table is a two-dimensional grid characterized by rows and columns. Rows run horizontally; columns run vertically. Every row has the same set of columns, defined by the table's schema. The diagram below (Figure 4-4.2) shows this structure with a concrete five-row, five-column example.

Fourth, a table holds a data item called a value at each row-column intersection. The value in row 2 of the PRICE column is $20.26. That value belongs to exactly one row (the B. Gupta record) and exactly one column (PRICE). It is the atomic unit of data in the relational model.

A Table Is a Two-Dimensional Grid

Figure 4-4.2 shows a concrete representation of a database table. The ROWS are numbered 1 through 5 on the right side — each row is one complete order record. The COLUMNS are labeled at the bottom with cyan arrows — ID, CUSTOMER, PRODUCT, PRICE, and DATE are five distinct attributes, each forming a vertical slice across all rows. The highlighted cell in row 2 (B. Gupta, Widget B) demonstrates the value concept: the PRICE column intersects row 2 at exactly one cell, which holds the value $20.26. The DETAIL VIEW tooltip shows that a single value can carry additional context — in a real application, this cell's full record includes STATUS: PENDING and a precise timestamp.

Database table diagram showing rows numbered 1-5, five columns (ID, CUSTOMER, PRODUCT, PRICE, DATE), and a highlighted value cell demonstrating the row-column intersection
Figure 4-4.2: A database table is a two-dimensional grid of rows and columns. Each cell holds a single value — the intersection of one row and one column. Row 2 (B. Gupta) × PRICE column = $20.26.

Figure 4-4.3 shows how a table typically appears in database software and documentation — as a plain-text grid with column headers on the first line. This table has 3 columns, 4 rows, and 3 × 4 = 12 values. The header row (au_id, au_fname, au_lname) is not a data row; it displays the column names defined in the table's schema.

au_id   au_fname   au_lname
-----   --------   --------
A01     Sarah      Buchman
A02     Wendy      Heydemark
A03     Hallie     Hull
A04     Klee       Hull
Figure 4-4.3: This grid represents an actual (not abstract) table as it usually appears in database software and books. This table has 3 columns, 4 rows, and 3 × 4 = 12 values. The top row is a header displaying column names, not a data row.

The CREATE TABLE Statement — Keyword by Keyword

The basic concept you need to understand is the fundamental format of the CREATE TABLE statement. It has three parts: the declaration, the column list, and the closing.

Declaring the Table Name

The statement opens with the CREATE TABLE keywords followed by the name of the new table:
CREATE TABLE BasicTable

This portion tells the SQL engine: "I want to create a new table and it should be named BasicTable." The keywords CREATE TABLE are reserved — the engine recognizes them as the command to define a new base relation. The table name follows immediately. Table names follow the same naming rules as column names: no spaces without quoting, no reserved words without quoting, consistent with your schema's naming convention.

Defining the Column List

After the table name, an opening parenthesis begins the column list. Each column is defined on its own line with a name, a data type, and optional constraints. A comma separates each column definition. A closing parenthesis and a semicolon end the statement:

(
    Firstname   CHAR(50) NULL,
    Lastname    CHAR(50) NULL,
    PhoneNumber CHAR(10) NULL
)

The Complete BasicTable Statement

Combining the declaration and the column list produces the complete CREATE TABLE statement:

CREATE TABLE BasicTable (
    Firstname   CHAR(50) NULL,
    Lastname    CHAR(50) NULL,
    PhoneNumber CHAR(10) NULL
);
This statement creates a table named BasicTable with three columns. Firstname and Lastname are CHAR(50) columns — text values up to 50 characters. PhoneNumber is a CHAR(10) column — text up to 10 characters. All three columns allow NULL values. After this statement executes, the table exists in the database with its schema defined and zero rows of data.


Column Data Types

What remains after naming each column is defining what type of information it holds. The data type tells the database engine what values the column can store, how much space to allocate, and what operations are valid against the column in SQL queries.

Character Types — CHAR, VARCHAR, TEXT

Character types store text values — names, descriptions, codes, addresses, and any other data that consists of letters, numbers, spaces, and symbols.
CHAR(n) stores fixed-length text. Every stored value is padded to exactly n characters with trailing spaces. CHAR is efficient for columns where every value has the same length — a two-letter state code, a fixed-format product code, a country code.

VARCHAR(n) stores variable-length text up to n characters. Each value occupies only the space it needs plus a small length indicator. VARCHAR is the standard choice for name, address, and description columns where values vary in length. Most databases support VARCHAR lengths up to 65,535 characters or more — not the 255-character limit noted in older documentation.

TEXT (or its equivalents: CLOB, LONGTEXT, NVARCHAR(MAX)) stores text of unknown or very large length — article bodies, notes, log entries, JSON data. TEXT columns typically cannot be indexed directly and have limitations in WHERE clause comparisons.

Numeric Types — INT, BIGINT, DECIMAL, FLOAT

Numeric types store numbers and support arithmetic operations and numeric comparisons in SQL.

INT stores whole numbers — customer IDs, quantities, counts, ages. The standard INT range is approximately −2 billion to +2 billion. BIGINT extends this to approximately −9 quintillion to +9 quintillion, used for large counters, timestamps, and system-generated IDs.

DECIMAL(p,s) stores exact decimal values with p total digits and s digits after the decimal point. DECIMAL(10,2) stores values like 12345678.99 — the correct type for monetary amounts, where floating-point rounding errors are unacceptable.

FLOAT and REAL store approximate floating-point numbers — suitable for scientific measurements where a small margin of imprecision is acceptable, but never appropriate for financial data.

Date and Time Types — DATE, DATETIME, TIMESTAMP

DATE stores a calendar date (year, month, day) with no time component. DATETIME stores both date and time. TIMESTAMP stores a point in time, often with automatic population on INSERT or UPDATE. The specific type names and behaviors vary slightly between database engines — MySQL uses DATETIME and TIMESTAMP distinctly; PostgreSQL uses TIMESTAMP; SQL Server uses DATETIME2.

Boolean Types — BOOLEAN / BIT

BOOLEAN (PostgreSQL, MySQL 8+) or BIT (SQL Server, older MySQL) stores a true/false value. Used for flags, status indicators, and yes/no attributes. In databases that lack a native boolean type, a CHAR(1) column with 'Y'/'N' or a TINYINT with 0/1 is a common substitute.

Why PhoneNumber Is CHAR, Not Numeric

The PhoneNumber column in BasicTable is defined as CHAR(10), not INT or DECIMAL, even though phone numbers contain digits. The reason is that phone numbers are not quantities — you never add two phone numbers together or calculate their average. More importantly, phone numbers include non-numeric characters: parentheses around area codes, hyphens between segments, plus signs for international prefixes, and leading zeros that numeric types silently drop. A phone number stored as INT loses the leading zero from numbers like 0207-123-4567. Stored as CHAR(10), the value is preserved exactly as entered.

This principle extends to any identifier that looks numeric but is not a quantity: postal codes, social security numbers, product codes, ISBN numbers. When the digits represent an identifier rather than a measurable value, use a character type.

Data Type Reference

Type Use Case Example Column
CHAR(n)Fixed-length textStateCode CHAR(2)
VARCHAR(n)Variable-length textCustomerName VARCHAR(100)
TEXTLarge or unknown-length textNotes TEXT
INTWhole numbersQuantity INT
BIGINTLarge whole numbersTransactionID BIGINT
DECIMAL(p,s)Exact decimal — moneyPrice DECIMAL(10,2)
FLOATApproximate decimal — scienceTemperature FLOAT
DATECalendar dateOrderDate DATE
DATETIMEDate and timeCreatedAt DATETIME
BOOLEANTrue/false flagIsActive BOOLEAN

Column Constraints

NULL and NOT NULL

The NULL and NOT NULL constraints control whether a column accepts absent values. NULL-able columns accept rows where that column has no value — the absence is stored as NULL. NOT NULL columns reject any INSERT or UPDATE that provides no value for that column.

In BasicTable, all three columns are NULL-able. A more realistic contact table would make Lastname NOT NULL — a contact without a last name is not useful — while leaving PhoneNumber NULL-able, since a contact might not have a phone number on file:
CREATE TABLE BasicTable (
    Firstname   CHAR(50)     NULL,
    Lastname    CHAR(50) NOT NULL,
    PhoneNumber CHAR(10)     NULL
);

PRIMARY KEY

A primary key uniquely identifies each row in the table. No two rows can share the same primary key value, and the primary key column cannot be NULL. In practice, most tables use a system-generated integer as their primary key — a value the database engine assigns automatically, guaranteed to be unique:

CREATE TABLE Contacts (
ContactID   INT      NOT NULL AUTO_INCREMENT,
Firstname   CHAR(50)     NULL,
Lastname    CHAR(50) NOT NULL,
PhoneNumber CHAR(10)     NULL,
PRIMARY KEY (ContactID)
);
AUTO_INCREMENT (MySQL) or IDENTITY (SQL Server) or SERIAL (PostgreSQL) tells the engine to generate the next sequential integer value automatically when a new row is inserted. The application never needs to supply or track the ContactID — the engine handles it.

Named Constraints with the CONSTRAINT Keyword

Constraints can be given explicit names using the CONSTRAINT keyword. Named constraints are easier to reference in error messages, easier to drop with ALTER TABLE, and make the schema's intent clearer:

CREATE TABLE Contacts (
ContactID   INT      NOT NULL,
Firstname   CHAR(50)     NULL,
Lastname    CHAR(50) NOT NULL,
PhoneNumber CHAR(10)     NULL,
CONSTRAINT pk_contacts PRIMARY KEY (ContactID)
);

IF NOT EXISTS — The Safe CREATE TABLE Pattern

Running a CREATE TABLE statement against a database that already contains a table with the same name produces an error. The IF NOT EXISTS clause prevents this — the statement executes only if no table with that name exists, and silently succeeds (does nothing) if the table already exists:

CREATE TABLE IF NOT EXISTS Contacts (
    ContactID   INT      NOT NULL,
    Firstname   CHAR(50)     NULL,
    Lastname    CHAR(50) NOT NULL,
    PhoneNumber CHAR(10)     NULL,
    CONSTRAINT pk_contacts PRIMARY KEY (ContactID)
);

IF NOT EXISTS is supported in MySQL, PostgreSQL, and SQLite. SQL Server uses a different pattern: check for table existence in the system catalog before executing CREATE TABLE.


Database Engines and Column Type Variations

What Is a Database Engine?

A database engine is the underlying software component that a database management system (DBMS) uses to create, read, update, and delete data. Most database management systems include their own application programming interface (API) that allows users and applications to interact with the engine without going through the DBMS's user interface directly. The term database engine is frequently used interchangeably with database server and database management system.

A database instance refers to the processes and memory structures of a running database engine — the live execution environment, as distinct from the software installation itself. Many modern DBMS products support multiple storage engines within the same database system.

MySQL InnoDB and MyISAM

MySQL is a notable example of a DBMS that supports multiple storage engines. InnoDB is MySQL's default and recommended storage engine — it supports ACID transactions, foreign key constraints, and row-level locking. MyISAM is an older engine that lacks transaction support and foreign key enforcement but offers fast full-text search. For new MySQL tables, InnoDB is always the correct choice unless a specific MyISAM feature is required.

Type Names Vary by Engine

Different database engines use different names for the same conceptual data types. The column types used in this lesson's examples reflect broadly supported SQL standard types. When working with a specific engine, verify the exact type names and behavior in that engine's documentation. The major modern relational databases — PostgreSQL, MySQL/MariaDB, Microsoft SQL Server, Oracle Database, and SQLite — all support the core types covered in this lesson, with minor naming and behavioral variations.

Create Table - Exercise

Click the Exercise link below to write a CREATE TABLE statement that matches a specific set of column requirements.
Create Table - Exercise
In the next lesson, we will discuss what a database index is and how it is used to speed up SQL queries.

SEMrush Software 4 SEMrush Banner 4