Relational Constructs   «Prev  Next»

Lesson 5 Table Column Rules
Objective List the rules for table columns.

Table Column Rules in a Relational Database

Table columns are the structural units of a relational database. Every piece of data stored in a relational database is stored in a column, and the design of those columns - their names, data types, constraints, and relationships - determines how reliably the database stores, retrieves, and protects that data. The following nine rules govern the creation and maintenance of table columns in a well-designed relational database.

  1. Naming Conventions. Use clear, concise, and meaningful names for table columns. Column names should communicate the attribute they represent without requiring external documentation. Avoid spaces, special characters, and reserved SQL keywords in column names. Use underscores to separate words in multi-word names, and follow a consistent naming pattern throughout the schema. snake_case (e.g. last_name, purchase_date) is widely preferred in RDBMS environments because it is unambiguous and compatible with all major database platforms. Inconsistent naming across tables - mixing CustID in one table with customer_id in another - creates confusion during queries and maintenance.
  2. Data Types. Assign the data type that most accurately represents the nature of the data the column will store. Common data types include INTEGER and BIGINT for whole numbers, DECIMAL and NUMERIC for precise fractional values such as prices, VARCHAR and CHAR for text, and DATE and TIMESTAMP for temporal values. Choose types that reflect domain meaning rather than convenience - storing a date as a VARCHAR may seem easier but prevents date arithmetic, range queries, and format validation. Choose the smallest type that accommodates the full range of expected values to minimize storage requirements and improve performance.
  3. Null Values. Explicitly determine whether each column may contain null values. A null value represents the absence of a known value - it is not zero, not an empty string, and not a default. Allow nulls only in columns where the absence of data is semantically meaningful, such as a DateTerminated column for an employee who is still active. For all other columns, apply NOT NULL to ensure that critical attributes are always present. Leaving columns nullable by default is a common source of data quality problems that are difficult to correct after data has accumulated.
  4. Primary Keys. Designate one or more columns as the primary key to uniquely identify each row in the table. Primary keys must be unique across all rows and cannot contain null values. Choose primary keys that are stable - values unlikely to change over the life of the record - and minimal, using as few columns as necessary. A surrogate key (a system-generated integer such as an auto-increment ID) is often preferred over a natural key when natural key values are subject to change. Primary keys are essential for maintaining data consistency and supporting efficient retrieval and modification of records.
  1. Foreign Keys. When a column contains values that reference rows in another table, define it as a foreign key. Foreign keys establish relationships between tables, enforce referential integrity, and enable efficient multi-table queries. A foreign key constraint tells the RDBMS that the value in this column must exist as a primary key (or unique key) value in the referenced table. Pair foreign key definitions with ON DELETE and ON UPDATE rules that reflect the intended business behavior - CASCADE, SET NULL, or RESTRICT - so that the database handles related data correctly when parent rows change.
  2. Default Values. Assign default values to columns where a standard value applies when no explicit value is provided. Default values are automatically inserted when a new record is added without specifying a value for that column. For example, a Status column might default to 'active', or a CreatedDate column might default to CURRENT_TIMESTAMP. Default values maintain data consistency, reduce the need for application-level logic to populate common values, and decrease the likelihood of unintended nulls in columns that should always have data.
  3. Column Constraints. Use constraints to enforce rules on the data stored in a column beyond what the data type alone can express. UNIQUE prevents duplicate values in a column while permitting nulls. CHECK enforces a boolean condition on column values - for example, requiring that a price column always contain a positive value. NOT NULL prevents missing values. Constraints defined in the schema apply universally to all data written to the table, regardless of which application or user performs the write. This makes constraint-based validation more reliable than application-level validation alone.
  4. Indexing. Create indexes on columns that are frequently used in WHERE clause predicates, JOIN conditions, or ORDER BY clauses. Indexes allow the RDBMS to locate rows matching a query condition without scanning the entire table, which dramatically improves query performance on large tables. Most RDBMS systems automatically create an index on the primary key column. Create additional indexes on foreign key columns and on any columns frequently used to filter or sort results. Use indexes judiciously - each index consumes additional storage and adds overhead to INSERT, UPDATE, and DELETE operations. Over-indexing write-heavy tables can degrade write performance significantly.
  5. Comments and Documentation. Document each column with a description of its purpose, the constraints that apply to it, and any relationships to other tables. Most RDBMS platforms support column-level comments that are stored in the data dictionary alongside the schema definition. This documentation is invaluable during database maintenance, troubleshooting, onboarding of new team members, and future development. A column named status_cd with no documentation is ambiguous; a column with a comment explaining that it stores a two-character ISO status code referencing the status_lookup table is unambiguous.

Applying these nine rules consistently produces a schema whose columns are self-describing, reliably constrained, and efficient to query. Well-designed columns reduce the burden on application code, prevent data quality problems at the source, and support the long-term maintainability of the database.

Columns Store Attributes of the Entity

When you create a table, you use its columns to store the attributes of the entity the table represents. A computer entity, for example, has attributes that describe its hardware configuration: brand, processor, storage capacity, RAM, storage type, monitor size, and category. Each of those attributes becomes a column in the Computer table.

Dell Desktop Computer table showing seven attributes: Brand, Processor, Storage,
RAM, Storage Type, Monitor, and Category
Brand Processor Storage RAM Storage Type Monitor Category
Dell Intel Core Ultra 9 2 TB SSD 64 GB NVMe 27" 4K Desktop

In relational notation this table is expressed as:

Computer (CompID, Brand, Processor, Storage, RAM, StorageType, Monitor, Category)

Each column name is distinct, each stores values of a single type, and together they describe every relevant attribute of the Computer entity. The CompID column serves as the primary key, uniquely identifying each computer record.

Two Fundamental Rules for Table Columns

Beyond the nine design rules above, every column in a relational table must satisfy two fundamental structural rules:

  1. The column must have a unique name within the table. Every column must represent a different attribute of the entity the table describes. No two columns in the same table can share a name. This rule is not merely a convention - it is a requirement of the relational model. If the same attribute name appears in more than one table, that shared attribute typically indicates a relationship[2] between those tables rather than a duplication error within a single table.
  2. The values for the column must come from a single domain[1]. All values stored in a column must be of the same type and represent the same kind of data. A column defined to store processor names should not also store monitor sizes or purchase dates. The domain of a column defines the set of valid values for that attribute. Domain constraints are covered in detail in a later lesson.

Imagine that your database contains two tables: one listing computer manufacturers (the MANUFACTURING table) and another listing computers owned by your organization (the COMPUTER table). Both tables contain the ManuID column.

ManuID field creates the relationship between the MANUFACTURING table (primary key)
and the COMPUTER table (foreign key)
The ManuID field creates the relationship between these tables. ManuID is the primary key in the MANUFACTURING table and the foreign key in the COMPUTER table.

Shared Columns Create Relationships Between Tables

The presence of ManuID in both the MANUFACTURING table and the COMPUTER table is not a violation of the unique name rule - it is the mechanism by which the two tables are related. In the MANUFACTURING table, ManuID is the primary key that uniquely identifies each manufacturer. In the COMPUTER table, ManuID is a foreign key that references the primary key of the MANUFACTURING table.

Establishing this relationship enables you to link the two tables in a query. If a computer develops a fault and you need to contact its manufacturer, you look up the computer's record in the COMPUTER table, note its ManuID value, and use that value to find the manufacturer's contact details - including phone number and support URL - in the MANUFACTURING table. A single foreign key value connects the two records across tables without duplicating the manufacturer's contact information in every computer record.

The relationship between these tables is a one-to-many relationship[3]: one manufacturer can be associated with many computers, but each computer record references exactly one manufacturer. This cardinality is enforced by the foreign key constraint. If referential integrity[4] is enforced between the two tables, the Relational Database Management System[5] (RDBMS) will reject any attempt to insert a ManuID value in the COMPUTER table that does not exist in the MANUFACTURING table. It would make no sense to record a computer as manufactured by an organization that is not in the MANUFACTURING table - and with referential integrity enforced, the database makes that mistake impossible.

Glossary

[1] Domain: Determines the type of data values that are permitted for a given attribute. All values in a column must come from the same domain.
[2] Relationship: If the same attribute occurs in more than one table, a relationship exists between those two tables.
[3] One-to-many relationship (1:N): For one instance of entity A, there exist zero, one, or many instances of entity B; but for one instance of entity B, there exists zero or one instance of entity A.
[4] Referential integrity: The means of maintaining the integrity of data between related tables. A foreign key column value must be null or match a value in the corresponding primary key column of the referenced table. Usually enforced with foreign key constraints.
[5] RDBMS (Relational Database Management System): A software package that manages and provides access to a relational database. These systems follow Codd's 12 rules of relational databases and use SQL to access and manipulate data.

The next lesson lists the rules for table rows.


SEMrush Software 5 SEMrush Banner 5