SQL Foundations  «Prev  Next»

Lesson 6 Create an index
Objective Create an Index on a Column in the Database Table.

Create Index on Column (SQL CREATE INDEX Statement with Examples)

Lesson 5 explained what a database index is and how it speeds up queries by replacing full table scans with direct lookups. This lesson applies that foundation: you will write CREATE INDEX statements against the BasicTable and Contacts tables defined in earlier lessons, learn the naming conventions that make indexes maintainable, see how syntax varies across the major database engines, and learn how to view and drop indexes. By the end of this lesson you will be able to create a complete indexing strategy for any table you design.

Creating an Index on BasicTable

The Motivating Query — Searching by Last Name

BasicTable, defined in lesson 4, has three columns: Firstname, Lastname, and PhoneNumber. The most common query against a contact table is a search by last name — finding everyone named Smith, or everyone whose last name starts with H. Without an index on Lastname, every such query performs a full table scan, reading every row to find the matches. As BasicTable grows from dozens of rows to thousands, this scan grows proportionally slower.



Creating an index on the Lastname column gives the SQL engine a sorted lookup structure it can use to find matching last names in O(log n) time instead of O(n) time — the same improvement described in lesson 5's book index analogy. The index is the sorted guide; the engine uses it automatically whenever a query filters or sorts on Lastname.

The CREATE INDEX Statement

As with the CREATE TABLE statement from lesson 4, the CREATE INDEX statement begins with the CREATE keyword followed by the object type — INDEX — and then the object name and its definition:

CREATE INDEX index_name
ON table_name (column_name);

For BasicTable, indexing the Lastname column:
CREATE INDEX LastnameIndex
ON BasicTable (Lastname);

This statement creates an index named LastnameIndex on the Lastname column of BasicTable. The index is built immediately when the statement executes — the engine reads all existing Lastname values, sorts them into the index structure, and stores the row pointers. From that point forward, the engine maintains the index automatically: every INSERT that adds a new row also adds an entry to LastnameIndex; every UPDATE that changes a Lastname value updates the corresponding index entry; every DELETE removes the entry.

What Happens After the Index Is Created

After LastnameIndex is created, queries that filter or sort on Lastname can use it without any change to the SQL syntax. The query:

SELECT Firstname, Lastname, PhoneNumber
FROM BasicTable
WHERE Lastname = 'Hilbert';

executes the same way it did before the index was created — but the engine now uses LastnameIndex to locate the matching rows directly rather than scanning the entire table. The index is invisible to the SQL developer; it operates transparently behind the scenes. Use EXPLAIN (covered in lesson 5) to confirm the engine is using the index for a given query.

Creating and managing indexes is a standard developer skill, not an exclusively DBA task. In tools like DBeaver, HeidiSQL, and Beekeeper Studio, you can view and create indexes through the schema browser or by executing CREATE INDEX statements directly in the SQL editor. For production systems with high traffic or strict SLA requirements, a DBA review of indexing decisions is best practice — but the syntax and concepts are part of every SQL developer's toolkit.

Extending to a Composite Index

If queries frequently sort or filter on both Lastname and Firstname together — for example, finding all contacts with the last name Hull and then sorting by first name — a composite index on both columns is more efficient than two separate single-column indexes:

CREATE INDEX idx_basictable_lastname_firstname
ON BasicTable (Lastname, Firstname);
This composite index supports queries that filter on Lastname alone, or on Lastname and Firstname together. The Lastname column is the leading column — the index is sorted first by Lastname, then by Firstname within each Lastname group. A query that filters only on Firstname without Lastname in the WHERE clause cannot use this composite index efficiently, because Firstname is not the leading column.

Applying the same pattern to the Contacts table from lesson 4:
-- Single-column index on the foreign-key lookup column
CREATE INDEX idx_contacts_lastname
ON Contacts (Lastname);

-- Composite index for full-name searches and sorts
CREATE INDEX idx_contacts_lastname_firstname
ON Contacts (Lastname, Firstname);


Index Naming Conventions

The idx_ Prefix Pattern

The legacy LastnameIndex name in the original lesson conveys the indexed column but does not identify the table. In a database with dozens of tables, an index name like LastnameIndex is ambiguous — which table's Lastname column does it index? The standard convention used throughout this module follows the pattern idx_tablename_columnname:
-- Pattern: idx_tablename_columnname
CREATE INDEX idx_basictable_lastname
ON BasicTable (Lastname);

-- Composite: idx_tablename_col1_col2
CREATE INDEX idx_contacts_lastname_firstname
ON Contacts (Lastname, Firstname);
This convention makes the index's purpose self-documenting. Any developer reading a list of indexes in the schema browser immediately knows which table each index covers and which column or columns it indexes.

Why Naming Matters for Maintenance

Index names appear in error messages, query execution plans, monitoring dashboards, and DROP INDEX statements. An ambiguous name like index1 or LastnameIndex forces developers to look up the index definition to understand what it covers. A descriptive name like idx_contacts_lastname_firstname communicates the full context without additional lookup.

Consistent naming also makes it easier to identify unused indexes — if EXPLAIN output or the database's index usage statistics show that idx_basictable_lastname has never been used in six months of query traffic, you know immediately which table and column to examine and whether dropping it makes sense.


Syntax Variations by Database Engine

The core CREATE INDEX syntax is consistent across all major relational databases. The variations appear in optional clauses, index types, and storage options.

MySQL / MariaDB

MySQL and MariaDB use the standard CREATE INDEX syntax with an optional index type specifier:
-- Standard index (B-tree by default)
CREATE INDEX idx_basictable_lastname
ON BasicTable (Lastname);

-- Explicitly specify index type
CREATE INDEX idx_basictable_lastname
ON BasicTable (Lastname)
USING BTREE;
MySQL InnoDB stores indexes as part of the table's data file — the clustered primary key index and all secondary indexes live together. Secondary indexes store the primary key value as the row pointer rather than a physical row address, which means secondary index lookups involve two B-tree traversals: one to find the key in the secondary index, and one to find the full row via the primary key.

PostgreSQL

PostgreSQL uses the same standard syntax and also supports a CONCURRENTLY option that builds the index without locking the table — essential for production systems where locking a large table during index creation is not acceptable:
-- Standard index
CREATE INDEX idx_basictable_lastname
ON BasicTable (Lastname);

-- Build without locking the table (takes longer but allows concurrent writes)
CREATE INDEX CONCURRENTLY idx_basictable_lastname
ON BasicTable (Lastname);
PostgreSQL also supports several index types beyond B-tree: Hash (equality-only), GiST (geometric and full-text), and GIN (for array and JSONB columns). The default B-tree index is appropriate for the column types covered in this course.

SQL Server

SQL Server uses the standard syntax with optional CLUSTERED or NONCLUSTERED specifiers (SQL Server-specific, as noted in lesson 5):
-- Standard non-clustered index (default)
CREATE INDEX idx_basictable_lastname
ON BasicTable (Lastname);

-- SQL Server online index build (equivalent to PostgreSQL CONCURRENTLY)
CREATE INDEX idx_basictable_lastname
ON BasicTable (Lastname)
WITH (ONLINE = ON);

ALTER TABLE ADD INDEX (MySQL Alternative)

MySQL also supports adding an index as part of an ALTER TABLE statement — useful when modifying a table's structure and adding an index in a single operation:
ALTER TABLE BasicTable
ADD INDEX idx_basictable_lastname (Lastname);

-- Or as a unique index
ALTER TABLE BasicTable
ADD UNIQUE INDEX idx_basictable_phone (PhoneNumber);
This syntax is MySQL-specific. PostgreSQL and SQL Server use CREATE INDEX exclusively for index creation outside of table definition.

Viewing Existing Indexes

Before creating a new index, it is good practice to check what indexes already exist on the table. Duplicate indexes — two indexes covering the same column — waste storage and write overhead without improving query performance.

MySQL — SHOW INDEX

-- List all indexes on a table
SHOW INDEX FROM BasicTable;

-- Or using information_schema (standard SQL, works across engines)
SELECT index_name, column_name, non_unique
FROM information_schema.statistics
WHERE table_name = 'BasicTable'
ORDER BY index_name, seq_in_index;

PostgreSQL — \d and pg_indexes

-- In psql client: show table structure including indexes
\d BasicTable

-- Query the pg_indexes system view
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'basictable'
ORDER BY indexname;

SQL Server — sp_helpindex

-- List all indexes on a table
EXEC sp_helpindex 'BasicTable';

-- Or query sys.indexes
SELECT i.name, i.type_desc, c.name AS column_name
FROM sys.indexes i
JOIN sys.index_columns ic ON i.object_id = ic.object_id
    AND i.index_id = ic.index_id
JOIN sys.columns c ON ic.object_id = c.object_id
    AND ic.column_id = c.column_id
WHERE OBJECT_NAME(i.object_id) = 'BasicTable'
ORDER BY i.name, ic.key_ordinal;

Dropping an Index

DROP INDEX Syntax

The DROP INDEX statement removes an existing index. The syntax varies by engine:
-- MySQL / MariaDB: requires table name
DROP INDEX idx_basictable_lastname ON BasicTable;

-- PostgreSQL: index name only (indexes are schema-scoped, not table-scoped)
DROP INDEX idx_basictable_lastname;

-- PostgreSQL: safe drop — no error if index does not exist
DROP INDEX IF EXISTS idx_basictable_lastname;

-- SQL Server: requires table-qualified name
DROP INDEX BasicTable.idx_basictable_lastname;

When to Drop an Index

An index should be dropped when it is no longer serving its purpose. The most common reasons to drop an index are: the query pattern it was designed for has changed and the index is no longer used; the table has become write-heavy and the index maintenance cost outweighs the read performance benefit; a duplicate index exists that covers the same columns; or the index was created experimentally and testing confirmed it does not improve the target queries.

Most database systems track index usage statistics. In MySQL, the performance_schema.table_io_waits_summary_by_index_usage view shows how many times each index has been used for reads. In PostgreSQL, pg_stat_user_indexes shows the idx_scan count for each index. An index with zero or near-zero scans over a representative time period is a candidate for removal.

Why Database Indexes Are Important

Faster Query Performance

Indexes allow the database engine to locate and retrieve matching rows without reading every row in the table. For a table with one million rows, a B-tree index reduces the number of comparisons needed to find a specific value from one million (full scan) to approximately twenty (log₂ of one million). This difference is invisible on small tables but becomes decisive as data volumes grow — a query that takes 50 milliseconds on ten thousand rows may take fifty seconds on ten million rows without an index, and remain under 100 milliseconds with one.

Efficient Sorting and Range Searches

Because indexes store their entries in sorted order, they accelerate both ORDER BY operations and range searches. A query with ORDER BY Lastname against an indexed Lastname column can read the index entries in sorted order rather than reading all rows and sorting them in memory. Range queries using BETWEEN or comparison operators (WHERE Price BETWEEN 10 AND 25) can use the index to jump to the start of the range and read forward, stopping when the range is exceeded — without examining rows outside the range at all.

Reduced I/O Load

Indexes reduce the number of data pages the engine reads from storage. A full table scan reads every data page that contains table rows. An index lookup reads only the index pages needed to find the matching keys, then reads only the specific data pages containing those rows. For a selective query — one that matches a small fraction of the total rows — this reduction in I/O can be dramatic, particularly on storage systems where I/O latency is significant.

Improved JOIN Operations

JOIN operations between tables are significantly faster when the join columns are indexed. When two tables are joined on ON Orders.ContactID = Contacts.ContactID, an index on Contacts.ContactID allows the engine to look up each ContactID value from the Orders table in the Contacts index rather than scanning the entire Contacts table for each Orders row. Without the index, a JOIN between a large Orders table and a large Contacts table requires a full scan of one table for every row in the other — an O(n×m) operation. With an index, it becomes O(n log m).

Unique Constraint Enforcement

Unique indexes enforce uniqueness at the database level. When a UNIQUE INDEX exists on a column, the engine checks the index on every INSERT and UPDATE to verify that the new value does not duplicate an existing entry. This check is fast — O(log n) — because the index is sorted. Without the index, enforcing uniqueness would require a full table scan on every write operation. Primary keys are always backed by a unique index; CREATE UNIQUE INDEX allows the same enforcement on any other column that requires uniqueness.


Index Trade-offs — Storage and Write Overhead

Index Size and Memory Pressure

Every index occupies storage space proportional to the number of rows in the table and the size of the indexed column values. A table with one million rows and ten indexes maintains ten separate sorted structures, each potentially as large as the indexed column data itself. Beyond disk space, indexes compete for space in the database engine's buffer pool — the in-memory cache of recently accessed data and index pages. A database with many large indexes may find that its buffer pool fills with index pages, leaving less room for the base table data and reducing the cache hit rate for queries.

The practical implication is to index deliberately. Every index you create is a commitment to ongoing storage and memory overhead. Index the columns your most important queries depend on, verify the performance improvement with EXPLAIN, and resist the temptation to index every column preemptively.

Write Performance Impact

Every INSERT, UPDATE, and DELETE operation must update all indexes defined on the table. An INSERT into BasicTable with three indexes — a primary key index, LastnameIndex, and a composite index on (Lastname, Firstname) — performs three index updates in addition to the base table insert. On tables with very high write rates — transaction logs, real-time event streams, sensor data tables — this per-write overhead accumulates and can become a bottleneck.
The trade-off is straightforward: indexes improve read performance at the cost of write performance. A table used primarily for reads (a reporting table, a reference data table, a historical archive) can sustain aggressive indexing. A table with extremely high write rates should be indexed minimally, with only the indexes that directly support the most critical read queries.

Indexing Strategy — How Many Is Too Many

There is no universal answer to how many indexes a table should have — it depends on the read-to-write ratio, the query patterns, the data volume, and the performance requirements. A practical starting point: index the primary key (automatic), index foreign key columns that appear in JOIN conditions, index columns that appear in WHERE clauses of your most frequent or most critical queries, and index columns used in ORDER BY on large result sets. Measure performance before and after creating each index using EXPLAIN and query timing. Remove indexes that are not used.

Lesson 7 examines how the database index engine uses indexes internally — the physical storage structures, the query optimizer's index selection logic, and the conditions under which the engine chooses a full table scan even when an index is available.


Creating Index - Exercise

Click on the Exercise link below to practice creating an index in SQL.
Creating Index - Exercise

SEMrush Software