Lesson 1

Fundamental Aspects of SQL

Module introduction

This module introduces the fundamentals of SQL: what it is, why it matters, and how it differs from “traditional” programming. You will also set up a practice database and load the PUBS sample data so you can run consistent queries throughout the course.

SQL (Structured Query Language) is the standard language used to define, query, and control data in a relational database management system (RDBMS). SQL includes statements for: data definition (creating and changing schema objects), data manipulation (querying and modifying rows), and data control (permissions and access control). SQL has been standardized through ANSI and ISO, which is why SQL skills transfer across many platforms.

SQL is one of the foundational building blocks of modern data architecture. It defines how applications and users create tables, enforce data rules, and retrieve information efficiently. By the end of this module, you should be comfortable reading SQL and writing basic queries—enough to start working effectively with databases in real systems.

SQL emerged from relational theory—specifically relational algebra and tuple relational calculus. In the relational model, data is represented as relations (tables), and queries are operations that transform one set of relations into another set of relations. Edgar F. Codd’s 1970 paper introduced the relational model, and SQL became the first broadly successful commercial language influenced by that work.

SQL is often described as a declarative language: you describe the result you want (the set of rows and columns), and the database engine decides how to produce it (index usage, join order, access paths). That is one of the biggest differences between SQL and general-purpose programming languages.

It’s also worth knowing that SQL is not a perfect “pure relational” language. For example, SQL permits duplicate rows unless you explicitly eliminate them (for example via DISTINCT), and it includes NULL to represent missing/unknown values. Even with these practical compromises, SQL remains the dominant language for relational systems and has evolved over decades with new capabilities.

Why SQL is useful

Information is one of the most valuable assets in modern organizations. Businesses store and analyze data about customers, orders, inventory, finances, operations, and more. A database system exists to store that data safely, allow many users to query it concurrently, and keep it consistent as it changes over time.

SQL is useful because it gives you:

A common language for data access: applications, analysts, and administrators can all use SQL to work with the same data.
Set-based querying: you can transform thousands (or millions) of rows at once without writing loops.
Data integrity support: constraints and keys help prevent redundant and inconsistent data.
Security and governance: GRANT/REVOKE and role-based permissions define who can do what.
Optimization: the database can choose efficient execution plans using statistics and indexes.

How SQL differs from programming

General-purpose languages (Java, C#, Python, etc.) are usually imperative: you write step-by-step instructions that specify control flow (loops, branches, algorithms). SQL is typically declarative: you specify the data set you want, and the database figures out the retrieval strategy.

That difference has a huge practical impact:

SQL is set-oriented: operations apply to sets of rows, not one row at a time.
The optimizer matters: performance depends on indexing, statistics, and query shape—not just correctness.
Schema design matters: keys, relationships, and normalization choices strongly affect correctness and speed.

Neighboring topic: most platforms also provide procedural extensions for database-side logic (for example PL/SQL in Oracle, T-SQL in SQL Server, PL/pgSQL in PostgreSQL). Those extensions can add loops and control-of-flow, but they are still best used when the work is data-centric and close to the database.

Modern data and SQL today

Data volume and variety have grown massively. Along with traditional structured data (tables of numbers and text), modern systems manage semi-structured and unstructured content such as documents, logs, messages, images, and event streams.

Relational databases evolved to meet this reality. Many engines now support JSON storage and querying, full-text search, spatial data, and integration with external runtimes and services. Even when data begins “unstructured,” organizations often need to model, validate, and query it reliably—which is exactly where relational design and SQL remain valuable.

A core design goal stays the same: reduce redundancy and improve correctness. Normalization and keys are not just academic topics; they are practical tools for preventing update anomalies and ensuring your data stays trustworthy over time.