Lesson 1

SQL Functions in Database Engines (Math, String, and Date Functions Explained)

Database engines do far more than store and retrieve rows — they provide a library of built-in functions that perform calculations, transform strings, format dates, and aggregate data directly within SQL queries. These functions are predefined actions applied to values you send to the engine, executing inside the database layer before results are returned to the application. Using functions effectively means less data transferred over the network, less processing in middleware or front-end code, and SQL that accurately expresses the intent of the query rather than delegating computation elsewhere. This module covers the three principal categories of built-in SQL functions — numeric/math, string/character, and date/time — and introduces user-defined functions (UDFs) that allow you to encapsulate custom logic in a reusable, callable form.

What Is a SQL Function?

Predefined Actions on Values

A function is a predefined routine that accepts zero or more input values, performs a defined operation, and returns a result. The database engine executes the function as part of query processing — the calling SQL statement provides the input, the function returns the output, and the query uses that output just as it would use any column value or literal. At the simplest level, you can ask the engine to perform arithmetic directly:

-- The engine computes 5 + 5 and returns 10 in an unnamed column
SELECT 5 + 5;

The engine resolves the expression and returns 10 as the result of your query — no application code required. Functions extend this capability from simple arithmetic to complex string manipulation, date arithmetic, statistical aggregation, and user-defined business logic.

Built-in vs User-Defined Functions

SQL functions fall into two broad categories. Built-in functions are provided by the database engine — they are available without any DDL, are optimized at the engine level, and are consistent across all databases that implement them (with syntax variations between implementations). User-defined functions (UDFs) are created with CREATE FUNCTION DDL — they encapsulate custom logic, accept parameters, and can be called from SELECT, WHERE, JOIN, and other SQL clauses exactly like built-in functions. UDFs are discussed in detail at the end of this lesson and are the subject of several lessons later in this module.

SQL Data Analytics

Scalar vs Table-Valued vs Aggregate

Three structural categories of SQL function serve distinct purposes. Scalar functions accept one or more values and return exactly one value — UPPER('hello') returns 'HELLO', ROUND(3.14159, 2) returns 3.14. Table-valued functions return a result set — a virtual table that can be used in the FROM clause of a SELECT statement, joined to other tables, or filtered with WHERE conditions. Aggregate functions operate on a set of rows and return one value summarizing the set — SUM(), COUNT(), AVG(), MIN(), MAX() are the standard aggregate functions; they are always used with GROUP BY when applied to subsets of rows. A fourth category — window functions using the OVER() clause — applies aggregate-like calculations across a defined window of rows without collapsing the result set to one row per group. Window functions are covered in the advanced lessons of this module.

Database Engine Support for Functions

Why Functions Belong in the Engine

When working with database engines that have capabilities beyond simple data retrieval, it makes sense to take advantage of those capabilities. Many additional features build on elements already covered in this course — clauses in the SELECT statement, joins, and views. Functions are another layer of that capability. SQL is a tool for pulling out raw data to answer questions; any actual processing of those answers belongs in high-level programming languages in middleware or front-end applications. A web server might connect to a database to retrieve a customer name and order details, but the database is not expected to format the output for display, compute compound interest, or validate a credit card number — those are application concerns. What the database can and should do is handle data-layer transformations: converting case, calculating totals, extracting date components, and applying business rules that are inherently data-centric.

Operator Precedence in SQL

The four basic arithmetic operators supported by all SQL implementations are multiplication (*), division (/), addition (+), and subtraction (-). What is not obvious from this list is the order of precedence. SQL follows standard mathematical precedence: multiplication and division are evaluated before addition and subtraction; operations of equal precedence are evaluated left to right; parentheses override all precedence rules and are evaluated innermost first.

-- Without parentheses: multiplication before addition
-- Result: 5 + 10 = 15  (not 5 * 5 = 25... wait — 5 + 5 * 2)
SELECT 5 + 5 * 2;    -- returns 15

-- With parentheses: addition first
SELECT (5 + 5) * 2;  -- returns 20

-- Division truncates in integer arithmetic (SQL Server, Oracle)
SELECT 7 / 2;        -- returns 3 (integer division)
SELECT 7.0 / 2;      -- returns 3.5 (decimal division)
SELECT 7 / 2.0;      -- returns 3.5

The integer division behavior differs between SQL implementations — MySQL returns 3 for SELECT 7 / 2; PostgreSQL returns 3 as well for integer operands but 3.5 for SELECT 7.0 / 2. Always cast at least one operand to a decimal type when fractional results are required.

Math Functions

The Four Basic Operators in Practice

The four arithmetic operators are used directly in SQL expressions — they are not called as named functions but as infix operators between operands. They apply to column values, literals, and subquery results:

-- Arithmetic on column values
SELECT
    unit_price,
    quantity,
    unit_price * quantity            AS line_total,
    unit_price * quantity * 0.20     AS vat_amount,
    unit_price * quantity * 1.20     AS total_with_vat
FROM order_lines
WHERE order_id = 1001;

This query computes three derived values from two stored columns without any application-layer arithmetic — the engine calculates all three expressions and returns them as additional columns in the result set.

Advanced Math Functions — ROUND, ABS, CEILING, FLOOR, POWER

Beyond the four operators, most SQL implementations provide a standard set of mathematical functions. These are part of the SQL standard and available across SQL Server, Oracle, PostgreSQL, and MySQL with consistent syntax.

-- ROUND: round to specified decimal places
SELECT ROUND(3.14159, 2);    -- returns 3.14
SELECT ROUND(3.14159, 0);    -- returns 3.0
SELECT ROUND(2.5, 0);        -- returns 3.0 (rounds half up)

-- ABS: absolute value
SELECT ABS(-42);             -- returns 42
SELECT ABS(unit_price - target_price) AS price_variance
FROM products;

-- CEILING / FLOOR: round up or down to nearest integer
SELECT CEILING(3.2);         -- returns 4
SELECT FLOOR(3.9);           -- returns 3

-- POWER: raise to a power
SELECT POWER(2, 8);          -- returns 256

-- MOD / % : modulo (remainder)
SELECT MOD(17, 5);           -- Oracle, MySQL: returns 2
SELECT 17 % 5;               -- SQL Server, PostgreSQL: returns 2

Note that MOD() uses function syntax in Oracle and MySQL while SQL Server and PostgreSQL use the % operator — a common cross-database syntax difference to check when writing portable SQL.

String (Character) Functions

UPPER and LOWER — Cross-Database Syntax

Converting string case is one of the most common string operations in SQL — used for case-insensitive comparisons, display formatting, and data normalization. The standard SQL functions are UPPER() and LOWER(), implemented consistently across SQL Server, Oracle, PostgreSQL, and MySQL:

-- Standard syntax — works in SQL Server, Oracle, PostgreSQL, MySQL
SELECT UPPER(customer_name) AS upper_name,
       LOWER(customer_name) AS lower_name
FROM customers;

-- Case-insensitive comparison using LOWER
SELECT *
FROM customers
WHERE LOWER(email) = LOWER('User@Example.COM');

The legacy functions UCASE() and LCASE() appear in older MySQL documentation and some teaching materials — they are MySQL-specific aliases for UPPER() and LOWER() and are not part of the SQL standard. Use UPPER() and LOWER() in all new development for cross-database portability.

String Length — LEN vs LENGTH

Returning the character count of a string value is a common operation for validation and data quality checks. The function name differs by implementation — SQL Server uses LEN(), while MySQL, PostgreSQL, and Oracle use LENGTH():

-- SQL Server
SELECT LEN(customer_name) AS name_length
FROM customers
WHERE LEN(customer_name) > 50;   -- find unusually long names

-- MySQL, PostgreSQL, Oracle
SELECT LENGTH(customer_name) AS name_length
FROM customers
WHERE LENGTH(customer_name) > 50;

-- Cross-database: use CHAR_LENGTH() in MySQL/PostgreSQL
-- for consistent Unicode character counting
SELECT CHAR_LENGTH(customer_name) AS name_length
FROM customers;

The distinction between LEN() and LENGTH() is one of the most frequently encountered cross-database portability issues when migrating SQL between SQL Server and other implementations.

ROUND for String-Adjacent Operations and Additional String Functions

ROUND() is a numeric function but is frequently used alongside string functions in reporting queries that format output — rounding a calculated price before converting it to a display string, for example. Additional string functions available in most SQL implementations include SUBSTRING() (extract a portion of a string), TRIM() (remove leading and trailing spaces), REPLACE() (substitute one substring for another), CONCAT() or the || operator (join strings), and CHARINDEX()/POSITION()/INSTR() (find a substring within a string). These functions are covered in detail in the dedicated string functions lesson later in this module.

User-Defined Functions — Module 4 Preview

Scalar UDF — Encapsulating Business Logic

A scalar user-defined function accepts parameters, performs a computation, and returns a single value. Functions allow encapsulation of frequently used logic — a function written to calculate VAT for a product price can be invoked wherever VAT calculations are needed, ensuring consistency and eliminating redundancy across queries, stored procedures, and views. The following example uses SQL Server 2022 syntax:

-- SQL Server 2022: scalar UDF for VAT calculation
CREATE OR ALTER FUNCTION dbo.CalculateVAT (@Price DECIMAL(10,2))
RETURNS DECIMAL(10,2)
AS
BEGIN
    RETURN @Price * 0.20;
END;
GO

-- Calling the UDF in a SELECT statement
SELECT
    product_name,
    unit_price,
    dbo.CalculateVAT(unit_price)          AS vat_amount,
    unit_price + dbo.CalculateVAT(unit_price) AS price_with_vat
FROM products;

In Oracle 23ai, the equivalent scalar function uses PL/SQL syntax; in PostgreSQL 16, the equivalent uses CREATE OR REPLACE FUNCTION with PL/pgSQL or SQL language. The concept — encapsulate logic, call by name with parameters, receive a scalar return value — is consistent across implementations.

Table-Valued UDF — Returning Result Sets

A table-valued function returns a result set rather than a single value. It can accept parameters and use them to filter or shape the returned rows — making it a parameterized view. The calling query treats the function result as a table in the FROM clause:

-- SQL Server 2022: inline table-valued function
CREATE OR ALTER FUNCTION dbo.GetSalesData
    (@StartDate DATE, @EndDate DATE)
RETURNS TABLE
AS
RETURN
(
    SELECT
        ProductID,
        SUM(QuantitySold)  AS TotalSold,
        SUM(SaleAmount)    AS TotalRevenue
    FROM Sales
    WHERE SaleDate BETWEEN @StartDate AND @EndDate
    GROUP BY ProductID
);
GO

-- Calling the table-valued function in a query
SELECT p.ProductName, s.TotalSold, s.TotalRevenue
FROM dbo.GetSalesData('2026-01-01', '2026-03-31') s
JOIN Products p ON p.ProductID = s.ProductID
ORDER BY s.TotalRevenue DESC;

The table-valued function is joined to the Products table exactly as any other table would be — the database engine materializes the function result and then applies the join. This is significantly more flexible than a static view because the date range is passed as a parameter at query time.

Performance Considerations

While functions enhance modularity and reduce code duplication, they must be used judiciously. Scalar UDFs applied to large datasets — particularly in WHERE clauses or JOIN conditions — can cause significant performance degradation. The database engine must execute the scalar function once per row in the result set, and the optimizer often cannot push predicates through scalar UDFs as efficiently as it can through inline expressions. SQL Server 2019 introduced scalar UDF inlining, which automatically rewrites eligible scalar UDFs as inline expressions at compile time — but this optimization applies only to functions that meet specific structural criteria. In Oracle and PostgreSQL, function-based indexes can partially mitigate this issue for frequently queried function expressions. Inline table-valued functions (iTVFs) — which use a single RETURN SELECT statement without a BEGIN/END block — are generally more efficient than multi-statement table-valued functions because the optimizer can reason about their structure. Always profile the execution plan before deploying UDFs in production queries against large tables.

In the next lesson, you will learn how to use character functions to manipulate string data in SQL queries.