Relational Constructs   «Prev  Next»

Lesson 8 Calculations and Domain Types
Objective List domain types that support calculations.

Calculations and Domain Types for Database Systems

An important advantage of using non-text data domains for columns in a relational database is that the RDBMS can perform calculations directly on the stored values. Text domains - CHAR and VARCHAR - store data as character strings. The RDBMS treats these values as text and cannot apply arithmetic or temporal operations to them. Non-text domains tell the RDBMS what kind of value a column holds, which enables it to apply the appropriate operations.

The four domain types that enable calculations on column contents are:

  1. INT
  2. DECIMAL
  3. DATE
  4. TIME

A practical example: to calculate the cost of an order where a customer has a 15% discount coupon, store the retail price in a DECIMAL column and the discount rate in a DECIMAL column, then multiply the retail price by 0.85 (1 - 0.15) to produce the discounted cost. This calculation is only possible because the columns holding price and discount data use numeric domains rather than text.

Numeric Domain Types for Calculations

Numeric domain types are designed to store numerical values and support the full range of arithmetic operations: addition, subtraction, multiplication, division, modulus, and integer division. Three categories of numeric domain types are available in SQL-based RDBMS platforms:

  1. Integer types (INT, SMALLINT, BIGINT) - store whole numbers, both positive and negative, with no decimal component. The range of values varies by subtype and platform. SMALLINT typically handles values from -32,768 to 32,767. Standard INT handles larger ranges. BIGINT handles very large integer values required by financial transaction IDs, row counts, and similar high-volume applications. Integer types are appropriate for counts, quantities, and identifiers.
  2. Floating-point types (FLOAT, DOUBLE) - store approximate numerical values with floating-point precision. These types are suitable for scientific and engineering measurements where a high degree of precision is needed but absolute exactness is not required. Floating-point types are not appropriate for financial calculations because their binary representation of decimal fractions can introduce small rounding errors that accumulate across multiple operations.
  3. Exact decimal types (DECIMAL, NUMERIC) - store exact numeric values with a defined number of decimal places. The designer specifies the total number of digits and the number of digits to the right of the decimal point. DECIMAL(10,2), for example, stores values up to 99,999,999.99 with exactly two decimal places. This makes DECIMAL the correct choice for monetary values, prices, tax rates, and any application where rounding errors are unacceptable.

Date and Time Domain Types for Calculations

Date and time domain types allow storage and manipulation of temporal values. Most SQL-based RDBMS platforms support arithmetic operations on date and time columns: adding or subtracting intervals, calculating the difference between two dates, and extracting specific components such as day, month, or year.

  • DATE - stores a calendar date without a time component. The RDBMS can subtract one DATE value from another to produce the number of days between them.
  • TIME - stores a time of day without a date component. TIME arithmetic enables calculations such as the duration between a shift start and end time.
  • TIMESTAMP - stores both date and time components, enabling more complex temporal calculations and comparisons involving both the date and the time of an event.
  • INTERVAL - represents a duration or span of time, used as the result or operand of arithmetic operations involving DATE, TIME, or TIMESTAMP values.

Date arithmetic has practical value in business applications. If an order was placed on 03/07/00 and shipped on 03/21/00, subtracting the placement date from the shipment date produces a difference of 14 days. A 14-day gap between order and shipment could indicate that the item was out of stock and required a special order from the producer - information that is immediately useful for inventory management and customer service.

Boolean Domain Types

Boolean domain types store logical true or false values, typically represented internally as 1 (true) or 0 (false). Boolean columns are not primarily used for arithmetic calculations, but they participate in logical operations - AND, OR, NOT - that are essential for filtering and conditional evaluation in queries. A WHERE clause that combines multiple conditions uses Boolean logic to determine which rows satisfy all the specified criteria.

Creating RDBMS Calculations

When you create a calculation in an RDBMS, you use arithmetic operators to tell the system which mathematical operation to perform. The following table lists the standard operators and the operations they represent:

Symbol Operation Example
+ Addition 5 + 2 = 7
- Subtraction 5 - 2 = 3
* Multiplication 5 * 2 = 10
/ Division 5 / 2 = 2.5
Mod Modulo division (remainder division) 5 Mod 2 = 1 (2 goes into 5 twice, with a remainder of 1)
\ Integer division (remainder discarded) 5 \ 2 = 2

The Eight Original Relational Operations

Eight operations were originally defined for relational databases by E.F. Codd as part of the relational algebra. These operations form the theoretical foundation of modern database query languages including SQL. Understanding them clarifies why relational databases are structured the way they are and how query languages achieve their expressive power.

  1. Selection - retrieves some or all of the rows in a table that satisfy a specified condition. In SQL this is the WHERE clause. For example: SELECT drink FROM fridge WHERE drink = 'juice' returns only the rows where the drink column contains 'juice'. Selection filters rows without changing the column structure.
  2. Projection - returns a subset of columns from a table or selection, dropping the columns that are not needed. In SQL this is expressed by naming specific columns in the SELECT list rather than using SELECT *. Projection reduces the width of the result without filtering rows.
  3. Union - combines the rows of two tables with compatible column structures into a single result, removing duplicate rows. Suppose a Competitors table lists this year's participants and a FormerCompetitors table lists participants from previous years. The union of the two tables produces a complete list of everyone who has competed in either year, with duplicates removed.
  4. Intersection - returns only the rows that appear in both tables. The intersection of FormerCompetitors and Competitors produces a list of participants who competed in a previous year and are competing again this year.
  5. Difference - returns the rows that appear in the first table but not in the second. The difference between FormerCompetitors and Competitors produces a list of people who competed previously but are not competing this year - useful for follow-up outreach.
  6. Cartesian Product - creates a result containing every possible combination of a row from the first table with a row from the second table. If one table contains values {1, 2, 3} and a second contains values {A, B, C}, their Cartesian product contains nine rows: 1/A, 1/B, 1/C, 2/A, 2/B, 2/C, 3/A, 3/B, 3/C. The Cartesian product is rarely used directly but is the mathematical basis for the JOIN operation.
  7. Join - a conditional Cartesian product that pairs rows from two tables only when they satisfy a specified condition. For example, joining a Competitors table with a NextOfKin table where the competitor's NextOfKin value matches the NextOfKin table's Name column produces a result that pairs each competitor with their corresponding next of kin record. JOIN is the most frequently used multi-table operation in SQL.
  8. Divide - the inverse of the Cartesian product. It uses one table to partition the records in another and finds all field values in the first table that are associated with every value in the second. If a first table contains the pairs {1/A, 1/B, 1/C, 2/A, 2/B, 2/C, 3/A, 3/B, 3/C} and a second table contains {1, 2, 3}, dividing the first by the second produces {A, B, C} - the values associated with all three rows of the second table.

Applications for Mathematical Calculations in Databases

The ability to perform calculations on stored data makes relational databases useful for a wide range of analytical and operational applications:

  1. Financial Analysis. Interest calculations, amortization schedules, present value computations, and portfolio analysis all require the precision of DECIMAL types and the ability to perform arithmetic directly in SQL queries. Performing these calculations in the database rather than in application code ensures that all users and systems work from the same computed values.
  2. Statistical Analysis. Databases store large volumes of data that benefit from aggregation functions such as SUM, AVG, MIN, MAX, COUNT, and standard deviation. SQL aggregate functions applied to numeric columns produce statistical summaries that reveal trends and patterns across the dataset.
  3. Data Mining. Calculations supporting clustering, classification, and association rule mining require repeated arithmetic operations across large datasets. Performing these operations close to the data - in the database engine rather than in application memory - reduces data transfer overhead and leverages the RDBMS's optimization capabilities.
  4. Scientific Research. Experimental results and simulation data stored in numeric columns can be analyzed using regression queries, statistical aggregations, and mathematical transformations applied directly in SQL. Date and timestamp columns support temporal analysis of time-series data.
  5. Manufacturing and Quality Control. Process capability indices, control chart calculations, and defect rate computations can be performed in the database against production data stored in numeric columns, providing real-time quality metrics for operational decisions.

Choosing Data Types for Cross-Domain Calculations

When a calculation involves columns of different numeric types, the result's data type determines how the value is stored and what precision is retained. Pay careful attention to the data types involved in any calculation that mixes INT and DECIMAL values.

For example, if you subtract a DECIMAL value of 5.25 from an INT value of 35, the result can be handled two ways:

  1. As an INT, the result is 29 - the decimal portion (0.75) is truncated.
  2. As a DECIMAL, the result is 29.75 - the full precision is preserved.

For financial and scientific applications, always use DECIMAL types for columns involved in calculations where precision matters. Storing prices, rates, and quantities as INT to save storage space and then performing arithmetic on them risks precision loss that compounds over many calculations. Always consult your RDBMS documentation for the rules governing cross-domain arithmetic on your specific platform.

Calculated Fields Belong in Views, Not Base Tables

Every column in a base table must independently and uniquely store information about the table's subject. A column's value should represent a raw attribute of the entity - data that was observed, entered, or imported, not derived from other columns in the same table.

Calculated fields violate this principle. A LineCost column that is computed by multiplying Quantity by SellingPrice and then applying a Discount does not independently store information - it derives its value entirely from other columns. Including such a column in a base table creates redundancy: the same information is now represented twice (as the source columns and as the calculated result), and any update to a source column can leave the calculated column out of sync.

The rule is: never include calculated fields in a base table. Instead, place calculated expressions in user views[1]. A view can define a calculated column as a SELECT expression derived from base table columns, and every query against the view will compute the value from the current source data. User views also serve the secondary purpose of specifying which users are permitted access to specific data in the database. Views are examined in detail later in this course.

The next lesson describes null values[2].

Calculation Domain - Exercise

Before moving on to the next lesson, click the Exercise link below to reinforce your understanding of calculated domains:
Calculation Domain - Exercise

[1] User views: Saved queries created with SQL that specify which users are permitted access to specific data in a database. Views can include calculated expressions derived from base table columns.
[2] Null value: An unknown or missing value in a column or field.
SEMrush Software 8 SEMrush Banner 8