Lesson 9	Relational Database Null Values
Objective	Explain the purpose of null values in a Relational Database

Null Values in a Relational Database

In a relational database, null values serve a specific purpose in denoting the absence of information or the inapplicability of a data element. The concept of null values, introduced by Dr. E.F. Codd in his seminal work on the relational model, plays a crucial role in managing data integrity, flexibility, and semantics within the database. The following points explain the purpose of null values in a relational database:

Representation of missing information: Null values allow for the representation of missing or unknown information in a database. When a specific attribute or data point is not available or has not been provided for a given tuple, a null value can be assigned to indicate that the information is currently unavailable. This facilitates the storage of partial information without resorting to artificial default values, which could lead to incorrect assumptions or conclusions.
Distinction from empty or zero values: Null values distinguish the absence of information from other valid data elements, such as empty strings, zeros, or other default values. By using null values, a relational database can differentiate between situations where the actual value is known to be zero or empty, and situations where the value is simply unknown or missing.
Support for optional attributes: In database design, some attributes may not be applicable to every tuple or may be optional. Null values enable the representation of such optional attributes, allowing for more flexible and adaptable schema design. This can be particularly useful in situations where data requirements evolve over time or where the data model needs to accommodate a wide range of scenarios.
Preservation of data integrity: By explicitly representing the absence of information through null values, relational databases can maintain data integrity when performing operations such as joins, selections, or projections. Queries and operations that involve null values can be designed to account for their specific semantics, ensuring that results accurately reflect the underlying data and its limitations.
Semantics in aggregate functions: In relational databases, aggregate functions such as COUNT, SUM, AVG, MIN, and MAX are used to summarize or calculate data across multiple tuples. The treatment of null values in these functions is defined such that they are generally ignored or excluded from the calculations, ensuring that the results are not skewed or distorted by the absence of information.

In conclusion, null values in a relational database serve the vital purpose of representing the absence of information or the inapplicability of a data element. Their proper use and handling ensure that the database can accurately model complex and evolving data requirements while maintaining data integrity and flexibility.
Every column in a table should contain a value, though there may be times when the value is unknown. For example, consider the following table, which stores data relating to Stories on CD suppliers.

SUPPLIER
SupplierId	SupplierName	SupplierAddress	SupplierPhone	SupplierFax

Columns of the SUPPLIER table 1) SupplierId, 2) SupplierName ,3) SupplierAddress, 4) SupplierPhone, 5) SupplierFax

Determine Column Values of Table

To communicate with suppliers you will need their name, address, phone number, and fax. If you do not know one or more of those pieces of data, you will not know what to enter into its corresponding column.
When the value to enter into a field is unknown, it is called a null value. A null is different from a blank or a zero. A blank is the value used when a column contains no value.
For instance, if the supplier you met with does not have a fax number, you would enter a blank value into that column. A zero, by contrast, is an INT or DECIMAL value.
If a Stories on CD supplier gave the company a thousand free CDs to send to the next one thousand customers who placed an order, the RetailPrice column for that CD would contain a zero.

Blank and null used interchangeably:
In many cases it appears that a blank and a null can be used interchangeably. However, the distinction is important to the RDBMS. If a value is truly unknown, enter a null into the column. You can represent a blank by typing a single space in the field. How you enter a null into a field will vary from RDBMS to RDBMS, but one common method is to skip the field entirely by pressing the Tab key. The columns that make up database tables fall into one of two general categories: key columns and descriptor columns.
The next lesson discusses the difference between them. Before you move on to the next lesson, click the link below to reinforce your understanding of the distinctions between nulls, zeros, and blanks.

Null, Zero, and Blank Values for Database Tables

How would you specify the attributes for a Customer entity in a table column for that entity, if the value of the attribute is not known in advanced?

Customer Apartment Number: blank.
Because the customer may not live in an apartment, this field should contain a blank. If the customer lived in an apartment, they would surely include it in their mailing address to ensure prompt delivery of their order.
Customer Phone Number: null.
We assume every customer has a phone, so entering a null into this field indicates the value exists but is not known.
Customer Order Total: zero.
A customer could use a coupon to reduce the total cost of their order to zero, so allowing a value of 0 is appropriate for this field.

null value in DBMS
Question: What does a null value in DBMS represent?
Answer: Null is a special marker used in SQL to indicate that a data value does not exist in the database. SQL Null serves to fulfill the requirement that all true relational database management systems (RDBMS) support a representation of 1) missing information and 2) inapplicable information. Codd also introduced the use of the lowercase Greek omega symbol ω to represent Null in database theory. NULL is also an SQL reserved keyword used to identify the Null special marker.

Systematic Treatment of Null Values [three-valued logic]

As you know, null is a special database value that means "unknown." Its presence in a database brings special problems during data retrieval. Consider, for example, what happens if you have an employees’ relation that contains a column for salary. Assume that the salary is null for some portion of the rows.
Question: What, then, should happen if someone queries the table for all people who make more than $60,000 a year?
Should the rows with null be retrieved, or should they be left out? When the DBMS evaluates a null against the logical criterion of salary value greater than 60,000, it cannot determine whether the row containing the null meets the criteria. Maybe it does, and maybe it does not. For this relation, we say that relational databases use three-valued logic. The result of the evaluation of a logical expression is true, false, or maybe. Codd's third rule therefore deals with the issue of nulls:
Null values (distinct from the empty character string or a string of blank characters or any other number) are supported in the fully relational DBMS for representing missing information in a systematic way, independent of data type.

Where to store NULL values: First, a relational DBMS must store the same value for null in all columns and rows where the user does not explicitly enter data values. The value used for null must be the same, regardless of the data type of the column. Note that null is not the same as a space character or zero; it has its own distinct ASCII or UNICODE value. However, in most cases when you see a query’s result table on the screen, nulls do appear as blank.
Second, the DBMS must have some consistent, known way of handling those nulls when performing queries. Typically, you will find that rows with nulls are not retrieved by a query, such as the salary greater than 60,000 example, unless the user explicitly asks for rows with a value of null. Most relational DBMSs today adhere to a three-valued logic truth table to determine retrieval behavior when they encounter nulls.
The inclusion of nulls in a relation can be extremely important. They provide a consistent way to distinguish between valid data such as a 0 and missing data. For example, there is a big difference between the balance in an account payable being 0 and being unknown. The account with 0 is something we like to see; the account with an unknown balance could be a significant problem. The concept of unknown values is not unique to relational databases. Regardless of the data model it uses, a DBMS must contend with the problem of how to behave when querying against a null.

Rule 3: Systematic Treatment of NULL Values

Question: What Are Codd's Rules of an RDBMS?
Rule 3 states that NULL values (distinct from an empty character string, a string of blank characters, or a numeric zero value) are supported in the RDBMS as a systematic representation of missing information, independent of the data type of the column containing the NULL value. This means that the database engine has to allow NULL values for any data type, as distinct and different from zeros, spaces, and N/A. This emphasizes the importance of the database supporting defined nullability (the ability to not have any value at all) and optionality (the ability for optional relationships to other data sets).