Relational Constructs   «Prev  Next»

Null, Zero, and Blank Values for Database Tables

How would you specify the attributes for a Customer entity in a table column for that entity, if the value of the attribute is not known in advanced?
  1. Customer Apartment Number: blank.
    Because the customer may not live in an apartment, this field should contain a blank. If the customer lived in an apartment, they would surely include it in their mailing address to ensure prompt delivery of their order.
  2. Customer Phone Number: null.
    We assume every customer has a phone, so entering a null into this field indicates the value exists but is not known.
  3. Customer Order Total: zero.
    A customer could use a coupon to reduce the total cost of their order to zero, so allowing a value of 0 is appropriate for this field.

null value in DBMS

Question: What does a null value in DBMS represent?
Answer: Null is a special marker used in SQL to indicate that a data value does not exist in the database. SQL Null serves to fulfill the requirement that all true relational database management systems (RDBMS) support a representation of 1) missing information and 2) inapplicable information. Codd also introduced the use of the lowercase Greek omega symbol ω to represent Null in database theory. NULL is also an SQL reserved keyword used to identify the Null special marker.

Systematic Treatment of Null Values [three-valued logic]

As you know, null is a special database value that means "unknown." Its presence in a database brings special problems during data retrieval. Consider, for example, what happens if you have an employees’ relation that contains a column for salary. Assume that the salary is null for some portion of the rows.
Question: What, then, should happen if someone queries the table for all people who make more than $60,000 a year?
Should the rows with null be retrieved, or should they be left out? When the DBMS evaluates a null against the logical criterion of salary value greater than 60,000, it cannot determine whether the row containing the null meets the criteria. Maybe it does, and maybe it does not. For this relation, we say that relational databases use three-valued logic. The result of the evaluation of a logical expression is true, false, or maybe. Codd's third rule therefore deals with the issue of nulls:
Null values (distinct from the empty character string or a string of blank characters or any other number) are supported in the fully relational DBMS for representing missing information in a systematic way, independent of data type.

Where to store NULL values

First, a relational DBMS must store the same value for null in all columns and rows where the user does not explicitly enter data values. The value used for null must be the same, regardless of the data type of the column. Note that null is not the same as a space character or zero; it has its own distinct ASCII or UNICODE value. However, in most cases when you see a query’s result table on the screen, nulls do appear as blank.
Second, the DBMS must have some consistent, known way of handling those nulls when performing queries. Typically, you will find that rows with nulls are not retrieved by a query, such as the salary greater than 60,000 example, unless the user explicitly asks for rows with a value of null. Most relational DBMSs today adhere to a three-valued logic truth table to determine retrieval behavior when they encounter nulls.
The inclusion of nulls in a relation can be extremely important. They provide a consistent way to distinguish between valid data such as a 0 and missing data. For example, there is a big difference between the balance in an account payable being 0 and being unknown. The account with 0 is something we like to see; the account with an unknown balance could be a significant problem.
The concept of unknown values is not unique to relational databases. Regardless of the data model it uses, a DBMS must contend with the problem of how to behave when querying against a null.

Ad   Fundamentals of Database Systems