As with columns, relational databases have rules governing table rows:
- Each column in a row can contain only one value.
- There can be no duplicate rows: each instance of the entity described by the table cannot be represented more than once in the table.
- Each row has a primary key, which is a column, or set of columns whose values uniquely identify the row in the table.
Database design requires that every table in a relationship have a primary key. This is the reason why: Consider a table that lists customers for Stories on CD, Inc. If the same customer identification number were listed twice, orders from that customer could be linked to either instance of the customer ID. That would make it impossible to find every order placed by the customer if you searched based on only one of the customer's identification numbers.
You can think of a relational database as a collection of tables, each containing rows and columns. At this level, it looks a lot like a workbook containing several worksheets (or spreadsheets), although a worksheet is much less constrained than a database table is. You can put just about anything in any cell in a worksheet. In contrast, every entry in a particular column of a table is expected to contain the same kind of data. For example, all of the fields in a particular column should contain the same data type. A poorly designed database application may allow the user to introduce some irrelevant kinds of data into other fields. For example, if the database and user interface are not designed properly, you might be able to enter a string such as
none in a telephone number field. Where in reality, the field for this column may be a mandatory attribute.
In contrast, the cells of a spreadsheet do not really care what you put in them. The set of values that are allowed for a column is called the
domain [1] of the column. For example, a column's domain might be telephone numbers, bank account numbers, shoe sizes, or lamp colors.
Domain is closely related to data type but it is not quite the same. A column's data type is the kind of data that the column can hold.
The data types that you can use for a column depend on the particular database you are using but typical data types include
- integer,
- floating point number (a number with a decimal point),
- string, and
- date.
To see the difference between domain and data type, note that street address (345 Relational Road) and jersey color (red) are both strings.
However, the domain for the street address column is a valid physical street address, whereas the domain for the jersey color column is colors (and all colors may not be available if you only allow a few choices). You can think of the data type as the highest level or most general possible domain. For example, an address or color domain is a more restrictive subset of the domain allowing all strings.) The rows in a table correspond to column values that are related to each other according to the purpose of the table.
Some relational database management systems will not enforce the
no-duplicate-rows rule by default. Double-check your system if it does not activate it manually.
The next lesson describes column domains.