Physical Design   «Prev  Next»

Lesson 14Building the Database
ObjectiveExplain how to use SQL to build a Database

Creating Schema and Building a Database using SQL

Once you have created the schema and populated it with tables, it is time to build the database itself. This involves a use of SQL that is beyond the scope of this course.
To create a schema, you use the CREATE SCHEMA statement. In its simplest form it has the syntax:
CREATE SCHEMA schema_name
as in:
CREATE SCHEMA dispersednet

By default, a schema belongs to the user who created it (the user ID under which the schema was created). The owner of the schema is the only user ID that can modify the schema unless the owner grants that ability to other users. To assign a different owner to a schema, you add an AUTHORIZATION clause:

For example, to assign the schema of the Antique Opticals to the user ID DBA, someone could use:

When creating a schema, you can also create additional elements at the same time. To do so, you use braces to group the CREATE statements for the other elements, as in:
owner_user_id { 
// other CREATE statements go here 

SQL Commands used to create a Database Table

The following diagram gives you a flavor of the commands involved in creating a table for the course-project database. (The code assumes a schema named StoriesOnCDOrders already exists.) This automatically assigns the elements with the braces to the schema
SQL code
SQL Commands in the diagram above are detalied below
  1. CREATE TABLE CD: Tells the RDBMS to make a table named CD.
  2. CDNo int: Creates a field named CDNo with the data type Integer.
  3. CDTitle varchar (42): Creates a field named CDTitle that can contain up to 42 characters.
  4. DistID int: Creates a field named DistID with the data type Integer.
  5. CDPrice numeric (3, 2): Creates a field named CDPrice that can hold a numeric value with up to three digits to the left of the decimal and two to the right (e.g., 14.95).
  6. CategoryNo int: Creates a field named CategoryNo with the data type Integer.
  7. PRIMARY KEY (CDNo): Defines the CDNo field as the primary-key field for the table.
  8. FOREIGN KEY (DistID): Identifies the DistID field as a foreign-key field.
  9. FOREIGN KEY (CategoryNo): Identifies the CategoryNo field as a foreign-key field.

What is an Identifier in the context of a Database Object?

An identifier is a name that lets you refer to an object unambiguously within the hierarchy of database objects (whether a schema, database, column, key, index, view, constraint, or anything created with a CREATE statement). An identifier must be unique within its scope, which defines where and when it can be referenced. In general:
  1. Database names must be unique on a specified instance of a database server.
  2. Table and view names must be unique within a given schema (or database).
  3. Column, key, index, and constraint names must be unique within a given table or view.
This scheme lets you duplicate names for objects whose scopes do not overlap. You can give the same name to columns in different tables, for example, or to tables in different databases.

SQL Commands used to create Database Table

The operators of the relational algebra allow us to start with some given relations and obtain further relations from those given ones (for example, by performing queries). The given relations are referred to as base relations[1], the others are derived relations. In order to get us started, therefore, a relational system has to provide a means for defining those base relations in the first place. In SQL, this task is performed by the CREATE TABLE statement (the SQL counterpart to a base relation being, of course, a base table, which is what CREATE TABLE creates). And base relations obviously need to be named. For example:

But certain derived relations, including in particular what are called views, are named as well. A view (also known as a virtual relation) is a named relation whose value at any given time t is the result of evaluating a certain relational expression at that time t. Here’s an SQL example:
WHERE CITY = 'Paris' ) ;

Operate on Views

In principle, you can operate on views just as if they were base relations,12 but they are not base relations. Instead, you can think of a view as being "materialized". In effect, you can think of a base relation being constructed whose value is obtained by evaluating the specified relational expression─at the time the view in question is referenced. But I must emphasize that thinking of views being materialized in this way when they are referenced is purely conceptual and is only a way of thinking. It is not what is really supposed to happen; and it would not work for update operations in any case.
  1. Base relations really exist, that is, they are physically stored in the database.
  2. Views, by contrast, do not really exist, they merely provide different ways of looking at the base relations.

But the relational model has nothing to say as to what is physically stored. In fact, it has nothing to say about physical storage matters at all. In particular, it categorically does not say that base relations are physically stored. The only requirement is that there must be some mapping between whatever is physically stored and those base relations, so that those base relations can somehow be obtained when they are needed. If the base relations can be obtained from whatever is physically stored, then everything else can also be obtained. For example, we might physically store the join of 1) suppliers and 2) shipments, instead of storing them separately. Then base relations S and SP could be obtained, conceptually, by taking appropriate projections of that join. In other words, base relations are no more (and no less) physical than views are, as far as the relational model[2] is concerned. The fact that the relational model says nothing about physical storage is deliberate.
The idea was to give implementers the freedom to implement the model in whatever way they chose. In particular, in whatever way seemed likely to yield good performance without compromising on physical data independence. Most SQL product vendors seem not to have understood this point and instead, they map base tables fairly directly to physical storage. Database Management Systems therefore provide far less physical data independence than relational systems. Indeed, this state of affairs is reflected in the SQL standard itself (as well as in most other SQL documentation). The whole point about a view is that it is a table (or a relation). That is, we can perform the same kinds of operations on views as we can on regular relations (at least in the relational model), because views are "regular relations." The term relation is to mean a relation (possibly a base relation, possibly a view, possibly a query result, etc.) If I want to mean a base relation specifically, then I will say "base relation." I suggest strongly that you adopt the same discipline for yourself. Do not fall into the common trap of thinking the term relation means a base relation specifically or, in SQL terms, thinking the term table means a base table specifically. Likewise, do not fall into the common trap of thinking base relations (or base tables, in SQL) have to be physically stored.

Base Relations in Relational Algebra

Base relations are the core building blocks from which other relations or views can be derived.
Key characteristics of base relations:
  • Physically Stored: Base relations are physically stored on disk or other storage media, unlike derived relations, which are computed on demand.
  • Independent: They aren't computed from other relations. They serve as the foundation for constructing more complex relations through relational algebra operations.
  • Directly Manipulated: Users can directly interact with base relations through operations like insert, update, and delete.
  • Represented as Tables: They are typically visualized as tables with rows (tuples) representing individual records and columns (attributes) representing data fields.
Examples of base relations in a database:
  • Customers: Storing customer information (name, address, phone number, etc.)
  • Orders: Keeping track of orders placed (order ID, customer ID, order date, items ordered, etc.)
  • Products: Maintaining product information (product ID, name, description, price, etc.)
  • Employees: Holding employee data (employee ID, name, job title, salary, etc.)
Contrasting with Derived Relations:
  • Derived relations (views): These are virtual tables that are computed from base relations using relational algebra expressions. They don't physically store data but present a customized view of the underlying data.
  • Base relations serve as the starting point for creating derived relations.
Importance of base relations:
  • Foundation of data storage: They form the primary storage mechanism for data in relational databases.
  • Building blocks for queries: Relational algebra operators are applied to base relations to retrieve and manipulate data.
  • Data integrity: Enforcing integrity constraints (e.g., primary keys, foreign keys) on base relations ensures data consistency and accuracy.
  • Efficiency: Optimizing base relation storage and access methods is crucial for database performance.

The next lesson examines the completed data dictionary.
[1]base relations: In the context of relational algebra and relational database theory, base relations refer to the fundamental, stored tables that hold the actual data in a database.
[2] Relational Model: Relational Model represents how data is stored in Relational Databases. A relational database stores data in the form of relations (tables).

SEMrush Software