Database Design   «Prev  Next»

Lesson 7Data flow diagram
Objective Explain the purpose of the data flow diagram.

Data Flow Diagram

As part of Requirements Analysis, the designer may request organizational charts from the business, and supplement these charts with information gathered during interviews with users to assess how data flow is handled within the organization. He or she uses this information to create a data flow diagram. These diagrams are useful to the designer in establishing user views across multiple databases (and also in determining if multiple databases are needed) and in sorting business objects by subject matter. While there are different styles of data flow diagrams, they are quite similar in appearance and function and are read in the very same way.

Creating Data Flow Diagrams

The two main styles of data flow diagrams are the 1) Yourdon/DeMarco style and 2) the Gane and Sarson style. The Yourdon/DeMarco uses squares, circles, and parallel lines to represent data handlers, processes, and databases, respectively. The Gane & Sarson style, on the other hand, uses squares, round-cornered rectangles, and open-ended rectangles to represent data handlers, processes, and databases, respectively. CASE tools are useful for creating data flow diagrams, just as they are useful for creating ER diagrams. While different CASE tools use slightly different symbols in their diagrams, the diagrams illustrate the same information and are interpreted in the same manner.

(DFD) Data Flow Diagram

A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system, modeling its process aspects. Data flow diagrams are a preliminary step used to create an overview of the system which can later be elaborated.DFDs can also be used for the visualization of data processing (structured design). A DFD shows the kind of information that will be input to and output from the system. A DFD shows the input and output of a system.It does not show information about the timing of processes, or information about whether processes will operate in sequence or in parallel .

Yourdon/DeMarco Data Flow Diagram

The Yourdon/DeMarco data flow diagram, for example, includes symbols that illustrate:
  1. Who handles the data (text enclosed in a square)
  2. Where the data moves (arrows on the diagram)
  3. Where the data are stored (text surrounded by parallel lines)
  4. What is done to the data (text inside a circle)

The following diagram illustrates a typical Yourdon/DeMarco data flow diagram. Note the use of the symbols mentioned above:
  1. Text surrounded by a square indicates people who handle the data.
  2. Arrows connecting text indicate the movement of data.
  3. Text within circles indicates an operation performed on the data.
  4. Text enclosed by parallel lines indicates where the data is stored.
The diagram below contains an explanation of how the data is piped fron one component to the next. The diagram below describes the different processes assoicated with each number.
Data Flow Diagram
  1. A customer places an order
  2. An employee takes the customer order.
  3. The order is stored in the database.
  4. An employee prepares the order for shipment.
  5. The order is shipped to the customer.
  6. The ship date is recorded in the database.


Difference between Data Modeling and Data Flow

One of the most common mistakes people make when they begin data modeling is confusing data models with data flows. A data flow shows how data are handled within an organization, including
  1. who handles the data,
  2. where the data are stored, and
  3. what is done to the data.
In contrast, a data model depicts the internal, logical relationships between the data, without regard to who is handling the data or what is being done with it.
Data flows are often documented in data flow diagrams (DFDs). For example, Figure 4-7 shows a top-level data flow diagram for Distributed Networks. The squares with drop shadows represent the people who are handling the data. Simple rectangles with numbers in them represent processes, or things that are done with the data. A place where data are stored (a data store) appears as two parallel lines, in this example containing the words "Main database." The arrows on the lines show the direction in which data pass from one place to another. Data flow diagrams are often exploded to show more detail.
For example, Figure 4-8 contains an explosion of the "Take order" process from Figure 4-7. You can now see that the process of taking an order involves two major steps: getting customer information and getting item information.

Figure 4-7 : A top-level data flow diagram for Distributed Networks
Figure 4-7 : A top-level data flow diagram for Distributed Networks

Expand Process for further Detail

Each of the processes in Figure 4-8 can be expanded even further to show additional detail.
At this point, the diagrams are almost detailed enough so that an application designer can plan an application program.
Question: Where do the database and the ER diagram fit into all of this?
Answer: The entire ER diagram is buried inside the "Main database".
In fact, most CASE software allows you to link your ER diagram to a database's representation on a data flow diagram. Then, you can simply doubleclick on the database representation to the ER diagram into view.There are a few guidelines for keeping data flows and data models separate:
  1. A data flow shows who uses or handles data. A data model does not show which data are transformed.
  2. A data flow shows how data are gathered (the people or other sources from which they come), while a data model does not.
  3. A data flow shows operations on data (the process through which data are transformed). A data model does not.

Figure 4-8 An explosion of the Take order process from Figure 4-7.
Figure 4-8 An explosion of the Take order process from Figure 4-7

  1. A data model shows how entities are interrelated. A data flow does not.
  2. A data model shows the attributes that describe data entities.A data flow does not.
The bottom line is this: A data model contains information about the data being stored in the database (entities, attributes, and entity relationships). If data about an entity are not going to be stored in the database, then that entity should not be part of the data model. For example, although the DistributedNetworks data flow diagram shows the employee who handles most of the data, no data about employees are going to be stored in the database. Therefore, there is no employee entity in the ER diagram.

It is not always necessary to create a data flow diagram. For example, Stories on CD, Inc. is small, with only one subject database. It is not necessary to create a diagram to understand the flow of data in this instance. Views created for this company would provide specific information in a specific format, and those views could be documented without recourse to a data flow diagram. However, a large organization with several subject databases (or one very large database) will require numerous user views that draw from different databases. In this case, a data flow diagram is important for two reasons:
  1. To determine what data from which databases go into different user views
  2. To assist application designers in planning database application programs

The larger the organization for which you are designing database(s), the more important it is to create a data flow diagram. The next lesson discusses user views.

Data Flow - Exercise

Before moving on to the next lesson, click the Exercise link below to check your understanding of the data flow diagram.
Data Flow - Exercise

Ad Relational Database Design