Design Strategy, Tools and Database Life Cycle
Regardless of the approach they take to database design, the overall strategy of database designers is the same:
- To accomplish certain time-tested objectives.
- Design strategy, tools, and the database life cycle
This module describes those objectives as well as two different approaches to database design, and illustrates database architecture. In addition, this module describes the design and post-design stages of the database life cycle, and outlines the capabilities of CASE tools.
1) Subject Approach, 2) Application Approach
Fixed Life Cycle
All computer software, including databases and their incorporated database models, have a fixed life cycle. That life cycle is the cycle of usefulness within acceptable limits of cost effectiveness. There comes a point where older legacy software is either too expensive to maintain or can be easily replaced. At that point, there is simply no reason to retain and not rewrite older software. Other than that, older legacy databases (depending on how old they are) can present enormous problems for management in finding people to maintain those older systems.
Module 3 Learning objectives
After completing this module, you will be able to:
- Describe the overall strategy of database design
- Describe the subject approach to database design
- Describe the application approach to database design
- Define Three Schema Architecture: 1) user view, 2) logical schema, and 3) physical schema
- Describe the design stages of the database life cycle
- Describe the post-design stages of the database life cycle
- Explain the use of CASE tools in database design
Stages of Data
What follows are the stages that new data goes through on its way from being used as
a recent fact to becoming a historic fact and eventually to becoming a forgotten fact. The latter is a stage in the life cycle that we are having increasing difficulty believing in because of repeated requests by clients to unearth their archived data or restore it from backups. This is not the life cycle of an application development project, but many other processes can go through similar stages.
This is the point in time where someone asks for some data.
Let us consider the example of a "credit card transaction" "charge-back fee amount".
Say you are a small retail firm, and you want to begin accepting credit cards to pay for purchases. You contact several credit card companies to determine the process. You discover that different companies charge different fees for using their services.
The fees charged differ under different circumstances such as 1) cash advances or 2) payment overdue fees.
You realize that to understand what the provision of this service to your customers is going to cost you, you need to collect the service fee for each transaction separately to be able to better analyze profitability.
This recognition of the need for a data element is step one.
A data element is not captured just because it exists. In order to consider a data element, it needs to be of value to someone, or there needs to be a reason for using this element in the future. As you find out what impact the new data element has on the organization, you may discover other data elements also need to be available.
The Need It phase is the recognition of a new fact, which needs to be available in order to accomplish a task. You now know you need a Credit Card Transaction Charge-Back Fee Amount to assess profitability in your business, determine return on investment
(ROI) by looking at service usage and cost, and even plan next year's budget. So what happens after you know you need it?
In this stage, you explore and analyze the data element you need to collect.
You ask questions about frequency, size, and business rules
. You explore methods of capturing and storing data while investigating security, reliability, and quality issues.
You ask questions such as:
- how much can a credit card transaction charge-back fee amount be?
- If you are charged later, how do you tie it to the individual transaction?
- Is it credited to the store so the amount could be a plus or a minus value?
Generally, hundreds of questions arise to discover and plan for the true nature and business requirements
of the data element. When you are in this Plan It
stage, you look at different aspects of the data element and document. This may be in the form of a formal specification or a lively brainstorming session. What happens after you make a plan for it?
The Collect Data Stage is the part of the life cycle where you check to see if you were right and then implement the plan.
It is a time of testing and finally deploying the chosen method of data creation. You must sufficiently understand the data to at least draft a solution to the data requirements and to be able to verify that the plan is going to work. For this example, you start a collection of credit card transaction charge-back fee amounts and see if you learned enough to manage them properly. Sometimes this means you end up discovering you need more data elements or you did not plan for this one properly. This testing happens every time a new value (a real instance of the data element)
comes into your hands to manage. You may find that things run smoothly for a period of time and then changes in the billing style of the credit card company impact your understanding, use, and definition of this data element. When that happens, the plan for the data element changes. None of these stages in the life cycle has to be sequential. Loopbacks can happen innumerable times.
So now you feel comfortable that you have the process of "collection of credit card transaction charge-back fee" amounts right.
After collecting a certain amount of data, you must decide how to store it, possibly using different techniques to support different processes.
For example, you may need a focused, simple, little data store, such as an Extensible Markup Language (XML) file to collect real-time data.
However, you also need a larger, more complex one to support the compilation of all the little data sets into one for corporate reporting. Imagine that the data elements are originally collected in dozens of little individual credit card scanners that are polled on a periodic basis. Those data values may need to be applied correctly to the sales data from the cash registers and finally aggregated and sent on to a different environment. Security or accessibility issues may require parallel storage (in case of power or equipment failure). This is another point at which it is critical to monitor the process by which you will collect this data element, with particular regard to correctness and to performance requirements. You are striving for a process that is accurate and reliable so that the Credit Card Transaction Charge-Back Fee Amount is correctly collected and stored each time,
but at the same time you may be required to be sensitive to adding undue complexity (and therefore time) to the sale completion process.
But once you have collected and stored it to your satisfaction, what happens next?
This is where you start to use the data element to your advantage. This is a maturing of the data element itself.After it has been defined to the team's satisfaction, new ways to use it should start to spring up, especially if the data element is a timely and reliable addition to the inventory of business data elements. Equally, any problems with the original analysis will have become apparent and need revising. This is a time of integration with other data elements, sometimes even in other data collections with a new generation of highly complicated report writing.
The more the data element is requested for use, the more confident you can be that it was well-defined to satisfy the requirements.
The Credit Card Transaction Charge-Back Fee Amount becomes a new variable, additive fact, or decision point in the sales, marketing, finance, and accounting departments. The new data element could become a popular new addition to the data element resource pool for the enterprise. It is referred to in the formula for gross sales and net sales. The data warehousing team is asked to provide it in the monthly sales fact table. It becomes a
part of the never-ending thirst for better understanding of the enterprises health.So after it is aggregated, averaged, sliced, diced, racked, and stacked in reports and on screens all over the company, what can it do?
Act on It
Data elements can become the foundation for business action. They can even become the foundation for a new series of data elements that are used like a dashboard of combined data elements provided to influence decisions. You can achieve this stage only if you understand the data and it becomes a cornerstone of the data collection for the client. In combination, data elements can be the basis for business information and ultimately business
knowledge. This is the crowning achievement for a piece of data. It has been proven to be a quality fact and can be relied on to support a decision. This data should have an auditable pedigree (traceable origin and processing) in order to stand the scrutiny of doubt if the action does not get the expected results. In other words, you would better know where it came from and who thought it was a true fact.
This is, of course, the targeted goal of data: to be useful. The Credit Card Transaction Charge-Back Fee Amount can become the basis for a large decision, such as another new program to provide a bank debit function for customers, which could be less expensive to the
company than the charges from credit cards. It can become the basis for daily small decisions, such as offering discounts for cash sales of less than a certain value. It may even become the basis for a decision to cancel the credit card program and become the reason for its own removal from the data element inventory. However, once the data element has established itself as a useful addition to the company data resource, what do
you do with it?
Now the data values have become a sizeable data set. The next step in the life cycle is to begin the process of archiving or preserving backup and historic copies for security, for restorability (in the case of disasters), and for reducing the bulk in the production data sets. Once data values have depreciated in relevance and value (sometimes that just means they are too old to be referenced daily), they are usually transferred to a nearby storage area. Only the data clients can determine how old a data value should be to be archived.
Archiving data is fraught with its own challenges. How should the data be archived? How accessible should it be? How safe should the archive be? Should it be saved in the same format, in the same structure, and according to the same rules as the live data?
The Credit Card Transaction Charge-Back Fee Amount from the current week are critical to have available; the ones from last month are less so. The individual values from two years ago have almost lost their value. They may have value now only in an aggregated form for trend reporting.
You can begin to take slices of data from the production systems and archive them in several ways. A data warehouse may keep a larger set of older data than any other application in the company, but even it may need to resort to removing transaction-level details into various data storage tools that support accessibility performance at an appropriate level.
This is the golden retirement stage of data values. They can still be useful but are no longer of immediate relevance. What happens when the data values or whole data elements finally become obsolete?
Well, you delete them from all live environments. They fall off everyone's list of responsibilities. They may not be deleted the way you think of a transaction deleting a record, but they are gone for all intents and purposes from anyone's ability to use them. Over time their very existence fades away. Data values and data elements get tossed out when they are no longer of value. Tapes are erased, backups deleted, and whole collection tools (such as old applications, measuring devices, and storage devices) are scrapped. Even if the data still exists (say, on an 8-inch floppy), the means to retrieve it may no longer be available. This is the rest in peace stage of data. The amount of data in our landfills is probably staggering, but our perception of the value has changed over time. It is hard now to conceive of any data being worthless. But we do know that we still come across bone piles of tapes, disks, and clunky old machines that contain data values and collections that exist nowhere else, so it seems appropriate to note that this stage still exists in the life of a data element. When credit cards no longer exist, the Credit Card Transaction Charge-Back Fee Amount will be as useful as a collection of payroll information from 1912. Only historians will be interested, and they will be interested in only samples, not all of it. But that leads us into the next step that can happen to a data element.
The next lesson outlines the objectives of database design strategy.
Python Machine Learning