Recovery Considerations   «Prev  Next»
Lesson 6Identify the components of a disaster recovery plan
ObjectiveIdentify the components of an effective disaster recovery plan.

Disaster Recovery Plan

Systems fail for a variety of reasons. Most failures fall into one of four categories
  1. hardware,
  2. software,
  3. operational, or
  4. environmental.
Each category presents different challenges and requires specific recovery techniques. We'll cover each category briefly.

What are the components of an effective disaster recovery plan in Oracle?

An effective disaster recovery plan in Oracle should have the following components:
  1. Business impact analysis: A business impact analysis (BIA) is a critical first step in disaster recovery planning. It involves identifying the critical business processes and the systems and data required to support them. The BIA helps to determine the recovery time objectives (RTOs) and recovery point objectives (RPOs) for each system and application.
  2. Backup and recovery strategy: The backup and recovery strategy should include the type of backups to be taken (e.g., full backups, incremental backups, differential backups), the backup frequency, and the backup retention policy. The backup and recovery strategy should also specify how backups will be stored, and how they will be tested and validated.
  3. Disaster recovery procedures: The disaster recovery procedures should detail the steps required to recover the Oracle environment in the event of a disaster. This includes the procedures for restoring data, recovering applications, and restoring the system to its normal state.
  4. Communication plan: The communication plan should outline how communication will be managed during and after a disaster. This includes how notifications will be sent to stakeholders, how updates will be communicated, and how the status of the recovery effort will be reported.
  5. Testing plan: The testing plan should specify how the disaster recovery plan will be tested and validated. This includes how often the plan will be tested, the scope of the testing, and how the testing results will be evaluated.
  6. Roles and responsibilities: The roles and responsibilities of the disaster recovery team should be clearly defined in the disaster recovery plan. This includes the responsibilities of the DBAs, system administrators, application owners, and other stakeholders.
  7. Training and awareness: The disaster recovery plan should include training and awareness programs for the disaster recovery team and other stakeholders. This includes training on the plan itself, as well as training on how to respond in the event of a disaster.

Overall, an effective disaster recovery plan in Oracle should be comprehensive, flexible, and regularly tested and updated. It should be designed to minimize downtime, data loss, and other negative impacts in the event of a disaster, and should enable the organization to recover quickly and efficiently.

Hardware

When you think of all that goes on in a computer, it is a wonder that they do not break down more often. From the disk heads skimming over a rotating platter to the millions of electronic switches flipping on and off --what couldn't go wrong? There is very little a DBA can do to prevent hardware failures. At best the DBA can help minimize data loss by working with the hardware system managers to design a highly available and scaleable system. There are certain database features that help minimize the loss of data during a hardware failure--such as the use of database replication and standby databases.

Software

No piece of software is perfect. There will be software bugs in the operating system, database software, and application software. The best a DBA can do is to make sure that the database software is reasonably current. It is a major project in many shops to update the database from one release to another. The database software must be tested with the current operating system as well as the current version of the applications. As a consequence, most DBAs I know like to stay a software patch or two behind the database vendor.

Operational

To err is human. A DBA has the most control over the operational aspect of a database. Here are some questions you should ask yourself in order to try to reduce the chances of operational failures. Is there proper documentation and training for the DBAs? Are there adequate controls for the release of new applications or upgrades to the system? Do you plan, test, and reassess your backup and recovery strategy on a regular basis? Is your database secure? Do you leave the default passwords on your system? If you've answered all these questions (except for the last one) in the affirmative, you've already greatly reduced the likelihood of operational failures for your system.

Environmental

There are external environmental problems that no one can control. Those that we can prepare for include power outages, power surges, and temperature variations. Every location has unique situations and while it is not possible to predict when the next earthquake will hit, a DBA must be prepared for the problems that it could cause. Your job is to assess the potential problems and plan accordingly.
The next lesson discusses why a DBA should test the backup and recovery plan.