Lesson 8

FastStart Fault Recovery Conclusion

Oracle's Fast-Start Fault Recovery feature is designed to minimize the time needed to recover from instance failures, thereby increasing system availability. In the context of Oracle RAC (Real Application Clusters), this feature becomes particularly useful because RAC is designed for high availability and load balancing across multiple instances. Here's how Fast-Start Fault Recovery functions in an Oracle RAC environment:

Understanding Fast-Start Fault Recovery:

Fast-Start Fault Recovery reduces the time needed for instance recovery by using the Fast-Start On-Demand Parallelism feature, which dynamically adjusts the number of parallel recovery processes. This enables the system to bring the database to a transactionally consistent state more quickly than with traditional serial recovery methods.

Components:

Redo Threads: In a RAC environment, each instance has its own redo thread. Fast-Start Fault Recovery helps in quicker application of these redo logs during the recovery process.
Checkpoints: Oracle databases use checkpoints to know how far they need to go back in the redo logs to begin recovery. By managing these checkpoints more efficiently, Fast-Start Fault Recovery limits the amount of work needed during the recovery process.
Rollback Segments: Any uncommitted transactions at the time of failure must be rolled back. Fast-Start Fault Recovery optimizes this process for speed.

Key Concepts:

Fast-Start I/O Target: This parameter sets the upper limit on the amount of I/O that instance recovery can use, which helps in quicker recovery while balancing the I/O resources among surviving nodes.
Target Recovery Time: You can set a target recovery time (in seconds) using the `FAST_START_MTTR_TARGET` parameter. Oracle then tries to ensure that the instance recovery won't take longer than this time.

Working with RAC:

Load Distribution: In RAC, if one node fails, the surviving nodes can share the additional load. Fast-Start Fault Recovery ensures that the failed node returns to operation more quickly, helping to re-balance the workload faster.
Cache Fusion: RAC uses Cache Fusion for block sharing between instances. Fast-Start helps in recovering the Global Cache more quickly in case of a node failure.
Global Resource Directory (GRD): Fast-Start Fault Recovery has to work in tandem with RAC’s GRD to coordinate the recovery process, especially when a failed instance is brought back online.
Parallel Recovery: Recovery processes can be run in parallel in a RAC environment, further reducing the time required to recover from a failure.
Dynamic Reconfiguration: RAC environments can dynamically adjust to configuration changes. Fast-Start Fault Recovery complements this by allowing quicker recovery times, thereby reducing the time during which the cluster is in a reconfigured, potentially suboptimal state.

Tuning:

You may need to tune some parameters like `FAST_START_IO_TARGET` and `FAST_START_MTTR_TARGET` based on your specific RAC setup and SLAs.
Monitor the `V$INSTANCE_RECOVERY` view to get insights into your recovery settings and see if Oracle is meeting your target recovery time.

Fast-Start Fault Recovery in Oracle RAC is designed to work hand-in-hand with RAC's native high-availability and load-balancing features to minimize downtime and ensure business continuity.

From a systems management perspective, this module has given you some good ideas about the new features within Oracle9i.
Backup and recovery within this version is much more robust than previous versions of Oracle.
Now that you have completed this module, you should be able to:

Describe the concept and process of Fast-Start Fault Recovery
Explain the steps involved in Fast-Start rollback
Describe the effects of Fast-Start checkpointing
Implement Fast-Start parallel rollback
Explore the features of a read-only database
Create a standby database

Glossary

You were introduced to the following terms in this module:

Checkpointing: The process of writing all transactions into the redo log files.
Dirty buffer: When a user changes the records within a transaction, that transaction within the buffer becomes dirty or changed
Buffer cache: A buffer cache is a memory area within the Oracle database, where records are processed.
Parallel rollback: Rolling back of data within multiple parallel processes.
Rollback: This is a process, where the Oracle server replaces the old values for a record when a transaction is not committed.
Roll forward phase: In this process, which happens during recovery, all transactions within the redo log files are applied to the database.
Serial rollback: Rolling back of data within a single serial process.

In the next module you will learn about additional improvements Oracle8i brings to backup and recovery.

Conclusion - Exercise

Click the Exercise link below to evaluate the optimum backup strategy based on the business needs of the House-O-Pets.

Backup Strategy - Exercise