Introduce the Oracle DBA utilities and table-management features covered in this module
Oracle Improvements for Database Administrators
Oracle Database includes a set of command-line utilities and table-management features that help database administrators handle bulk-data loading, maintain and reorganize tables, manage temporary data, and make structural changes without unnecessary downtime or manual effort. This module covers the principal tools and capabilities in that category, beginning with SQL*Loader and expanding into the table-management features that complement it.
The lessons in this module address practical DBA tasks: loading external data files into Oracle tables efficiently, handling large object data, relocating and reorganizing tables, working with temporary tables, identifying and removing unused columns, and understanding the database capacity limits that govern object sizing and counts. Each topic represents a capability that is well established in Oracle Database and continues to be relevant in current Oracle releases, including Oracle 23ai.
Module Objectives
By the end of this module you will be able to:
Explain the purpose and principal components of SQL*Loader.
Identify SQL*Loader improvements and the loading options available for different data-volume and performance requirements.
Load large object (LOB) data using SQL*Loader.
Describe Oracle table-management capabilities, including table relocation and reorganization.
Explain when and why to create a temporary table.
Identify unused columns in a table and remove them safely.
Recognize important Oracle Database capacity and object limits.
SQL*Loader Overview
SQL*Loader is an Oracle command-line utility that reads data from external source files and loads that data into Oracle Database tables. It is one of the primary tools available to a DBA for bulk data ingestion — situations where inserting records one at a time through an application would be impractical due to data volume or the external origin of the source data.
The behavior of a SQL*Loader session is defined by a control file. The control file is a text file that tells SQL*Loader how to interpret the source data and how to load it. It can specify the input datafiles, record and field formats, target tables and columns, data transformations, record-selection conditions, and how rejected and discarded records should be handled. Separating the load instructions into a control file makes SQL*Loader sessions repeatable and auditable — the same control file can be reused across multiple load cycles with different input datafiles.
SQL*Loader connects to Oracle Database through Oracle Net. The SQL*Loader client can run on a system separate from the database server, provided that Oracle Net connectivity is available and that the input datafiles are accessible to the system running the client. The input datafiles themselves are read by the SQL*Loader client process, not independently by the database server.
SQL*Loader Capabilities
SQL*Loader supports a broad range of loading scenarios. The following capabilities are available in current Oracle Database releases:
Load data from one or more input datafiles in the same load session.
Load data into one or more Oracle Database tables in the same load session.
Read data files and other supported input streams available to the client environment.
Specify the character set of the source data.
Selectively load records based on conditions defined in the control file.
Transform field values during loading using SQL functions and expressions.
Generate sequential values for designated columns during loading.
Load object-relational data, collections, and large object (LOB) data, including data stored in secondary datafiles.
Produce a detailed log file and separate bad-file and discard-file output for rejected and discarded records.
Choose between conventional-path and direct-path loading based on data volume, table structure, and operational requirements.
Loading Methods: Conventional Path and Direct Path
SQL*Loader supports two primary loading methods, and selecting the appropriate one for a given situation is an important DBA decision.
Conventional-path loading processes records using standard SQL INSERT operations. It supports the full range of SQL and database functionality — triggers fire, integrity constraints are enforced in the standard way, and the load participates in normal Oracle transaction processing. Conventional-path loading is the more flexible of the two methods and is appropriate when the load must coexist with other database activity or when the full SQL processing pipeline is required.
Direct-path loading uses Oracle's direct-path load interface to write data blocks directly to the database, bypassing portions of the normal SQL-processing overhead. This approach can reduce load time significantly for large data volumes. However, direct-path loading has constraints: certain types of triggers do not fire, some constraints are deferred until after the load completes, and concurrent access to the target table may be restricted during the operation. Its suitability depends on the table structure, indexes, active constraints, concurrent workload, and the operational requirements of the environment.
Current Oracle documentation also describes automatic parallel direct-path loading in supported configurations, which can further reduce elapsed time for very large loads on appropriately configured systems.
SQL*Loader Input and Output Files
A SQL*Loader session involves several distinct files, each with a specific role. Understanding what each file contains — and what it does not contain — is essential for interpreting load results and troubleshooting problems.
Input Datafiles
Contain the source records that SQL*Loader reads during the load session. A single load session can read from one or more input datafiles. SQL*Loader's data parsing engine imposes relatively few restrictions on the format of the data in these files, which is one of the utility's principal strengths.
Loader Control File
Defines how SQL*Loader should interpret and load the source data. A single control file can reference multiple input datafiles, specify field delimiters and data types, identify target tables and columns, define selection criteria and transformations, and configure bad-file and discard-file behavior. The control file is what makes a SQL*Loader load repeatable.
Oracle Database (Tables and Indexes)
The destination for accepted records. Depending on the loading method and configuration, SQL*Loader inserts data into the specified tables and may rebuild or update associated indexes as part of the load operation.
Log File
Records information about the load operation itself: the SQL*Loader settings in effect, the input and output file names, row counts for loaded rows, rejected rows, and discarded rows, and any error or diagnostic messages generated during the session. The log file is the primary diagnostic resource after a load completes.
Bad File
Stores records that SQL*Loader attempted to load but could not, because of invalid data conversions, constraint violations, incorrect record structure, or other loading errors. The bad file contains the rejected source records, not diagnostic messages — those appear in the log file. Retaining bad files allows the DBA to correct the source data and reload only the failed records.
Discard File
Stores records that were structurally valid and processable but did not satisfy the record-selection conditions specified in the control file. A discard file is optional: it is created only when discard handling is enabled in the control file and records are actually discarded during the load. Discarded records are not errors — they were intentionally excluded by the selection criteria.
Figure 5-1 illustrates the data flow across these components for a typical SQL*Loader session.
Figure 5-1: SQL*Loader reads input data according to a control file, loads accepted records into Oracle Database tables, and writes processing details, rejected records, and optionally discarded records to separate output files.
The next lesson examines SQL*Loader improvements and the loading methods available for different data-volume and performance requirements.