SQL*Loader   «Prev  Next»

Lesson 2 SQL*Loader in Oracle 23ai
Objective Describe SQL*Loader applications in Oracle 23ai

SQL*Loader in Oracle Database 23ai

SQL*Loader is Oracle's command-line utility for loading data from external files into Oracle Database tables. It supports a broad range of source formats and loading scenarios — from simple delimited text files to complex hierarchical records containing large objects, object-relational structures, and collection types. This lesson covers the three principal approaches SQL*Loader supports for loading complex data, how rejected and discarded records are handled, and the structure of the parameters and control file that govern a load session.

Loading Complex Data into Oracle Database 23ai

SQL*Loader supports three distinct approaches for loading data that goes beyond flat relational rows. The composite diagram below illustrates all three workflows and their relationship to Oracle Database 23ai.
Composite diagram showing SQL*Loader loading LOB data, object attributes, and collections through control files, and a custom application loading data through the Oracle Direct Path Load API into Oracle Database 23ai
Figure 5-2: Oracle Database 23ai supports several approaches for loading complex data. SQL*Loader uses control-file definitions to load LOBs, object attributes, and collections, while custom applications can use the Oracle Direct Path Load API to supply data programmatically without a SQL*Loader control file.

SQL*Loader Loading Capabilities

1. Loading Data Directly into Large Objects

SQL*Loader can load data files directly into LOB columns — BLOB, CLOB, and NCLOB — in an Oracle Database table. The source content can be images, PDF documents, text files, binary data, or any other file type supported by the target LOB column.

The loading mechanism uses two levels of datafiles. A primary datafile contains one record per row to be loaded. Each record identifies the LOB content by reference — typically through a file name or path stored in a metadata field such as LOB_FILE_REF. The actual LOB content lives in secondary datafiles, which SQL*Loader reads separately and maps into the corresponding LOB column for each row.

The control file defines the relationship between the primary record fields and the secondary LOB files, specifying which field contains the file reference and how SQL*Loader should locate and read each secondary file. The result is a table where ordinary columns — such as DOCUMENT_ID and FILE_NAME — are populated from the primary record, and the CONTENT LOB column is populated from the referenced secondary file.

This approach is well suited for document management systems, media repositories, and any application that stores large binary or text content alongside structured metadata in the same Oracle table.

2. Loading Data into Oracle Objects and Collections

SQL*Loader can load data into object-relational structures: object types, object tables, nested tables, varying arrays (VARRAYs), and REF columns. Source records may contain scalar attributes, object attributes, collection elements, and REF values in a single structured record, and the control file maps each portion of the source record to the corresponding target structure in Oracle Database.

A source record for object loading typically contains several categories of data within one record block:
  • Scalar attributes — simple column values such as an ID or name.
  • Object attributes — values that map to the attributes of an Oracle object type, such as a composite address object containing Street, City, and Zip fields.
  • Collection elements — repeating values that populate a nested table or VARRAY, such as a list of phone numbers.
  • REF values — references to rows in object tables, loaded as REF column values.
The control file uses SQL*Loader's object-loading syntax to specify how each portion of the source record maps to the target object attributes and collection elements. Oracle Database 23ai extends SQL*Loader's loading support to include JSON columns and VECTOR columns, making it possible to load these modern data types using the same control-file driven workflow.

3. Using the Oracle Direct Path Load API

A custom application can load data into Oracle Database programmatically using Oracle's Direct Path Load API, bypassing the SQL*Loader command-line utility entirely. The application supplies both the data and the load configuration through the API, without a SQL*Loader control file.

This approach is appropriate when an application generates or transforms data at runtime — reading from a message queue, consuming a data stream, or performing in-memory transformations — and needs to write the results directly into Oracle tables at high throughput. The Oracle Direct Path Load API gives the application control over the load parameters that would ordinarily be defined in a SQL*Loader control file, but expressed programmatically rather than through a text file.

The Direct Path Load API uses Oracle's direct-path loading interface, which writes formatted data blocks directly to the database, reducing portions of the normal SQL processing overhead compared to conventional INSERT processing. As with SQL*Loader direct-path loads, constraints and triggers may be handled differently than in conventional loading — these considerations apply when choosing this approach for tables with complex integrity rules.

Discarded and Rejected Records

Not every record in the input datafile will be inserted into the database. SQL*Loader routes unsuccessful records to one of two output files depending on the reason the record was not loaded: the bad file or the discard file. Understanding the distinction between these two files is important for interpreting load results and correcting source data.

The Bad File

The bad file contains records that SQL*Loader attempted to load but could not, because of a formatting problem or a database rejection. If no bad file is explicitly specified in the control file and rejected records exist, SQL*Loader automatically creates one. The generated bad file takes the same name as the input datafile with a .bad extension replacing the original extension.

Two distinct events cause a record to be written to the bad file:

Records Rejected by SQL*Loader

SQL*Loader rejects a record before it reaches the database when the input format is invalid. Common causes include:
  • A missing second enclosure delimiter in an enclosed field.
  • A delimited field value that exceeds its maximum defined length.
  • A record structure that does not match the format defined in the control file.
These records never reach Oracle Database — they are intercepted during SQL*Loader's parsing phase and written immediately to the bad file.

Records Rejected by Oracle Database

A record that passes SQL*Loader's format validation is sent to Oracle Database for insertion as a row. The database evaluates the row against the table's constraints and data type rules. If the row fails that evaluation, the database rejects it and returns it to SQL*Loader, which writes it to the bad file. Common reasons for database rejection include:
  • A unique key or primary key constraint violation.
  • A NOT NULL constraint on a column for which the source record supplies no value.
  • A field value that cannot be converted to the target Oracle data type.
  • A check constraint that the row's values do not satisfy.

The Discard File

The discard file stores records that were structurally valid and processable by SQL*Loader, but did not satisfy the record-selection conditions specified in the control file. A discard file is not automatically created — it is generated only when discard handling is enabled in the control file and at least one record is actually discarded during the load.

Discarded records are not errors. They were intentionally excluded from the load because they did not meet the selection criteria. This is an important distinction from the bad file: a rejected record is one that SQL*Loader or Oracle Database could not process; a discarded record is one that was processed successfully but deliberately excluded.

Reviewing the discard file after a load is a useful way to verify that selection criteria are working as intended and that the right subset of the input data is being loaded.

SQL*Loader Parameters

SQL*Loader is invoked using the sqlldr command, optionally followed by parameters that establish session characteristics — the target database connection, the control file name, the bad file name, the log file name, the loading method, and so on. When the same parameters are used across multiple load sessions and their values seldom change, specifying them on the command line every time is inefficient. Two alternatives are available:
  1. Parameter file (PARFILE) — Parameters can be grouped in a plain-text parameter file, one parameter per line. The parameter file name is then passed on the command line using the PARFILE parameter. This keeps the command line short and makes the parameter set easy to version-control alongside the control file.
  2. OPTIONS clause — Certain parameters can be embedded directly in the SQL*Loader control file using the OPTIONS clause, placed at the top of the control file before the LOAD DATA statement. This approach is convenient when a control file is always run with the same settings.
Parameters specified on the command line take precedence over values specified in a parameter file or an OPTIONS clause, allowing individual load sessions to override the defaults without modifying the control file or parameter file.

SQL*Loader Control File

The SQL*Loader control file is a text file written in SQL*Loader's control language. It defines everything SQL*Loader needs to know about a load session: where to find the input data, how to parse it, which tables and columns to load, what transformations to apply, and how to handle records that do not load successfully. A control file is divided into up to three sections.

Section 1 — Session-Wide Settings

The first section establishes settings that apply to the entire load session:
  • Global options such as BINDSIZE, ROWS, and the number of records to skip at the start of the input file.
  • INFILE clauses that specify the path and format of each input datafile.
  • Optionally, inline data to be loaded, embedded directly in the control file rather than in a separate datafile.

Section 2 — INTO TABLE Blocks

The second section contains one or more INTO TABLE blocks. Each block specifies:
  • The name of the target Oracle table.
  • The loading mode — INSERT, APPEND, REPLACE, or TRUNCATE.
  • The column list and the field-to-column mappings that define how source record fields are assigned to target columns.
  • Any field transformations, selection conditions, or secondary datafile references for LOB loading.
Multiple INTO TABLE blocks allow a single SQL*Loader session to load data into more than one table from the same input datafile.

Section 3 — Inline Data (Optional)

If the control file includes inline data — source records embedded directly in the file — that content forms the third section, beginning after a BEGINDATA statement. SQL*Loader treats everything after BEGINDATA as input data rather than control file syntax, which means comments using the standard -- prefix are not supported in this section.

Control File Syntax Rules

  • Syntax is free-format: statements can span multiple lines, and whitespace is not significant.
  • Syntax is case-insensitive, with one exception: strings enclosed in single or double quotation marks are taken literally, preserving their exact case.
  • Comments begin with two hyphens (--) and extend to the end of the line. Comments are valid in sections 1 and 2 but not in the inline-data section.
  • The keywords CONSTANT and ZONE are reserved by SQL*Loader. Oracle recommends not using either as a table or column name to avoid conflicts during control file parsing.

The next lesson demonstrates SQL*Loader LOB loading and object-collection loading through worked control file examples.
SEMrush Software 2 SEMrush Banner 2