Language Support   «Prev  Next»

Lesson 5 National Character Sets
Objective Choose a national character set for an Oracle 19c database.

National Character Sets in Oracle 19c

Understanding Character Sets in Oracle 19c

An Oracle 19c database uses two types of character sets: the database character set and the national character set. The database character set defines the encoding for data in CHAR, VARCHAR2, CLOB, and LONG columns, as well as identifiers and PL/SQL programs. The national character set is used exclusively for NCHAR, NVARCHAR2, and NCLOB columns, which are designed to store Unicode data for multilingual support or special characters not supported by the database character set.

Choosing a national character set is critical for applications requiring support for multiple languages or diverse character sets. This lesson outlines the steps to select an appropriate national character set for an Oracle 19c database.

Steps to Choose a National Character Set

  1. Understand the Purpose of NCHAR Data Types: The NCHAR, NVARCHAR2, and NCLOB data types store Unicode data, enabling support for characters from multiple languages (e.g., Chinese, Arabic, or special symbols) that may not be covered by the database character set, such as AL32UTF8.
  2. Identify Application Requirements:
    • Determine if your application needs multilingual support or characters not supported by the database character set.
    • Assess storage needs, as the choice of national character set affects the size of data stored in NCHAR columns.
  3. Explore Available National Character Sets: Oracle 19c supports two national character sets:
    • AL16UTF16: A fixed-width Unicode character set using 2 bytes per character, ideal for consistent in-memory operations and applications with extensive multilingual data.
    • UTF8: A variable-width Unicode character set using 1 to 3 bytes per character (also referred to as AL32UTF8 in some contexts), suitable for applications with a mix of ASCII and multi-byte characters to optimize storage.

    Note: Since Oracle 12c, AL16UTF16 is the default national character set.

  4. Select the National Character Set: The national character set is set during database creation using one of these methods:
    • Database Configuration Assistant (DBCA): During database creation, DBCA prompts you to select the character set and national character set on the "Character Set" page. The default is AL16UTF16, but you can choose UTF8 based on your needs.
    • Manual Database Creation: Specify the national character set in the CREATE DATABASE statement. For example:
      CREATE DATABASE your_database_name
      ...
      NATIONAL CHARACTER SET AL16UTF16;
                          
      Replace AL16UTF16 with UTF8 if preferred.
  5. Consider AL16UTF16 vs. UTF8:
    • AL16UTF16: Choose for applications requiring extensive multilingual character support, as its fixed-width encoding simplifies in-memory operations but uses more storage.
    • UTF8: Choose for applications with many ASCII characters, as its variable-width encoding is more storage-efficient but may require more processing for multi-byte characters.
  6. Test and Validate:
    • Test the application to ensure it correctly handles all required characters with the chosen national character set.
    • Validate data migrations, imports, or exports to confirm that NCHAR, NVARCHAR2, and NCLOB data maintains integrity across systems.

Viewing the National Character Set

To check the database and national character sets in Oracle 19c, query the NLS_DATABASE_PARAMETERS view. For example:

SELECT PARAMETER, VALUE
FROM NLS_DATABASE_PARAMETERS
WHERE PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET');

PARAMETER                 VALUE
------------------------  ----------------
NLS_CHARACTERSET          AL32UTF8
NLS_NCHAR_CHARACTERSET    AL16UTF16
    

This output shows a typical Oracle 19c configuration with AL32UTF8 as the database character set and AL16UTF16 as the national character set.

Additional Considerations

Changing the National Character Set: Modifying the national character set after database creation is complex, often requiring data export and re-import or database recreation. Choose carefully during initial setup to avoid future complications.

Single-Byte Limitations: Certain database elements (e.g., database names, instance names, filenames, rollback segment names, and keywords) must use single-byte characters, even if the national character set supports multi-byte characters. For NCHAR and NVARCHAR2, length specifications refer to the number of characters, not bytes, unlike CHAR and VARCHAR2.

Summary

  • Select AL16UTF16 (default) or UTF8 based on your application’s multilingual and storage needs.
  • Use DBCA or the CREATE DATABASE statement to set the national character set during database creation.
  • Verify the character set configuration using NLS_DATABASE_PARAMETERS and test thoroughly to ensure data integrity.

By carefully choosing the national character set, you ensure your Oracle 19c database supports the multilingual and character encoding requirements of your application effectively.


Next Steps

Understanding how to choose a national character set prepares you to configure Oracle 19c databases for diverse applications. The next lesson will explore data conversion techniques for handling multilingual data across different character sets.


SEMrush Software Target 5SEMrush Software Banner 5