Character functions: CONCAT, SUBSTR, LENGTH, INSTR
Objective
Describe the use of the character functions CONCAT, SUBSTR, LENGTH, and INSTR.
Oracle Character Functions CONCAT, SUBSTR, LENGTH, INSTR
Four commonly used string functions are 1) concatenate, 2) substring, 3) length, and 4) instring.
CONCAT or ||:
The concatenation function uses different syntax for different programming languages and software. For Oracle, you have two choices:
Use two vertical bars (||) to connect the two columns you wish to combine. This is by far the most commonly used variation and you can easily concatenate three columns (or expressions or literal values) together by repeating the vertical bars.
Use the CONCAT(a,b) function. This function concatenates the first column with the second column. To add a third column, you must add another CONCAT(a,b) around the first one.
Here is an example (using the course project) of the first, most commonly used, form of concatenation:
SELECT CITY || ', ' || STATE || ' ' || ZIP
FROM CUSTOMER;
The results look like this:
CITY||','||STATE||''||ZIP
-----------------------------------
Omaha, NE 98765
Makena, HI 99800
Hollywood, CA 10293
Kihei, HI 39873
The Oracle `SUBSTR` function extracts a substring from a given string, starting at a specified position and extending for a given number of characters. The command provided:
SELECT SUBSTR('PANDABEAR', 3, 4) "Substring"
FROM DUAL;
breaks down as follows:
`'PANDABEAR'` is the string from which the substring is to be extracted.
`3` is the starting position, indicating that the extraction should begin from the third character of the string, which is `N`.
`4` specifies the length of the substring to be extracted, counting from the starting position.
Therefore, starting from the third character `N`, and extracting four characters from this point, yields `NDAB`.
The output of the provided SQL command will be a single column with the header "Substring" and a single row containing the substring `NDAB`:
Substring
---------
NDAB
Oracle LENGTH: This simple function tells you the length (in number of characters) of the data in a column, expression, or literal. In the case of VARCHAR2 datatype columns, the defined length of the column is often higher than the actual data stored in each row's column. Use the Oracle length function to count the characters. If a field contains no data (is null) then its length is null.
The Oracle `INSTR` function is a string function used to determine the location of a substring within a string. It returns the position of the specified substring within a larger string, starting from a given position. The function is useful for parsing strings and for situations where you need to know the presence and specific location of a substring within a string.
Syntax The basic syntax of the `INSTR` function is:
`substring` is the sequence of characters to search for within the source string.
`start_position` is an optional argument specifying where in the string to start the search. If this is omitted, the default is 1, meaning the search starts at the beginning of the string.
`occurrence` is an optional argument that specifies which occurrence of the substring to search for. The default is 1, meaning the function will find the first occurrence of the substring.
Example Consider the following example:
SELECT INSTR('HELLO WORLD', 'WOR') AS Position
FROM DUAL;
In this example, the `INSTR` function searches for the substring `'WOR'` within the string `'HELLO WORLD'`. Since the search starts at the beginning of the string by default and we are looking for the first occurrence also by default, the function will return the position at which the substring `'WOR'` begins.
Explanation
In the string `'HELLO WORLD'`, the substring `'WOR'` starts at position 7. Therefore, the output of the example query will be:
Position
--------
7
This indicates that the substring `'WOR'` was found starting at the 7th character of the string `'HELLO WORLD'`.The `INSTR` function is particularly powerful in data manipulation and analysis tasks, where locating the position of specific patterns within strings is necessary, such as in data cleaning, extraction of specific data segments, or when processing log files and textual data.
The following series of images below shows the syntax for the INSTR function.
1) The syntax of the INSTR function, including parameters. If the character or characters are not found, the INSTR function returns zero.
2) This example shows how to search for the letter "a" in the LASTNAME column of the CUSTOMER table. The searching used in this function is case-sensitive. In this case, the default starting position and occurrence number were used because these two parameters were omitted.
3) Add a position number as the third parameter to start the search for the character at some point within the field. For example, if you want to find the word "the" in the LOG_TEXT column, but only in the 20th character or beyond, the query would look like the one shown here. The identical function was added to the WHERE clause to limit the rows returned to those that match the INSTR function.
SELECT INSTR(UPPER(LOG_TEXT), 'FISH', 20) POS,
LOG_TEXT
FROM PET_CARE_LOG
WHERE INSTR(UPPER(LOG_TEXT), 'FISH', 20) > 0
4) You can find the second, third, or other occurrence of the character by adding a third parameter to the function. For example, if you wish to find the second occurrence of the word "the" in the LOG_TEXT column, the query would be revised to include both the starting position (20) and the occurence number (2) as shown here.
5) You can use an INSTR function to set the starting position of another INSTR function. In this example, we have replaced the starting position (20) with an INSTR function that looks for period (.). In effect, we are setting the starting position at the end fo the first sentence in the text.
Oracle INSTR Function and Syntax
The INSTR function allows for simple or sophisticated searching through a string for a set of characters, not unlike LTRIM and RTRIM, except that INSTR does not clip anything off. It simply tells you where in the string it found what you were searching for.
This is similar to the LIKE logical operator. However, LIKE can only be used in a
where or
having
clause, and INSTR can be used anywhere except in the from clause. Of course, LIKE can be used for complex pattern searches that would be quite difficult, if even possible, using INSTR.
Here is the format for INSTR:
INSTR(string,set [,start [,occurrence ] ])
INSTR searches String
INSTR searches in the string for a certain set of characters. It has two options, one within the other. The first option is the default: It will look for the set starting at position 1. If you specify the location to start, it will skip over all the characters up to that point and begin its search there.
The second option is occurrence. A set of characters may occur more than once in a string, and you may really be interested only in whether something occurs more than once. By default, INSTR will look for the first occurrence of the set. By adding the option occurrence and making it equal to 3, for example, you can force INSTR to skip over the first two occurrences of the set and give the location of the third. Some examples will make all this simpler to grasp. Recall the table of magazine articles. Here is a list of their authors:
select Author from MAGAZINE;
AUTHOR
-------------------------
BONHOEFFER, DIETRICH
CHESTERTON, G.K.
RUTH, GEORGE HERMAN
WHITEHEAD, ALFRED
CROOKES, WILLIAM
To find the location of the first occurrence of the letter O, INSTR is used without its options and with set as 'O' (note the single quotation marks, since this is a literal), as shown in the following listing:
select Author, INSTR(Author,'O') from MAGAZINE;
AUTHOR INSTR(AUTHOR,'O')
------------------------- -----------------
BONHOEFFER, DIETRICH 2
CHESTERTON, G.K. 9
RUTH, GEORGE HERMAN 9
WHITEHEAD, ALFRED 0
CROOKES, WILLIAM 3
This is, of course, the same as the following:
select Author, INSTR(Author,'O',1,1) from MAGAZINE;
If INSTR had looked for the second occurrence of the letter O, it would have found
select Author, INSTR(Author,'O',1,2) from MAGAZINE;
AUTHOR INSTR(AUTHOR,'O',1,2)
------------------------- ---------------------
BONHOEFFER, DIETRICH 5
CHESTERTON, G.K. 0
RUTH, GEORGE HERMAN 0
WHITEHEAD, ALFRED 0
CROOKES, WILLIAM 4
INSTR found the second O in Bonhoeffer's name, at position 5, and in Crookes' name, at position 4.
Chesterton has only one O, so for him, Ruth, and Whitehead, the result is zero, meaning no success and no second O was found.
To tell INSTR to look for the second occurrence, you also must tell it where to start looking (in this case, position 1). The default value of start is 1, which means that is what it uses if you do not specify anything,
but the occurrence option requires a start, so you have to specify both. If set is not just one character but several, INSTR gives the location of the first letter of the set, as shown here:
select Author, INSTR(Author,'WILLIAM') from MAGAZINE;
AUTHOR INSTR(AUTHOR,'WILLIAM')
------------------------- -----------------------
BONHOEFFER, DIETRICH 0
CHESTERTON, G.K. 0
RUTH, GEORGE HERMAN 0
WHITEHEAD, ALFRED 0
CROOKES, WILLIAM 10
This has many useful applications, such as in the MAGAZINE table, for instance:
select Author, INSTR(Author,',') from MAGAZINE;
AUTHOR INSTR(AUTHOR,',')
------------------------- -----------------
BONHOEFFER, DIETRICH 11
CHESTERTON, G.K. 11
RUTH, GEORGE HERMAN 5
WHITEHEAD, ALFRED 10
CROOKES, WILLIAM 8
Oracle SUBSTR
The substring function, which returns part of the column, is written like as follows:
SUBSTR(col1, start, length)
You can combine the substring function with the instring function and create a useful tool.
Here is an example, using the Pet Store schema. Let us say you want to retrieve only the first sentence from each Pet Log entry. You can use the substring function to return a portion of the data, but the exact length of the sentence varies with each entry. By using the instring function, you can locate the first period (.) found in the data. This is where the sentence ends. Add one to this number, and you can retrieve the complete sentence,
including the period. The diagram below shows the SQL query that does it.
SELECT
SUBSTR(LOG_TEXT,1,INSTR(LOG_TEXT,'.') + 1)
FIRST_SENTENCE
FROM PET_CARE_LOG
The first command you see in the figure is a SQL*Plus column formatting command. It lets me show you the entire contents of the column in a narrow window. See the module titled "The Oracle SQL*Plus environment" for more information on SQL*Plus commands.
In the next lesson, you will learn use of the character functions UPPER, INITCAP, RTRIM, and SOUNDEX.