Lesson 8	Table high water marks and full-table scans
Objective	Describe effect of high water mark on full-table scans.

Table High water marks and Full Table Scans

When we create an Oracle table we must specify a STORAGE clause. This storage clause determines the initial and next extent sizes, and Oracle allocates data blocks within the tablespace based upon the values for these parameters. To prevent the database from reading into storage that has no rows, Oracle always keeps track of the table high water mark. When a new extent is allocated for a table, Oracle dedicates the data blocks to the table, but only raises the high water mark for the table to include the first five data blocks of the new extent. Let us take a look at an example. Assume that we have a new table called CUSTOMER with an initial extent of 50 megabytes and we have added ten rows to this table.

High Water Mark Workflow

1) Oracle allocates the customer table with an initial extent of 50 megabytes

2) We add ten rows to the table, causing the high water mark to grab the first five data blocks of the table and place these blocks onto the table freelist

3) We now issue the statement select count from customer and we see that Oracle reads only the first five data blocks — 3) We now issue the statement 'select count(*) from customer' and we see that Oracle reads only the first five data blocks before returning the row count.

Oracle High Water Mark

A high water mark is the set of blocks that have at one point contained data. You might have 1000 blocks allocated to a table but only 500 are under the High Water Mark. The blocks under the HWM are the blocks that will be read when the table is fully scanned.
The (HWM) high water mark for an Oracle table is a construct that shows the table at its greatest size. The Oracle table has a high water mark that shows the greatest size of the table, the point at which it consumed the most extents. As a table undergoes deletes and updates, rows shrink and table data blocks become empty. For performance reasons, Oracle keeps the high water mark for a table rather than re-calculate the high water mark after blocks at the "end" of the table (the last extent) becomes empty. For example assume that you have a million row table that takes 29 seconds to read. After deleting 897,000 rows, a full scan on the table will still take 29 seconds. This is because the table high water mark is not re-set after delete operations.

Challenges with high water mark
The issue with the high water mark is that full-table scans will always read up to the high water mark, even though Oracle may be reading through many empty blocks that were 1) allocated to the table, 2) used for rows, and 3) then deleted. Therefore, there are no easy SQL scripts that will reveal the high water mark for an Oracle table, but you can assume that it is the last extent that was allocated to the table for estimation purposes. Here is a simple query to find the high water mark for a table:
```
select 
   a.tablespace_name, 
   a.file_name, 
   ceil( (nvl(hwm,1)*8192)/1024/1024 ) "Mo" from dba_data_files a,
     ( select file_id, max(block_id+blocks-1) hwm
       from dba_extents
       group by file_id
     ) b 
where a.file_id = b.file_id(+)
order by tablespace_name, file_name;
```

If we did not have a high water mark, a full table scan would read all 50 megabytes of the table looking for new rows. Since all of the rows are below the high water mark, Oracle will only read up to the high water mark and we get our count in less than one second.
There is a danger with this mechanism. Let us use the same example, but this time allocate the customer table, add 100,000 rows and then delete 99,990 rows.

Oracle 23ai Golden Gate

APPEND hint

The APPEND hint works within statements performing DML insert operations from another table, that is, using a subquery from within an INSERT SQL statement. This is appropriate for when you need to copy a large volume of rows between tables. By bypassing the Oracle database buffer cache blocks and appending the data directly to the segment above the high-water mark, you save significant overhead. This is a popular method for inserting rows into a table very quickly. When you specify one of these hints, Oracle will perform a direct-path insert. In a direct-path insert, the data is appended at the end of a table, rather than using free space that is found within current allocated blocks for that table.
The APPEND and APPEND_VALUES hints, when used, automatically convert a conventional insert operation into a direct-path insert operation. In addition, if you are using parallel operations during an insert, the default mode of operation is to use the direct-path mode. If you want to bypass performing direct-path operations, you can use the NOAPPEND hint. Keep in mind that if you are running with either of these hints, there is a risk of contention if you have multiple application processes inserting rows into the same table. If two append operations are inserting rows at the same time, performance will suffer: since the insert append operation appends the data above the high water mark for a segment, only one operation should be done at one time. However, if you have partitioned objects, you can still run several concurrent append operations, as long as each insert operates on separate partitions for a given table.

1) We allocate the table at 50 meg. — 1) We allocate the table at 50 megabytes

2) Oracle continues to extend the high water mark as the 100,000 rows are added.

3) As we delete 99,990 rows, the rows are removed, yet the high water mark stays high.

Now, when we execute the statement

 select count(*) from customer;

the query runs for several minutes while all of the data blocks up to the high water mark are accessed.
For highly active tables where large numbers of rows are deleted, you will see very long response times for full table scans. To fix this, you can reorganize the table to lower the high water mark, or you can can force the use of an index, bypassing all full-table scans. The next lesson concludes this module.