Monitoring Databases  «Prev  Next»

Lesson 8 Maintaining database statistics
Objective Maintain statistics.

Maintain Database Index Statistics

SQL Server maintains statistics on every index and column in a table. The statistics are used by SQL Server to optimize the performance of your SQL statements. The database can be configured to automatically maintain statistics, or you can manually maintain the statistics. For instance, if you are retrieving data from a customer table where the ZipCode = 80013, having an index or group statistics can help SQL Server find out how many rows will be returned from the query. This helps SQL Server decide on indexes to use and in which order to join tables. In this lesson, we will cover four commands that will allow you to make use of database statistics.
  1. CREATE STATISTICS,
  2. ALTER DATABASE (for auto-create),
  3. DROP STATISTICS, and
  4. UPDATE STATISTICS.

Create statistics Command:

The create statistics command will allow SQL Server to analyze columns in a table by gathering histogram and density information. This information is used to estimate how many rows will have a given value for the column. The syntax of create statistics is as follows:
  1. Histogram: A range of values within an index.
  2. Density: The reciprocal of the number of distinct values. Lower density indicates higher uniqueness.”

CREATE STATISTICS statistics_name 
ON table_name (column1 [ ,...n ]) 
WITH [ FULLSCAN | SAMPLE number PERCENT | SAMPLE number ROWS ] 
     [ , NORECOMPUTE ]
Create Statistics in SQL Server
  1. A unique name identifying the statistic group
  2. The table for which you want statistics.
  3. The columns in the table on which to create statistics.
  4. Tells the create statistics command to scan the whole table when creating the statistics.
  5. Allows you to specify how much of the table to scan.
  6. Stops SQL Server from automatically updating statistics on this statistics
It is a good idea to consider generating statistics on columns that are not in an index AND are used in a WHERE clause or in a JOIN clause. This is because this can help SQL Server make better choices when optimizing queries.
To create statistics on the ZipCode column in the Customer table, you could use the following command:
create statistics CustStat1 on Customer (ZipCode)


Auto Create Statistics

Data-distribution statistics are a key factor in how the SQL Server Query Optimizer creates query execution plans. This option directs SQL Server to automatically create statistics for any columns for which statistics could be useful. The default for this option is set to ON. To set auto create statistics ON for database_sample
ALTER DATABASE database_sample 
SET AUTO_CREATE_STATISTICS ON;

Drop statistics:

The `DROP STATISTICS` command is used to remove a statistics object.
The syntax is:
DROP STATISTICS table_name.statistics_name [,...n]

Maintaining and Updating Statistics

✅ Transact-SQL Syntax:
UPDATE STATISTICS table_name  
    [ ( statistics_name [ ,...n ] ) | index_name ]  
    [ WITH  
        [ FULLSCAN | SAMPLE number { PERCENT | ROWS } ]  
        [ , { ALL | COLUMNS | INDEX } ]  
        [ , NORECOMPUTE ]  
    ]

When you do not specify an index or a statistics, all statistics are updated.
✅ Explanation (Rewritten for SQL Server 2022)
🔹 `WITH` Clause Options
Option SQL Server 2022 Description
FULLSCAN Updates statistics using all rows in the table — provides maximum accuracy.
SAMPLE number PERCENT Samples a percentage of rows. Improves performance when a full scan is unnecessary.
SAMPLE number ROWS Samples a fixed number of rows. Useful for very large tables.
ALL Updates all statistics: both index-based and column-based.
COLUMNS Updates only column-level statistics (non-indexed).
INDEX Updates only statistics associated with indexes.
NORECOMPUTE Prevents SQL Server from automatically updating the specified statistics in the future.

There is a core, built-in feature of the SQL Server Query Optimizer. This automatic process is controlled by a database setting called `AUTO_UPDATE_STATISTICS`. When this option is turned on (which it is by default), the Query Optimizer automatically detects when statistics are "stale" or out-of-date and updates them before compiling a query.
The "Intelligent Threshold"
The logic that determines when statistics are stale has become more intelligent over time.
  • Legacy Threshold: It used to be a simple rule where statistics were updated after about 20% of the rows in a table had been modified. This was inefficient for very large tables, as it could take millions of changes to trigger an update.
  • Modern Threshold: Since SQL Server 2016 (and available in earlier versions via Trace Flag 2371), the threshold is dynamic and more intelligent. It lowers as a table gets larger. This means a smaller percentage of row changes will trigger an update on a massive table, ensuring the statistics are kept up-to-date more effectively. This modern threshold is based on a formula that uses the square root of the number of table rows: `sqrt(1000 * table_rows)`.

Best Practices for SQL Server 2022

  • Use FULLSCAN for critical reports or predictable performance.
  • Use SAMPLE options for large tables where speed is preferred over precision.
  • Avoid NORECOMPUTE unless you have a strong reason — modern statistics auto-updating is intelligent and generally recommended.

The next lesson will show you how to view statistics information.

SEMrush Software