Oracle Database - Optimizer Statistics
Type of Statistics
- Dictionary by Table, Column
- System (of the operating System) during a regular workload (otherwise non appropriate values for the CPU costing for instance)
Articles Related
Query Optimizer Statistics in the Data Dictionary
Statistics are always treated as estimates and can become stale or out of date, even the microsecond after a complete computation.
The statistics used by the query optimizer are stored in the data dictionary. You can collect exact or estimated statistics about physical storage characteristics and data distribution in these schema objects by using the DBMS_STATS package.
To maintain the effectiveness of the query optimizer, you must have statistics that are representative of the data.
Good statistics improve the degree of accuracy of the measures that use the plan estimator to estimate a plan.
For table columns that contain values with large variations in number of duplicates, called skewed data, you should collect histograms.
The resulting statistics provide the query optimizer with information about data uniqueness and distribution. Using this information, the query optimizer is able to compute plan costs with a high degree of accuracy. This enables the query optimizer to choose the best execution plan based on the least cost.
Bad Statistics
For example, if a table has not been analyzed since it was created, and if it has less than DB_FILE_MULTIBLOCK_READ_COUNT blocks under the high water mark, then the optimizer thinks that the table is small and uses a full table scan. (Review the LAST_ANALYZED and BLOCKS columns in the ALL_TABLES table to examine the statistics)
No statistics
- If no statistics are available when using query optimization, the optimizer will do dynamic sampling depending on the setting of the OPTMIZER_DYNAMIC_SAMPLING initialization parameter.
- If the optimizer uses the cost-based approach for a SQL statement, and if some tables accessed by the statement have no statistics, then the optimizer uses internal information, such as the number of data blocks allocated to these tables, to estimate other statistics for these tables.