Optimization in SQL: Determining Stale Statistics in Oracle

Optimization in SQL

You can determine whether statistics are stale in Oracle using two methods. The first is to make Oracle tell you if it considers the stats are stale.

The second involves a comparison of the statistics of what the DBMS assumes a table to be – and what it really is. Here, we’ll help you understand and determine whether the stats are stale.

Other Reasons Why Oracle May Pick a Bad Plan (Aside from Stale Statistics)

Good statistics aren’t necessarily always perfect – they only have to be correct to a degree for the information in the table. In case your statements are running sluggishly because Oracle picked a bad plan, you may also take it as a fair sign that your statistics are stale.

That said, stale statistics aren’t the only culprits behind Oracle selecting a bad plan, triggering the need for optimization in SQL. There are others, such as the following:

  • If your data doesn’t include enough histograms, extended statistics, or joins (both correlated and anti-correlated ones), it may not be uniform enough for Oracle to perceive it as it should.
  • Oracle might find it hard to calculate the amount of data to expect. This generally happens when a query is written in a way that confuses the DBMS.
  • When Oracle lacks a precise representation of the duration involving the completion of one or more operations. The system statistics may be incorrect in such cases.
  • The data set may not be sufficiently represented in Oracle’s version of the statistics. This usually happens in large tables with highly varied data that a histogram cannot accurately represent.
  • Certain indexes typically used by Oracle may have become invalid.

Coaxing Oracle to Determine Stale Statistics

You can have Oracle tell you about stale stats easily if your priority is to improve performance of SQL query.  The catch is, you won’t be able to determine how stale those stats are. However, you will know if there have been enough changes in a table for Oracle to consider regathering statistics on it.

To find out if that’s the case, you’ll need to view the stale stats column in DBA_STATISTICS which you can do with the following query:

select stale_stats

from dba_statistics

where owner = ‘<name of table owner>’’

AND table_name = ‘<name of table>’

The column may return “YES”, “NO” or other results, indicating Oracle’s stance on the stats. “YES” means Oracle is ready to re-gather statistics, whereas “NO” shows Oracle believes the statistics don’t need updating.

The column may return null, indicating incomplete or absent stats altogether. Take care to enter the correct table owner and table names as the query won’t return any rows otherwise.

Checking Stats on Your Own: What to Do in Oracle Database and SQL

When you do this manually, you can find out “how stale” your stats are. That’s because you’ll be able to collect stats on the “stalest” tables first, reducing the number of changes needed to be made to the database. This way, you could also avoid accumulating stats in situations that could lead to contention.

The goal here is to draw a comparison between Oracle’s values and the table’s actual values. Typically, a difference of up to 10 per cent between the two is acceptable. Also, this method requires us to check two separate kinds of statistics –

  • Table Level Statistics – As the name suggests, you can verify various aspects of the table like the following:
  1. A number of rows and empty blocks – You can use a query like this:

DBA_TAB_STATISTICS.NUM_ROWS

select count(*)

from <table name>;

  1. Number of Empty blocks and
  2. A number of blocks taken up by the table – The statement below will help you retrieve the number of “occupied blocks”:

DBA_TAB_STATISTICS.BLOCKS –

DBA_TAB_STATISTICS.EMPTY_BLOCKS

select count(distinct substr(rowid, 7, 9))

from <table name>;

  • Column Level Statistics – These will help you determine things like:
  1. How many distinct values exist in a column – This proves useful for expressions that include “COLUMN = <any value>”.

Try the following:

select count(distinct <columnname>)

from <table name>;

  1. The column’s high and low values – These values prove useful for situations that need range-based predicates, such as those involving COLUMN <= <any value> or COLUMN between <START> and <END>.
  1. The number of nulls in a column – This query should help if you want to view the stats of a particular column: 

with cte (x)

as

(

   select /*+ inline */ <column name>

   from <table owner name>.<table name that you want to check>

)

select (select approx_count_distinct(x) from cte) distinct_values

     , (select count(*) – count(x) from cte) num_nulls

     , (select min(x) from cte) low_value

     , (select max(x) from cte) high_value

from dual