How to build indexes for slow first execution SQL – SQL Server?

You may suffer from SQL statements with a slow first execution time due to the long data cache process. The following SQL is simple that retrieves records from the EMPLOYEE table that if EMP_SALARY < 500000 and the result set is ordered by EMP_NAME.

Select emp_id,
    emp_name,
    emp_salary,
    emp_address,
    emp_telephone
from    employee
where  emp_salary < 500000
order by emp_name;

The following is the query plan that takes 9.51 seconds for the first execution and takes 0.99 seconds for the second execution without data cache.

The SQL cannot be tuned by SQL syntax rewrite or hints injection for both the first execution and the second execution, it is because SQL Server has selected the best query plan for this simple SQL statement. But the problem is that if the condition “where emp_salary < 500000” is changed; say from 500000 to 510000 or the EMPLOYEE data is flushed out from the memory, the execution time will then be prolonged up to 9.51 seconds.

Let’s see if we can build indexes to improve this situation. There is a common perception that a good index can help to improve both the first execution time and the second execution time. So, I use a tool to explore a lot of indexes configurations, but none of them can improve both executions’ performance. Here the following is the performance of the second execution with data cached for different indexes proposed by the tool. You can see the performance of “Index Set 1” is close to the original SQL performance with a little performance variation due to the system’s loading status and all other indexes sets are worse than the original SQL. Normally, we will give up the tuning of the SQL statement without even trying to see whether those recommended indexes are good for the first execution time.

I did a test for those recommended indexes to see whether they are helpful to improve the first execution time, it surprises me that the “Index Set 1” is tested with a significant improvement and improves the first execution time from 9.51 seconds to 0.65 seconds. It is a 14 times improvement that can make my database run more efficiently. So, you should be very careful to tune your SQL with new indexes that may not be good for your second execution with all data cached, but it may be very good for your first execution without data cached.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically.

Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How to index SQL with aggregate function SQL for Oracle?

Here the following is an example SQL shows you that select the maximum emp_address which is not indexed in the EMPLOYEE table with 3 million records, the emp_grade is an indexed column.

select max(emp_address) from employee a
where emp_grade<4000

As 80% of the EMPLOYEE table’s records will be retrieved to examine the maximum emp_address string. The query plan of this SQL shows a Table Access Full on EMPLOYEE table is reasonable.

How many ways to build an index to improve this SQL?
Although it is simple SQL, there are still 3 ways to build an index to improve this SQL, the following are the possible indexes that can be built for the SQL, the first one is a single column index and the 2 and 3 are the composite index with a different order.
1. EMP_ADDRESS
2. EMP_GRADE, EMP_ADDRESS
3. EMP_ADDRESS, EMP_GRADE

Most people may use the EMP_ADDRESS as the first choice to improve this SQL, let’s see what the query plan is if we build a virtual index for the EMP_ADDRESS column in the following, you can see the estimated cost is reduced by almost half, but this query plan is finally not being used after the physical index is built for benchmarking due to actual statistics is collected.

The following query shows the EMP_ADDRESS index is not used and the query plan is the same as the original SQL without any new index built.

Let’s try the second composite index (EMP_GRADE, EMP_ADDRESS), the new query plan shows an Index Fast Full Scan of this index, it is a reasonable plan which no table’s data is needed to retrieve. So, the execution time is reduced from 16.83 seconds to 3.89 seconds.

Let’s test the last composite index (EMP_ADDRESS, EMP_GRADE) that EMP_ADDRESS is placed as the first column in the composite index, it creates a new query plan that shows an extra FIRST ROW operation for the INDEX FULL SCAN (MIN/MAX), it highly reduces the execution time from 16.83 seconds to 0.08 seconds.

So, indexing sometimes is an art that needs you to pay more attention to it, some potential solutions may perform excess your expectation.

The best index solution is now more than 200 times better than the original SQL without index, this kind of index recommendation can be achieved by Tosska SQL Tuning Expert for Oracle automatically.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to build indexes for multiple Max() functions for SQL Server?

For some SQL statements with multiple Max() functions in the select list and nothing in the Where clause, we have different methods to create new indexes to improve the SQL speed.

Here is an example SQL, it is to retrieve the maximum name and age from the employee table.

select max(emp_name),
     max(emp_age)
from  employee

The following is the query plan that takes 9.27 seconds.

The SQL cannot be tuned by SQL syntax rewrite or hints injection, and the SSMS cannot recommend any index to improve the SQL.

For this kind of SQL that we can consider building a composite index or two individual indexes for emp_name and emp_age.  A new composite of these two columns (emp_age, emp_name) can improve the SQL around 7 times. The following is the query plan shows that the new composite index is used, but it has to scan the entire index for these two stream aggregate operations before getting the max(emp_name) and max(emp_age).

How about if we build two individual indexes for emp_name and emp_age. The following is the result and query plan of these two indexes created. A Top operator selects the first row from each index and returns to the Stream Aggregate operation, and then a Nested Loops join the two maximum results together. It is 356 times much faster than the original SQL.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically.

Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How is the order of the columns in a composite index affecting a subquery performance for Oracle?

MySQL database and sql

We know the order of the columns in a composite index will determine the usage of the index or not against a table. A query will use a composite index only if the where clause of the query has at least the leading/left-most columns of the index in it. But, it is far more complicated in correlated subquery situations. Let’s have an example SQL to elaborate the details in the following.

SELECT D.*
FROM   department D
WHERE EXISTS (SELECT    Count(*)
         FROM     employee E
         WHERE     E.emp_id < 1050000
                AND E.emp_dept = D.dpt_id
         GROUP BY  E.emp_dept
         HAVING    Count(*) > 124)

Here the following is the query plan of the SQL, it takes 10 seconds to finish. We can see that the SQL can utilize E.emp_id and E.emp_dept indexes individually.

Let’s see if a new composite index can help to improve the SQL’s performance or not, as a rule of thumb, a higher selectivity column E.emp_id will be set as the first column in a composite index (E.emp_id, E.emp_dept).

The following is the query plan of a new composite index (E.emp_id, E.emp_dept) and the result performance is not good, it takes 11.8 seconds and it is even worse than the original query plan.

If we change the order of the columns in the composite index to (E.emp_dept, E.emp_id), the following query plan is generated and the speed is improved to 0.31 seconds.

The above two query plans are similar, the only difference is the “2” operation. The first composite index with first column E.emp_id uses an INDEX RANGE SCAN of the new composite index, but the second query plan uses an INDEX SKIP SCAN for the first column of E.emp_dept composite index. You can see there is an extra filter operation for E.emp_dept in the Predicate Information of INDEX RANGE SCAN of the index (E.emp_id, E.emp_dept). But the (E.emp_dept, E.emp_id) composite index use INDEX SKIP SCAN without extra operation to filter the E.emp_dept again.

So, you have to test the order of composite index very carefully for correlated subqueries, sometimes it will give you improvements that exceed your expectation.

This kind of index recommendation can be achieved by Tosska SQL Tuning Expert for Oracle automatically.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

Do not undermine your SQL Server’s potential ability

For some SQL statements that are failed to be tuned by syntax rewrite, hints injection, and all necessary indexes are built, people may think that hardware upgrade is the only way to resolve the performance problem. But, please don’t undermine your SQL Server’s SQL optimizer which can provide you with the ultimate performance solution that you may not have imagined before. What you need to do is to provide SQL Server with a set of proper new indexes.

Here is an example SQL, it is to retrieve the minimum employee’s salary and the emp_id that with salary greater than all salary of the emp_subsidiary table with subsidiary’s employees’ department = “AAA”.

SELECT emp_id,
    (SELECT min(emp_salary)
     FROM  employee)
FROM  employee
WHERE emp_salary > (SELECT max(emp_salary)
           FROM emp_subsidiary
           where  emp_dept = ‘AAA’)

Although all columns that show in the SQL are indexed, the following query plan takes 44 seconds.

The SQL cannot be tuned by SQL syntax rewrite or hints injection, and the SSMS can recommend only one index on one table for a SQL statement, it is failed to recommend any good index. So, the SQL cannot be tuned in any traditional way.

Let’s use our new A.I. index recommendation engine to see if there are any good index solutions. A set of indexes is recommended listed in the following. It takes only 0.55 seconds.

Example: 80 times faster A.I. SQL index recommendation

The query plan shows that two new indexes are used at the same time that the SSMS is not able to provide.

Tosska SQL Tuning Expert Pro is in-built with an A.I. engine to recommend indexes for multiple tables at the same time for a SQL statement. The new technology is so powerful to recommend multiple tables’ new indexes for a SQL at the same time, it means that how each new table’s indexes affect each other in the query plan will be considered by the engine. It is very helpful for SQL Server’s SQL optimizer to explore more potential query plans that could not be generated before. So, don’t undermine your SQL Server’s ability. Instead, use the right tool to tune your SQL statements before you are planning to upgrade your hardware.

Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How to use ORDERED Hint to Tune a SQL with subquery for Oracle?

Here the following is the description of the ORDERED hint.

The ORDERED hint causes Oracle to join tables in the order in which they appear in the FROM clause.

If you omit the ORDERED hint from a SQL statement performing a join, then the optimizer chooses the order in which to join the tables. You might want to use the ORDERED hint to specify a join order if you know something about the number of rows selected from each table that the optimizer does not. Such information lets you choose an inner and outer table better than the optimizer could.

We usually use an ORDERED hint to control the john order, but how this hint causes a SQL with a subquery. Let’s use the following SQL as an example to see how ORDERED hint works for a subquery.

SELECT *
     FROM DEPARTMENT
where  dpt_id
     in (select emp_dept from employee
      where emp_id >3300000)

Here the following is the query plan of the SQL, it takes 68.84 seconds to finish. The query shows a “TABLE ACCESS FULL” of the DEPARTMENT table and “NESTED LOOPS SEMI” to an “INDEX RANGE SCAN” of EMPLOYEE.

If you think it is not an effective plan, you may want to try to reorder the join path and see if an ORDERED hint is working or not in a subquery case like this:

SELECT  /*+ ORDERED */ *
FROM  department
WHERE  dpt_id IN (SELECT  emp_dept
         FROM  employee
         WHERE  emp_id > 3300000)

Here is the query plan of the hinted SQL and the speed is 3.44 seconds which is 20 times better than the original SQL. The new query plan shows the new join order that EMPLOYEE is retrieve first and then hash join DEPARTMENT later. You can see the ORDERED hint will order the subquery’s table first. This new order clauses a new data retrieval method from the EMPLOYEE table, it makes the overall performance much better than the original query plan.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for Oracle automatically, there are other hints-injection SQL with better performance, but it is not suitable to discuss in this short article, maybe I can discuss later in my blog.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL Statements with NO_RANGE_OPTIMIZATION Hints Injection?

There are some SQL statements with performance problem can be tuned by Hints injection only. Here is an example to show you how to use NO_RANGE_OPTIMIZATION optimization hints to tune a SQL statement.

A simple example SQL that retrieves data from EMPLOYEE and EMP_SAL_HIST tables.

select * from employee a,emp_sal_hist h
where  a.emp_id =h.sal_emp_id
and  a.emp_dept < ‘B’
and h.sal_salary  between 1000000 and 2000000

Here the following are the query plans of this SQL, it takes 24.3 seconds. The query shows an Index Range Scan (EMPS_DPT_INX) of EMPLOYEE and then Nested Loop to EMP_SAL_HIST with a Non-Unique Key Lookup of SALS_EMP_INX index.

The EMP_SAL_HIST is the employee’s salary history table which keeps more than one salary record for each employee. So, EMPLOYEE to EMP_SAL_HIST is a one-to-many relationship. The speed of a nested loop operation is highly dependent on the driving path of two nested loop tables. MySQL SQL optimizer estimated that the condition (a.emp_dept < ‘B’) can rapidly reduce the result set, so the driving path that “from EMPLOYEE to EMP_SAL_HIST” is selected.

Unless you fully understand the data distribution and do a very precise calculation, otherwise you are not able to tell whether this driving path is the best or not.

How to make MySQL consider another driving path “from EMP_SAL_HIST to EMPLOYEE”? Let’s take a look at MySQL documentation:

NO_RANGE_OPTIMIZATION: Disable index range access for the specified table or indexes. This hint also disables Index Merge and Loose Index Scan for the table or indexes. By default, range access is a candidate optimization strategy, so there is no hint for enabling it.

This hint may be useful when the number of ranges may be high and range optimization would require many resources.

To disable the Index Range Scan of the EMPLOYEE table, I explicitly add a Hints /*+ QB_NAME(QB1) NO_RANGE_OPTIMIZATION(`a`@QB1) */  to the SQL statement and hope that MySQL will use the Index Range Scan by the condition (h.sal_salary between 1000000 and 2000000) as the first driving table.

select  /*+ QB_NAME(QB1) NO_RANGE_OPTIMIZATION(`a`@QB1) */ *
from    employee a,
     emp_sal_hist h
where a.emp_id = h.sal_emp_id
     and a.emp_dept < ‘B’
     and h.sal_salary between 1000000 and 2000000

Here is the result query plan of the Hints injected SQL and the execution time is reduced to 10.01 seconds. The new query plan shows that the driving path is changed from EMP_SAL_HIST table nested loop to EMPLOYEE table. So, sometimes you may make use of the NO_RANGE_OPTIMIZATION hint to control the driving path order to see if MySQL can run your SQL faster.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the Hints injected SQL is more than 2 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL Statement with CASE Expression for SQL Server II?

oracle database performance tuning

We have discussed how to tune a CASE expression SQL with hardcoded literals in my last blog:

How to Tune SQL Statement with CASE Expression for SQL Server I?

SELECT *
FROM EMPLOYEE
 WHERE
 CASE
  when emp_id  < 1001000 then ‘Old Employee’ 
  when emp_dept <‘B’     then ‘Old Department’
 ELSE  ‘Normal’
 END =  ‘old Employee’

If I change the hardcoded literal to a @var, what will be the performance of the last blog’s rewritten SQL?

SELECT *
FROM EMPLOYEE
 WHERE
 CASE
  when emp_id  < 1001000 then ‘Old Employee’ 
  when emp_dept <‘B’     then ‘Old Department’
 ELSE  ‘Normal’
 END =  @var

I use the same method in my last blog to rewrite this SQL into the following multiple OR syntax, but the SQL Server optimizer change back to a full table scan of the EMPLOYEE table. It is because the SQL Server cannot do a good cardinality estimation of the variable of @var.

select *
from  EMPLOYEE
where emp_id < 1005000
     and ‘Old Employee’ = @var
     or not ( emp_id < 1005000 )
     and emp_dept < ‘B’
     and ‘Old Department’ = @var
     or not ( emp_id < 1005000 )
     and not ( emp_dept < ‘B’ )
     and ‘Normal’ = @var

We can rewrite the CASE expression into the following syntax with multiple UNION ALL statements, this syntax is more complicated than the rewrite with multiple OR conditions in my last blog. But it can make SQL Server improve the query plan to be more efficient.

select *
from  EMPLOYEE
where emp_id < (select 1005000)
     and ‘Old Employee’ = @var
union all
select *
from  EMPLOYEE
where ( not ( emp_id < 1005000 )
       and ‘Old Employee’ = @var
     or @var is null )
     and emp_id >= 1005000
     and emp_dept < ‘B’
     and ‘Old Department’ = @var
union all
select *
from  EMPLOYEE
where ( not ( emp_id < 1005000 )
       and ‘Old Employee’ = @var
     or @var is null )
     and ( not ( emp_id >= 1005000
         and emp_dept < ‘B’
         and ‘Old Department’ = @var )
       or @var is null )
     and emp_id >= 1005000
     and emp_dept >= ‘B’
     and ‘Normal’ = @var

Here is the query plan of the rewritten SQL and the speed is 0.448 seconds. It is 5 times better than the original syntax. People may think that there are two table scan operations of EMPLOYEE that will slow down the whole process, but actually, the corresponding filter operations will stop the table scan operations immediately due to the filter conditions ‘Normal’ = @var and ‘Old Department’ = @var will not be satisfied. This kind of query plan cannot be generated by SQL Server’s internal SQL optimizer, it means that you cannot use Hints injection to get this query plan.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for SQL Server automatically.

Tosska SQL Tuning Expert (TSES™) for SQL Server® – Tosska Technologies Limited

How to Tune SQL Statement with CASE Expression for SQL Server I?

oracle database performance tuning

Here the following is a simple SQL statement with a CASE expression syntax.

SELECT *
FROM EMPLOYEE
WHERE
CASE
when emp_id  < 1001000 then ‘Old Employee’
when emp_dept <‘B’     then ‘Old Department’
ELSE  ‘Normal’
END =  ‘old Employee’

Here the following are the query plans of this SQL, it takes 2.23 seconds in a cold cache situation, which means data will be cached during the SQL is executing. The query shows a Full Table Scan of the EMPLOYEE table due to the CASE expression cannot utilize the emp_id index or emp_dept index.

We can rewrite the CASE expression into the following syntax with multiple OR conditions.

select *
from  EMPLOYEE
where emp_id < 1005000
and ‘Old Employee’ = ‘Old Employee’
or not ( emp_id < 1005000 )
and emp_dept < ‘B’
and ‘Old Department’ = ‘Old Employee’
or not ( emp_id < 1005000 )
and not ( emp_dept < ‘B’ )
and ‘Normal’ = ‘Old Employee’

Here is the query plan of the rewritten SQL and the speed is 0.086 seconds. It is 25 times better than the original syntax. The new query plan shows an Index Seek of EMP_ID index.

This SQL rewrite is useful when the CASE expression is equal to a hardcoded literal, but if the literal “  =’Old Employee’ ” replaced by a variable “ = :var ”, this rewrite may not be useful, I will discuss it in my next blog.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for SQL Server automatically.

Expert (TSES™) for SQL Server® – Tosska Technologies Limited

How to tune a SQL that cannot be tuned ?

oracle sql performance tuning

Some mission-critical SQL statements are already reached their maximum speed within the current indexes configuration.  It means that those SQL statements are not able to be improved by syntax rewrite or Hints injection. Most people may think that the only way to improve this kind of SQL may be by upgrading hardware.  For example, the following SQL statement has every column in WHERE clause is indexed and the best query plan is generated by Oracle already. There is no syntax rewrite or hints injection that can help Oracle to improve the SQL performance.

SELECT EMP_ID,
    EMP_NAME,
    SAL_EFFECT_DATE,
    SAL_SALARY
  FROM EMPLOYEE,
    EMP_SAL_HIST,
    DEPARTMENT,
    GRADE
WHERE EMP_ID = SAL_EMP_ID
  AND SAL_SALARY <200000
  AND EMP_DEPT = DPT_ID
  AND EMP_GRADE = GRD_ID 
  AND GRD_ID<1200    AND EMP_DEPT<‘D’

Here the following is the query plan and execution statistics of the SQL, it takes 2.33 seconds to extract all 502 records. It is not acceptable for a mission-critical SQL that is executed thousands of times in an hour. Do we have another choice if we don’t want to buy extra hardware to improve this SQL?

Introduce new plans for Oracle’s SQL optimizer to consider
Although all columns in the WHERE clause are indexed, can we build some compound indexes to help Oracle’s SQL optimizer to generate new query plans which may perform better than the original plan? Let’s see if we adopt the common practice that the following EMPLOYEE’s columns in red color can be used to compose a concatenated index (EMP_ID, EMP_DEPT, EMP_GRADE).

WHERE  EMP_ID = SAL_EMP_ID
  AND  SAL_SALARY <200000
  AND  EMP_DEPT = DPT_ID
  AND  EMP_GRADE = GRD_ID 
  AND  GRD_ID<1200
  AND  EMP_DEPT<‘D’

CREATE INDEX C##TOSSKA.TOSSKA_09145226686_V0043 ON C##TOSSKA.EMPLOYEE
(
 EMP_ID,
 EMP_DEPT,
 EMP_GRADE
)

The following is the query plan after the concatenated index is created. Unfortunately, the speed of the SQL is 2.40 seconds although a new query plan is introduced by Oracle’s SQL optimizer.

To be honest, it is difficult if we just rely on common practices or human knowledge to build indexes to improve this SQL. Let me imagine that if we got an AI engine that can help me to try the most effective compound indexes to explore Oracle’s SQL optimizer potential solutions for the SQL. The following concatenated indexes are the potential recommendation by the imagined AI engine.

CREATE INDEX C##TOSSKA.TOSSKA_13124445731_V0012 ON C##TOSSKA.EMP_SAL_HIST
(
 SAL_SALARY,
 SAL_EFFECT_DATE,
 SAL_EMP_ID
)
CREATE INDEX C##TOSSKA.TOSSKA_13124445784_V0044 ON C##TOSSKA.EMPLOYEE
(
 EMP_GRADE,
 EMP_DEPT,
 EMP_ID,
 EMP_NAME
)

The following is the query plan after these two concatenated indexes are created and the speed of the SQL is improved to 0.13 seconds. It is almost 18 times better than that of the original SQL without the new indexes.

The above indexes include some columns that appear on the SELECT list of the SQL and there is a correlated indexes relationship for Oracle’s SQL optimizer to generate the query plan, it means that missing any columns of the recommended indexes or reshuffling of the column position of the concatenated indexes may not be able to produce such query plan structure. So, it is difficult for a human expert to compose these two concatenated indexes manually. I am glad to tell you that this kind of AI engine is actually available in the following product.

https://www.tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/