How to Tune SQL Statement with IN Operator with an Expression List for SQL Server?

The following is an example shows a SQL statement with an IN List expression. The SQL retrieves records from EMPLOYEE table that EMP_DEPT should match any value in a list of values.

select EMP_ID
from EMPLOYEE
WHERE EMP_DEPT IN (‘AAD’,‘COM’,‘AAA’)
AND EMP_SALARY<10000000

Here the following are the query plans in the Tosska proprietary tree format, it takes 2.4 seconds to finish.

The query plan shows three Hash Match with EMPLOYEE’s indexes. For indexes EMPLOYEE_PK and emps_salary_inx are processing with EstimateRows up to 3000000, it seems too expensive since this condition EMP_DEPT IN (‘AAD’,’COM’,’AAA’) should rapidly trim down the return records. Let me rewrite the IN list into multiple UNION conditions in the following:

select  EMP_ID
from     EMPLOYEE E1
where  exists ( select ‘x’
                 where  E1.EMP_DEPT = ‘AAD’
                 union
                 select ‘x’
                 where  E1.EMP_DEPT = ‘COM’
                 union
                 select ‘x’
                 where  E1.EMP_DEPT = ‘AAA’)
      and EMP_SALARY < 10000000

This rewrite can force the IN list operation to be processed first before the condition EMP_SALARY < 10000000 takes place. Here the following is the query plan after rewrite, SQL server now can utilize Merge Join of 3 Nested Loop of “EMPS_DPT_INX index seek to RID Lookup of employee”. The speed now is 0.191 seconds and is much faster than the original SQL.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for SQL Server automatically, it shows that the rewrite is more than 12 times faster than the original SQL.  

Tosska SQL Tuning Expert (TSES™) for SQL Server® – Tosska Technologies Limited

How to Tune SQL with IN Subquery in Certain Environment for Oracle?

Here is an example SQL that retrieves data from EMPLOYEE table with employee’s ID code in EMP_SUBSIDIARY table.

select * from employee a
where a.emp_id
       in (select b.emp_id
             from emp_subsidiary b)

Here the following are the query plan of this SQL, it takes 10.03 seconds to finish.  The query plan is very simple since the syntax of the SQL is not complicated. The query plan shows a full table scan of EMPLOYEE and then nested to the index of EMPSB_EMP_ID, it looks like a good query plan, but I wonder if it is too expensive to have a full table scan of EMPLOYEE table?

In order to ask Oracle to consider other query plans, I added a dummy function Coalesce(b.emp_id,b.emp_id) in the subquery’s select list that artificially adding cost to the driving path from EMPLOYEE to EMP_SUBSIDIARY due to the index EMPSB_EMP_ID is disabled by this dummy function.

SELECT  *
FROM      employee a
WHERE  a.emp_id   IN (SELECT Coalesce(b.emp_id,b.emp_id)
                           FROM     emp_subsidiary b)

The rewritten SQL generates an adaptive query plan and a “nested loops” from EMPSB_EMP_ID to EMPLOYEE by EMPLOYEE_PK index. You can remove the steps in “Id” column with marked ‘-‘ from the plan to verify the result plan. The new plan now takes 4.13 seconds to finish only.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 2 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL with Exists Operator in Certain Environment for Oracle?

Here is an example SQL that retrieves data from EMPLOYEE table with “emp_id < 710000” and employee’s department code exists in DEPARTMENT table.

select  *
  from   employee
 where emp_id < 710000
      and  exists (select  ‘x’
                    from department
                 where dpt_id = emp_dept)

Here the following are the query plan of this SQL, it takes 34.22 seconds to finish.  The query plan is very complicated, although the SQL is quite simple. It is not abnormal that Oracle uses a complex solution to solve simple data retrieval.  This kind of complex plan steps is suitable for certain environments, but not for a simple database like this. I call it over-optimized query plan, which is due to the under estimated cost of this query plan. For complex plan like this, the cost estimation error is easily be amplified from step to step within the chain of plan steps.

In order to ask Oracle to consider other query plans, I rewrite the EXISTS to IN with a new “group by dpt_id” operation that force Oracle SQL optimizer to execute the subquery first.

SELECT  *
FROM     employee
WHERE  emp_id < 710000
       AND  emp_dept IN (SELECT      dpt_id
                            FROM         department
                            GROUP BY  dpt_id)

The rewritten SQL generates a simpler query plan and it is actually running faster with 5.59 seconds only.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 6 times faster than the original SQL. There is a SQL rewrite with even better performance, it is a little bit complicated to discuss in this short article here. May be we can discuss later.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL with SEMI JOIN by Hints INDEX_DESC Injection for Oracle?

Semi-join is introduced in Oracle 8.0. It provides an efficient method of performing a WHERE EXISTS or WHERE IN sub-queries. A semi-join returns one copy of each row in first table for which at least one match is found in second table, there is no need of further scanning of the second table once a record is found.

SELECT *
     FROM DEPARTMENT
where dpt_id
     in (select emp_dept from EMPLOYEE
        where emp_id >3300000)

Here the following is the query plan of this SQL, it takes 13.59 seconds to finish. The query shows a “NESTED LOOPS SEMI” from DEPARTMENT to EMPLOYEE table.

Basically, this SQL is difficult to optimize by just syntax rewrite due to the simplicity of the SQL syntax that Oracle is easily transformed into a canonical syntax internally, so not much alternative query plan can be triggered by syntax rewrite.

Let’s use Hints injection to the SQL and see if there any brutal force of hints injection can trigger a better performance plan. With our A.I. Hints Injection algorithm applying to the SQL, it comes up with a SQL with extraordinary performance improvement that even I cannot understand at the first glance.

SELECT  /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ *
FROM     department
WHERE  dpt_id IN (SELECT emp_dept
                           FROM     employee
                          WHERE  emp_id > 3300000)

Here is the query plan of the hints injected SQL and it is now running much faster. The new query plan shows that the “INDEX RANGE SCAN” of EMP_DPT_INX to EMPLOYEE table is changed to “INDEX RANGE SCAN DESCENDING” and the estimated cost is the same as the Original SQL.

The Hints /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */  injected SQL takes only 0.05 second, it is much faster than the original SQL, the reason behind is the employee records creation order in EMPLOYEE table, the higher the emp_id will be created later, so the corresponding records will be inserted into the right hand side of the EMP_DPT_INX index tree nodes. The “INDEX RANGE SCAN” in the original SQL plan that needs to scan a lot of records from left to right direction before it can hit one record for  the condition “WHERE  emp_id > 3300000”.  In contrast, the Hints injected SQL with the “INDEX RANGE SCAN DESCENDING” operation that can evaluate the WHERE condition with only one scan from right to left on EMP_DPT_INX index tree nodes. That explains why the Hints injected SQL outperformed the original SQL by more than 270 times.

It is common that we employ “transaction id”, “serial no” or “creation date” in our application design, this kind of the records are normally created alone with an increasing sequence order, there may be some SQL in your system can be improved by this technique.

This kind of rewrites or Hints injection can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the Hints injected SQL is more than 270 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL with OR statements?

It is common that the performance is not good if a SQL statement with OR conditions. Let’s have an example show you how to tune those SQL statements in certain situations. Here is an example SQL that extract records from EMPLOYEE table if (emp_grade < 1050 or emp_id<730000). Emp_grade and emp_id are indexed and they are not null field.

select * from employee
where emp_grade < 1050 or emp_id<730000

You can see MySQL SQL Optimizer use an Index Merge of emps_grade_inx and employee_pk to process the SQL, the performance is not good as expected since the result set is quite big for sort_union operation. It takes more than 40 seconds to finish the data retrieval. Let me rewrite the OR condition into the following UNION ALL statement, please make sure the emp_grade and emp_id are not null column, otherwise it may generate error result. The rewrite is simple that the first part extract data with emp_grade<1050, the second part of the UNION ALL retrieve records that satisfied with emp_id<730000, but it is not retrieved in the first part of the UNION ALL.

select    *
from       employee
where    emp_grade < 1050
union all
select    *
from       employee
where    not ( emp_grade < 1050 )
                  and emp_id < 730000

Here the following is the query plan of this SQL, it takes 12.46 seconds to finish. The query shows two “Index Range Scan” of EMPLOYEE_PK and EMPS_GRADE_INX to the employee table.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL with COUNT(*) statements ?

It is common that we used to count the number of records in a table.  You may encounter unexpected performance degradation in certain situations. Here is an example SQL that count number of records from EMPLOYEE table. There are number of indexes are built such as emp_id, emp_dept, emp_grade, emp_hire_date and etc….

SELECT COUNT(*)
   FROM EMPLOYEE;

You can see MySQL SQL Optimizer use a Full Index Scan of EMP_HIRE_DATE index, the performance is bad since unnecessary random reads is needed and it takes 3 minutes and 6 seconds to count a 3 million records in my computer.  I want to make use of Index Range Scan for specific index, let me rewrite the above SQL into the following syntax. If you know EMP_GRADE is indexed and it is not a nullable column, you can add a dummy condition EMP_GRADE>=’’. It fools MySQL SQL optimizer to consider using EMP_GRADE range index to retrieve the records and it is successfully generate a new plan in the following:

select    COUNT(*)
from       EMPLOYEE
where    EMP_GRADE >=  ‘ ‘

Here the following is the query plan of this SQL, it takes 2.6 seconds to finish. The query shows an “Index Range Scan” of employee table.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 71 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune substr(emp_name,5,4) SQL Statement?

There may be some business requirements that need to compare certain part of a column as a data retrieval criteria. Here is an example SQL that retrieves data from EMPLOYEE table employee’s name with a string pattern “Acco” start from 5 character of the emp_name.

select    *
  from    employee
where    substr(emp_name,5,4)=‘Acco’

Here the following are the query plans of this SQL, it takes 17 seconds to finish. The query shows a “Full Table Scan Employee”  

You can see that this SQL cannot utilize index scan even the emp_name is indexed field. Let me add a “Force Index(emp_name_inx)“ hints to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition emp_name >= ‘ ‘ , it is an always true condition that emp_name should be greater or equal to a smallest empty character.

select    *
from       employee force index(emp_name_inx)
where    substr(emp_name,5,4) = ‘Acco’
                 and emp_name >= ‘ ‘

Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows that an Index Range Scan is used now.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

Tosska In-Memory Maestro (TIM™) for Oracle® v1.1.0 officially released !

In-Memory SQL Optimization In-Memory Objects recommendation

Hong Kong, – June 1, 2017


I am excited to announce that Tosska has just released our first version of Tosska In-Memory Maestro (TIM™) for Oracle® v1.1.0, the ultimate tool to manage your powerful Oracle® In-Memory capabilities that every database administrator has been looking for !

Just go and download a free trial immediately to explore the fascinating features that TIM™ can offer you. Enjoy your test drive !