How to Tune SQL Statement with Multiple Union in Subquery for MySQL?

dba tuning

The following is an example shows a SQL statement with two union operator in a subquery. The SQL retrieve records from EMPLOYEE table that EMP_ID should satisfy with the union result set from two queries in a subquery.

select * from employee
  where emp_id IN
  (select emp_id from emp_subsidiary where emp_grade=1000
    union
    select emp_id from employee where emp_dept=‘AAA’)

Here the following are the query plans in Tosska proprietary tree format, it takes 3 minutes 27 seconds to finish.

The query plan shows a full table scan of EMPLOYEE table and the attached subquery will be executed for each of scanned record. So, you can see the query plan is very inefficient. If we know the union result set is small and it should be executed first, and then use EMP_ID index to retrieve EMPLOYEE table. Let me rewrite the Union subquery as a derived table expression in the following:

select *
from employee
where  emp_id in (select  emp_id
                       from    (select  emp_id
                             from     emp_subsidiary
                             where  emp_grade = 1000
                             union
                             select  emp_id
                             from     employee
                             where  emp_dept = ‘AAA’) DT1)

Now, you can see the Union subquery is executed first and use it to retrieve the EMPLOYEE table by EMP_ID index. The overall query is now become more reasonable and efficient.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 60 times faster than the original SQL.  There are some other rewrites with even better performance, but it is a little bit complicated to discuss in this short article, let’s discuss it in my coming blogs.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL Statement with IN List Bind Variables for MySQL?

dba tuning

The following is an example shows a SQL statement with a variable on the IN List operator. The SQL retrieve records from EMPLOYEE table that (EMP_ID,@1) should match any value in a set of values on the right-hand side.

select * from employee
where  (emp_id,@1) in ((1000000,’a’),(2000000,’b’),(3000000,’c’))

Here the following are the query plans in Tosska proprietary tree format, it takes 19 seconds to finish.

The query plan shows a full table scan of EMPLOYEE, it means MySQL cannot decompose the IN list syntax into a better syntax for cost evaluation and no index scan is used.

Let me rewrite the IN list into multiple OR conditions in the following:

select *
from     employee
where  (  (  emp_id = 1000000
           and @1 = ‘a’ )
           or ( emp_id = 2000000
              and @1 = ‘b’ )
           or ( emp_id = 3000000
              and @1 = ‘c’ ) )

Now, MySQL can utilize Single Row (constant) index search. The speed now is 0.0059 second and is much faster than original SQL.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3200 times faster than the original SQL.  

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL with LIKE ‘%ROGER%’ Expression for Oracle?

mysql query optimization

The LIKE is a logical operator that determines if a character string matches a specified pattern. A pattern may include regular characters and wildcard characters. The LIKE operator is used in the WHERE clause of the SELECT, UPDATE, and DELETE statements to filter rows based on pattern matching.

Here is an example SQL that retrieves data from EMPLOYEE table employee’s name with a string pattern like “ROGER%”. If the emp_name is indexed, the following SQL will utilize index of the emp_name and the speed of the SQL will be fine.

select *
from employee
where emp_name like ‘ROGER%’;

If user is looking for emp_name with pattern like ‘%ROGER%’, Oracle SQL Optimizer cannot utilize emp_name index to speed up the process, full table scan is normally be used and the performance will be bad.

select *
from employee
where emp_name like ‘%ROGER%’;

Here the following are the query plan of this SQL, it takes 5.88 seconds to finish. The query shows a “Full Table Scan” of employee table.

You can see that this SQL cannot utilize index scan even the emp_name is indexed. Let me add a hints /*+ INDEX(@SEL$1 EMPLOYEE) */ to the SQL and ask Oracle to use index to retrieve records.

SELECT    /*+ INDEX(@SEL$1 EMPLOYEE) */ *
FROM        employee
WHERE    emp_name LIKE ‘%ROGER%’

Here is the query plan of the rewritten SQL and it is now running faster. The new query plan shows that an Index Full Scan is used although the estimated cost is higher than the Original SQL.

You can also use hints /*+ index(employee EMP_NAME_INX) */  explicitly, if the  /*+ INDEX(@SEL$1 EMPLOYEE) */ hints doesn’t work for you.

SELECT    /*+ index(employee EMP_NAME_INX) */ *
FROM        employee
WHERE    emp_name LIKE ‘%ROGER%’

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 2 times faster than the original SQL. There is a SQL hints injection with even better performance, it is a little bit complicated to discuss in this short article here. May be we can discuss later.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune Delete SQL statement with subqueries for MySQL?

mysql query optimization

The following is an example shows a DELETE SQL statement with subquery. The SQL delete records from emp_subsidiary that should satisfy with three conditions shows in the following query.

delete from emp_subsidiary
where emp_id in
         (SELECT EMP_ID
              FROM EMPLOYEE
          WHERE emp_salary < 15000)
and emp_dept<‘D’
and emp_grade<1500

Here the following are the query plans in tabular format, it takes 8.88 seconds to finish.

Normally, DELETE SQL statements are difficult to tune due to the MySQL SQL optimizer generate a relative smaller plan space for DELETE statements compare to SELECT SQL statements. Simply speaking, there are not much alternative plans that MySQL will generate for you no matter how complicated SQL syntax you can rewrite for your DELETE statement.  But there is a loophole in MySQL version 8, which we have to aware of is the order of conditions listed in the DELETE statement. The following rewrite which reordered the filtering conditions and has the same query plan as the original SQL both in Tree Plan and Tabular Plan. But the speed is improved to 3.88 seconds.

delete from emp_subsidiary
where            emp_dept < ‘D’
                          and emp_grade < 1500

                          and emp_id in (select EMP_ID
                                 from   EMPLOYEE
                                 where  emp_salary < 15000)

Since there is no change in Tree Plan and Tabular Plan, we have to check the Visual Plan and found the following change in red box, it shows you that the Attached Condition’s execution order is changed and the time-consuming subquery is placed at the end of the Attached Condition. It means that either one of the first two conditions is false then the subquery is not necessary to execute. It is possibly can explain why the second DELETE statement is running much faster than the original SQL statement.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 2 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune CTE “WITH” SQL statement?

SQL Server database and SQL

CTE stands for common table expression. A CTE allows you to define a temporary named result set that available temporarily in the execution scope of a statement such as SELECT, INSERT, UPDATE, DELETE.

The following shows an example of a CTE in MySQL:

WITH
    cte1 AS (SELECT a, b FROM table1),
    cte2 AS (SELECT c, d FROM table2)
SELECT b, d FROM cte1 JOIN cte2
WHERE cte1.a = cte2.c;

The following is an example shows a SQL statement with CTE WITH expression. The retrieve records from EMPLOYEE that EMP_GRADE and EMP_DEPT have to satisfy the CTE selection result.

with DT1 as
(SELECT
                     EMP_GRADE GRADE,EMP_DEPT DEPT
  FROM     DEPARTMENT,
                     EMP_SUBSIDIARY
 WHERE   DPT_ID = EMP_DEPT
         AND   DPT_AVG_SALARY<500000
         AND   EMP_DEPT<‘D’
         and     EMP_SALARY<1200000
)
select * from  EMPLOYEE where (EMP_GRADE,EMP_DEPT) in
(select GRADE,DEPT from DT1)

Here the following are the query plan of this SQL, it takes 55.9 seconds to finish. The query shows a “Subquery2” with a Nested Loop from sub_emp_salary_inx to DEPARTMENT_PK.

I found the Rows=69606 of step 1 (1 Index Range Scan –  EMP_SUBSIDIARY –  sub_emp_salary_inx) is significant high, it is not reasonable for MySQL SQL optimizer to such path from EMP_SUBSIDIARY to DEPARTMENT. I believe that MySQL optimizer cannot do a good transitivity improvement DPT_ID.  So, I manually add a new condition as “and DPT_ID<‘D’“and a “group by 1,2” to narrow down the result set from CTE.

with DT1
           as (select   EMP_GRADE GRADE,
                                   EMP_DEPT  DEPT
                   from     DEPARTMENT,
                                   EMP_SUBSIDIARY
                 where    DPT_ID = EMP_DEPT
                                   and DPT_ID < ‘D’
                                   and DPT_AVG_SALARY < 500000
                                   and EMP_DEPT < ‘D’
                                   and EMP_SALARY < 1200000
                 group by 1,
                                    2)
  select *
  from         EMPLOYEE
  where      (EMP_GRADE,EMP_DEPT) in (select  GRADE,
                      DEPT
                     from     DT1)

Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows correct driving path from DEPARTMENT to EMP_SUBSIDIARY, the estimated Rows now are closer to reality. There are two new steps of GROUP and DT1 (materialized) to narrow down the result set of CTE to future improve the performance.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 2 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune “Not Exists” SQL statement?

MySQL database and SQL

The following is an example shows a SQL statement with “Not Exists” expression. The SQL retrieve records from emp_subsidiary that satisfy with the “Not  Exists” subquery.

select * from emp_subsidiary sub
where not (exists
(select ‘x’ from employee emp
      where emp.emp_salary<1000
           and emp.emp_dept=sub.emp_dept ))
and sub.emp_grade<1200

Here the following is the query plan of this SQL, it takes 13.36 seconds to finish. The query shows a Nested Loop from emp_subsidiary to the “Materialized Subquery2” from a full table scan of employee.

I found the Rows=2950038 of “Full Table Scan of employee” of step 2 is significantly high to constitute the materialized subquery2(view). In order to reduce the actual number of rows scan of this materialized subquery2(view). I moved the subquery of “Not Exists” to a CTE “WITH” statement and added a “group by 1” to reduce the result set of the CTE in the following.

with DT1
          as (select     emp.emp_dept
                from         employee emp
                where      emp.emp_salary < 1000
                group by 1)
    select *
    from     emp_subsidiary sub
    where  not ( sub.emp_dept in (select emp_dept
                                                                     from   DT1) )
                    and sub.emp_grade < 1200

The following is the query plan of the rewritten SQL and it takes only 2.32 seconds to complete. The new query plan shows an “Index Range Scan” to the Employee table plus a “GROUP” operation to narrow down the result set of the materialized subquery2.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 5 times faster than the original SQL. There are some other rewrites shown in this screen with even better performance, but they are more complicated in SQL syntax transformation and not suitable to discuss here in this short article.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune UPDATE SQL statement with IN subquery (I) ?

optimization of SQL queries

The following is an example shows an Update SQL statement with an “IN” subquery. It updates records from emp_subsidiary that satisfies the “IN” subquery conditions.

update emp_subsidiary set emp_name =‘Deleted Name’
where   emp_dept in
(select dpt_id from department
 where dpt_avg_salary<=6000);

Here the following is the query plan of this SQL, it takes 7.55 seconds to finish the update. The query shows an attached_subqueries attached to a Full Index Scan of emp_subsidiary table. It means that the 295344 rows in emp_subsidiary is going to check the subquery’s conditions one by one.

Let me rewrite the SQL into the following join update syntax.

update emp_subsidiary e1, department d1
set    e1.emp_name=‘Deleted Name’
where  e1.emp_dept = d1.dpt_id
       and d1.dpt_avg_salary <= 6000

The following is the query plan of the rewritten SQL and it takes only 1.22 seconds to complete. The new query plan shows a “Nested Loop” from Department table to Emp_subsidiary table, due to the condition “dpt_avg_salary <= 6000” has been executed before it is going to loop the Emp_subsidiary table, it saved a lot of unnecessary time to detect every record in Emp_subsidiary table.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL with OR statements?

Tosska Chat Box

It is common that the performance is not good if a SQL statement with OR conditions. Let’s have an example show you how to tune those SQL statements in certain situations. Here is an example SQL that extract records from EMPLOYEE table if (emp_grade < 1050 or emp_id<730000). Emp_grade and emp_id are indexed and they are not null field.

select * from employee
where emp_grade < 1050 or emp_id<730000

You can see MySQL SQL Optimizer use an Index Merge of emps_grade_inx and employee_pk to process the SQL, the performance is not good as expected since the result set is quite big for sort_union operation. It takes more than 40 seconds to finish the data retrieval. Let me rewrite the OR condition into the following UNION ALL statement, please make sure the emp_grade and emp_id are not null column, otherwise it may generate error result. The rewrite is simple that the first part extract data with emp_grade<1050, the second part of the UNION ALL retrieve records that satisfied with emp_id<730000, but it is not retrieved in the first part of the UNION ALL.

select    *
from       employee
where    emp_grade < 1050
union all
select    *
from       employee
where    not ( emp_grade < 1050 )
                  and emp_id < 730000

Here the following is the query plan of this SQL, it takes 12.46 seconds to finish. The query shows two “Index Range Scan” of EMPLOYEE_PK and EMPS_GRADE_INX to the employee table.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL with COUNT(*) statements ?

sql performance monitoring

It is common that we used to count the number of records in a table.  You may encounter unexpected performance degradation in certain situations. Here is an example SQL that count number of records from EMPLOYEE table. There are number of indexes are built such as emp_id, emp_dept, emp_grade, emp_hire_date and etc….

SELECT COUNT(*)
   FROM EMPLOYEE;

You can see MySQL SQL Optimizer use a Full Index Scan of EMP_HIRE_DATE index, the performance is bad since unnecessary random reads is needed and it takes 3 minutes and 6 seconds to count a 3 million records in my computer.  I want to make use of Index Range Scan for specific index, let me rewrite the above SQL into the following syntax. If you know EMP_GRADE is indexed and it is not a nullable column, you can add a dummy condition EMP_GRADE>=’’. It fools MySQL SQL optimizer to consider using EMP_GRADE range index to retrieve the records and it is successfully generate a new plan in the following:

select    COUNT(*)
from       EMPLOYEE
where    EMP_GRADE >=  ‘ ‘

Here the following is the query plan of this SQL, it takes 2.6 seconds to finish. The query shows an “Index Range Scan” of employee table.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 71 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune substr(emp_name,5,4) SQL Statement?

sql performance monitoring

There may be some business requirements that need to compare certain part of a column as a data retrieval criteria. Here is an example SQL that retrieves data from EMPLOYEE table employee’s name with a string pattern “Acco” start from 5 character of the emp_name.

select    *
  from    employee
where    substr(emp_name,5,4)=‘Acco’

Here the following are the query plans of this SQL, it takes 17 seconds to finish. The query shows a “Full Table Scan Employee”  

You can see that this SQL cannot utilize index scan even the emp_name is indexed field. Let me add a “Force Index(emp_name_inx)“ hints to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition emp_name >= ‘ ‘ , it is an always true condition that emp_name should be greater or equal to a smallest empty character.

select    *
from       employee force index(emp_name_inx)
where    substr(emp_name,5,4) = ‘Acco’
                 and emp_name >= ‘ ‘

Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows that an Index Range Scan is used now.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/