How to Tune CTE “WITH” SQL statement?

SQL Server database and SQL

CTE stands for common table expression. A CTE allows you to define a temporary named result set that available temporarily in the execution scope of a statement such as SELECT, INSERT, UPDATE, DELETE.

The following shows an example of a CTE in MySQL:

WITH
    cte1 AS (SELECT a, b FROM table1),
    cte2 AS (SELECT c, d FROM table2)
SELECT b, d FROM cte1 JOIN cte2
WHERE cte1.a = cte2.c;

The following is an example shows a SQL statement with CTE WITH expression. The retrieve records from EMPLOYEE that EMP_GRADE and EMP_DEPT have to satisfy the CTE selection result.

with DT1 as
(SELECT
                     EMP_GRADE GRADE,EMP_DEPT DEPT
  FROM     DEPARTMENT,
                     EMP_SUBSIDIARY
 WHERE   DPT_ID = EMP_DEPT
         AND   DPT_AVG_SALARY<500000
         AND   EMP_DEPT<‘D’
         and     EMP_SALARY<1200000
)
select * from  EMPLOYEE where (EMP_GRADE,EMP_DEPT) in
(select GRADE,DEPT from DT1)

Here the following are the query plan of this SQL, it takes 55.9 seconds to finish. The query shows a “Subquery2” with a Nested Loop from sub_emp_salary_inx to DEPARTMENT_PK.

I found the Rows=69606 of step 1 (1 Index Range Scan –  EMP_SUBSIDIARY –  sub_emp_salary_inx) is significant high, it is not reasonable for MySQL SQL optimizer to such path from EMP_SUBSIDIARY to DEPARTMENT. I believe that MySQL optimizer cannot do a good transitivity improvement DPT_ID.  So, I manually add a new condition as “and DPT_ID<‘D’“and a “group by 1,2” to narrow down the result set from CTE.

with DT1
           as (select   EMP_GRADE GRADE,
                                   EMP_DEPT  DEPT
                   from     DEPARTMENT,
                                   EMP_SUBSIDIARY
                 where    DPT_ID = EMP_DEPT
                                   and DPT_ID < ‘D’
                                   and DPT_AVG_SALARY < 500000
                                   and EMP_DEPT < ‘D’
                                   and EMP_SALARY < 1200000
                 group by 1,
                                    2)
  select *
  from         EMPLOYEE
  where      (EMP_GRADE,EMP_DEPT) in (select  GRADE,
                      DEPT
                     from     DT1)

Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows correct driving path from DEPARTMENT to EMP_SUBSIDIARY, the estimated Rows now are closer to reality. There are two new steps of GROUP and DT1 (materialized) to narrow down the result set of CTE to future improve the performance.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 2 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/