How to Tune Bad Performance SET ROWCOUNT SQL Statements for SQL Server?

sql performance monitoring

Some SQL statements will be running very slow after SET ROWCOUNT or TOP is used.  SET ROWCOUNT and TOP are used to tell SQL Server to select a specific number of rows from the SQL statements instead of extracting all records. Not many people know that SQL Server will try to re-optimize your SQL statements after you adding SET ROWCOUNT or TOP, the result is normally good after re-optimization of your SQL statements that can generate query plans for retrieving the first few records as fast as possible.

Good Example for Query Re-optimization for SET ROWCOUNT

Here the following is an example that shows the SQL takes 6.78 seconds to retrieve 217500 rows from the database, the query plan shows a good plan with a Hash Match for two Table Scan of [DEPARTMENT] and [EMPLOYEE].

The following screen shows the new query plan is generated after the SET ROWCOUNT 1 is used, the query plan is changed from Hash Match to Nested Loops. Nested Loops operation normally provides faster first few records retrieval time but may not be good for overall records extraction in certain situations. It is good to see that SQL Server uses only 0.013 seconds to extract the first row for this SQL.

Bad Example for Query Re-optimization for SET ROWCOUNT

Let’s see a bad example that shows how SQL Server degrades a good query plan to a bad query plan after the SET ROWCOUNT 1 is used. Here the following is an example that shows the SQL takes 0.118 seconds to retrieve 1613 rows from the database, the query plan is a little bit complex but it is a good query plan to retrieve all 1613 rows.

The following screen shows the new query plan is generated after the SET ROWCOUNT 1 is used, the query plan is now changed to Nested Loops with two Table Scans. The new query plan takes 1.312 seconds to extract only the first record, it is even slower than the 0.118 seconds that is used to extract all 1613 rows from the database.

How to Solve This Problem?

We can use Hints injection or SQL syntax rewrite to influence SQL Server to get back the original plan or generate an even better query plan for the SET ROWCOUNT or TOP operation. The following Hints injection generated a good query plan that is almost 90 times better than the original SQL with SET ROWCOUNT 1.

Tosska SQL Tuning Expert (TSES™) for SQL Server® – Tosska Technologies Limited

How to Tune Cold Cache SQL Statements for SQL Server?

sql optimizer for sql server

For SQL statements that are not executed frequently, so that the relevant data is no longer exists in the buffer cache, a cold cache will significantly affect the performance of a SQL statement. A good performance SQL for hot cache may not be performing well in a cold cache environment. Experience developers will tune their SQL running well for both environments.

Here the following is an example SQL:

select * from
EMPLOYEE A
 where  A.EMP_ID IN (SELECT B.EMP_ID from EMP_SUBSIDIARY B
                      where B.EMP_DEPT < ‘D’)

Here the following is the query plan in the Tosska proprietary tree format, it takes 8.024 seconds for the first execution with cache delay and it takes 3.7 seconds for the second execution without caching time.

According to the query plan, you may find that the most significant IO consumption is the Table Scan of [EMPLOYEE] table. To simulate the cold cache environment, we can use the DBCC DROPCLEANBUFFERS command to clear the data cache before each execution of rewritten or optimized SQL statement.

Let me add an optimizer hint OPTION(LOOP JOIN) to the SQL and try to change the query plan from a Hash Match to a Nested Loop join. So, the EMP_ID(EMPLOYEE_PK) and a RID Lookup to [EMPLOYEE] will be used instead of using Table Scan. I hope that the RID Lookup can select fewer data from hard disk with matched EMP_ID in both [EMPLOYEE] and [EMP_SUBSIDIARY].

select *
from  EMPLOYEE A
where A.EMP_ID in (select B.EMP_ID
          from   EMP_SUBSIDIARY B
          where   B.EMP_DEPT < ‘D’) OPTION(LOOP JOIN)

Here the following is the query plan, the time is reduced from 8.024 seconds to 1.565 seconds with data cache overhead, and the physical reads are also dropped from 190,621 to 39,044. It shows a wrong IO estimation If you just rely on the SQL Server’s EstimateIO x EstimiateExecutions in the query plan.

There are other even better tuning solutions for this SQL with the A.I. SQL tuning tool in the following:

Tosska SQL Tuning Expert (TSES™) for SQL Server® – Tosska Technologies Limited

The following SQL with an optimizer hint generate a more complicated query plan with the best execution time of 0.7 seconds. The SQL is tuned by cold cache simulation that data will be flushed before each execution of SQL alternatives.

How to Tune SQL Statement with EXISTS Subquery for SQL Server II ?

Optimization in SQL

In my last article that a SQL statement with an Exists subquery was improved 90 times by the following rewrite.

SELECT *
FROM DEPARTMENT
where exists (select ‘x’
         from employee
         where emp_id > 2700000
         and emp_dept=DPT_ID)

Query Plan:

Rewritten SQL syntax:

select  *
from DEPARTMENT
where  DPT_ID in (select    isnull(emp_dept,emp_dept)
         from      employee
         where   emp_id > 2700000
         group by emp_dept)

Query Plan:

Syntax Rewrite Solution
Syntax rewrite technique to improve SQL statements are commonly used by DBA or developers especially for Oracle or MySQL databases, but syntax rewrite is not easy to be applied by users who are using MS SQL Server or IBM Db2 LUW. The reason is that MS SQL Server and IBM Db2 LUW have a strong internal rewrite engine in their SQL optimizer. The internal SQL rewrite engine will try to rewrite a SQL syntax to their internal canonical syntax. It means that no matter how you rewrite your SQL statement, MS SQL Server and IBM Db2 LUW will try to rewrite the SQL back to their internal presumed good syntax, so it is difficult to tune a SQL if the so-called presumed good syntax is not good, since users are not easy to influence database SQL optimizer to generate a better query plan by simple SQL syntax rewrite.

Query Hints Injection Solution
To solve this problem, SQL Server provides Query Hints feature for users to help its SQL optimizer generate a better query plan. It is not like the SQL syntax rewrite method, experienced developers may tell what the final query plan will be for a rewritten syntax, Query Hints is a pinpoint solution that a query hint injection is normally applied to the specific step of the entire query plan, but a change to a plan step will incur domino effect to other plan steps in the entire query plan since MS SQL Server must adjust other plan steps to achieve what the user’s expectation for the query hint in the SQL statement. So, the final query plan is not easy to predict by users, especially for complex SQL statements.

The following SQL with Hints injection generated by Tosska SQL Tuning Expert is around 4 times better than the original SQL and takes 0.639 seconds.

select   *
from  DEPARTMENT
where exists ( select  ‘x’
        from  employee
        where emp_id > 2700000
           and emp_dept = DPT_ID) OPTION(LOOP JOIN,HASH GROUP)

There is an even better SQL with Hints injected, it is around 50 times better than original SQL and takes 0.055 seconds. This query plan is pretty close to the rewrite tuning in my last article.

select   *
from  DEPARTMENT
where exists ( select  ‘x’
        from  employee WITH(INDEX(EMPS_DPT_INX))
        where emp_id > 2700000
           and emp_dept = DPT_ID)

Syntax Rewrite plus Hints Injection Solution
For some SQL statements, a separate syntax rewrite method or a hints injection method may not be able to solve a complex SQL performance problem individually, some people may think that will it be possible if we rewrite a SQL and apply hints at the same time to improve a SQL statement? Yes, it is possible in the Tosska SQL Tuning Expert A.I. engine, this technology can solve more SQL performance problems by a computer algorithm ever before. I will discuss this technology later in my blog.

Tosska SQL Tuning Expert (TSES™) for SQL Server® – Tosska Technologies Limited

The following screen show Tosska SQL Tuning Expert can generate 178 distinguished query plans after investigated 300 SQL Hints injection, it is far out of what a human expert can achieve within 10 minutes. MS SQL Server is the most sensitive to Query Hints Injection database in the market, SQL Server query hints are normally able to influence SQL optimizer to generate a specific query plan, so the SQL tuning for MS SQL Server is far more challenging than other databases.

How to Tune SQL Statement with EXISTS Subquery for SQL Server I ?

sql server tuning tools

The following is an example that shows a SQL statement with an Exists subquery. The SQL retrieves records from the DEPARTMENT table that DPT_ID is found in emp_dept of employee table with emp_id > 2700000.

SELECT *
FROM DEPARTMENT
where exists (select ‘x’
         from employee
         where emp_id > 2700000
         and emp_dept=DPT_ID)

Here the following is the query plan in the Tosska proprietary tree format, it takes 2.23 seconds to finish.

The query plan shows two Hash Match from [EMPLOYEE].[EMPLOYEE_PK] to [EMPLOYEE].[EMPS_DPT_INX] and then Merge Join to a sorted [DEPARTMENT] table. This query plan looks reasonable, but the number of records scan from [EMPLOYEE] is too expensive at the first stage, can we use the small [DEPARTMENT] table to scan back the [EMPLOYEE] table to improve the SQL.

Let me rewrite the EXISTS subquery into an IN subquery in the following, but the query plan is not changed as expected.

select  *
from DEPARTMENT
where  DPT_ID in (select   emp_dept
         from     employee
         where  emp_id > 2700000)

I further rewrite the SQL and add the dummy function “isnull(emp_dept,emp_dept)” in the select list, but it cannot stop the operation of Hash Match to [EMPLOYEE].[EMPS_DPT_INX].

select  *
from DEPARTMENT
where  DPT_ID in (select    isnull(emp_dept,emp_dept)
         from      employee
         where   emp_id > 2700000)

To further enforce the restriction for stoping the operation “Hash Match to [EMPLOYEE].[EMPS_DPT_INX]”, I try to add a dummy “group by emp_dept” operation in the subquery.

select  *
from DEPARTMENT
where  DPT_ID in (select    isnull(emp_dept,emp_dept)
         from      employee
         where   emp_id > 2700000
         group by emp_dept)

Here the following is the query plan after the final rewrite, SQL server first uses a Table Scan of [DEPARTMENT] table and Nested Loop of “EMPS_DPT_INX index seek to RID Lookup of [EMPLOYEE]” with the Top 1 operation, so each record from [DEPARTMENT] table will match at most one record from [EMPLOYEE] only. The speed now is 0.024 seconds and is much faster than the original SQL.

Although the steps to the final rewrite is a little bit complicated, this kind of rewrites can be achieved by Tosska SQL Tuning Expert for SQL Server automatically, it shows that the rewrite is more than 90 times faster than the original SQL.  

Tosska SQL Tuning Expert (TSES™) for SQL Server® – Tosska Technologies Limited

How to Tune SQL Statements with CONCAT Operator for MySQL?

oracle sql performance tuning

There may be some business requirements that need to compare concatenate strings and column with a given unknown length of the bind variable. Here is an example SQL that retrieves data from EMPLOYEE and DEPARTMENT tables where employee’s department ID must concatenate two strings before it is compared to an unknown length of variable @dpt_var

select * from employee,department
where concat(concat(‘A’,emp_dept),‘B’) = @dpt_var
and  emp_dept= dpt_id

Here the following are the query plans of this SQL, it takes 23.8 seconds to finish. The query shows a “Full Table Scan Employee” to nested loop Department table.

You can see that this SQL cannot utilize index scan even the emp_dept is an indexed field. Let me add a “force index(EMPS_DPT_INX) hints to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition emp_dept >= ” , it is an always true condition that emp_dept should be greater or equal to a smallest empty character. It is to fool MySQL SQL optimizer that emp_dept’s index is a reasonable step.

select  *
from  employee force index(EMPS_DPT_INX),
     department
where  concat(concat(‘A’,emp_dept),‘B’) = @dpt_var
     and emp_dept >= ”
     and emp_dept = dpt_id

Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows that an Index Range Scan is used for Employee table first and then nested loop Department table.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL Statement with DECODE Expression for Oracle?

sql tuning for MySQL

Here the following is an example SQL statement with a DECODE expression syntax.

select  *  from employee
where  decode(emp_dept , ‘AAA’ , ‘ADM’ , ‘AAB’ , ‘ACC’ , emp_dept) = ‘ADM’

Here the following are the query plans of this SQL, it takes 6.41 seconds to finish. The query shows a Full Table Scan of EMPLOYEE table due to the DECODE expression cannot utilize the EMP_DEPT column’s index.

We can rewrite the DECODE expression into the following semantical equivalent SQL statement with multiple OR conditions.

SELECT   *
FROM      employee
WHERE  emp_dept = ‘AAA’
         AND ‘ADM’ = ‘ADM’
         OR  NOT ( emp_dept = ‘AAA’ )
              AND emp_dept = ‘AAB’
              AND ‘ACC’ = ‘ADM’
         OR  NOT ( emp_dept = ‘AAA’
                       OR emp_dept = ‘AAB’ )
              AND emp_dept = ‘ADM’

Here is the query plan of the rewritten SQL and the speed is 0.41 seconds. It is 15 times better than the original syntax. The new query plan shows a BITMAP OR of two INDEX RANGE SCAN of EMP_DEPT index.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for Oracle automatically, there are other rewrites with even better performance, but it is not suitable to discuss in the short article, maybe I can discuss later in my blog.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL statement with Transitive Dependency Improvement for MySQL?

oracle query optimizer too

The following is an example shows a SQL statement with two conditions “emp_dept=dpt_id and emp_dept<‘L’”

select  *  from employee,department
where  emp_dept=dpt_id
  and  emp_dept<‘L’
  and  emp_id<1500000
  and  emp_salary= dpt_avg_salary
order    by  dpt_avg_salary

Here the following is the query plan of this SQL in Tosska proprietary tree format, it takes 8.84 seconds to finish.

The query plan looks reasonable that shows a full table scan of DEPARTMENT to nested-loop EMPLOYEE table, the records in EMPLOYEE table being nested-loop must satisfy with the condition “emp_id<1500000” and the corresponding index EMPS_SALARY_INX is also used. Due to the number of records in the first driving table in a Nested Loop Join is very critical to the join performance, we should find a way to narrow down the number of result records of DEPARTMENT table before it is used to nested-loop EMPLOYEE table.

As the conditions “emp_dept=dpt_id and emp_dept<‘L’”, it implies that “dpt_id < ‘L’” is also true, let me add this extra condition to the SQL, it helps MySQL SQL optimizer to make a better decision with more information provided by the new SQL syntax, this technique is especially useful for MySQL database.
Remark:
Oracle or MS SQL Server are doing very good on their internal Transitive Dependency Improvement in their SQL optimizer already, so this technique may not work for Oracle and MS SQL Server.

select      *
from        employee,
       department
where     emp_dept = dpt_id
    and dpt_id < ‘L’
    and emp_dept < ‘L’
    and emp_id < 1500000
    and emp_salary = dpt_avg_salary
order by dpt_avg_salary

Let’s see the DEPARTMENT is now being filtered by the new condition “dpt_id < ‘L’ “ with an index range scan. You can see the estimated Rows 401 of DEPARTMENT table is now being trimmed down to 176. The rewritten SQL now takes only 3.8 seconds with such a simple change in syntax.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that this rewrite is more than 2 times faster than the original SQL with such an easy change in the syntax.
https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL Statement with Driving Path Control for MySQL?

sql tuning for MySQL

The following is an example shows a SQL statement with two potential table join paths.  “Employee to Department” and “Department to Employee” are potential driving paths which will be decided by MySQL SQL optimizer during SQL optimization stage.  

select  emp_id,emp_name,dpt_avg_salary
  from employee ,department
where emp_dept=dpt_id
and emp_dept like ‘A%’
and dpt_id like ‘A%’;

Here the following is the query plan selected by MySQL SQL optimizer in Tosska proprietary tree format, it takes 59 seconds to finish.

The query plan looks reasonable that uses DEPARTMENT’s Primary Key (DPT_ID) to fetch DEPARTMENT table first and then nested loop EMPLOYEE table by EMP_DEPT index, the speed of this query plan depends on the size of EMPLOYEE table and the records distribution according to the EMP_DEPT code.

If we want to change the driving path of the query plan from EMPLOYEE to DEPARTMENT, let me add a ifnull(dpt_id,dpt_id) dummy function to disable the DPT_ID index search, so it can artificially add cost to condition search DEPARTMENT table first. It means that using EMP_DEPT index search cost is relative cheaper now, so EMPLOYEE to DEPARTMENT driving path is probably be selected by MySQL SQL optimizer in the following:

select  emp_id,
        emp_name,
        dpt_avg_salary
from     employee,
        department
where  emp_dept = dpt_id
        and emp_dept like ‘A%’
        and ifnull(dpt_id,dpt_id) like ‘A%’

EMPLOYEE to DEPARTMENT driving path plan is generated by MySQL now and it takes only 18.8 seconds only to finish the query.

If we know that using the EMP_DEPT index is not that efficient due to the selectivity of “ like ‘A%’ “ condition may not high enough to utilize the index range scan. Let me add an additional ifnull(emp_dept,emp_dept) dummy function to disable the EMP_DEPT index range scan too in the following:

select  emp_id,
        emp_name,
        dpt_avg_salary
from     employee,
        department
where  emp_dept = dpt_id
        and ifnull(emp_dept,emp_dept) like ‘A%’
        and ifnull(dpt_id,dpt_id) like ‘A%’

Now, MySQL use full table scan of EMPLOYEE table to nested loop DEPARTMENT table. The speed is further improved to 15 seconds now.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the best rewrite is around 4 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to Tune SQL with IN Subquery in Certain Environment for Oracle?

performance tuning in sql MySQL

Here is an example SQL that retrieves data from EMPLOYEE table with employee’s ID code in EMP_SUBSIDIARY table.

select * from employee a
where a.emp_id
       in (select b.emp_id
             from emp_subsidiary b)

Here the following are the query plan of this SQL, it takes 10.03 seconds to finish.  The query plan is very simple since the syntax of the SQL is not complicated. The query plan shows a full table scan of EMPLOYEE and then nested to the index of EMPSB_EMP_ID, it looks like a good query plan, but I wonder if it is too expensive to have a full table scan of EMPLOYEE table?

In order to ask Oracle to consider other query plans, I added a dummy function Coalesce(b.emp_id,b.emp_id) in the subquery’s select list that artificially adding cost to the driving path from EMPLOYEE to EMP_SUBSIDIARY due to the index EMPSB_EMP_ID is disabled by this dummy function.

SELECT  *
FROM      employee a
WHERE  a.emp_id   IN (SELECT Coalesce(b.emp_id,b.emp_id)
                           FROM     emp_subsidiary b)

The rewritten SQL generates an adaptive query plan and a “nested loops” from EMPSB_EMP_ID to EMPLOYEE by EMPLOYEE_PK index. You can remove the steps in “Id” column with marked ‘-‘ from the plan to verify the result plan. The new plan now takes 4.13 seconds to finish only.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 2 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL with Exists Operator in Certain Environment for Oracle?

improve oracle database performance

Here is an example SQL that retrieves data from EMPLOYEE table with “emp_id < 710000” and employee’s department code exists in DEPARTMENT table.

select  *
  from   employee
 where emp_id < 710000
      and  exists (select  ‘x’
                    from department
                 where dpt_id = emp_dept)

Here the following are the query plan of this SQL, it takes 34.22 seconds to finish.  The query plan is very complicated, although the SQL is quite simple. It is not abnormal that Oracle uses a complex solution to solve simple data retrieval.  This kind of complex plan steps is suitable for certain environments, but not for a simple database like this. I call it over-optimized query plan, which is due to the under estimated cost of this query plan. For complex plan like this, the cost estimation error is easily be amplified from step to step within the chain of plan steps.

In order to ask Oracle to consider other query plans, I rewrite the EXISTS to IN with a new “group by dpt_id” operation that force Oracle SQL optimizer to execute the subquery first.

SELECT  *
FROM     employee
WHERE  emp_id < 710000
       AND  emp_dept IN (SELECT      dpt_id
                            FROM         department
                            GROUP BY  dpt_id)

The rewritten SQL generates a simpler query plan and it is actually running faster with 5.59 seconds only.

This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 6 times faster than the original SQL. There is a SQL rewrite with even better performance, it is a little bit complicated to discuss in this short article here. May be we can discuss later.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/