How to Tune SQL with DB Link for Oracle I?

Here is an example SQL query used to calculate the average salary of employees on the remote database @richdb in each department in the local database whose department name starts with the letter “D”.

SELECT   Avg(emp_salary),
         emp_dept
FROM     employee@richdb
WHERE    emp_dept IN (SELECT dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE ‘D%’)
GROUP BY emp_dept

Here the following is the query plan of this SQL, it takes 9.16 seconds to finish.  The query plan shows a Nested Loops from DEPARTMENT in local to EMPLOYEE in the remote database. Due to the size of the EMPLOYEE table being much larger than that of the DEPARTMENT table, the nested loop join path is not optimal in this case.

To ask Oracle to consider doing the join operation in the remote database @richdb, I added a Hint  /*+ DRIVING_SITE(employee) */ to tell Oracle to use EMPLOYEE table’s database @richdb as the driving site for the distributed query.

SELECT   /*+ DRIVING_SITE(employee) */ Avg(emp_salary),
         emp_dept
FROM     employee@richdb
WHERE    emp_dept IN (SELECT dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE ‘D%’)
GROUP BY emp_dept

The following query shows the driving site is changed to @richdb and remote retrieves DEPARTMENT data from the “local” database. Now the speed is improved to 5.94 seconds. But the query plan shows a little bit complicated, there is a view that is construed by a Hash Join of two “index fast full scan” of indexes from EMPLOYEE and DEPARTMENT.

I further change the SQL and added a dummy operation Coalesce(dpt_id,dpt_id) in the select list of the subquery to block the index fast full scan of the DEMPARTMENT table.

SELECT   /*+ DRIVING_SITE(employee) */ Avg(emp_salary),
         emp_dept
FROM     employee@richdb
WHERE    emp_dept IN (SELECT Coalesce(dpt_id,dpt_id)
                      FROM   department
                      WHERE  dpt_name LIKE ‘D%’)
GROUP BY emp_dept

The change gives the SQL a new query plan shown in the following, the performance significantly improved to 0.71 seconds. You can learn how the dummy operation Coalesce(dpt_id,dpt_id) affected the Oracle SQL optimizer decision in this example.

This kind of rewrite can be achieved by Tosska DB Ace for Oracle automatically, it shows that the rewrite is almost 13 times faster than the original SQL.

Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

How to Tune SQL Statement with CASE Expression by Hints Injection for Oracle?

Here the following is a simple SQL statement with a CASE expression syntax.

SELECT *
FROM   employee
WHERE
      CASE
      WHEN emp_salary< 1000
      THEN  ‘low’
      WHEN emp_salary>100000
      THEN  ‘high’
      ELSE  ‘Normal’
      END = ‘low’

Here the following are the query plans of this SQL, it takes 4.64 seconds to finish. The query shows a Full Table Scan of the EMPLOYEE table due to the CASE expression cannot utilize the emp_salary index. It is because the CASE statement disabled the index range search of the emp_salary index.

Commonly, we will try to enable index search by forcing the SQL with an Index hint as the following:

SELECT/*+ INDEX(@SEL$1 EMPLOYEE) */ *
FROM   employee
WHERE CASE
      WHEN emp_salary < 1000
      THEN  ‘low’
      WHEN emp_salary > 100000
      THEN  ‘high’
      ELSE  ‘Normal’
     END = ‘low’

Although the CASE statement disabled the index range search of the emp_salary index, an index full scan is now enabled to help filter the result more quickly compared with the original full table scan of the EMPLOYEE table.

This hint injection takes 0.38 seconds and it is 12 times faster than the original SQL will full table scan. For this kind of SQL statement that you cannot change your source code, you can use SQL Patch with the hints and SQL text deployed to the database without the need of changing your source code.

If you can modify your source code, the best performance will be to rewrite the CASE expression into the following syntax with multiple OR conditions.

SELECT *
FROM   employee
WHERE emp_salary < 1000
     AND ‘low’ = ‘low’
     OR NOT  ( emp_salary < 1000 )
        AND  emp_salary > 100000
        AND  ‘high’ = ‘low’
     OR NOT  ( emp_salary < 1000
           OR emp_salary > 100000 )
        AND  ‘Normal’ = ‘low’

The new query plan shows an INDEX RANGE SCAN OF emp_salary index.

This kind of rewrite and hints injection can be achieved by Tosska SQL Tuning Expert Pro for Oracle automatically,

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to Tune SQL Statement with LCASE function on index field?

Some business requirements may need to compare the lower case of an indexed column to a given string as a data retrieval criterion.

Here is an example SQL that retrieves records from the EMPLOYEE table employee if the lower case of the name is equal to the string ‘richard’.

select  *
  from employee
where LCASE(emp_name)=‘richard’

Here the following are the query plans of this SQL, it takes 17 seconds to finish. The query shows a “Full Table Scan Employee”  

You can see that this SQL cannot utilize index scan even if the emp_name is an indexed field. Let me add a “Force Index(emp_name_inx)“hint to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition “emp_name >= ””, it is an always true condition that emp_name should be greater or equal to a smallest empty character, it is used to increase the cost of not using emp_name_inx index. There is another condition added “emp_name is null” to correct this condition if emp_name is a null value.

select  *
from   employee force index(EMPS_NAME_INX)
where  LCASE(emp_name) = ‘richard’
     and ( emp_name >=
        or emp_name is null )

Here is the query plan of the rewritten SQL and it is running much faster. The new query plan shows that an Index Scan is used now and takes 2.79 seconds only.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to use ROWID to improve an UPDATE statement for Oracle?

Here the following is an Update SQL with a subquery that updates the EMPLOYEE table if the emp_dept satisfies the records returned from a subquery.

update  employee
   set  emp_name = ‘testing’
 where  emp_dept IN (select dpt_id
            from department
          where dpt_name like ‘A%’)
and emp_grade>2000

You can see Oracle uses a Hash join of the DEPARTMENT table and EMPLOYEE table to execute the update process. This query plan takes 1.96 seconds to complete and no index is used even though emp_dept, dpt_id, and emp_grade are indexed columns. It looks like the most expansive operation is the Table Access Full scan of the EMPLOYEE table.

Let’s rewrite the SQL into the following syntax to eliminate EMPLOYEE’s Table Access Full operation from the query plan.  The new subquery with the italic Bold text is used to force the EMPLOYEE to extract records with emp_dept in the DEPARTMENT table with the dpt_name like ‘A%’. The ROWID returned from the EMPLOYEE(subquery) is to make sure a more efficient table ROWID access to the outer EMPLOYEE table.

UPDATE  employee
SET   emp_name=‘testing’
WHERE   ROWID IN (SELECT  ROWID
          FROM   employee
          WHERE  emp_dept IN (SELECT  dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE‘A%’))
     AND emp_grade > 2000

You can see the final query plan with this syntax has a better cost without full table access to the EMPLOYEE table. The new syntax takes 0.9 seconds and it is more than 2 times faster than the original syntax.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert Pro for Oracle automatically, there is another SQL rewrite with similar performance, but it is not suitable to discuss in this short article, maybe I can discuss it later in my blog.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to build indexes for multiple Max() functions for SQL Server?

For some SQL statements with multiple Max() functions in the select list and nothing in the Where clause, we have different methods to create new indexes to improve the SQL speed.

Here is an example SQL, it is to retrieve the maximum name and age from the employee table.
select   max(emp_name),
     max(emp_age)
from  employee

The following is the query plan that takes 9.27 seconds.

The SQL cannot be tuned by SQL syntax rewrite or hints injection, and the SSMS cannot recommend any index to improve the SQL.

For this kind of SQL that we can consider building a composite index or two individual indexes for emp_name and emp_age. A new composite of these two columns (emp_age, emp_name) can improve the SQL around 7 times. The following is the query plan shows that the new composite index is used, but it has to scan the entire index for these two stream aggregate operations before getting the max(emp_name) and max(emp_age).

How about if we build two individual indexes for emp_name and emp_age. The following is the result and query plan of these two indexes created. A Top operator selects the first row from each index and returns to the Stream Aggregate operation, and then a Nested Loops join the two maximum results together. It is 356 times much faster than the original SQL.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically:
Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How to build indexes for slow first execution SQL – SQL Server?

You may suffer from SQL statements with a slow first execution time due to the long data cache process. The following SQL is simple that retrieves records from the EMPLOYEE table that if EMP_SALARY < 500000 and the result set is ordered by EMP_NAME.

Select emp_id,
    emp_name,
    emp_salary,
    emp_address,
    emp_telephone
from    employee
where  emp_salary < 500000
order by emp_name;

The following is the query plan that takes 9.51 seconds for the first execution and takes 0.99 seconds for the second execution without data cache.

The SQL cannot be tuned by SQL syntax rewrite or hints injection for both the first execution and the second execution, it is because SQL Server has selected the best query plan for this simple SQL statement. But the problem is that if the condition “where emp_salary < 500000” is changed; say from 500000 to 510000 or the EMPLOYEE data is flushed out from the memory, the execution time will then be prolonged up to 9.51 seconds.

Let’s see if we can build indexes to improve this situation. There is a common perception that a good index can help to improve both the first execution time and the second execution time. So, I use a tool to explore a lot of indexes configurations, but none of them can improve both executions’ performance. Here the following is the performance of the second execution with data cached for different indexes proposed by the tool. You can see the performance of “Index Set 1” is close to the original SQL performance with a little performance variation due to the system’s loading status and all other indexes sets are worse than the original SQL. Normally, we will give up the tuning of the SQL statement without even trying to see whether those recommended indexes are good for the first execution time.

I did a test for those recommended indexes to see whether they are helpful to improve the first execution time, it surprises me that the “Index Set 1” is tested with a significant improvement and improves the first execution time from 9.51 seconds to 0.65 seconds. It is a 14 times improvement that can make my database run more efficiently. So, you should be very careful to tune your SQL with new indexes that may not be good for your second execution with all data cached, but it may be very good for your first execution without data cached.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically.

Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How to index SQL with aggregate function SQL for Oracle?

Here the following is an example SQL shows you that select the maximum emp_address which is not indexed in the EMPLOYEE table with 3 million records, the emp_grade is an indexed column.

select max(emp_address) from employee a
where emp_grade<4000

As 80% of the EMPLOYEE table’s records will be retrieved to examine the maximum emp_address string. The query plan of this SQL shows a Table Access Full on EMPLOYEE table is reasonable.

How many ways to build an index to improve this SQL?
Although it is simple SQL, there are still 3 ways to build an index to improve this SQL, the following are the possible indexes that can be built for the SQL, the first one is a single column index and the 2 and 3 are the composite index with a different order.
1. EMP_ADDRESS
2. EMP_GRADE, EMP_ADDRESS
3. EMP_ADDRESS, EMP_GRADE

Most people may use the EMP_ADDRESS as the first choice to improve this SQL, let’s see what the query plan is if we build a virtual index for the EMP_ADDRESS column in the following, you can see the estimated cost is reduced by almost half, but this query plan is finally not being used after the physical index is built for benchmarking due to actual statistics is collected.

The following query shows the EMP_ADDRESS index is not used and the query plan is the same as the original SQL without any new index built.

Let’s try the second composite index (EMP_GRADE, EMP_ADDRESS), the new query plan shows an Index Fast Full Scan of this index, it is a reasonable plan which no table’s data is needed to retrieve. So, the execution time is reduced from 16.83 seconds to 3.89 seconds.

Let’s test the last composite index (EMP_ADDRESS, EMP_GRADE) that EMP_ADDRESS is placed as the first column in the composite index, it creates a new query plan that shows an extra FIRST ROW operation for the INDEX FULL SCAN (MIN/MAX), it highly reduces the execution time from 16.83 seconds to 0.08 seconds.

So, indexing sometimes is an art that needs you to pay more attention to it, some potential solutions may perform excess your expectation.

The best index solution is now more than 200 times better than the original SQL without index, this kind of index recommendation can be achieved by Tosska SQL Tuning Expert for Oracle automatically.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to use FORCE INDEX Hints to tune an UPDATE SQL statement?

improve performance of sql query

We used to use FORCE INDEX hints to enable an index search for a SQL statement if a specific index is not used. It is due to the database SQL optimizer thinking that not using the specific index will perform better.  But enabling an index is not as simple as just adding an index search in the query plan, it may entirely change the structure of the query plan, which means that forecasting the performance of the new Force Index hints is not easy. Here is an example to show you how to use FORCE INDEX optimization hints to tune a SQL statement.

A simple example SQL that updates EMP_SUBSIDIARY if the emp_id is found in EMPLOYEE with certain criteria.

update EMP_SUBSIDIARY set emp_name=concat(emp_name,'(Headquarter)’)
where emp_id in
(SELECT emp_id
  FROM EMPLOYEE
WHERE  emp_salary <1000000
   and emp_grade<1150)

Here the following is the query plan of this SQL, it takes 18.38 seconds. The query shows a Full Table Scan of EMPLOYEE and then Nested Loop to EMP_SUBSIDIARY with a Unique Key Lookup of Emp_sub_PK index.

We can see that the filter condition “emp_salary <1000000 and emp_grade<1150” is used for the full table scan of EMPLOYEE. The estimated “filtered (ratio of rows produced per rows examined): 3.79%”, it seems the MySQL SQL optimizer is failed to use an index to scan the EMPLOYEE table. We should consider forcing MySQL to use either one of emp_salary or emp_grade index.

Unless you fully understand the data distribution and do a very precise calculation, otherwise you are not able to tell which index is the best?

Let’s try to force the index of emp_salary first.

update   EMP_SUBSIDIARY
set    emp_name=concat(emp_name,‘(Headquarter)’)
where emp_id in (select  emp_id
         from    EMPLOYEE FORCE INDEX(`emps_salary_inx`)
         where  emp_salary < 1000000
           and emp_grade < 1150)

This SQL takes 8.92 seconds and is 2 times better than the original query plan without force index hints.

Let’s try to force the index of emp_grade again.

update   EMP_SUBSIDIARY
set    emp_name=concat(emp_name,‘(Headquarter)’)
where emp_id in (select  emp_id
         from    EMPLOYEE FORCE INDEX(`emps_grade_inx`)
         where  emp_salary < 1000000
           and emp_grade < 1150)

Here is the result query plan of the Hints FORCE INDEX(`emps_grade_inx`) injected SQL and the execution time is reduced to 3.95 seconds. The new query plan shows an Index Range Scan of EMPLOYEE by EMP_GRADE index, the result is fed to a subquery2(temp table) and Nested Loop to EMP_SUBSIDIARY for the update. This query plan’s estimated cost is lower and performs better than the original SQL. It is due to the limited plan space in the real-time SQL optimization process, so this query plan cannot be generated for the original SQL text, so manual hints injection is necessary for this SQL statement to help MySQL database SQL optimizer to find a better query plan.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the Hints injected SQL is more than 4.6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to build indexes for multiple Max() functions for SQL Server?

For some SQL statements with multiple Max() functions in the select list and nothing in the Where clause, we have different methods to create new indexes to improve the SQL speed.

Here is an example SQL, it is to retrieve the maximum name and age from the employee table.

select max(emp_name),
     max(emp_age)
from  employee

The following is the query plan that takes 9.27 seconds.

The SQL cannot be tuned by SQL syntax rewrite or hints injection, and the SSMS cannot recommend any index to improve the SQL.

For this kind of SQL that we can consider building a composite index or two individual indexes for emp_name and emp_age.  A new composite of these two columns (emp_age, emp_name) can improve the SQL around 7 times. The following is the query plan shows that the new composite index is used, but it has to scan the entire index for these two stream aggregate operations before getting the max(emp_name) and max(emp_age).

How about if we build two individual indexes for emp_name and emp_age. The following is the result and query plan of these two indexes created. A Top operator selects the first row from each index and returns to the Stream Aggregate operation, and then a Nested Loops join the two maximum results together. It is 356 times much faster than the original SQL.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically.

Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How is the order of the columns in a composite index affecting a subquery performance for Oracle?

MySQL database and sql

We know the order of the columns in a composite index will determine the usage of the index or not against a table. A query will use a composite index only if the where clause of the query has at least the leading/left-most columns of the index in it. But, it is far more complicated in correlated subquery situations. Let’s have an example SQL to elaborate the details in the following.

SELECT D.*
FROM   department D
WHERE EXISTS (SELECT    Count(*)
         FROM     employee E
         WHERE     E.emp_id < 1050000
                AND E.emp_dept = D.dpt_id
         GROUP BY  E.emp_dept
         HAVING    Count(*) > 124)

Here the following is the query plan of the SQL, it takes 10 seconds to finish. We can see that the SQL can utilize E.emp_id and E.emp_dept indexes individually.

Let’s see if a new composite index can help to improve the SQL’s performance or not, as a rule of thumb, a higher selectivity column E.emp_id will be set as the first column in a composite index (E.emp_id, E.emp_dept).

The following is the query plan of a new composite index (E.emp_id, E.emp_dept) and the result performance is not good, it takes 11.8 seconds and it is even worse than the original query plan.

If we change the order of the columns in the composite index to (E.emp_dept, E.emp_id), the following query plan is generated and the speed is improved to 0.31 seconds.

The above two query plans are similar, the only difference is the “2” operation. The first composite index with first column E.emp_id uses an INDEX RANGE SCAN of the new composite index, but the second query plan uses an INDEX SKIP SCAN for the first column of E.emp_dept composite index. You can see there is an extra filter operation for E.emp_dept in the Predicate Information of INDEX RANGE SCAN of the index (E.emp_id, E.emp_dept). But the (E.emp_dept, E.emp_id) composite index use INDEX SKIP SCAN without extra operation to filter the E.emp_dept again.

So, you have to test the order of composite index very carefully for correlated subqueries, sometimes it will give you improvements that exceed your expectation.

This kind of index recommendation can be achieved by Tosska SQL Tuning Expert for Oracle automatically.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/