The following is an example shows an Update SQL statement with an “IN” subquery. It updates records from emp_subsidiary that satisfies the “IN” subquery conditions.
update emp_subsidiary set emp_name =‘Deleted Name’
where emp_dept in
(select dpt_id from department
where dpt_avg_salary<=6000);
Here the following is the query plan of this SQL, it takes 7.55 seconds to finish the update. The query shows an attached_subqueries attached to a Full Index Scan of emp_subsidiary table. It means that the 295344 rows in emp_subsidiary is going to check the subquery’s conditions one by one.
Let me rewrite the SQL into the following join update syntax.
update emp_subsidiary e1, department d1
set e1.emp_name=‘Deleted Name’
where e1.emp_dept = d1.dpt_id
and d1.dpt_avg_salary <= 6000
The following is the query plan of the rewritten SQL and it takes only 1.22 seconds to complete. The new query plan shows a “Nested Loop” from Department table to Emp_subsidiary table, due to the condition “dpt_avg_salary <= 6000” has been executed before it is going to loop the Emp_subsidiary table, it saved a lot of unnecessary time to detect every record in Emp_subsidiary table.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 6 times faster than the original SQL.
It is common that the performance is not good if a SQL statement with OR conditions. Let’s have an example show you how to tune those SQL statements in certain situations.
Here is an example SQL that extract records from EMPLOYEE table if (emp_grade < 1050 or emp_id<730000). Emp_grade and emp_id are indexed and they are not null field.
select * from employee
where emp_grade < 1050 or emp_id<730000
You can see MySQL SQL Optimizer use an Index Merge of emps_grade_inx and employee_pk to process the SQL, the performance is not good as expected since the result set is quite big for sort_union operation. It takes more than 40 seconds to finish the data retrieval. Let me rewrite the OR condition into the following UNION ALL statement, please make sure the emp_grade and emp_id are not null column, otherwise it may generate error result. The rewrite is simple that the first part extract data with emp_grade<1050, the second part of the UNION ALL retrieve records that satisfied with emp_id<730000, but it is not retrieved in the first part of the UNION ALL.
select *
from employee
where emp_grade < 1050
union all
select *
from employee
where not ( emp_grade < 1050 )
and emp_id < 730000
Here the following is the query plan of this SQL, it takes 12.46 seconds to finish. The query shows two “Index Range Scan” of EMPLOYEE_PK and EMPS_GRADE_INX to the employee table.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.
It is common that we used to count the number of records in a table. You may encounter unexpected performance degradation in certain situations.
Here is an example SQL that count number of records from EMPLOYEE table. There are number of indexes are built such as emp_id, emp_dept, emp_grade, emp_hire_date and etc….
SELECT COUNT(*)
FROM EMPLOYEE;
You can see MySQL SQL Optimizer use a Full Index Scan of EMP_HIRE_DATE index, the performance is bad since unnecessary random reads is needed and it takes 3 minutes and 6 seconds to count a 3 million records in my computer. I want to make use of Index Range Scan for specific index, let me rewrite the above SQL into the following syntax. If you know EMP_GRADE is indexed and it is not a nullable column, you can add a dummy condition EMP_GRADE>=’’. It fools MySQL SQL optimizer to consider using EMP_GRADE range index to retrieve the records and it is successfully generate a new plan in the following:
select COUNT(*)
from EMPLOYEE
where EMP_GRADE >= ‘ ‘
Here the following is the query plan of this SQL, it takes 2.6 seconds to finish. The query shows an “Index Range Scan” of employee table.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 71 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.
There may be some business requirements that need to compare certain part of a column as a data retrieval criteria.
Here is an example SQL that retrieves data from EMPLOYEE table employee’s name with a string pattern “Acco” start from 5 character of the emp_name.
select *
from employee
where substr(emp_name,5,4)=‘Acco’
Here the following are the query plans of this SQL, it takes 17 seconds to finish. The query shows a “Full Table Scan Employee”
You can see that this SQL cannot utilize index scan even the emp_name is indexed field. Let me add a “Force Index(emp_name_inx)“ hints to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition emp_name >= ‘ ‘ , it is an always true condition that emp_name should be greater or equal to a smallest empty character.
select *
from employee force index(emp_name_inx)
where substr(emp_name,5,4) = ‘Acco’
and emp_name >= ‘ ‘
Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows that an Index Range Scan is used now.
This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 6 times faster than the original SQL.
There may be some business requirements that needs to retrieve only some rows from a table (or join tables) randomly. This kind of SQL is normally hard to tune.
For example the following SQL retrieve one row from two tables join of Employee and Department, where Employee’s department code is ‘AAA’, and both Emp_dept and Dpt_id columns are indexed.
select *
from employee ,department
where emp_dept=dpt_id
and emp_dept = ‘AAA’
order by rand()
limit 1
Here the following is the query plan of this SQL, it takes 3.126 seconds to finish. The query shows a nested loop from Department table to Employee table to extract all records with ‘AAA’ department code. An Order operation is executed followed from the join result.
You can see the most expensive step is to extract all Employee data with department code “AAA”, if there is an unique key such Employee ID (EMP_ID) which can uniquely identify a row from the query, you can use the With common table expressions in MySQL version 8 to randomly select rows from the KEY column only with the same conditions given by the original query, for example the following blue colored SQL text in “With” clause, it randomly select 1 row of EMP_ID from the join query. It not only significantly reduces the size of the data retrieved from Employee table, but it also shrank the size of Order By operation. And then the main query will use the selected EMP_ID to extract specific row from original query, so the whole query will run faster with this new rewrite syntax.
with DT1
as (select EMP_ID
from employee,
department
where emp_dept = dpt_id
and emp_dept = ‘AAA’
order by rand() limit 1)
select *
from employee,
department
where emp_dept = dpt_id
and emp_dept = ‘AAA’
and EMP_ID in (select EMP_ID
from DT1)
order by rand() limit 1
/* Remark: “order by rand() limit 1” is used to make sure that only 1 row will be selected if the EMP_ID cannot uniquely identify only one row */
Here is the query plan of the rewritten SQL with less cost and run much faster.
This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 100 times faster than the original SQL.
There may be some business requirements that need to retrieve the first N number of rows from a join tables. Some people may encounter unexpected performance problem.
Here is an example SQL that retrieves first 1000 row from a join tables of Employee and Department in the order of dpt_id. Where emp_dept and dpt_id columns are both indexed.
select *
from employee ,department
where emp_dept=dpt_id
order by dpt_id
limit 1000
Here the following are the query plans of this SQL, it takes 1 mins and 8 seconds to finish. The query shows a nested loop from “Full Table Scan Employee” to “Unique Key Lookup Department” table to extract all records. An “Order by” operation of dpt_id of Department table is executed followed from this join result.
You can see the most expensive step is to extract all Employee data with Department data and then an Order by dpt_id of Department table. Let’s see if we rewrite the original SQL text into the syntax that “order by dpt_id” is changed to “order by emp_dept”. It is a semantically eqvialent rewrite of original SQL statement since “emp_dept = dpt_id”.
select *
from employee,
department
where emp_dept = dpt_id
order by emp_dept
limit 1000
Here is the query plan of the rewritten SQL with less cost and run much faster. The new query plan shows that no “Use temporary; Using filesort” in Tabular Explain, it means the Nested Loop operation will be stop at the Limit 1000 records is done.
This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 70 times faster than the original SQL.
The LIKE is a logical operator that determines if a character string matches a specified pattern. A pattern may include regular characters and wildcard characters. The LIKE operator is used in the WHERE clause of the SELECT, UPDATE, and DELETE statements to filter rows based on pattern matching.
Here is an example SQL that retrieves data from EMPLOYEE table employee’s name with a string pattern like “Mary%”. If the emp_name is indexed, the following SQL will utilize Index Range Scan of the emp_name and the speed of the SQL will be fine.
select *
from employee
where emp_name like ‘Mary%’;
If user is looking for emp_name with pattern like ‘%Mary%’, MySQL SQL Optimizer cannot user emp_name index to speed up the process, full table scan is normally be used and the performance will be bad too.
select *
from employee
where emp_name like ‘Mary%’;
Here the following are the query plan of this SQL, it takes 18.8 seconds to finish. The query shows a “Full Table Scan” of employee table.
You can see that this SQL cannot utilize index scan even the emp_name is indexed. Let me add a “Force Index(emp_name_inx)“ hints to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition emp_name >= ” , it is an always true condition that emp_name should always greater or equal to a smallest empty character.
select *
from employee force index(emp_name_inx)
where emp_name like ‘%Mary%’
and emp_name >= ‘ ‘;
Here is the query plan of the rewritten SQL and it is running faster. The new query plan shows that an Index Range Scan is used now.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 5 times faster than the original SQL.
Common SQL tuning methods
SQL tuning is the process to improve a SQL statement’s performance up to the user’s expectation. There are at least three methods which are commonly used by developers or DBAs, These methods are creating new indexes, rewriting the SQL syntax and applying Oracle Optimizer Hints to the SQL statements. Each method has its pros and cons and suitable for different stages of application cycle. Let’s discuss these three methods in the following.
Create new indexes for SQL statement
Creating new indexes for SQL statements are a very common method to improve SQL performance, it is especially important during database development. As new indexes to a SQL statement are not only affecting current SQL, it is also affecting other SQL statements running on the same database. So, it should be used very carefully in in production database. Normally, users are required to make impact analysis to other relevant SQL statements for the newly-created indexes.
Rewrite SQL statement
There are a lot of people teaching SQL rewrite skills over the internet. Most of those rewrite rules are inherited from Oracle rule-based SQL optimizer or older version of cost-based SQL optimizer. For example, some people may think that the following SQL may have different performances: Select * from A where A.unique_key in (select B.unique_key from B);
Select * from A where exists (select ‘x’ from B where A.unique_key=B.unique_key);
Actually, if you put these two SQLs into Oracle database, you may probably get the same query plan from Oracle; it is because Oracle SQL optimizer will rewrite these two SQLs into one standard form internally to enable better query plans generation. A stronger internal rewrite ability was developed by Oracle SQL optimizer in last two decades. So, some obvious problems such as “KEY IS NOT NULL” or “NVL(KEY,’ABC’) =’ABC’ ” were not able to use indexes are solved by Oracle already. Of course, there are still some limitations in Oracle SQL optimizer for complex SQL transformation, so experience users may still be able to tune a SQL statement through SQL rewrite to influence Oracle SQL optimizer to generate a better query plan. But this approach is getting more difficult to apply by DBAs since the relationship between SQL syntax and final query plan generation is getting weaker, this is because Oracle cost-based optimizer (CBO) is getting smarter and the syntax of the SQL text is no longer a dominating factor to the cost calculation and plans generation.
SQL rewrite is still useful both in development and production database, since it is an isolated change to a database and no other SQLs’ performance will be affected, and it is safer than building new indexes. But it requires SQL code changes in program sources, so unit test or integration test may still be necessary. In contrast, using hints to tune a SQL is relatively safer, since it is not easy to change the semantics of the SQL accidentally.
Apply hints to SQL statement
Most databases in the market provide some sorts of query plan control language to help its optimizer to better optimize a specific SQL statement. For example; Optimization Guidelines of IBM LUW and Plan Guides of SQL Server are provided for many years already. Oracle provides Optimizer Hints that embedded with SQL text as a remark to tell Oracle SQL optimizer how to optimize the SQL statement. As hints will not affect the semantic of the SQL statement, so it is relatively safer than SQL rewrite and building new indexes. There are more than a hundred of documented Optimize Hints provided by Oracle. Unless you are an expert in using Optimizer Hints; using the right hints to tune a SQL is not easy to be mastered even by a human expert, since it is close to a million of permutations if we just pick 3 hints out from a hundred of hints.
Let’s use a simple SQL example with the follow Optimizer Hints to show how Optimizer Hints works, the Hints is used to tell Oracle SQL optimizer to use index of DEPARTMENT table during query plan selection if possible. Oracle will try to find the lowest cost plan among all plans with index retrieval of DEPARTMENT table. Select /*+ INDEX(@QB2 DEPARTMENT) */ *
From employee
Where emp_dept in (Select /*+ QB_NAME(QB2) */ dpt_id
From department
Where dpt_name LIKE ‘D%’);
Oracle SQL Hints is not a programing language
Oracle Optimizer Hints is not a programing language that come with proper syntax check or error return. It means an invalid Hints instruction to a SQL statement that Oracle SQL optimizer will not return with error message. Furthermore, even if it is a valid Hints instruction that Oracle SQL optimizer actually cannot comply with, there will be no error message returned too. So, users have to do a lot of trial and error before it can influence SQL optimizer to generate a specific better query plan.
Knowing the solution
It is a very common tuning practice that people are trying to find the best query plan for a bad performance SQL statement, it works like a human mimic Oracle SQL optimizer’s job to optimize a SQL statement with human’s knowledge. If the SQL statement is simple and Oracle SQL optimizer had made an obvious mistake, probably human intervention may work fine for this kind of simple SQL. Here is an example: Select * from A where A.KEY<’ABC’;
If Oracle SQL optimizer fails to use index to retrieve records from table A and using index of KEY1 is actually a better solution. You can use Optimizer Hints to instruct Oracle to use index instead of full table scan for this SQL statement. Select /*+ INDEX(KEY1 KEY1_INX) */ from A where A.KEY1<’ABC’ and A.KEY2<123 and A.KEY3<’C’;
Knowing the best solution is easy for simple SQL statements, but it is difficult for complex SQL statements. Since there are a lot of execution steps in the query plan for complex SQL, human beings are not able to estimate each step and work out a series of query steps to compose the best performance query plan. So, using Oracle Optimizer Hints to instruct SQL optimizer to generate specific query plan for complex SQL may not be easy even for an experienced SQL tuning expert.
Knowing the problem
Instead of knowing the solution of a SQL statement, it is relatively easier for a human expert to find where the problem is in a complex query plan. The way to tell Oracle to bypass the problem is applying hints with prefix “NO_” such as NO_INDEX or NO_USE_HASH. It tells Oracle not to use the specified operation in the query plan and select another operation instead with the lowest cost. This approach is not commonly adopted in the market due to people are normally bound by solution oriented thinking.
For example: Select /*+ NO_INDEX(KEY2 KEY2_INX) */ * from A where A.KEY1<’ABC’ and A.KEY2<123 and A.KEY3<’C’;
If you think the KEY2_INX is not a good index to retrieve records from table A, you can disable the KEY2 index by applying /*+ NO_INDEX(KEY2 KEY2_INX) */ and let Oracle to select other index to retrieve the records.
Unlock the potential power of Oracle optimizer hints
Whether you are using solution oriented approach or problem bypassing approach, you need to know the details of the SQL’s query plan, data distribution, cost of each step, and try to predict what will be the final actual aggregated cost of the plan. Sometimes you have to use both techniques for complex SQL statements; you have to provide the best query steps for parts of the query plan and use “NO_*” to bypass those known issues in the query plan. It is a very complicated process and it is not easy to carry out by common SQL developers without in-depth knowledge of SQL tuning.
The following is an example shows a SQL that took 8.56 seconds to finish and the query plan looks normal. The SQL syntax is intact, so it is not much room for SQL rewrite to take place. May be parallel execution can help to improve the SQL statement, but which table or index should be used for parallel execution and how the new parallel execution steps are affecting the entire query plan. It is not an easy job even for an experienced SQL tuning expert. That is why the potential of Oracle optimizer hints is not be fully explored from the beginning.
With the help of a latest AI algorithm, a computer-searching engine can dramatically release user effort to discover the combination of Hints without going through a huge Hints permutation space. It makes Hints SQL tuning become easier and can solve more problems than you expected.
The following solution shows a series of hints combination that tell Oracle not to use EMPLOYEES.EMPSS_GRADE_INX index and exclude Hash Join to join table DEPARTMENTS and then use parallel index scan of table EMPLOYEES. It makes a new query plan that runs 70% faster than original plan. The whole tuning process is finished without human intervention and the result is safe since it does not involve any syntax rewrite.
Tuning SQL without touching the its SQL text
If you are a packaged application user, you might be wondering how you can tune your SQL if you can’t edit a query directly. Or, if you are an application developer, you want to have a quick fix on SQL performance without the time to go thought source code change and unit test. There are multiple features provided by Oracle such SQL Profiles, SQL Plan Baselines and SQL patch that you can use to tell Oracle to fix a SQL’s query plan without the need to change the SQL text. But the using of these features are limited by Hints injection only, you cannot rewrite a SQL with different syntax and ask the original SQL to accept a rewritten SQL’s query plan. So, hints-based SQL tuning is becoming more important in today’s rapidly changing database applications.
Actually, SQL performance should be detached from application source code, the SQL performance should be manageable to deploy and rollback anytime, anywhere. It should also be tailorable for one source SQL code to fit different sizes of databases. Hints-based SQL tuning will be the key to unlock the potential power of SQL performance management for Oracle databases and it is becoming more important in modern database SQL tuning process.