Semi-join is introduced in Oracle 8.0. It provides an efficient method of performing a WHERE EXISTS or WHERE IN sub-queries. A semi-join returns one copy of each row in first table for which at least one match is found in second table, there is no need of further scanning of the second table once a record is found.
in (select emp_dept from EMPLOYEE
where emp_id >3300000)
Here the following is the query plan of this SQL, it takes 13.59 seconds to finish. The query shows a “NESTED LOOPS SEMI” from DEPARTMENT to EMPLOYEE table.
Basically, this SQL is difficult to optimize by just syntax rewrite due to the simplicity of the SQL syntax that Oracle is easily transformed into a canonical syntax internally, so not much alternative query plan can be triggered by syntax rewrite.
Let’s use Hints injection to the SQL and see if there any brutal force of hints injection can trigger a better performance plan. With our A.I. Hints Injection algorithm applying to the SQL, it comes up with a SQL with extraordinary performance improvement that even I cannot understand at the first glance.
SELECT /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ *
WHERE dpt_id IN (SELECT emp_dept
WHERE emp_id > 3300000)
Here is the query plan of the hints injected SQL and it is now running much faster. The new query plan shows that the “INDEX RANGE SCAN” of EMP_DPT_INX to EMPLOYEE table is changed to “INDEX RANGE SCAN DESCENDING” and the estimated cost is the same as the Original SQL.
The Hints /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ injected SQL takes only 0.05 second, it is much faster than the original SQL, the reason behind is the employee records creation order in EMPLOYEE table, the higher the emp_id will be created later, so the corresponding records will be inserted into the right hand side of the EMP_DPT_INX index tree nodes. The “INDEX RANGE SCAN” in the original SQL plan that needs to scan a lot of records from left to right direction before it can hit one record for the condition “WHERE emp_id > 3300000”. In contrast, the Hints injected SQL with the “INDEX RANGE SCAN DESCENDING” operation that can evaluate the WHERE condition with only one scan from right to left on EMP_DPT_INX index tree nodes. That explains why the Hints injected SQL outperformed the original SQL by more than 270 times.
It is common that we employ “transaction id”, “serial no” or “creation date” in our application design, this kind of the records are normally created alone with an increasing sequence order, there may be some SQL in your system can be improved by this technique.
This kind of rewrites or Hints injection can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the Hints injected SQL is more than 270 times faster than the original SQL.
The following is an example shows a SQL statement with two union operator in a subquery. The SQL retrieve records from EMPLOYEE table that EMP_ID should satisfy with the union result set from two queries in a subquery.
select * from employee
where emp_id IN
(select emp_id from emp_subsidiary where emp_grade=1000
select emp_id from employee where emp_dept=‘AAA’)
Here the following are the query plans in Tosska proprietary tree format, it takes 3 minutes 27 seconds to finish.
The query plan shows a full table scan of EMPLOYEE table and the attached subquery will be executed for each of scanned record. So, you can see the query plan is very inefficient. If we know the union result set is small and it should be executed first, and then use EMP_ID index to retrieve EMPLOYEE table. Let me rewrite the Union subquery as a derived table expression in the following:
where emp_id in (select emp_id
from (select emp_id
where emp_grade = 1000
where emp_dept = ‘AAA’) DT1)
Now, you can see the Union subquery is executed first and use it to retrieve the EMPLOYEE table by EMP_ID index. The overall query is now become more reasonable and efficient.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 60 times faster than the original SQL. There are some other rewrites with even better performance, but it is a little bit complicated to discuss in this short article, let’s discuss it in my coming blogs.
Here the following is SQL statement with multiple OR subquery.
WHERE emp_id IN (SELECT emp_id FROM emp_subsidiary where emp_dept = ‘ACC’)
OR emp_id IN (SELECT emp_id FROM employee where emp_dept = ‘COM’)
OR emp_id = 600000
Here the following are the query plans of this SQL, it takes 29 seconds to finish. The query plan shows that the OR conditions are partially converted to Union All statement, the “OR emp_id = 600000” condition is not converted to Union All operation, so three is a full table access on Employee in the query plan is found and most of the time is spent on this step.
Let me rewrite the OR conditions in to a subquery with UNION ALL operations in the following.
WHERE emp_id IN (SELECT emp_id
FROM (SELECT emp_id
WHERE emp_dept = ‘ACC’
WHERE emp_dept = ‘COM’
FROM dual) dt1)
Here is the query plan of the rewritten SQL and the speed is 0.06 seconds. It is 480 times better than the original syntax. The extra “SELECT emp_id” from the “UNION ALL” subquery in green color is used to force the subquery have to be processed in a whole without merging into the main query.
This kind of rewrite can be achieved by Tosska SQL Tuning Expert for Oracle automatically, there are other rewrite with even better performance, but it is not suitable to discuss in the short article, maybe I can discuss later in my blog.
The following is an example shows a SQL statement with a variable on the IN List operator. The SQL retrieve records from EMPLOYEE table that (EMP_ID,@1) should match any value in a set of values on the right-hand side.
select * from employee
where (emp_id,@1) in ((1000000,’a’),(2000000,’b’),(3000000,’c’))
Here the following are the query plans in Tosska proprietary tree format, it takes 19 seconds to finish.
The query plan shows a full table scan of EMPLOYEE, it means MySQL cannot decompose the IN list syntax into a better syntax for cost evaluation and no index scan is used.
Let me rewrite the IN list into multiple OR conditions in the following:
where ( ( emp_id = 1000000
and @1 = ‘a’ )
or ( emp_id = 2000000
and @1 = ‘b’ )
or ( emp_id = 3000000
and @1 = ‘c’ ) )
Now, MySQL can utilize Single Row (constant) index search. The speed now is 0.0059 second and is much faster than original SQL.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3200 times faster than the original SQL.
It is common that the performance is not good if a SQL statement with OR conditions. Let’s have an example show you how to tune those SQL statements in certain situations.
Here is an example SQL that extract records from EMPLOYEE table if (emp_grade < 1050 or emp_id<730000). Emp_grade and emp_id are indexed and they are not null field.
select * from employee
where emp_grade < 1050 or emp_id<730000
You can see MySQL SQL Optimizer use an Index Merge of emps_grade_inx and employee_pk to process the SQL, the performance is not good as expected since the result set is quite big for sort_union operation. It takes more than 40 seconds to finish the data retrieval. Let me rewrite the OR condition into the following UNION ALL statement, please make sure the emp_grade and emp_id are not null column, otherwise it may generate error result. The rewrite is simple that the first part extract data with emp_grade<1050, the second part of the UNION ALL retrieve records that satisfied with emp_id<730000, but it is not retrieved in the first part of the UNION ALL.
where emp_grade < 1050
where not ( emp_grade < 1050 )
and emp_id < 730000
Here the following is the query plan of this SQL, it takes 12.46 seconds to finish. The query shows two “Index Range Scan” of EMPLOYEE_PK and EMPS_GRADE_INX to the employee table.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 3 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.
It is common that we used to count the number of records in a table. You may encounter unexpected performance degradation in certain situations.
Here is an example SQL that count number of records from EMPLOYEE table. There are number of indexes are built such as emp_id, emp_dept, emp_grade, emp_hire_date and etc….
You can see MySQL SQL Optimizer use a Full Index Scan of EMP_HIRE_DATE index, the performance is bad since unnecessary random reads is needed and it takes 3 minutes and 6 seconds to count a 3 million records in my computer. I want to make use of Index Range Scan for specific index, let me rewrite the above SQL into the following syntax. If you know EMP_GRADE is indexed and it is not a nullable column, you can add a dummy condition EMP_GRADE>=’’. It fools MySQL SQL optimizer to consider using EMP_GRADE range index to retrieve the records and it is successfully generate a new plan in the following:
where EMP_GRADE >= ‘ ‘
Here the following is the query plan of this SQL, it takes 2.6 seconds to finish. The query shows an “Index Range Scan” of employee table.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 71 times faster than the original SQL. There are some other rewrites shown in this screen with comparable results too.
There may be some business requirements that needs to retrieve only some rows from a table (or join tables) randomly. This kind of SQL is normally hard to tune.
For example the following SQL retrieve one row from two tables join of Employee and Department, where Employee’s department code is ‘AAA’, and both Emp_dept and Dpt_id columns are indexed.
from employee ,department
and emp_dept = ‘AAA’
order by rand()
Here the following is the query plan of this SQL, it takes 3.126 seconds to finish. The query shows a nested loop from Department table to Employee table to extract all records with ‘AAA’ department code. An Order operation is executed followed from the join result.
You can see the most expensive step is to extract all Employee data with department code “AAA”, if there is an unique key such Employee ID (EMP_ID) which can uniquely identify a row from the query, you can use the With common table expressions in MySQL version 8 to randomly select rows from the KEY column only with the same conditions given by the original query, for example the following blue colored SQL text in “With” clause, it randomly select 1 row of EMP_ID from the join query. It not only significantly reduces the size of the data retrieved from Employee table, but it also shrank the size of Order By operation. And then the main query will use the selected EMP_ID to extract specific row from original query, so the whole query will run faster with this new rewrite syntax.
as (select EMP_ID
where emp_dept = dpt_id
and emp_dept = ‘AAA’
order by rand() limit 1)
where emp_dept = dpt_id
and emp_dept = ‘AAA’
and EMP_ID in (select EMP_ID
order by rand() limit 1
/* Remark: “order by rand() limit 1” is used to make sure that only 1 row will be selected if the EMP_ID cannot uniquely identify only one row */
Here is the query plan of the rewritten SQL with less cost and run much faster.
This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is almost 100 times faster than the original SQL.
Application package software is a collection of software programs that is developed for the purpose of being licensed to third-party organizations. Although
a package software may be tailored for a user’s specific needs through parameters or tables, the software itself is not developed specifically for an organization. So, users do not own the source code and have no way to modify the embedded SQL statements for performance tuning purpose. There are a lot of application packages built on Oracle RDBMS such as Siebel, PeopleSoft JD Edwards, SAP and so on. In order to help application packages’ users, Oracle provides some features for helping users to tune their SQL statements without the need to change their source code.
It is a profile generated by Oracle SQL Tuning Advisor. A SQL profile contains corrections for wrongly estimated statistics, auxiliary information. Therefore, SQL profile just guides the optimizer to a better plan, but they do not guarantee the same plan each time the statement is parsed. For certain SQL statements, no matter how good the statistics are corrected, Oracle SQL optimizer is still not able to generate a better plan in specific environments. For these kinds of SQL statements, human intervention is necessary, but SQL profile is not a convenient tool for developers to force Oracle to pick up a new plan without changing the program source code.
SQL plan baselines and stored outlines
Due to the Oracle environment changes or Oracle database version upgrade, it might target Oracle SQL optimizer to generate new plans for certain SQL statements. If it is not good, and we need something to preserve the old plans for the new environment. To achieve SQL plan stability, stored outlines was the major tool in earlier releases of Oracle Database. This feature is still supported in Oracle Database 11g; however, it might be depreciated in the future releases and replaced by SQL plan management. The mechanism of SQL Plan Baselines is to preserve the performance of specified SQL statements, regardless of changes in the database environment or release upgrade. Furthermore, create Plan Baselines manually for a SQL statement is possible, and this technique can help developers to guide Oracle SQL optimizer to generate a specific plan for a bad performance SQL statement. So, when Oracle SQL optimizer receive the same SQL statement next time, a better performance plan will be composed according to the new plan baselines stored in database. There is no need to change the SQL syntax in source programs.
For example, if you want to tune a SQL with execution plan-A that is currently used by Oracle SQL optimizer in your database, and you want to tune the SQL with Hints to make Oracle SQL optimizer to generate a new execution plan-B. What you have to do is to execute the tuned SQL with new Hints and use the following method provided by Oracle:
Execute the tuned SQL with Hints and plan B cached in SGA.
To enable the use of the tuned plan, manually alter the tuned plan to a fixed plan by setting its FIXED attribute to YES.
To enable the use of SQL plan baselines, make sure the OPTIMIZER_USE_SQL_PLAN_BASELINES initialization parameter is set to TRUE.
Weaknesses of using SQL Plan baselines for SQL tuning
As the SQL plan baselines was designed to preserve the performance of SQL statements such as after the following environment changes:
New optimizer version
Changes to optimizer statistics and optimizer parameters
Changes to schema and metadata definitions
Changes to system settings
SQL profile creating
You can see that it is not designed for the purpose of manual SQL tuning. There are also some additional limitations such as Parallel Hints is not supported by SQL Plan Baselines, you cannot load a Plan-B with Parallel Hints applied your SQL with bad performance of original Plan-A. Parallel Hints sometimes are very important for a better plan generation by Oracle SQL optimizer.
SQL Patches is part of the features provided by SQL Repair Advisor which is used to fix a SQL statement’s critical failures such as returning wrong result. The SQL Repair Advisor analyzes the problematic statement and in many cases recommends a SQL patch to repair the statement. The SQL patch is to influence the Oracle SQL optimizer to choose an alternate execution plan for future executions, instead of using the original problematic execution plan. There is a public API call to create SQL patches manually provided by Oracle Database 12c Release 2 onwards. The DBMS_SQLDIAG.CREATE_SQL_PATCH package can help users to create a SQL Patch for specific SQL statement for SQL tuning purpose. You can change a bad performance SQL statement’s execution plan without the need to modify the program source code as the following example:
Patch_name := SYS.DBMS_SQLDIAG.create_sql_patch(
sql_text => 'SELECT *
WHERE emps_dept IN
WHERE dpts_avg_salary <200000)', hint_text => 'INDEX(@SEL$1 EMPLOYEES) INDEX(@SEL$2 DEPARTMENTS)',
name => 'my_sql_patch_name');
If your database version is before Oracle database 12c Release 2, you must use this package DBMS_SQLDIAG_INTERNAL.i_create_patch instead. Both SQL text and SQL ID is able to be used for SQL hints injection. The injected hints for your SQL should be placed in hint_text input parameter. There is only one line of Hints text you can use for a SQL and there is no way to define your own query block name for any subqueries’ block. So, if your SQL has multiple subqueries and you want to instruct Oracle to do something in subqueries’ blocks, you must use Oracle default query block names in your injected hints text.
hint_text => ‘ INDEX(@SEL$1 EMPLOYEES) INDEX(@SEL$2 DEPARTMENTS) ‘
This hints text in the above example shows that @SEL$1 and @SEL$2 are default query block names provided by Oracle in the execution plan of the SQL. The Hints tells Oracle use index search for EMPLOYEES table in query block @SEL$1 and also use index search for DEPARTMENTS in query block @SEL$2.
Pros and cons of using SQL Patches to tune SQL
SQL Patches is more flexible to accept hints instructions without SQL Plan Baselines’ limitations, complex hints with parallel operations are normally accepted by SQL patches. There is no additional maintenance effort to tell Oracle to use the SQL Patches after it is created. Oracle will use the stored hints to optimize any SQL with the same SQL ID or SQL Text and generate a better performance execution plan. Furthermore, you can also use SQL Patches to disable a SQL with a destructive hints already written in a package application or even use it to control a bind-aware SQL execution behavior.
As the injected hints text must be placed in one text line and using default query block name only, manually compose a desire Hints to improve a SQL statement will be a difficult task for most SQL developers especially for complex SQL statements with many subqueries.
A tool to automatically create Hints and SQL Patches
There is only one tool in the market so far that is able to generate a better hints and create SQL Patch in a fully automatic way. Tosska SQL Tuning Expert Pro is a tool for users to improve SQL performance without touching their program source code. Users can deploy different performance query plans for various sizes of production databases without the effort of keeping multiple versions of the program source, and it is especially suitable for package application users who don’t own the source code of their applications. The tool will try most useful hints combinations to tune your bad performance SQL statement, the best Hints combination SQL performance will be benchmarking side by side with the original SQL. Users will get the exact performance improvement without any guesswork or uncertain cost assessment only.
I am excited to announce that Tosska has just released our first version of Tosska In-Memory Maestro (TIM™) for Oracle® v1.1.0, the ultimate tool to manage your powerful Oracle® In-Memory capabilities that every database administrator has been looking for !
Just go and download a free trial immediately to explore the fascinating features that TIM™ can offer you. Enjoy your test drive !
TSE™ & TSE Pro™ for Oracle have new releases for minor bug fix. Download now.