Here is an example SQL that retrieves data from EMPLOYEE table with employee’s ID code in EMP_SUBSIDIARY table.
select * from employee a
in (select b.emp_id
from emp_subsidiary b)
Here the following are the query plan of this SQL, it takes 10.03 seconds to finish. The query plan is very simple since the syntax of the SQL is not complicated. The query plan shows a full table scan of EMPLOYEE and then nested to the index of EMPSB_EMP_ID, it looks like a good query plan, but I wonder if it is too expensive to have a full table scan of EMPLOYEE table?
In order to ask Oracle to consider other query plans, I added a dummy function Coalesce(b.emp_id,b.emp_id) in the subquery’s select list that artificially adding cost to the driving path from EMPLOYEE to EMP_SUBSIDIARY due to the index EMPSB_EMP_ID is disabled by this dummy function.
FROM employee a
WHERE a.emp_id IN (SELECT Coalesce(b.emp_id,b.emp_id)
FROM emp_subsidiary b)
The rewritten SQL generates an adaptive query plan and a “nested loops” from EMPSB_EMP_ID to EMPLOYEE by EMPLOYEE_PK index. You can remove the steps in “Id” column with marked ‘-‘ from the plan to verify the result plan. The new plan now takes 4.13 seconds to finish only.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 2 times faster than the original SQL.
Here is an example SQL that retrieves data from EMPLOYEE table with “emp_id < 710000” and employee’s department code exists in DEPARTMENT table.
where emp_id < 710000
and exists (select ‘x’
where dpt_id = emp_dept)
Here the following are the query plan of this SQL, it takes 34.22 seconds to finish. The query plan is very complicated, although the SQL is quite simple. It is not abnormal that Oracle uses a complex solution to solve simple data retrieval. This kind of complex plan steps is suitable for certain environments, but not for a simple database like this. I call it over-optimized query plan, which is due to the under estimated cost of this query plan. For complex plan like this, the cost estimation error is easily be amplified from step to step within the chain of plan steps.
In order to ask Oracle to consider other query plans, I rewrite the EXISTS to IN with a new “group by dpt_id” operation that force Oracle SQL optimizer to execute the subquery first.
WHERE emp_id < 710000
AND emp_dept IN (SELECT dpt_id
GROUP BY dpt_id)
The rewritten SQL generates a simpler query plan and it is actually running faster with 5.59 seconds only.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 6 times faster than the original SQL. There is a SQL rewrite with even better performance, it is a little bit complicated to discuss in this short article here. May be we can discuss later.
Semi-join is introduced in Oracle 8.0. It provides an efficient method of performing a WHERE EXISTS or WHERE IN sub-queries. A semi-join returns one copy of each row in first table for which at least one match is found in second table, there is no need of further scanning of the second table once a record is found.
in (select emp_dept from EMPLOYEE
where emp_id >3300000)
Here the following is the query plan of this SQL, it takes 13.59 seconds to finish. The query shows a “NESTED LOOPS SEMI” from DEPARTMENT to EMPLOYEE table.
Basically, this SQL is difficult to optimize by just syntax rewrite due to the simplicity of the SQL syntax that Oracle is easily transformed into a canonical syntax internally, so not much alternative query plan can be triggered by syntax rewrite.
Let’s use Hints injection to the SQL and see if there any brutal force of hints injection can trigger a better performance plan. With our A.I. Hints Injection algorithm applying to the SQL, it comes up with a SQL with extraordinary performance improvement that even I cannot understand at the first glance.
SELECT /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ *
WHERE dpt_id IN (SELECT emp_dept
WHERE emp_id > 3300000)
Here is the query plan of the hints injected SQL and it is now running much faster. The new query plan shows that the “INDEX RANGE SCAN” of EMP_DPT_INX to EMPLOYEE table is changed to “INDEX RANGE SCAN DESCENDING” and the estimated cost is the same as the Original SQL.
The Hints /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ injected SQL takes only 0.05 second, it is much faster than the original SQL, the reason behind is the employee records creation order in EMPLOYEE table, the higher the emp_id will be created later, so the corresponding records will be inserted into the right hand side of the EMP_DPT_INX index tree nodes. The “INDEX RANGE SCAN” in the original SQL plan that needs to scan a lot of records from left to right direction before it can hit one record for the condition “WHERE emp_id > 3300000”. In contrast, the Hints injected SQL with the “INDEX RANGE SCAN DESCENDING” operation that can evaluate the WHERE condition with only one scan from right to left on EMP_DPT_INX index tree nodes. That explains why the Hints injected SQL outperformed the original SQL by more than 270 times.
It is common that we employ “transaction id”, “serial no” or “creation date” in our application design, this kind of the records are normally created alone with an increasing sequence order, there may be some SQL in your system can be improved by this technique.
This kind of rewrites or Hints injection can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the Hints injected SQL is more than 270 times faster than the original SQL.