The database is a vital part of a majority of IT systems as it contains that data that needs to be processed for utilization. SQL is the standard language for fetching data from the database using queries and query optimization in SQL Server helps maintain database performance.
The queries used to communicate with the database are almost always simple and quick. The most common operations include the four major data manipulation techniques (create, select, update, and delete) that execute at quick rates in most cases. However, as you already know, they aren’t the only tasks an administrator has to do in a database.
Why Experts Recommend Query Optimization in SQL
Operational databases may have a requirement to execute statements that take a long time. For instance, queries that need access to several tables or include tasks such as aggregation.
Although statements that operate on data warehouses aren’t time-critical, the ones running on operational databases are. They are necessary to fetch the requested information as quickly as possible – which makes SQL Server performance tuning important. One of the best examples of systems that need optimal SQL is one that has access to a geo server. A geo server contains millions of geographical information spread across countless tables.
Moreover, the database operations taking place on this data is often performance heavy. Such tasks may include the intersection of lands and calculations of area, the distance between two geographical locations, etc.
These involve complex geo operations that need a large amount of data present in multiple tables. If the SQL statement used for these operations isn’t efficient, it could take days or months to fetch the results. This is where the DBA will need query optimization in SQL Server to ensure the users don’t have to wait too long to get a result.
Some Quick Tips for Maintaining Optimal Database Query Performance
Take a look at some time-tested techniques of optimizing queries in SQL that not only save a lot of time and resources but are also useful for a wide range of databases:
- Check efficiency using LIMIT – Many times, the queries have to run on a huge scale of data. You would not want to wait for your statement to execute completely before finding out that you have used the wrong statement or it was inefficient. Therefore, limiting your statement to a smaller amount of information can help you check its validity and with SQL Server performance tuning. Once you’ve run the query and are satisfied with its efficiency, you can carry it out on the scale you want.
- Long statements can help – At times, you may have written statements that may seem simple to you as they didn’t take long to understand. However, they took too long to actually execute. In databases involving time-critical operations, the opposite would be preferable. Even though the queries will become complex and take a bit longer to understand, they will save a lot of time.
- Give preference to quicker data structures – This depends on the programming-based knowledge of the DBA. For example, integer comparisons are far quicker than string comparisons.
- Don’t be tempted to use IN operator – Checking the existence of certain data in a table may be important. But try not to use the IN operator as it slows down the result.
As mentioned before, these tips are useful for a majority of databases out there. Since SQL is a declarative language, certain databases are likely to optimize for the cases we have talked about above.
However, test the tips before you try to apply them for query optimization in your database. In case some of these don’t work, you may want to consider using a tuning tool to make things easier. Get in touch with us to know how our tuning tool can simplify database query optimization for you.
Your end-users may keep on complaining about some functions of their database application are running slow, but you may found that those SQL statements are already reached their maximum speed in the current Oracle and hardware configuration. There may be no way to improve the SQL unless you are willing to upgrade your hardware. To make your users feel better, sometimes, you don’t have to tune your SQL to run faster but to tune your SQL to run slower for certain application’s SQL statements.
This is an example SQL that is used to display the information from tables Emp_sal_hist and Employee if they are satisfied with certain criteria. This SQL is executed as an online query and users have to wait for at least 5 seconds before any data will be shown on screen after the mouse click.
select * from employee a,emp_sal_hist c
where a.emp_name like ‘A%’
order by c.sal_emp_id
Here the following is the query plan and execution statistics of the SQL, it takes 10.41 seconds to extract all 79374 records and the first records return time ”Response Time” is 5.72 seconds. The query shows a MERGE JOIN of EMPLOYEE and EMP_SAL_HIST table, there are two sorting operations of the corresponding tables before it is being merged into the final result. It is the reason that users have to wait at least 5 seconds before they can see anything shows on the screen.
As the condition “a.emp_id = c.sal_emp_id”, we know that “ORDER BY c.sal_emp_id“ is the same as “ORDER BY a.emp_id“, as SQL syntax rewrite cannot force a specified operation in the query plan for this SQL, I added an optimizer hint /*+ INDEX(@SEL$1 A EMPLOYEE_PK) */ to reduce the sorting time of order by a.emp_id.
SELECT /*+ INDEX(@SEL$1 A EMPLOYEE_PK) */ *
FROM employee a,
WHERE a.emp_name LIKE ‘A%’
ORDER BY c.sal_emp_id
Although the overall Elapsed Time is 3 seconds higher in the new query plan, the response time is now reduced from 5.72 seconds to 1.16 seconds, so the users can see the first page of information on the screen more promptly and I believe most users don’t care whether there are 3 more seconds for all 79374 records to be returned. That is why SQL tuning is an art rather than science when you are going to manage your users’ expectations.
This kind of rewrite can be achieved by Tosska SQL Tuning Expert for Oracle automatically.
Here is an example SQL that retrieves data from EMPLOYEE table with employee’s ID code in EMP_SUBSIDIARY table.
select * from employee a
in (select b.emp_id
from emp_subsidiary b)
Here the following are the query plan of this SQL, it takes 10.03 seconds to finish. The query plan is very simple since the syntax of the SQL is not complicated. The query plan shows a full table scan of EMPLOYEE and then nested to the index of EMPSB_EMP_ID, it looks like a good query plan, but I wonder if it is too expensive to have a full table scan of EMPLOYEE table?
In order to ask Oracle to consider other query plans, I added a dummy function Coalesce(b.emp_id,b.emp_id) in the subquery’s select list that artificially adding cost to the driving path from EMPLOYEE to EMP_SUBSIDIARY due to the index EMPSB_EMP_ID is disabled by this dummy function.
FROM employee a
WHERE a.emp_id IN (SELECT Coalesce(b.emp_id,b.emp_id)
FROM emp_subsidiary b)
The rewritten SQL generates an adaptive query plan and a “nested loops” from EMPSB_EMP_ID to EMPLOYEE by EMPLOYEE_PK index. You can remove the steps in “Id” column with marked ‘-‘ from the plan to verify the result plan. The new plan now takes 4.13 seconds to finish only.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 2 times faster than the original SQL.
Here is an example SQL that retrieves data from EMPLOYEE table with “emp_id < 710000” and employee’s department code exists in DEPARTMENT table.
where emp_id < 710000
and exists (select ‘x’
where dpt_id = emp_dept)
Here the following are the query plan of this SQL, it takes 34.22 seconds to finish. The query plan is very complicated, although the SQL is quite simple. It is not abnormal that Oracle uses a complex solution to solve simple data retrieval. This kind of complex plan steps is suitable for certain environments, but not for a simple database like this. I call it over-optimized query plan, which is due to the under estimated cost of this query plan. For complex plan like this, the cost estimation error is easily be amplified from step to step within the chain of plan steps.
In order to ask Oracle to consider other query plans, I rewrite the EXISTS to IN with a new “group by dpt_id” operation that force Oracle SQL optimizer to execute the subquery first.
WHERE emp_id < 710000
AND emp_dept IN (SELECT dpt_id
GROUP BY dpt_id)
The rewritten SQL generates a simpler query plan and it is actually running faster with 5.59 seconds only.
This kind of rewrites can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the rewrite is more than 6 times faster than the original SQL. There is a SQL rewrite with even better performance, it is a little bit complicated to discuss in this short article here. May be we can discuss later.
Semi-join is introduced in Oracle 8.0. It provides an efficient method of performing a WHERE EXISTS or WHERE IN sub-queries. A semi-join returns one copy of each row in first table for which at least one match is found in second table, there is no need of further scanning of the second table once a record is found.
in (select emp_dept from EMPLOYEE
where emp_id >3300000)
Here the following is the query plan of this SQL, it takes 13.59 seconds to finish. The query shows a “NESTED LOOPS SEMI” from DEPARTMENT to EMPLOYEE table.
Basically, this SQL is difficult to optimize by just syntax rewrite due to the simplicity of the SQL syntax that Oracle is easily transformed into a canonical syntax internally, so not much alternative query plan can be triggered by syntax rewrite.
Let’s use Hints injection to the SQL and see if there any brutal force of hints injection can trigger a better performance plan. With our A.I. Hints Injection algorithm applying to the SQL, it comes up with a SQL with extraordinary performance improvement that even I cannot understand at the first glance.
SELECT /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ *
WHERE dpt_id IN (SELECT emp_dept
WHERE emp_id > 3300000)
Here is the query plan of the hints injected SQL and it is now running much faster. The new query plan shows that the “INDEX RANGE SCAN” of EMP_DPT_INX to EMPLOYEE table is changed to “INDEX RANGE SCAN DESCENDING” and the estimated cost is the same as the Original SQL.
The Hints /*+ INDEX_DESC(@SEL$2 EMPLOYEE) */ injected SQL takes only 0.05 second, it is much faster than the original SQL, the reason behind is the employee records creation order in EMPLOYEE table, the higher the emp_id will be created later, so the corresponding records will be inserted into the right hand side of the EMP_DPT_INX index tree nodes. The “INDEX RANGE SCAN” in the original SQL plan that needs to scan a lot of records from left to right direction before it can hit one record for the condition “WHERE emp_id > 3300000”. In contrast, the Hints injected SQL with the “INDEX RANGE SCAN DESCENDING” operation that can evaluate the WHERE condition with only one scan from right to left on EMP_DPT_INX index tree nodes. That explains why the Hints injected SQL outperformed the original SQL by more than 270 times.
It is common that we employ “transaction id”, “serial no” or “creation date” in our application design, this kind of the records are normally created alone with an increasing sequence order, there may be some SQL in your system can be improved by this technique.
This kind of rewrites or Hints injection can be achieved by Tosska SQL Tuning Expert for Oracle automatically, it shows that the Hints injected SQL is more than 270 times faster than the original SQL.