How to tune Ad-hoc SQL with Plan Guide for SQL Server ?

Using Plan Guides to tune third-party applications SQL in MS SQL Server can be a useful technique when you need to optimize the performance of a specific query or set of queries generated by the application, without making changes to the application’s source code.

Here are the steps to use Plan Guides to tune third-party applications SQL in MS SQL Server without changing the source code:

  1. Identify the query or queries that are causing performance issues in the application. You can use SQL Server Profiler or Extended Events to capture and analyze the SQL statements generated by the application.
  2. Create a Plan Guide that provides an optimized execution plan for the identified query or queries. This can involve modifying the query text or providing query hints to influence the optimizer’s decisions.
  3. Test the Plan Guide to ensure that it provides the desired performance improvements and does not cause any unintended side effects.
  4. Deploy the Plan Guide to production and monitor the performance of the application to ensure that the Plan Guide is being used and is providing the desired performance improvements.

Before optimizing an ad-hoc SQL statement executed by an application program without modifying the source code, it is crucial to understand how the SQL statement matches the one specified in the Plan Guide, which includes whitespace and comments. Additionally, it is important to match the source of the SQL statement. The following is the system procedure used to create a Plan Guide.

Today, the focus will be on tuning ad-hoc SQL (@type = N’SQL’) using a Plan Guide. There are two types of SQL: standalone SQL (@module_or_batch = NULL) and SQL within a batch text (@module_or_batch = N’batch_text’). For instance, if an application program sends the following SQL, and it is executed independently without any other code, it falls under standalone SQL.
select top 10 * from  employee;
The example below illustrates a batch text that contains one of the SQL statements listed above, which needs optimization by Plan Guide. This SQL statement is located in the middle of the batch text. Since the same SQL statement can originate from a batch text, we must specify the specific batch text by using the variable @module_or_batch = N’batch_text’. Consequently, two Plan Guides must be created for the same SQL statement, one for ad-hoc SQL and one for batch text. To accurately identify the source of an Ad-hoc SQL, it is recommended to use SQL Profiler to capture the SQL statement that requires optimization by Plan Guide.

select count(*) from employee;
select top 10 * from  employee;
where emp_id in (select emp_id id
                             from emp_subsidiary
                             where emp_dept<‘h’)

order by emp_name;

Microsoft SQL Server Management Studio provides a useful tool that assists users in creating a plan guide for a SQL statement without requiring manual execution of the system stored procedure. However, it is crucial to have knowledge about the type of SQL statement being optimized and the meanings of the corresponding parameters that need to be input.

Although the steps to create plan guide may seem complicated for newcomers, they are worthwhile for improving SQL performance without altering the source code or lacking the permission to modify it. However, the most challenging and time-consuming aspect is finding the optimal query hint for the SQL statement (@hints = N’OPTION(query_hint [ ,…n ])). Unless you have an in-depth knowledge of SQL tuning techniques and enough time to experiment, you may require a product that streamlines the process from capturing SQL, identifying SQL source type, automatically tuning query hints, and facilitates easy deployment of Plan Guides.

Tosska DB Ace Enterprise for SQL Server – Tosska Technologies Limited
DBAS Tune SQL PG Standalone – YouTube
DBAS Tune SQL PG Batch – YouTube

The Overlooked Significance of Performance Deployment in Software Deployment

Performance deployment is a critical aspect of software deployment that is often undervalued. While it’s important to ensure that software is deployed correctly, it’s equally important to optimize its performance in the production environment. By recognizing the significance of performance deployment during the software deployment process, developers can ensure that their software performs well and meets the needs of its users. Focusing on performance deployment can help prevent performance issues and improve user satisfaction with the software.

The Missing Link of Performance Deployment Between Testing Database and Production Database
Despite extensive pre-deployment testing, there is still a chance of encountering performance problems in specific development environments during software performance deployment. The following issues may arise :

  1. Inability to copy production data to the testing database.
  2. Significant differences in hardware and software configuration between the testing and production databases.
  3. Inability to test software in the production database due to security restrictions.
  4. The utilization of DML SQL statements in the new software that may damage the data integrity of the production database.

It is not unusual for users to face performance issues or encounter application errors following a release of new application code.

Ensuring Performance Deployment with a Pre-Deployment Process
The subsequent instructions present a novel approach for guaranteeing performance reliability while deploying software. The idea is uncomplicated: since it is not feasible to run the new application code on the production database, why not obtain query plans for every SQL statement in the production database? This way, we can assess the performance of each SQL statement in the application code that is intended for deployment on the production database.

Suppose there are 10 SQL statements in the new application code that we need to identify in the testing database. In that case, we need to clear the shared pool and execute the new application in the testing database first to isolate these 10 statements. This process will enable us to capture and analyze the 10 SQL statements and obtain their query plan from the production database. The table below presents various potential outcomes resulting from the query plan comparison.

Observation Possible reasons
Explain Plan error in the production database The SQL statements requiring access to objects not present in the production database.
Query plan changes Significant statistical differences between the testing and production databases, including differences in the database schema. These schema differences may involve missing or new partitions and other changes affecting the database’s structure and organization. Benchmarking the SQL may be necessary due to the potential significant changes in performance.
Unused indexes Some indexes used in the testing database are not used in the production database. Benchmarking the SQL may be necessary due to the potential significant changes in performance.
New used indexes Some indexes used in the production database are not used in the testing database. Benchmarking the SQL may be necessary due to the potential significant changes in performance.
Total cost changes Changes in the overall query plan cost for the 10 SQL statements. If the production database has a larger data volume than the testing database, the cost change will be higher.

DBAO SQL Performance Tracker – YouTube
Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

An Example to Show How to Tune SQL with Query Store for SQL Server

The Query Store feature in SQL Server serves as a valuable tool for troubleshooting performance issues by allowing users to quickly identify performance degradation caused by changes to query plans.
For example, when the following SQL statement is executed in SSMS, it takes 15,579 ms to finish.

Using the Top Resource Consuming Queries feature in Query Store, we can see that the SQL with Query ID 23713 and its corresponding Plan ID 37290 are displayed in the Plan Summary window.

To obtain the SQL text from SQL Server, you can manually extract it using the Query Id and accessing the relevant system tables, namely sys.query_store_query and sys.query_store_query_text. Alternatively, if you have a tool that can help extract the SQL text, it may be displayed on the screen below.
The tool accept a Query Id or partial SQL text to locate a specific SQL statement from Query Store for SQL tuning.

The screen below shows how the product endeavors to enhance SQL performance by injecting a range of Hint combinations into queries and creating corresponding Plan Guides for analysis. When done manually, this process can be difficult, as there are many possible permutations of Hints to assess. Without a comprehensive understanding of SQL tuning and the underlying problems with the query plan, identifying the best combination of Hints may require extensive trial and error.
This tool is a fully automated SQL tuning solution that utilizes Query Store. In its investigation, the tool injected 100 different Hints into the SQL queries and identified 75 unique query plans. After conducting a benchmark, it was found that the Query Store 66 (QS 66) resulted in the best performance, achieving a processing time savings of 98.45%. The optimized query included the following Hints:
OPTION(HASH JOIN, TABLE HINT(employee, INDEX(EMPS_GRADE_INX)))

Once we have determined the optimal Hints for the SQL statement, we can Force this plan for the SQL query, as displayed on the screen below. By doing so, the performance of the SQL will be improved the next time it is executed by the user’s program, without requiring any modifications to its source code.

Displayed on the screen below is evidence that executing the same SQL statement in SSMS results in significantly improved performance. The CPU time has decreased from 54202 ms to 391 ms, resulting in a 138-fold improvement, while the elapsed time has reduced from 15579 ms to 294 ms, resulting in a 52-fold improvement.

A new product designed to optimize SQL statements for Query Store
Tosska DB Ace for SQL Server marks a significant leap forward in this domain since it surpasses the reactive recovery capabilities of Query Store and introduces proactive SQL performance enhancement. This pioneering technology allows users to extract SQL from the Query Store and optimize it by creating new and improved query plans within the Query Store. With Tosska DB Ace, users can implement these new plans to their SQL without requiring any modifications to the program source code or extensive testing.

Tosska DB Ace Enterprise for SQL Server – Tosska Technologies Limited
DBAS Tune SQL QS – YouTube

How To Use 80/20 Rule To Tune A Database Application II ?

The previous article “How To Use 80/20 Rule To Tune A Database Application I “ demonstrated how the 80/20 Rule can be applied to evaluate the overall performance of SQL workload in a database. In this example, a set of 90 SQL statements retrieved from Oracle SGA is presented in a chart that lists each statement based on its resource usage in descending order, with the most resource-intensive SQL on the left. The analysis reveals that roughly 14.44% of the SQL statements consume 80% of the total elapsed time, while 21.11% of the SQL statements consume 80% of the total CPU time, indicating that the SQL workload distribution aligns well with the 80/20 rule. Therefore, tuning the SQL may not be necessary since it is unlikely to result in significant performance improvements.

However, to further optimize the database performance cost-effectively, it is recommended to conduct an in-depth investigation of the top 20% of high workload SQL statements. This will reveal that the resource utilization drops steeply in the first few SQL statements, making them the most critical candidates for optimization.

Let’s aim to reduce the proportion of the total resource consumption from 80% to 60% and examine the SQL statements that are responsible for utilizing the resources. The results are interesting and reveal that 3 SQL statements account for 60% of the elapsed time, 6 SQL statements account for 60% of the CPU time, and only one SQL statement accounts for 60% of the disk reads. By focusing on these SQL statements, it is possible to enhance up to 60% of the database workload. For instance, if the database is experiencing an IO bottleneck, concentrating on the one SQL statement could yield savings of up to 60% on disk reads.

You can utilize Excel to conduct a simulation of the 80/20 rule analysis described above, providing a comprehensive overview of the distribution of the SQL workload. This approach facilitates a rapid evaluation of the overall health of the database’s SQL performance, as well as the associated costs and benefits of optimizing high workload SQL statements. Furthermore, the SQL resource spectrum analysis is integrated into our Tosska DB Ace for Oracle software.

Tosska DB Ace Enterprise for Oracle – Tosska Technologies Limited

DBAO Inspect SQL – YouTube

How to Tune SQL Statement with LCASE function on index field?

Some business requirements may need to compare the lower case of an indexed column to a given string as a data retrieval criterion.

Here is an example SQL that retrieves records from the EMPLOYEE table employee if the lower case of the name is equal to the string ‘richard’.

select  *
  from employee
where LCASE(emp_name)=‘richard’

Here the following are the query plans of this SQL, it takes 17 seconds to finish. The query shows a “Full Table Scan Employee”  

You can see that this SQL cannot utilize index scan even if the emp_name is an indexed field. Let me add a “Force Index(emp_name_inx)“hint to the SQL and hope it can help MySQL SQL optimizer to use index scan, but it fails to enable the index scan anyway, so I add one more dummy condition “emp_name >= ””, it is an always true condition that emp_name should be greater or equal to a smallest empty character, it is used to increase the cost of not using emp_name_inx index. There is another condition added “emp_name is null” to correct this condition if emp_name is a null value.

select  *
from   employee force index(EMPS_NAME_INX)
where  LCASE(emp_name) = ‘richard’
     and ( emp_name >=
        or emp_name is null )

Here is the query plan of the rewritten SQL and it is running much faster. The new query plan shows that an Index Scan is used now and takes 2.79 seconds only.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the rewrite is more than 6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/

How to use ROWID to improve an UPDATE statement for Oracle?

Here the following is an Update SQL with a subquery that updates the EMPLOYEE table if the emp_dept satisfies the records returned from a subquery.

update  employee
   set  emp_name = ‘testing’
 where  emp_dept IN (select dpt_id
            from department
          where dpt_name like ‘A%’)
and emp_grade>2000

You can see Oracle uses a Hash join of the DEPARTMENT table and EMPLOYEE table to execute the update process. This query plan takes 1.96 seconds to complete and no index is used even though emp_dept, dpt_id, and emp_grade are indexed columns. It looks like the most expansive operation is the Table Access Full scan of the EMPLOYEE table.

Let’s rewrite the SQL into the following syntax to eliminate EMPLOYEE’s Table Access Full operation from the query plan.  The new subquery with the italic Bold text is used to force the EMPLOYEE to extract records with emp_dept in the DEPARTMENT table with the dpt_name like ‘A%’. The ROWID returned from the EMPLOYEE(subquery) is to make sure a more efficient table ROWID access to the outer EMPLOYEE table.

UPDATE  employee
SET   emp_name=‘testing’
WHERE   ROWID IN (SELECT  ROWID
          FROM   employee
          WHERE  emp_dept IN (SELECT  dpt_id
                      FROM   department
                      WHERE  dpt_name LIKE‘A%’))
     AND emp_grade > 2000

You can see the final query plan with this syntax has a better cost without full table access to the EMPLOYEE table. The new syntax takes 0.9 seconds and it is more than 2 times faster than the original syntax.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert Pro for Oracle automatically, there is another SQL rewrite with similar performance, but it is not suitable to discuss in this short article, maybe I can discuss it later in my blog.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to build indexes for multiple Max() functions for SQL Server?

For some SQL statements with multiple Max() functions in the select list and nothing in the Where clause, we have different methods to create new indexes to improve the SQL speed.

Here is an example SQL, it is to retrieve the maximum name and age from the employee table.
select   max(emp_name),
     max(emp_age)
from  employee

The following is the query plan that takes 9.27 seconds.

The SQL cannot be tuned by SQL syntax rewrite or hints injection, and the SSMS cannot recommend any index to improve the SQL.

For this kind of SQL that we can consider building a composite index or two individual indexes for emp_name and emp_age. A new composite of these two columns (emp_age, emp_name) can improve the SQL around 7 times. The following is the query plan shows that the new composite index is used, but it has to scan the entire index for these two stream aggregate operations before getting the max(emp_name) and max(emp_age).

How about if we build two individual indexes for emp_name and emp_age. The following is the result and query plan of these two indexes created. A Top operator selects the first row from each index and returns to the Stream Aggregate operation, and then a Nested Loops join the two maximum results together. It is 356 times much faster than the original SQL.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically:
Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How to build indexes for slow first execution SQL – SQL Server?

You may suffer from SQL statements with a slow first execution time due to the long data cache process. The following SQL is simple that retrieves records from the EMPLOYEE table that if EMP_SALARY < 500000 and the result set is ordered by EMP_NAME.

Select emp_id,
    emp_name,
    emp_salary,
    emp_address,
    emp_telephone
from    employee
where  emp_salary < 500000
order by emp_name;

The following is the query plan that takes 9.51 seconds for the first execution and takes 0.99 seconds for the second execution without data cache.

The SQL cannot be tuned by SQL syntax rewrite or hints injection for both the first execution and the second execution, it is because SQL Server has selected the best query plan for this simple SQL statement. But the problem is that if the condition “where emp_salary < 500000” is changed; say from 500000 to 510000 or the EMPLOYEE data is flushed out from the memory, the execution time will then be prolonged up to 9.51 seconds.

Let’s see if we can build indexes to improve this situation. There is a common perception that a good index can help to improve both the first execution time and the second execution time. So, I use a tool to explore a lot of indexes configurations, but none of them can improve both executions’ performance. Here the following is the performance of the second execution with data cached for different indexes proposed by the tool. You can see the performance of “Index Set 1” is close to the original SQL performance with a little performance variation due to the system’s loading status and all other indexes sets are worse than the original SQL. Normally, we will give up the tuning of the SQL statement without even trying to see whether those recommended indexes are good for the first execution time.

I did a test for those recommended indexes to see whether they are helpful to improve the first execution time, it surprises me that the “Index Set 1” is tested with a significant improvement and improves the first execution time from 9.51 seconds to 0.65 seconds. It is a 14 times improvement that can make my database run more efficiently. So, you should be very careful to tune your SQL with new indexes that may not be good for your second execution with all data cached, but it may be very good for your first execution without data cached.

This kind of indexes recommendation can be achieved by Tosska SQL Tuning Expert Pro for SQL Server automatically.

Tosska SQL Tuning Expert Pro (TSES Pro™) for SQL Server – Tosska Technologies Limited

How to index SQL with aggregate function SQL for Oracle?

Here the following is an example SQL shows you that select the maximum emp_address which is not indexed in the EMPLOYEE table with 3 million records, the emp_grade is an indexed column.

select max(emp_address) from employee a
where emp_grade<4000

As 80% of the EMPLOYEE table’s records will be retrieved to examine the maximum emp_address string. The query plan of this SQL shows a Table Access Full on EMPLOYEE table is reasonable.

How many ways to build an index to improve this SQL?
Although it is simple SQL, there are still 3 ways to build an index to improve this SQL, the following are the possible indexes that can be built for the SQL, the first one is a single column index and the 2 and 3 are the composite index with a different order.
1. EMP_ADDRESS
2. EMP_GRADE, EMP_ADDRESS
3. EMP_ADDRESS, EMP_GRADE

Most people may use the EMP_ADDRESS as the first choice to improve this SQL, let’s see what the query plan is if we build a virtual index for the EMP_ADDRESS column in the following, you can see the estimated cost is reduced by almost half, but this query plan is finally not being used after the physical index is built for benchmarking due to actual statistics is collected.

The following query shows the EMP_ADDRESS index is not used and the query plan is the same as the original SQL without any new index built.

Let’s try the second composite index (EMP_GRADE, EMP_ADDRESS), the new query plan shows an Index Fast Full Scan of this index, it is a reasonable plan which no table’s data is needed to retrieve. So, the execution time is reduced from 16.83 seconds to 3.89 seconds.

Let’s test the last composite index (EMP_ADDRESS, EMP_GRADE) that EMP_ADDRESS is placed as the first column in the composite index, it creates a new query plan that shows an extra FIRST ROW operation for the INDEX FULL SCAN (MIN/MAX), it highly reduces the execution time from 16.83 seconds to 0.08 seconds.

So, indexing sometimes is an art that needs you to pay more attention to it, some potential solutions may perform excess your expectation.

The best index solution is now more than 200 times better than the original SQL without index, this kind of index recommendation can be achieved by Tosska SQL Tuning Expert for Oracle automatically.

https://tosska.com/tosska-sql-tuning-expert-pro-tse-pro-for-oracle/

How to use FORCE INDEX Hints to tune an UPDATE SQL statement?

improve performance of sql query

We used to use FORCE INDEX hints to enable an index search for a SQL statement if a specific index is not used. It is due to the database SQL optimizer thinking that not using the specific index will perform better.  But enabling an index is not as simple as just adding an index search in the query plan, it may entirely change the structure of the query plan, which means that forecasting the performance of the new Force Index hints is not easy. Here is an example to show you how to use FORCE INDEX optimization hints to tune a SQL statement.

A simple example SQL that updates EMP_SUBSIDIARY if the emp_id is found in EMPLOYEE with certain criteria.

update EMP_SUBSIDIARY set emp_name=concat(emp_name,'(Headquarter)’)
where emp_id in
(SELECT emp_id
  FROM EMPLOYEE
WHERE  emp_salary <1000000
   and emp_grade<1150)

Here the following is the query plan of this SQL, it takes 18.38 seconds. The query shows a Full Table Scan of EMPLOYEE and then Nested Loop to EMP_SUBSIDIARY with a Unique Key Lookup of Emp_sub_PK index.

We can see that the filter condition “emp_salary <1000000 and emp_grade<1150” is used for the full table scan of EMPLOYEE. The estimated “filtered (ratio of rows produced per rows examined): 3.79%”, it seems the MySQL SQL optimizer is failed to use an index to scan the EMPLOYEE table. We should consider forcing MySQL to use either one of emp_salary or emp_grade index.

Unless you fully understand the data distribution and do a very precise calculation, otherwise you are not able to tell which index is the best?

Let’s try to force the index of emp_salary first.

update   EMP_SUBSIDIARY
set    emp_name=concat(emp_name,‘(Headquarter)’)
where emp_id in (select  emp_id
         from    EMPLOYEE FORCE INDEX(`emps_salary_inx`)
         where  emp_salary < 1000000
           and emp_grade < 1150)

This SQL takes 8.92 seconds and is 2 times better than the original query plan without force index hints.

Let’s try to force the index of emp_grade again.

update   EMP_SUBSIDIARY
set    emp_name=concat(emp_name,‘(Headquarter)’)
where emp_id in (select  emp_id
         from    EMPLOYEE FORCE INDEX(`emps_grade_inx`)
         where  emp_salary < 1000000
           and emp_grade < 1150)

Here is the result query plan of the Hints FORCE INDEX(`emps_grade_inx`) injected SQL and the execution time is reduced to 3.95 seconds. The new query plan shows an Index Range Scan of EMPLOYEE by EMP_GRADE index, the result is fed to a subquery2(temp table) and Nested Loop to EMP_SUBSIDIARY for the update. This query plan’s estimated cost is lower and performs better than the original SQL. It is due to the limited plan space in the real-time SQL optimization process, so this query plan cannot be generated for the original SQL text, so manual hints injection is necessary for this SQL statement to help MySQL database SQL optimizer to find a better query plan.

This kind of rewrite can be achieved by Tosska SQL Tuning Expert for MySQL automatically, it shows that the Hints injected SQL is more than 4.6 times faster than the original SQL.

https://tosska.com/tosska-sql-tuning-expert-tse-for-mysql-2/